Pivot Tables in R – Basic Pivot table, columns and metrics

Posted on: September 28, 2017 Posted by: Guy Manova Comments: 0

Pivot Tables in R – Basic Pivot table, columns and metrics

Creating basic pivot tables in R with different metrics (measures)

follow the step by step below or download the R file and load into R studio from github to create basic pivot tables in R:

in this tutorial we’ll create basic pivot tables in R (summary tables), looking at it from an Excel pivot perspective first, then sql, then R:

# clean up the environment

rm(list = ls())

# we’ll need dplyr and tidyr for this, by far easier than with base R

library(dplyr)
library(tidyr)

 

# read in the Canadian SuperStore dataset from this dropbox address:

df <- read.csv(“https://www.dropbox.com/s/kj9yioc24iq4pdb/superstore.csv?dl=1”)
write.csv(df, “superstore.csv”)

let’s have a look in excel; we’ll create a pivot table on the region, product category and customer segment as attributes and the sales (sum) as metric:

pivot table in R

pivot table in R

# pivot tables in R takes a very sql-like approach with the dplyr package: let’s say we wish to
# pivot so we see the total sales per product category, region and customer segment.
# in sql, it would be something like:
# SELECT Product.Category, Region, Customer.Segment, SUM(Sales)
# FROM df
# GROUP BY Product.Category, Region, Customer.Segment

pivot table in R

 

# in dplyr, we follow a similar logic:

# first select the columns

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales)

head(pivot)

pivot table in R

# then group by

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales)%>%
group_by(Product.Category, Region, Customer.Segment)

# then tell R how to sum the sales (sum, count, average etcs):

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales)%>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales))

head(pivot)

pivot table in R

# and so there’s our pivot “select” statement, so to speak.

# let’s add a few things, as we would in Excel pivot or sql – more computed columns,
# adding avg sales and number (count) of sales transactions:

Excel – drag the sales into the values grid a couple more times, changing the metric to average and count instead of sum:

pivot table in R

In SQL, add a couple of aggregate columns:

pivot table in R

and the R equivalent:

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales)%>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales), AvgSales = mean(Sales,na.rm = TRUE),
NumSales = length(!is.na(Sales)))

head(pivot)

pivot table in R

Leave a Reply:

Your email address will not be published. Required fields are marked *