Pivot tables in R – calculated field (or measure)

Posted on: September 28, 2017 Posted by: Guy Manova Comments: 0

Pivot tables in R – calculated field (or measure)

How to create calculated fields in R pivot tables

In this tutorial, we’ll see how you can replicate an Excel pivot table calculated field (or calculated measure depending on your version) or aggregate SQL and create a calculated field in R, using the dplyr package. follow step by step below or download the R file from github. 

the previous tutorial explaining how the pivot is created is here:

# clean up the environment

rm(list = ls())

# we’ll need dplyr and tidyr for this, by far easier than with base R

library(dplyr)
library(tidyr)

# read in the Canadian SuperStore dataset from this dropbox address:

df <- read.csv(“https://www.dropbox.com/s/kj9yioc24iq4pdb/superstore.csv?dl=1”)
write.csv(df, “superstore.csv”)

# in a previous tutorial, we created this summary table:

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales)%>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales), AvgSales = mean(Sales,na.rm = TRUE),
NumSales = length(!is.na(Sales)))

head(pivot)

calculated field in R pivot table

# Now, let’s add a calculated column from two other columns and calculate
# profitability. we’ll add “Profit” to the summarize function, and use profit/sales
# in the calculated column:

calculated field in Excel:

calculated field in R pivot table excel

calculated field in R pivot table excel

calculated field in R pivot table excel

calculated field in SQL:

calculated field in R pivot table sql

calculated field in R:

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales, Profit)%>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales), TotalProfit = sum(Profit),
AvgSales = mean(Sales,na.rm = TRUE),
NumSales = length(!is.na(Sales))) %>%
mutate(AvgProfitMar = sum(TotalProfit)/TotalSales) %>%
arrange(Region, desc(AvgSales))

head(pivot)

calculated field in R pivot table

# note that MUTATE can be used here (creating a new column) but since the new column is a summary
# operation, it can also simply be added to the summary function:

pivot <- df %>%
select(Product.Category, Region, Customer.Segment, Sales, Profit)%>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales), TotalProfit = sum(Profit),
AvgSales = mean(Sales,na.rm = TRUE),
NumSales = length(!is.na(Sales)),
AvgProfitMar = sum(TotalProfit)/TotalSales) %>%
arrange(Region, desc(AvgSales))

head(pivot)

calculated field in R pivot table

 

Leave a Reply:

Your email address will not be published. Required fields are marked *