Pivot table value filters and label filters in R

Posted on: September 30, 2017 Posted by: Guy Manova Comments: 0

Pivot table value filters and label filters in R

R pivot tables filtering – learn how to create value filters and label filters in R

In this tutorial we’ll see how to replicate excel pivot tables value and label filters, by creating value filters and label filters in R and distinguish between the differences also using sql’s “where” and “having” clauses as they are identical in logic to how it’s done in R.

follow the step by step below and copy paste the code to your R console, or download the R file from github.

You can get the details of how the pivot is constructed in the previous vids here, here and here.

 

# clean up the environment

rm(list = ls())

# we’ll need dplyr and tidyr for this, by far easier than with base R

library(dplyr)
library(tidyr)

# read in the Canadian SuperStore dataset from this dropbox address:

df <- read.csv(“https://www.dropbox.com/s/kj9yioc24iq4pdb/superstore.csv?dl=1”)
write.csv(df, “superstore.csv”)

 

# in a previous tutorial we created a summary table with this dplyr code, summarising
# sales by product category, region and customer segment:

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales))

 

# first, let’s create a label filter emulating a WHERE clause, so only
#product category Furniture will be taken into account:

Excel:

pivot table value filter label filter excel

pivot table value filter label filter excel

SQL with WHERE clause:

pivot table value filter label filter sql

In R with dplyr, filtering before the group by is similar to WHERE:

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
filter(Product.Category==”Furniture”) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales))

pivot table value filter label filter R

pivot table value filter label filter R

# same, but product category being EITHER Furniture OR Technology

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
filter(Product.Category==”Furniture”|Product.Category==”Technology”) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales))

head(pivot2)

pivot table value filter label filter R

# same, but product category being Furniture AND customer segment being Consumer

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
filter(Product.Category==”Furniture”& Customer.Segment==”Consumer”) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales))

head(pivot2)

pivot table value filter label filter R

# now let’s look at value filters, either before grouping (=WHERE) or after (=HAVING)

to filter only single sales > 10000$ in Excel – we would need to ditch the pivot, go back to the data, filter only the sales > 10000 rows,

and create a new data set and pivot that. cumbersome. in R (and SQL) it’s very easy:

SQL:

pivot table value filter label filter excel

# similarly in R introduce the filter BEFORE the summarize function:

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
filter(Sales >= 10000) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales))

pivot table value filter label filter R

# To filter the summary totals (HAVING clause), introduce the filter AFTER
# the summarise function:

SQL:

pivot table value filter label filter sql

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales)) %>%
filter(TotalSales >= 100000)

pivot table value filter label filter R

# and of course, can combine:

pivot2 <- df %>%
select(Product.Category, Region, Customer.Segment, Sales) %>%
filter(Sales >= 10000) %>%
group_by(Product.Category, Region, Customer.Segment) %>%
summarise(TotalSales = sum(Sales)) %>%
filter(TotalSales >= 100000)

pivot table value filter label filter R

 

Leave a Reply:

Your email address will not be published. Required fields are marked *