### Pivot table value filters and label filters in R

## R pivot tables filtering – learn how to create value filters and label filters in R

In this tutorial we’ll see how to replicate excel pivot tables value and label filters, by creating value filters and label filters in R and distinguish between the differences also using sql’s “where” and “having” clauses as they are identical in logic to how it’s done in R.

follow the step by step below and copy paste the code to your R console, or download the R file from github.

You can get the details of how the pivot is constructed in the previous vids here, here and here.

# clean up the environment

rm(list = ls())

# we’ll need dplyr and tidyr for this, by far easier than with base R

library(dplyr)

library(tidyr)

# read in the Canadian SuperStore dataset from this dropbox address:

df <- read.csv(“https://www.dropbox.com/s/kj9yioc24iq4pdb/superstore.csv?dl=1”)

write.csv(df, “superstore.csv”)

# in a previous tutorial we created a summary table with this dplyr code, summarising

# sales by product category, region and customer segment:

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales))

# first, let’s create a label filter emulating a WHERE clause, so only

#product category Furniture will be taken into account:

Excel:

SQL with WHERE clause:

In R with dplyr, filtering before the group by is similar to WHERE:

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

filter(Product.Category==”Furniture”) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales))

# same, but product category being EITHER Furniture OR Technology

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

filter(Product.Category==”Furniture”|Product.Category==”Technology”) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales))head(pivot2)

# same, but product category being Furniture AND customer segment being Consumer

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

filter(Product.Category==”Furniture”& Customer.Segment==”Consumer”) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales))head(pivot2)

# now let’s look at value filters, either before grouping (=WHERE) or after (=HAVING)

to filter only single sales > 10000$ in Excel – we would need to ditch the pivot, go back to the data, filter only the sales > 10000 rows,

and create a new data set and pivot that. cumbersome. in R (and SQL) it’s very easy:

SQL:

# similarly in R introduce the filter BEFORE the summarize function:

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

filter(Sales >= 10000) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales))

# To filter the summary totals (HAVING clause), introduce the filter AFTER

# the summarise function:

SQL:

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales)) %>%

filter(TotalSales >= 100000)

# and of course, can combine:

pivot2 <- df %>%

select(Product.Category, Region, Customer.Segment, Sales) %>%

filter(Sales >= 10000) %>%

group_by(Product.Category, Region, Customer.Segment) %>%

summarise(TotalSales = sum(Sales)) %>%

filter(TotalSales >= 100000)