Running totals (cumulative sums) in R

Posted on: December 5, 2017 Posted by: Guy Manova Comments: 0

Running totals (cumulative sums) in R

Running totals (cumulative sums) in R

in this short tutorial, learn how to create Running totals (cumulative sums) in R using cumsum() in different

scenerios including pivot table, just like you would use Excel’s pivot table “show values as” and selecting “running total”

or “% running total”

In Excel

 

running total cumulative sum R

running total cumulative sum R

In R

# pre session options

rm(list = ls())
# getwd()
# setwd(“C:\\Users\\User\\Dropbox (Personal)\\Personal\\Courses\\Excel2R\\Dataframes”)

# load up the libraries (or install first if you don’t have them)

library(dplyr)
library(tidyr)

df <- read.csv(“https://www.dropbox.com/s/shcok8kh2vpnaet/cumsum.csv?dl=1”)
write.csv(df, “cumsum.csv”)

# in the most basic form it’s very easy – use the cumsum() function from base R:

df$cumsum <- cumsum(df$Sales)
df

running total cumulative sum R

# let’s take a more realistic scenerio where you first pivot the data, and then add cumulative sum
# (cumsum) and also the total sales and % of the total achieved (so called “running total”) –
# in this case there is only Date, so the cumulative runs over it:

by_date <- df %>%
select(Date, Sales) %>%
group_by (Date) %>%
summarise(Sales=sum(Sales)) %>%
mutate(CumSales=cumsum(Sales)) %>%
mutate(totalSales = sum(Sales)) %>%
mutate(percent_of_total = cumsum(Sales)/sum(Sales)*100)

running total cumulative sum R

# in the second case, “state” is added , and the cumsum is reset for each state + date

In Excel

running total cumulative sum R

running total cumulative sum R

In R

by_date_2 <- df %>%
select(State, Date, Sales) %>%
group_by (State,Date) %>%
summarise(Sales=sum(Sales)) %>%
mutate(CumSales=cumsum(Sales)) %>%
mutate(totalSales = sum(Sales)) %>%
mutate(percent_of_total = cumsum(Sales)/sum(Sales)*100)

running total cumulative sum R

# in this case, we pivot rows to columns (with “spread”) – and create corresponding
# cumulative sums for each

In Excel

running total cumulative sum R

running total cumulative sum R

In R

by_date_3 <- df %>%
select(State, Date, Sales) %>%
group_by (State,Date) %>%
summarise(Sales=sum(Sales)) %>%
spread(State, Sales) %>%
mutate(CumCalifornia=cumsum(California)) %>%
mutate(CumFlorida=cumsum(Florida)) %>%
mutate(CumNewYork=cumsum(NewYork)) %>%
select(Date, California, CumCalifornia,Florida,CumFlorida,NewYork,CumNewYork)

running total cumulative sum R

 

Leave a Reply:

Your email address will not be published. Required fields are marked *