This vignette provides some more information about the workings of
pivottabler
and in particular makes clearer how a pivot
table is constructed.
The following pivot table is used as the basis of the examples in the rest of this vignette:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
Each cell in a pivot table will sit under one or more row and column headings (data groups). These headings determine the filters that are applied to the data to calculate the value of that cell. This is the cell context.
For example, in the pivot table below, the cell highlighted in green (value 33792) is subject to the following filters:
i.e. the “Ordinary Passenger” and “London Midland” row/column headings provide context that define the data filters for the cell.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(specifyCellsAsList=TRUE, cellCoordinates=list(c(3, 2)))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()
For illustrative purposes, it is possible to display the
context/filters in a pivot table using the
includeHeaderValues
and includeRCFilters
parameters when rendering a pivot table:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(specifyCellsAsList=TRUE, cellCoordinates=list(c(3, 2)))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot(includeHeaderValues=TRUE, includeRCFilters=TRUE)
There are four different ways a cell value can be calculated, as explained in the Data Groups vignette. The calculation steps for each of these (and where the cell context fits in) are described below.
This is the simplest case. The data frame is filtered using the cell filters. The summarise function is then applied to the filtered data to get the cell value.
The cell filters are not applied to these calculations. They are applied as part of the calculation of the summarised values that this derived value calculation is based on.
For example, in the “Method 2” example in the Data Groups vignette, the cell filters are applied as part of the “Trains Arr. 5+ Mins Late” and “Total Trains” calculations (which are both “Method 1” summary calculations), but not as part of the “% Trains Arr. 5+ Mins Late” derived value calculation.
The cell filters are explicitly applied in code as part of the custom function definition. The author of the custom function has full control over if and when the cell filters are applied.
For example, in the “Method 3” example in the Data Groups vignette, the cell filters
are explicitly applied by the line
filteredTrains <- pivotCalculator$getFilteredDataFrame(trains, netFilters)
.
Here, the netFilters
parameter is a
PivotFilters
object that contains the filters for the
current cell. Remember, the custom function is invoked once for each
cell in the pivot table that relates to that calculation.
It is possible to display more cell information as illustrated below:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(specifyCellsAsList=TRUE, cellCoordinates=list(c(3, 2)))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot(includeHeaderValues=TRUE, includeRCFilters=TRUE,
includeCalculationFilters=TRUE, includeWorkingData=TRUE, includeEvaluationFilters=TRUE,
includeCalculationNames=TRUE, includeRawValue=TRUE, includeTotalInfo=TRUE)
The evaluation filters shown in each cell are the actual filters that were used when calculating the cell value. These are typically the same as the row-column filters. One notable exception is When filters are also specified as part of a calculation definition, i.e. when calculation filters are present. For example, calculating the percentage of trains that ran at the weekend (where only the “Arriva Trains Wales” train operating company is shown to reduce the size of the pivot table):
library(dplyr)
library(lubridate)
library(pivottabler)
# get the date of each train and whether that date is a weekday or weekend
trains <- bhmtrains %>%
mutate(GbttDateTime=if_else(is.na(GbttArrival), GbttDeparture, GbttArrival),
DayNumber=wday(GbttDateTime),
WeekdayOrWeekend=ifelse(DayNumber %in% c(1,7), "Weekend", "Weekday"))
# render the pivot table
pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC", fromData=FALSE,
explicitListOfValues=list("Arriva Trains Wales"), addTotal=FALSE)
weekendFilter <- PivotFilters$new(pt, variableName="WeekdayOrWeekend", values="Weekend")
pt$defineCalculation(calculationName="WeekendTrains", summariseExpression="n()",
filters=weekendFilter)
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationName="WeekendTrainsPercentage",
type="calculation", basedOn=c("WeekendTrains", "TotalTrains"),
format="%.1f %%",
calculationExpression="values$WeekendTrains/values$TotalTrains*100")
pt$addRowCalculationGroups()
pt$renderPivot(includeHeaderValues=TRUE, includeRCFilters=TRUE,
includeCalculationFilters=TRUE, includeWorkingData=TRUE, includeEvaluationFilters=TRUE,
includeCalculationNames=TRUE, includeRawValue=TRUE, includeTotalInfo=TRUE)
In the above example, for the WeekendTrains calculation, the calculation filters differ from the row-column filters.