vignettes/v11-irregularlayout.Rmd
v11-irregularlayout.Rmd
This is a more advanced topic. It requires a good understanding of the material in the previous vignettes.
This vignette starts with a discussion of irregular layout, then defines some low-level methods for building a pivot table before moving onto several different examples of irregular pivot tables.
Let’s start by talking about regular layout. The following is an example pivot table that has appeared in previous vignettes:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
This is a regular pivot table: Each level of row and column headings is related to a single variable only, i.e.
Returning to the original question (What is irregular layout?), the definition for our purposes is: Irregular layout is any layout where a single level of row or column headings relates to more than one variable.
Irregular layouts inevitably require more effort to construct. Irregular layouts also tend to be harder to understand, so they need careful consideration before being used. Often using multiple separate regular pivot tables is a better idea.
The process for building an irregular layout is:
Most of the extra effort in building an irregular/custom pivot table
is in step 1 and potentially step 3. Step 2 is identical for both
regular and irregular pivot tables and involves nothing more than
calling pt$evaluatePivot()
or similar. Step 4 is also
identical for both regular and irregular pivot tables.
Each row and column heading in a pivot table is a data group. If the hierarchy on the rows or columns has multiple levels, then each heading in each level is a data group. Put another way, each heading cell in the table is a data group. This is explored more in the Navigating a Pivot Table vignette.
Each data group is an instance of the R6 PivotDataGroup
class. See the Appendix: Class Overview
vignette for more details.
A data group can relate to a single value from a single variable (column) in a data frame. This is the most common case, e.g. in the pivot table below, there are five data groups on the rows axis and three data groups on the columns axis. The totals data groups relate to more than one value on each axis.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="NumberOfTrains", caption="Number of Trains",
summariseExpression="n()")
pt$renderPivot()
A data group can also relate to multiple values from a single variable. In irregular pivot tables, a data group even relate to multiple values from multiple variables.
There is more discussion of data group variables and values in the Cell Context vignette.
A data group can also represent just a calculation. This is most common when a pivot table contains more than one visible calculation. In pivot tables with more than one calculation, a calculation data group is created for each calculation under each of the existing data groups on an axis, e.g.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="NumberOfTrains", caption="Number of Trains",
summariseExpression="n()")
pt$defineCalculation(calculationName="MaximumSpeedMPH", caption="Maximum Speed (MPH)",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
pt$renderPivot()
Constructing pivot tables that have an irregular layout typically
requires more lines of R. This is because the helper functions
pt$addColumnDataGroups()
and
pt$addRowDataGroups()
that have been used throughout all of
the previous vignettes to easily build-up the structure of a pivot table
can no longer solely be used, since they generate a regular layout only.
Instead the layout must be built in a more granular way.
The table below describes the different methods that exist:
Comparison | High-Level Method | Low-Level Method |
---|---|---|
Number of data groups added per execution | Typically adds multiple groups per execution | Adds one group per execution |
Variations | Different methods for data groups and calculations | Single method can add a data group or a calculation group |
Add at levels | Can add groups at any level in the hierarchy. | Can only add child groups of the current group. |
Usability | Generally easier to use. | Generally harder to use. |
Safety | More checks that parameter values make sense. | Fewer checks potentially leading to non-sensical or invalid pivot table states. |
Pivot Table methods |
pt$addRowDataGroups() pt$addColumnDataGroups() pt$addRowCalculationGroups() pt$addColumnCalculationGroups()
|
pt$addRowGroup() pt$addColumnGroup()
|
Data Group methods |
grp$addDataGroups() grp$addCalculationGroups()
|
grp$addChildGroup() |
Notes:
pt$addRowCalculationGroups()
and
pt$addColumnCalculationGroups()
do not feature in many
examples in the documentation and many users will rarely have used them
explicitly, however other functions such as
pt$evaluatePivot()
and pt$renderPivot()
internally call them.pt$addRowDataGroups()
internally calls
pt$rowGroup$addDataGroups()
.The low level methods have the following parameters:
variableName
- a character value that specifies the
name of the variable in the data frame that the new group relates to and
will filter (if any).filterType
- must be one of “ALL”, “VALUES”, or “NONE”
to specify the filter type:
variableName
is filtered to only values
.values
specifies the filter values applied to
variableName
to select the data to match this row/column in
the pivot table.doNotExpand
- a logical value, default
FALSE
- specify TRUE
to prevent the high-level
methods from adding child groups.isEmpty
- a logical value, default FALSE
-
specify TRUE
to mark that this data group is empty.isOutline
- a logical value, default FALSE
- specify TRUE
to mark that this data group is an outline
group.
doNotExpand
, sortAnchor
and
outlineLinkedGroupId
parameters are set for an outline
group.captionTemplate
- a character value that specifies the
template for the data group caption.
{values}
simply
displays the value without any prefix or suffix.caption
- effectively a hard-coded caption that
overrides the built-in logic for generating a caption.isTotal
- a logical value, default FALSE
-
specify TRUE
to mark that this data group is a total.isLevelSubTotal
- a logical value, default
FALSE
- specify TRUE
to mark that this data
group is a sub-total within a level.
isLevelTotal
- a logical value, default
FALSE
- specify TRUE
to mark that this data
group is level total.calculationGroupName
- for calculation groups, this
character value specifies the calculation group that
calculationName
belongs to.calculationName
- for calculation groups, this
character value specifies the name of the calculation.baseStyleName
- the style name for the data group.styleDeclarations
- a list of CSS style declarations to
overlay on top of the base style.insertAtIndex
- the one-based location specifying where
to insert the new data group.insertBeforeGroup
- a reference data group specifying
where to insert the new data group.insertAfterGroup
- a reference data group specifying
where to insert the new data group.mergeEmptySpace
- a character value that specifies how
empty space should be merged.
mergeEmptySpace
must be one of “doNotMerge”,
“dataGroupsOnly”, “cellsOnly”, “dataGroupsAndCellsAs1” or
“dataGroupsAndCellsAs2”.cellBaseStyleName
- the style name for cells related to
this data group.cellStyleDeclarations
- a list of CSS style
declarations to overlay on top of the base style for cells related to
this data group.sortAnchor
- used to specify sort behaviour for outline
groups, must be one of “fixed”, “next” or “previous”.outlineLinkedGroupId
- used to link an outline group to
the value data group which has the child data groups.resetCells
- a logical value, default value
TRUE
- which clears the current pivot table cells (if the
pivot table has already been evaluated).
The properties and methods of a data group can be examined in code, e.g.
pt$topRowGroups[[1]]
Some data group properties are read-only (i.e. the R6 active bindings do not support changing the private values), though they will not error if an attempt is made - rather the internal value simply won’t be updated.
It is nonetheless possible to change many properties of the data groups, however, doing so should be done with caution as this may invalidate the state of the pivot table and such code could be broken by future changes to the package.
If you find that modifying data group properties is necessary for your requirements, please log an issue in the package GitHub site/repository to inform the package author of the use-case.
The data groups are tied into a hierarchy. Moving data groups is not currently supported.
The following methods can be used to remove groups:
grp$removeChildGroup()
- removes a child group of the
current group, parameters:
index
- specifies the one-based index of the child to
remove.group
- specifies the child group to remove.resetCells
- a logical value, default value
TRUE
- which clears the current pivot table cells (if the
pivot table has already been evaluated).grp$removeGroup()
- removes the current group,
parameters:
removeAncestorsIfNoRemainingChildren
- a logical value,
default value FALSE
, that specifies whether the parent
group should also be removed if it has no remaining child groups.removedRelatedOutlineGroups
- a logical value, default
value FALSE
, that specifies whether the related outline
groups (siblings of the current group) should also be removed.resetCells
- a logical value, default value
TRUE
- which clears the current pivot table cells (if the
pivot table has already been evaluated).The properties and methods of a cell can also be examined in code, e.g.
pt$getCell(r=1, c=1)
Again, while some properties are read-only, it is possible to change many properties of the cells, however, doing so should be done with caution as this may invalidate the state of the pivot table and such code could be broken by future changes to the package.
If you find that modifying cell properties is necessary for your requirements, please log an issue in the package GitHub site/repository to inform the package author of the use-case.
Consider we are only interested in the number of express trains (i.e. TrainCategory=“Express Passenger”) and the number of DMU trains (i.e. PowerType=“DMU”), for each train operating company. Since these requirements relate to two different variables, we need to construct an irregular layout. To do this, these two data groups are added individually to the root column group:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
The functions pt$addColumnGroup()
and
pt$addRowGroup()
add a single data group at the top level
of the pivot table, where the variable name and value are explicitly
specified.
No totals column is added to the above pivot table. This wouldn’t make sense for this pivot table anyway, since some express passenger trains are also DMU trains, so a simple total would double count some trains.
Showing the cell context, as described in the Cell Context vignette, makes the irregularity clearer:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot(includeHeaderValues=TRUE, includeRCFilters=TRUE)
Further data groups can be added to the pivot table. These data groups can be regular or irregular. Several different examples are shown below. These examples are rather contrived for demonstration purposes.
If a regular level is desired, this can simply be added using the
regular addColumnDataGroups()
and
addRowDataGroups()
functions, for example adding the train
status:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addColumnDataGroups("Status")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
The addColumnGroup()
and addRowGroup()
functions return the new data group that has been added. The
addChildGroup()
function can be called on each of these
groups to add further groups underneath:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
cg1$addChildGroup(variableName="Status", values="A")
cg1$addChildGroup(variableName="Status", values="R")
cg2$addChildGroup(variableName="SchedSpeedMPH", values="100")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
Instead of adding groups one at a time, the
addDataGroups()
function can be called on these data groups
to add different data groups for different variables underneath
each:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
cg1$addDataGroups("Status")
cg2$addDataGroups("SchedSpeedMPH")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
The addColumnDataGroups()
,
addRowDataGroups()
and addDataGroups()
functions1 all return zero, one or multiple data
groups, in the form of an R list. This list can be iterated or used with
functions such as lapply()
in the usual ways. Each data
group has a set of properties that can be used to examine the group. For
example, adding further child groups:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
cgrps <- pt$addColumnDataGroups("PowerType", addTotal=FALSE)
add2Groups <- function(grp) {
if(!grp$isTotal) {
grp$addChildGroup(variableName="Status", values="A")
grp$addChildGroup(variableName="Status", values="R")
}
}
invisible(lapply(cgrps, add2Groups))
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
Another example - setting the text colour:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
cgrps <- pt$addColumnDataGroups("PowerType")
colorText <- function(grp) {
if(isTRUE(grp$values=="DMU")) grp$setStyling(list(color="blue"))
else if(isTRUE(grp$values=="EMU")) grp$setStyling(list(color="green"))
else if(isTRUE(grp$values=="HST")) grp$setStyling(list(color="red"))
}
invisible(lapply(cgrps, colorText))
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
Sometimes it is desirable to use different calculations in different parts of the pivot table. Returning to the initial irregular example:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
Suppose we require the number of express trains but the maximum scheduled speed of the DMU trains. We might naively try the following R:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", caption="Count",
summariseExpression="n()")
pt$defineCalculation(calculationName="MaxSpeedMPH", caption="Maximum Speed",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
pt$renderPivot()
This has replicated both calculations across the whole pivot table, which is not what we wanted.
Instead the solution is to define an additional2 calculation group, then explicitly add the two calculation groups to the relevant parts of the pivot table:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationGroupName="calcGrp2", calculationName="MaxSpeedMPH",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
cg1$addCalculationGroups("default")
cg2$addCalculationGroups("calcGrp2")
pt$renderPivot()
Pivot tables like the above are quite likely to cause confusion. Either the caption of the existing data groups needs changing or additional empty groups with captions should be added. Both of these are demonstrated below.
The example below explicitly specifies the captions of the column data groups.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger",
caption="Express Passenger (Count)")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU",
caption="DMU (Maximum Speed)")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationGroupName="calcGrp2", calculationName="MaxSpeedMPH",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
cg1$addCalculationGroups("default")
cg2$addCalculationGroups("calcGrp2")
pt$renderPivot()
The example below adds two data groups that display a caption only - purely for display purposes.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationGroupName="calcGrp2", calculationName="MaxSpeedMPH",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
cg3 <- cg1$addChildGroup(caption="Count")
cg4 <- cg2$addChildGroup(caption="Maximum Speed")
cg3$addCalculationGroups("default")
cg4$addCalculationGroups("calcGrp2")
pt$renderPivot()
The example below combines two pivot tables in one, calling
pt$addRowDataGroups()
twice specifying
atLevel=1
both times:
# some sample school student grade data
gender <- c("F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F",
"F", "F", "F", "F", "F", "F", "M", "M", "M", "M", "M", "M", "M", "M",
"M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M")
age <- c(19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22, 22, 22,
22, 22, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22,
22, 22, 22, 22)
grade <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D",
"E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C",
"D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E")
counts <- c(6, 16, 56, 37, 213, 14, 21, 61, 45, 191, 30, 54, 74, 82, 246, 91, 46,
29, 71, 296, 3, 6, 21, 35, 162, 14, 11, 29, 22, 204, 15, 30, 49, 75,
253, 45, 22, 30, 30, 319)
df <- data.frame(gender, age, grade, counts)
# two pivot tables in one
library(pivottabler)
pt <- PivotTable$new()
pt$addData(df)
pt$addColumnDataGroups("grade")
pt$addRowGroup(caption="Age", isEmpty=TRUE, styleDeclarations=list(color="blue"))
pt$addRowDataGroups("age", atLevel=1)
pt$addRowGroup(caption="Gender", isEmpty=TRUE, styleDeclarations=list(color="blue"))
pt$addRowDataGroups("gender", atLevel=1)
pt$defineCalculation(calculationName="GradeCounts", summariseExpression="sum(counts)")
pt$renderPivot()
The example above allows each set of row groups to render a total
row. To prevent this and instead render a single total row at the bottom
of the pivot table, specify addTotal=FALSE
when calling
pt$addRowDataGroups()
, then call
pt$addRowGroup()
a third time to add the total row at the
bottom:
# some sample school student grade data
gender <- c("F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F",
"F", "F", "F", "F", "F", "F", "M", "M", "M", "M", "M", "M", "M", "M",
"M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M")
age <- c(19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22, 22, 22,
22, 22, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22,
22, 22, 22, 22)
grade <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D",
"E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C",
"D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E")
counts <- c(6, 16, 56, 37, 213, 14, 21, 61, 45, 191, 30, 54, 74, 82, 246, 91, 46,
29, 71, 296, 3, 6, 21, 35, 162, 14, 11, 29, 22, 204, 15, 30, 49, 75,
253, 45, 22, 30, 30, 319)
df <- data.frame(gender, age, grade, counts)
# two pivot tables in one
library(pivottabler)
pt <- PivotTable$new()
pt$addData(df)
pt$addColumnDataGroups("grade")
pt$addRowGroup(caption="Age", isEmpty=TRUE, isOutline=TRUE,
styleDeclarations=list(color="blue"))
pt$addRowDataGroups("age", atLevel=1, addTotal=FALSE)
pt$addRowGroup(caption="Gender", isEmpty=TRUE, isOutline=TRUE,
styleDeclarations=list(color="blue"))
pt$addRowDataGroups("gender", atLevel=1, addTotal=FALSE)
pt$defineCalculation(calculationName="GradeCounts", summariseExpression="sum(counts)")
pt$addRowGroup(caption="Total", isOutline=TRUE, isTotal=TRUE, isLevelTotal=TRUE,
styleDeclarations=list(color="blue"), cellStyleDeclarations=list(color="blue"))
pt$renderPivot()
Another example of a pivot table that effectively contains two pivot tables:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
# Rows: TOC breakdown
grp1 <- pt$addRowGroup(caption="By TOC")
grp1$addDataGroups("TOC", addTotal=FALSE)
# Rows: Power Type breakdown
grp2 <- pt$addRowGroup(caption="By Power Type")
grp2$addDataGroups("PowerType", addTotal=FALSE)
# Rows: Total
grp3 <- pt$addRowGroup(caption="Total")
# Row Group Headings
pt$setRowDataGroupHeader(levelNumber=1, header="Breakdown")
pt$setRowDataGroupHeader(levelNumber=2, header="Subset")
# Finish...
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$theme <- getStandardTableTheme(pt)
pt$renderPivot(showRowGroupHeaders=TRUE)
Rendering the same table as in outline layout:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
# Rows: TOC breakdown
grp1a <- pt$addRowGroup(caption="By TOC", isOutline=TRUE, isEmpty=TRUE,
sortAnchor="next", styleDeclarations=list(color="blue"))
grp1b <- pt$addRowGroup()
grp1a$outlineLinkedGroupId <- grp1b$instanceId
grp1b$addDataGroups("TOC", addTotal=FALSE)
# Rows: Power Type breakdown
grp2a <- pt$addRowGroup(caption="By Power Type", isOutline=TRUE, isEmpty=TRUE,
sortAnchor="next", styleDeclarations=list(color="blue"))
grp2b <- pt$addRowGroup()
grp2a$outlineLinkedGroupId <- grp2b$instanceId
grp2b$addDataGroups("PowerType", addTotal=FALSE)
# Rows: Total
grp3 <- pt$addRowGroup(caption="Total", isOutline=TRUE, sortAnchor="fixed",
styleDeclarations=list(color="blue"))
# Finish...
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()
In the example above, sortAnchor
links that data group
to the next/previous group which is also linked via
outlineLinkedGroupId
. These are set so that the outline
group and those groups beneath it stay together when the data groups in
the pivot table are sorted. If there is no intention to sort the pivot
table using pt$sortRowDataGroups()
then these
arguments/lines are not needed.
Sometimes, after a pivot table is evaluated, some rows and/or columns will contain no calculation results (i.e. NULL values). For example:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$addRowDataGroups("PowerType", onlyCombinationsThatExist=FALSE)
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
pt$renderPivot()
The rows/columns containing no calculation values can be found using
the pt$getEmptyRows()
or pt$getEmptyColumns()
functions:
## Empty rows: 2, 3, 6, 11, 15
“Empty” is defined as the following cell values:
NAasEmpty=FALSE
.zeroAsEmpty=TRUE
.The empty rows/columns can also be easily removed as described in the next section.
It is possible to remove specific rows and/or columns from a pivot table using any one of the following:
pt$removeRow(2)
removes the second row from the pivot
table.pt$removeRows(c(2, 4))
removes the second and fourth
rows from the pivot table.pt$removeEmptyRows()
removes all rows from the pivot
table where all cells (i.e. calculation values) have no value (i.e. are
NULL).pt$removeColumn(2)
removes the second column from the
pivot table.pt$removeColumns(c(2, 4))
removes the second and fourth
columns from the pivot table.pt$removeEmptyColumns()
removes all columns from the
pivot table where all cells (i.e. calculation values) have no value
(i.e. are NULL).Note the row/column numbers above do not include the headings.
pt$removeEmptyRows()
and
pt$removeEmptyColumns()
also support the
NAasEmpty
and zeroAsEmpty
arguments as
described in the previous section.
As an example of removing rows, consider the second and fourth rows highlighted in the pivot table below:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(rowNumbers=c(2, 4))
pt$setStyling(cells=cells, declarations=list("background-color"="#FFCC66"))
pt$renderPivot()
To remove these rows:
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
pt$removeRows(rowNumbers=c(2, 4))
pt$renderPivot()
Note that removing rows and/or columns does NOT recalculate the other cells in the pivot table, e.g. the values of the totals are not updated.
The asBasicTable()
function allows a pivot table to be
converted to a basic table - from the basictabler
package.
The basictabler
package allows free-form tables to be
constructed, in contrast to pivottabler
which creates pivot
tables with relatively fixed structures. pivottabler
contains calculation logic - to calculate the values of cells within the
pivot table. basictabler
contains no calculation logic -
cell values must be provided either from a data frame, row-by-row,
column-by-column or cell-by-cell.
Converting a pivot table to a basic table allows the structure of pivot tables to be altered after they have been created, e.g.
library(pivottabler)
library(dplyr)
library(lubridate)
trains <- mutate(bhmtrains,
GbttDate=if_else(is.na(GbttArrival), GbttDeparture, GbttArrival),
GbttMonth=make_date(year=year(GbttDate), month=month(GbttDate), day=1))
pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("GbttMonth", dataFormat=list(format="%B %Y"))
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
# convert the pivot table to a basic table, insert a new row, merge cells and highlight
bt <- pt$asBasicTable()
bt$cells$insertRow(5)
bt$cells$setCell(5, 2, rawValue="The values below are significantly higher than expected.",
styleDeclarations=list("text-align"="left", "background-color"="yellow",
"font-weight"="bold", "font-style"="italic"))
bt$mergeCells(rFrom=5, cFrom=2, rSpan=1, cSpan=13)
bt$setStyling(rFrom=6, cFrom=2, rTo=6, cTo=14,
declarations=list("text-align"="left", "background-color"="yellow"))
bt$renderTable()