In This Vignette

  • Introductory Note
  • What is Irregular Layout?
  • Caution
  • Constructing an Irregular Layout
  • Building a Data Group Hierarchy
  • Modifying a Pivot Table
  • Simple Example of an Irregular Layout
  • Extending the Simple Example
  • Another Example
  • Multiple Calculation Groups
  • Combining Multiple Pivot Tables
  • Custom Layout Changes
  • Further Reading

Introductory Note

This is a more advanced topic. It requires a good understanding of the material in the previous vignettes.

This vignette starts with a discussion of irregular layout, then defines some low-level methods for building a pivot table before moving onto several different examples of irregular pivot tables.

What is Irregular Layout?

Let’s start by talking about regular layout. The following is an example pivot table that has appeared in previous vignettes:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

This is a regular pivot table: Each level of row and column headings is related to a single variable only, i.e.

  • The first level of column headings (“Express Passenger”, “Ordinary Passenger”, etc.) are all related to the “TrainCategory” variable.
  • The second level of column headings (“DMU”, “EMU”, etc.) are all related to the “PowerType” variable.
  • The first level of row headings (“Arriva Trains Wales”, “CrossCountry”, etc.) are all related to the “TOC” (Train Operating Company) variable.

Returning to the original question (What is irregular layout?), the definition for our purposes is: Irregular layout is any layout where a single level of row or column headings relates to more than one variable.

Caution

Irregular layouts inevitably require more effort to construct. Irregular layouts also tend to be harder to understand, so they need careful consideration before being used. Often using multiple separate regular pivot tables is a better idea.

Constructing an Irregular Layout

The process for building an irregular layout is:

  1. Building the data group hierarchies (row headings and column headings)
  2. Evaluate the pivot table - to generate and calculate the cells in the body of the pivot table
  3. Modify the resulting pivot table - if needed
  4. Render the pivot table

Most of the extra effort in building an irregular/custom pivot table is in step 1 and potentially step 3. Step 2 is identical for both regular and irregular pivot tables and involves nothing more than calling pt$evaluatePivot() or similar. Step 4 is also identical for both regular and irregular pivot tables.

Building a Data Group Hierarchy

Reminder: What are data groups?

Each row and column heading in a pivot table is a data group. If the hierarchy on the rows or columns has multiple levels, then each heading in each level is a data group. Put another way, each heading cell in the table is a data group. This is explored more in the Navigating a Pivot Table vignette.

Each data group is an instance of the R6 PivotDataGroup class. See the Appendix: Class Overview vignette for more details.

A data group can relate to a single value from a single variable (column) in a data frame. This is the most common case, e.g. in the pivot table below, there are five data groups on the rows axis and three data groups on the columns axis. The totals data groups relate to more than one value on each axis.

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="NumberOfTrains", caption="Number of Trains",
                     summariseExpression="n()")
pt$renderPivot()

A data group can also relate to multiple values from a single variable. In irregular pivot tables, a data group even relate to multiple values from multiple variables.

There is more discussion of data group variables and values in the Cell Context vignette.

A data group can also represent just a calculation. This is most common when a pivot table contains more than one visible calculation. In pivot tables with more than one calculation, a calculation data group is created for each calculation under each of the existing data groups on an axis, e.g.

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="NumberOfTrains", caption="Number of Trains",
                     summariseExpression="n()")
pt$defineCalculation(calculationName="MaximumSpeedMPH", caption="Maximum Speed (MPH)",
                     summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
pt$renderPivot()

High-level methods vs. Low-level methods

Constructing pivot tables that have an irregular layout typically requires more lines of R. This is because the helper functions pt$addColumnDataGroups() and pt$addRowDataGroups() that have been used throughout all of the previous vignettes to easily build-up the structure of a pivot table can no longer solely be used, since they generate a regular layout only. Instead the layout must be built in a more granular way.

The table below describes the different methods that exist:

Comparison High-Level Method Low-Level Method
Number of data groups added per execution Typically adds multiple groups per execution Adds one group per execution
Variations Different methods for data groups and calculations Single method can add a data group or a calculation group
Add at levels Can add groups at any level in the hierarchy. Can only add child groups of the current group.
Usability Generally easier to use. Generally harder to use.
Safety More checks that parameter values make sense. Fewer checks potentially leading to non-sensical or invalid pivot table states.
Pivot Table methods pt$addRowDataGroups()
pt$addColumnDataGroups()
pt$addRowCalculationGroups()
pt$addColumnCalculationGroups()
pt$addRowGroup()
pt$addColumnGroup()
Data Group methods grp$addDataGroups()
grp$addCalculationGroups()
grp$addChildGroup()

Notes:

  • It is possible to mix the high-level and low-level methods when building a pivot table, e.g. level 1 could be built using the low-level methods, then level 2 built using one of the high-level methods. This can be seen in the examples below.
  • The pivot table versions of the low-level methods add the new group into level 1 of the hierarchy.
  • pt$addRowCalculationGroups() and pt$addColumnCalculationGroups() do not feature in many examples in the documentation and many users will rarely have used them explicitly, however other functions such as pt$evaluatePivot() and pt$renderPivot() internally call them.
  • The high-level methods internally make use of the low-level methods.
  • The pivot table versions are generally thin wrappers of the data group versions, e.g. pt$addRowDataGroups() internally calls pt$rowGroup$addDataGroups().

Low level method parameter details

The low level methods have the following parameters:

  • variableName - a character value that specifies the name of the variable in the data frame that the new group relates to and will filter (if any).
  • filterType - must be one of “ALL”, “VALUES”, or “NONE” to specify the filter type:
    • ALL means no filtering is applied.
    • VALUEs is the typical value used to specify that variableName is filtered to only values.
    • NONE means no data will match this data group.
  • values specifies the filter values applied to variableName to select the data to match this row/column in the pivot table.
  • doNotExpand - a logical value, default FALSE - specify TRUE to prevent the high-level methods from adding child groups.
  • isEmpty- a logical value, default FALSE - specify TRUE to mark that this data group is empty.
  • isOutline - a logical value, default FALSE - specify TRUE to mark that this data group is an outline group.
    • Typically sets of two or three groups (header/before, value and footer/after) are added for each data value when generating outline groups.
    • Normally, the doNotExpand, sortAnchor and outlineLinkedGroupId parameters are set for an outline group.
    • See the Regular Layout vignette for more information about outline groups.
  • captionTemplate - a character value that specifies the template for the data group caption.
    • The default caption template of {values} simply displays the value without any prefix or suffix.
    • See the Data Groups vignette for an example of setting the caption template.
  • caption - effectively a hard-coded caption that overrides the built-in logic for generating a caption.
  • isTotal - a logical value, default FALSE - specify TRUE to mark that this data group is a total.
  • isLevelSubTotal - a logical value, default FALSE - specify TRUE to mark that this data group is a sub-total within a level.
    • Typically there are multiple sub-totals within a level.
  • isLevelTotal - a logical value, default FALSE - specify TRUE to mark that this data group is level total.
  • calculationGroupName - for calculation groups, this character value specifies the calculation group that calculationName belongs to.
  • calculationName - for calculation groups, this character value specifies the name of the calculation.
  • baseStyleName - the style name for the data group.
  • styleDeclarations - a list of CSS style declarations to overlay on top of the base style.
  • insertAtIndex - the one-based location specifying where to insert the new data group.
  • insertBeforeGroup - a reference data group specifying where to insert the new data group.
  • insertAfterGroup - a reference data group specifying where to insert the new data group.
  • mergeEmptySpace - a character value that specifies how empty space should be merged.
    • This is typically only used with outline groups (so applies to row groups only, not column groups)
    • mergeEmptySpace must be one of “doNotMerge”, “dataGroupsOnly”, “cellsOnly”, “dataGroupsAndCellsAs1” or “dataGroupsAndCellsAs2”.
    • See the Regular Layout vignette for more information.
  • cellBaseStyleName - the style name for cells related to this data group.
  • cellStyleDeclarations - a list of CSS style declarations to overlay on top of the base style for cells related to this data group.
  • sortAnchor - used to specify sort behaviour for outline groups, must be one of “fixed”, “next” or “previous”.
  • outlineLinkedGroupId - used to link an outline group to the value data group which has the child data groups.
  • resetCells - a logical value, default value TRUE - which clears the current pivot table cells (if the pivot table has already been evaluated).
    • Clearing the pivot table cells is normally desirable, since changes to the data groups means they no longer align with the previously calculated cells.

Modifying a Pivot Table

Modifying data group properties

The properties and methods of a data group can be examined in code, e.g.

pt$topRowGroups[[1]]

Some data group properties are read-only (i.e. the R6 active bindings do not support changing the private values), though they will not error if an attempt is made - rather the internal value simply won’t be updated.

It is nonetheless possible to change many properties of the data groups, however, doing so should be done with caution as this may invalidate the state of the pivot table and such code could be broken by future changes to the package.

If you find that modifying data group properties is necessary for your requirements, please log an issue in the package GitHub site/repository to inform the package author of the use-case.

Moving data groups

The data groups are tied into a hierarchy. Moving data groups is not currently supported.

Removing data groups

The following methods can be used to remove groups:

  • grp$removeChildGroup() - removes a child group of the current group, parameters:
    • index - specifies the one-based index of the child to remove.
    • group - specifies the child group to remove.
    • resetCells - a logical value, default value TRUE - which clears the current pivot table cells (if the pivot table has already been evaluated).
  • grp$removeGroup() - removes the current group, parameters:
    • removeAncestorsIfNoRemainingChildren - a logical value, default value FALSE, that specifies whether the parent group should also be removed if it has no remaining child groups.
    • removedRelatedOutlineGroups - a logical value, default value FALSE, that specifies whether the related outline groups (siblings of the current group) should also be removed.
    • resetCells - a logical value, default value TRUE - which clears the current pivot table cells (if the pivot table has already been evaluated).

Modifying data group properties

The properties and methods of a cell can also be examined in code, e.g.

pt$getCell(r=1, c=1)

Again, while some properties are read-only, it is possible to change many properties of the cells, however, doing so should be done with caution as this may invalidate the state of the pivot table and such code could be broken by future changes to the package.

If you find that modifying cell properties is necessary for your requirements, please log an issue in the package GitHub site/repository to inform the package author of the use-case.

Moving cells

The cells are tied into a grid structure. Moving cells is not supported (and unlikely ever to be supported).

Removing cells

Removing individual cells is not currently supported.

Removing rows and columns

It is possible to remove entire rows and columns. This does not reset the cells - so recalculating the pivot table is not needed after removing rows or columns.

Removing rows/columns is discussed in more detail below.

Simple Example of an Irregular Layout

Consider we are only interested in the number of express trains (i.e. TrainCategory=“Express Passenger”) and the number of DMU trains (i.e. PowerType=“DMU”), for each train operating company. Since these requirements relate to two different variables, we need to construct an irregular layout. To do this, these two data groups are added individually to the root column group:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

The functions pt$addColumnGroup() and pt$addRowGroup() add a single data group at the top level of the pivot table, where the variable name and value are explicitly specified.

No totals column is added to the above pivot table. This wouldn’t make sense for this pivot table anyway, since some express passenger trains are also DMU trains, so a simple total would double count some trains.

Showing the cell context, as described in the Cell Context vignette, makes the irregularity clearer:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot(includeHeaderValues=TRUE, includeRCFilters=TRUE)

Extending the Simple Example

Further data groups can be added to the pivot table. These data groups can be regular or irregular. Several different examples are shown below. These examples are rather contrived for demonstration purposes.

If a regular level is desired, this can simply be added using the regular addColumnDataGroups() and addRowDataGroups() functions, for example adding the train status:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addColumnDataGroups("Status")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

The addColumnGroup() and addRowGroup() functions return the new data group that has been added. The addChildGroup() function can be called on each of these groups to add further groups underneath:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
cg1$addChildGroup(variableName="Status", values="A")
cg1$addChildGroup(variableName="Status", values="R")
cg2$addChildGroup(variableName="SchedSpeedMPH", values="100")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

Instead of adding groups one at a time, the addDataGroups() function can be called on these data groups to add different data groups for different variables underneath each:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
cg1$addDataGroups("Status")
cg2$addDataGroups("SchedSpeedMPH")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

Iterating Groups

The addColumnDataGroups(), addRowDataGroups() and addDataGroups() functions1 all return zero, one or multiple data groups, in the form of an R list. This list can be iterated or used with functions such as lapply() in the usual ways. Each data group has a set of properties that can be used to examine the group. For example, adding further child groups:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
cgrps <- pt$addColumnDataGroups("PowerType", addTotal=FALSE)
add2Groups <- function(grp) {
  if(!grp$isTotal) {
    grp$addChildGroup(variableName="Status", values="A")
    grp$addChildGroup(variableName="Status", values="R")
  }
}
invisible(lapply(cgrps, add2Groups))
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

Another example - setting the text colour:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
cgrps <- pt$addColumnDataGroups("PowerType")
colorText <- function(grp) {
  if(isTRUE(grp$values=="DMU")) grp$setStyling(list(color="blue"))
  else if(isTRUE(grp$values=="EMU")) grp$setStyling(list(color="green"))
  else if(isTRUE(grp$values=="HST")) grp$setStyling(list(color="red"))
}
invisible(lapply(cgrps, colorText))
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

Multiple Calculation Groups

Sometimes it is desirable to use different calculations in different parts of the pivot table. Returning to the initial irregular example:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

Suppose we require the number of express trains but the maximum scheduled speed of the DMU trains. We might naively try the following R:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", caption="Count", 
                     summariseExpression="n()")
pt$defineCalculation(calculationName="MaxSpeedMPH", caption="Maximum Speed", 
                     summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
pt$renderPivot()

This has replicated both calculations across the whole pivot table, which is not what we wanted.

Instead the solution is to define an additional2 calculation group, then explicitly add the two calculation groups to the relevant parts of the pivot table:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationGroupName="calcGrp2", calculationName="MaxSpeedMPH", 
                     summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
cg1$addCalculationGroups("default")
cg2$addCalculationGroups("calcGrp2")
pt$renderPivot()

Pivot tables like the above are quite likely to cause confusion. Either the caption of the existing data groups needs changing or additional empty groups with captions should be added. Both of these are demonstrated below.

Changing the data group captions

The example below explicitly specifies the captions of the column data groups.

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger", 
                         caption="Express Passenger (Count)")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU", 
                         caption="DMU (Maximum Speed)")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationGroupName="calcGrp2", calculationName="MaxSpeedMPH", 
                     summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
cg1$addCalculationGroups("default")
cg2$addCalculationGroups("calcGrp2")
pt$renderPivot()

Adding purely descriptive data groups

The example below adds two data groups that display a caption only - purely for display purposes.

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
cg1 <- pt$addColumnGroup(variableName="TrainCategory", values="Express Passenger")
cg2 <- pt$addColumnGroup(variableName="PowerType", values="DMU")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$defineCalculation(calculationGroupName="calcGrp2", calculationName="MaxSpeedMPH", 
                     summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
cg3 <- cg1$addChildGroup(caption="Count")
cg4 <- cg2$addChildGroup(caption="Maximum Speed")
cg3$addCalculationGroups("default")
cg4$addCalculationGroups("calcGrp2")
pt$renderPivot()

Combining Multiple Pivot Tables

Example 1

The example below combines two pivot tables in one, calling pt$addRowDataGroups() twice specifying atLevel=1 both times:

# some sample school student grade data
gender <- c("F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
            "F", "F", "F", "F", "F", "F", "M", "M", "M", "M", "M", "M", "M", "M", 
            "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M")
age <- c(19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22, 22, 22, 
         22, 22, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22, 
         22, 22, 22, 22)
grade <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", 
           "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", 
           "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E")
counts <- c(6, 16, 56, 37, 213, 14, 21, 61, 45, 191, 30, 54, 74, 82, 246, 91, 46, 
            29, 71, 296, 3, 6, 21, 35, 162, 14, 11, 29, 22, 204, 15, 30, 49, 75, 
            253, 45, 22, 30, 30, 319)
df <- data.frame(gender, age, grade, counts)

# two pivot tables in one
library(pivottabler)
pt <- PivotTable$new()
pt$addData(df)
pt$addColumnDataGroups("grade")
pt$addRowGroup(caption="Age", isEmpty=TRUE, styleDeclarations=list(color="blue"))
pt$addRowDataGroups("age", atLevel=1)
pt$addRowGroup(caption="Gender", isEmpty=TRUE, styleDeclarations=list(color="blue"))
pt$addRowDataGroups("gender", atLevel=1)
pt$defineCalculation(calculationName="GradeCounts", summariseExpression="sum(counts)")
pt$renderPivot()

The example above allows each set of row groups to render a total row. To prevent this and instead render a single total row at the bottom of the pivot table, specify addTotal=FALSE when calling pt$addRowDataGroups(), then call pt$addRowGroup() a third time to add the total row at the bottom:

# some sample school student grade data
gender <- c("F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", 
            "F", "F", "F", "F", "F", "F", "M", "M", "M", "M", "M", "M", "M", "M", 
            "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M")
age <- c(19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22, 22, 22, 
         22, 22, 19, 19, 19, 19, 19, 20, 20, 20, 20, 20, 21, 21, 21, 21, 21, 22, 
         22, 22, 22, 22)
grade <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", 
           "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", 
           "D", "E", "A", "B", "C", "D", "E", "A", "B", "C", "D", "E")
counts <- c(6, 16, 56, 37, 213, 14, 21, 61, 45, 191, 30, 54, 74, 82, 246, 91, 46, 
            29, 71, 296, 3, 6, 21, 35, 162, 14, 11, 29, 22, 204, 15, 30, 49, 75, 
            253, 45, 22, 30, 30, 319)
df <- data.frame(gender, age, grade, counts)

# two pivot tables in one
library(pivottabler)
pt <- PivotTable$new()
pt$addData(df)
pt$addColumnDataGroups("grade")
pt$addRowGroup(caption="Age", isEmpty=TRUE, isOutline=TRUE, 
               styleDeclarations=list(color="blue"))
pt$addRowDataGroups("age", atLevel=1, addTotal=FALSE)
pt$addRowGroup(caption="Gender", isEmpty=TRUE, isOutline=TRUE, 
               styleDeclarations=list(color="blue"))
pt$addRowDataGroups("gender", atLevel=1, addTotal=FALSE)
pt$defineCalculation(calculationName="GradeCounts", summariseExpression="sum(counts)")
pt$addRowGroup(caption="Total", isOutline=TRUE, isTotal=TRUE, isLevelTotal=TRUE,
               styleDeclarations=list(color="blue"), cellStyleDeclarations=list(color="blue"))
pt$renderPivot()

Example 2

Another example of a pivot table that effectively contains two pivot tables:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
# Rows: TOC breakdown
grp1 <- pt$addRowGroup(caption="By TOC")
grp1$addDataGroups("TOC", addTotal=FALSE)
# Rows: Power Type breakdown
grp2 <- pt$addRowGroup(caption="By Power Type")
grp2$addDataGroups("PowerType", addTotal=FALSE)
# Rows: Total
grp3 <- pt$addRowGroup(caption="Total")
# Row Group Headings
pt$setRowDataGroupHeader(levelNumber=1, header="Breakdown")
pt$setRowDataGroupHeader(levelNumber=2, header="Subset")
# Finish...
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$theme <- getStandardTableTheme(pt)
pt$renderPivot(showRowGroupHeaders=TRUE)

Rendering the same table as in outline layout:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
# Rows: TOC breakdown
grp1a <- pt$addRowGroup(caption="By TOC", isOutline=TRUE, isEmpty=TRUE, 
                        sortAnchor="next", styleDeclarations=list(color="blue"))
grp1b <- pt$addRowGroup()
grp1a$outlineLinkedGroupId <- grp1b$instanceId
grp1b$addDataGroups("TOC", addTotal=FALSE)
# Rows: Power Type breakdown
grp2a <- pt$addRowGroup(caption="By Power Type", isOutline=TRUE, isEmpty=TRUE, 
                        sortAnchor="next", styleDeclarations=list(color="blue"))
grp2b <- pt$addRowGroup()
grp2a$outlineLinkedGroupId <- grp2b$instanceId
grp2b$addDataGroups("PowerType", addTotal=FALSE)
# Rows: Total
grp3 <- pt$addRowGroup(caption="Total", isOutline=TRUE, sortAnchor="fixed",
                       styleDeclarations=list(color="blue"))
# Finish...
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

In the example above, sortAnchor links that data group to the next/previous group which is also linked via outlineLinkedGroupId. These are set so that the outline group and those groups beneath it stay together when the data groups in the pivot table are sorted. If there is no intention to sort the pivot table using pt$sortRowDataGroups() then these arguments/lines are not needed.

Custom Layout Changes

Working with empty rows and/or columns

Sometimes, after a pivot table is evaluated, some rows and/or columns will contain no calculation results (i.e. NULL values). For example:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$addRowDataGroups("PowerType", onlyCombinationsThatExist=FALSE)
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
pt$renderPivot()

The rows/columns containing no calculation values can be found using the pt$getEmptyRows() or pt$getEmptyColumns() functions:

cat("Empty rows: ", paste(pt$getEmptyRows(), collapse=", "))
## Empty rows:  2, 3, 6, 11, 15

“Empty” is defined as the following cell values:

  • NULL is always interpreted as empty.
  • NA is interpreted as empty by default, though this can be overridden using the argument NAasEmpty=FALSE.
  • 0 is NOT interpreted as empty by default, though this can be overridden using the argument zeroAsEmpty=TRUE.

The empty rows/columns can also be easily removed as described in the next section.

Removing rows and/or columns

It is possible to remove specific rows and/or columns from a pivot table using any one of the following:

  • pt$removeRow(2) removes the second row from the pivot table.
  • pt$removeRows(c(2, 4)) removes the second and fourth rows from the pivot table.
  • pt$removeEmptyRows() removes all rows from the pivot table where all cells (i.e. calculation values) have no value (i.e. are NULL).
  • pt$removeColumn(2) removes the second column from the pivot table.
  • pt$removeColumns(c(2, 4)) removes the second and fourth columns from the pivot table.
  • pt$removeEmptyColumns() removes all columns from the pivot table where all cells (i.e. calculation values) have no value (i.e. are NULL).

Note the row/column numbers above do not include the headings.

pt$removeEmptyRows() and pt$removeEmptyColumns() also support the NAasEmpty and zeroAsEmpty arguments as described in the previous section.

As an example of removing rows, consider the second and fourth rows highlighted in the pivot table below:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(rowNumbers=c(2, 4))
pt$setStyling(cells=cells, declarations=list("background-color"="#FFCC66"))
pt$renderPivot()

To remove these rows:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
pt$removeRows(rowNumbers=c(2, 4))
pt$renderPivot()

Note that removing rows and/or columns does NOT recalculate the other cells in the pivot table, e.g. the values of the totals are not updated.

Converting to a basictabler table

The asBasicTable() function allows a pivot table to be converted to a basic table - from the basictabler package.

The basictabler package allows free-form tables to be constructed, in contrast to pivottabler which creates pivot tables with relatively fixed structures. pivottabler contains calculation logic - to calculate the values of cells within the pivot table. basictabler contains no calculation logic - cell values must be provided either from a data frame, row-by-row, column-by-column or cell-by-cell.

Converting a pivot table to a basic table allows the structure of pivot tables to be altered after they have been created, e.g.

library(pivottabler)
library(dplyr)
library(lubridate)
trains <- mutate(bhmtrains, 
                 GbttDate=if_else(is.na(GbttArrival), GbttDeparture, GbttArrival),
                 GbttMonth=make_date(year=year(GbttDate), month=month(GbttDate), day=1))

pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("GbttMonth", dataFormat=list(format="%B %Y"))
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()

# convert the pivot table to a basic table, insert a new row, merge cells and highlight
bt <- pt$asBasicTable()
bt$cells$insertRow(5)
bt$cells$setCell(5, 2, rawValue="The values below are significantly higher than expected.", 
                 styleDeclarations=list("text-align"="left", "background-color"="yellow",
                                        "font-weight"="bold", "font-style"="italic"))
bt$mergeCells(rFrom=5, cFrom=2, rSpan=1, cSpan=13)
bt$setStyling(rFrom=6, cFrom=2, rTo=6, cTo=14, 
              declarations=list("text-align"="left", "background-color"="yellow"))
bt$renderTable()

  1. Use the addColumnDataGroups() and addRowDataGroups() functions against the pivot table. Use the addDataGroups() function against data groups.↩︎

  2. Every pivot table has a default calculation group named “default”.↩︎