vignettes/vA3-appendix.Rmd
vA3-appendix.Rmd
The pivottabler
package is comprised of R6 classes. This
section provides a short description of the main classes, their function
and relationships in the package.
The PivotTable
class represents a single pivot table. In
all of the example code, the instance of the pivot table is represented
by the pt
variable, on which various functions/methods are
invoked e.g. pivot tables are created by:
pt <- PivotTable$new()
pt$addData(bhmtrains)
...
The PivotData
class contains the references to the
source data (i.e. data frames) used to build a pivot table and is
accessible via pt$data
.
A pivot table can be built from more than one data frame, though this is typically only viable if the variable names and values in the data frames are consistent - e.g. if a given variable is present in multiple data frames, it should have the same name in all of them.
A pivot table has a set of column headings and a set of row headings,
each of which is an instance of the PivotDataGroup
class. A
data group can (and normally does) have child groups.
The data groups exist in a hierarchy. A single invisible data group
pt$rowGroup
acts as the top-level parent for the row
groups, and another single invisible data group
pt$columnGroup
acts as the top-level parent for the column
groups.
The first level of visible data groups headings can be accessed via
pt$topRowGroups
or pt$topColumnGroups
. The
leaf level data groups can be accessed using
pt$leafRowGroups
or pt$leafColumnGroups
.
The parent of a given PivotDataGroup
can be accessed
using the parentGroup
property.
The children for a given PivotDataGroup
can be accessed
using the childGroups
property, e.g. the child data groups
of a data group in the top level level can then be accessed via
pt$topRowGroups[[1]]$childGroups
.
Calling pt$addColumnDataGroups(...)
or
pt$addRowDataGroups(...)
generates a new level of child
data groups. Each child data group is also an instance of the
PivotDataGroup
.
The display value of a data group is accessible via the caption
property, e.g. pt$topRowGroups[[1]]$caption
.
See the Data Groups vignette for more details.
A pivot table normally has one or more calculations. Each calculation
definition is an instance of the PivotCalculation
class.
Calculation definitions are grouped together by a
PivotCalculationGroup
, of which there is normally only a
single default group in a pivot table. If this calculation group
contains three calculation definitions, then in the populated pivot
table there will be three cells underneath each data group (one for each
calculation), e.g.
library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="NumberOfTrains", caption="Number of Trains",
summariseExpression="n()")
pt$defineCalculation(calculationName="MinimumpeedMPH", caption="Minimum Speed (MPH)",
summariseExpression="min(SchedSpeedMPH, na.rm=TRUE)")
pt$defineCalculation(calculationName="MaximumSpeedMPH", caption="Maximum Speed (MPH)",
summariseExpression="max(SchedSpeedMPH, na.rm=TRUE)")
pt$renderPivot()
It is possible (though not typical) for a pivot table to contain multiple calculation groups. See the Irregular Layout vignette for an example.
pt$calculationGroups
contains the calculation groups
present in a pivot table. The calculation definitions within the default
group are accessed via
pt$calculationGroups$defaultGroup$calculations
.
see the Calculations vignette for more details.
A pivot table contains a set of cells represented by the
PivotCells
class. The cells of a pivot table are accessed
via pt$cells
. The dimensions of a pivot table (excluding
data group headings) can be checked with pt$rowCount
and
pt$columnCount
.
Individual cells are represented by the PivotCell
class.
The easiest way to access individual cells (i.e. individual
PivotCell
objects) is
pt$cells$getCell(r, c)
.
Within the PivotCells
object is a list of rows, each of
which is a list of PivotCell
objects. So a more direct (but
less-safe) way to access individual cells is
pt$cells$rows[[r]][[c]]
.
Each cell has a raw value (typically of data type numeric) and a
formatted value (typically of data type character). These can be
accessed via cell$rawValue
and
cell$formattedValue
respectively.
It is possible to find the leaf-level data groups (i.e. right-most
row heading and bottom column heading) that relate to a cell using
cell$rowLeafGroup
and cell$columnLeafGroup
. If
the data group has other parent groups, then these can be accessed
recursively via cell$rowLeafGroup
,
cell$rowLeafGroup$parentGroup
,
cell$rowLeafGroup$parentGroup$parentGroup
, etc.
The Finding and Formatting vignette describes other ways of accessing cells and data group headings.
Each data group typically acts as a filter for the data in that
row/column of the pivot table. E.g. a heading of “France” typically
implies “Country=France”. This filter condition is represented by the
PivotFilter
class. Every cell in a pivot table has a set of
filters (one filter from each row/column heading) and these filters are
represented by the PivotFilters
class. See the Cell Context vignette for more
details.
The PivotFilterOverrides
class provides a mechanism for
individual calculation definitions to override the default filters
associated with a cell. See the Calculations and Appendix: Calculations vignettes for more
details.
The PivotCalculator
class provides much of the
functionality for calculating the values in a pivot table. Users of the
pivottabler
package typically do not directly interact with
this internal class.
The PivotBatch
, PivotBatchCalculator
and
PivotBatchStatistics
classes provide functionality for
calculating the values of batches of cells in one/a small number of
dplyr/data.table calculations. These classes are also internal.
The PivotStyle
class represents a list of style
declarations in the form of name-value pairs (similar to a CSS style
declaration) that would be applied to a single data group or cell in the
pivot table (or sets of data groups / cells).
The PivotStyles
class is a set of styles that are
applied to different parts of a pivot table (headings, cells, totals,
etc). The PivotStyles
class can be thought of as a theme
applied to a pivot table.
The PivotOpenXlsxStyle
and
PivotOpenXlsxStyles
classes are similar, except these are
specific to Excel export.
See the Styling vignette for more details.
The PivotHtmlRenderer
, PivotLatexRenderer
and PivotOpenXlsxRenderer
classes provide rendering logic
for each of the formats that a pivot table can be output in.
See the Outputs vignette for more details.
PivotTable
class represents a single pivot table,
typically named pt
in the package examples.pt$data
is an instance of the PivotData
class that wraps the data frames used to build the pivot table.pt$rowGroup
and pt$columnGroup
are the
invisible top-level data groups on each axis - each is an instance of
the PivotDataGroup
class.pt$topRowGroups
and pt$topColumnGroups
provide the direct way to access the first visible level of row or
column data groups in a pivot table.pt$leafRowGroups
and pt$leafColumnGroups
provide the direct way to access the lowest level of row or column data
groups in a pivot table.PivotDataGroup
can be accessed
using the parentGroup
property.pt$topRowGroups[[1]]$childGroups[[1]]$childGroups
,
etc.pt$calculationGroups
contains the calculation groups
present in a pivot table. The calculation definitions within the default
calculation group are accessed via
pt$calculationGroups$defaultGroup$calculations
.pt$rowCount
and
pt$columnCount
.PivotCell
class and can be accessed via
pt$cells$getCell(r, c)
.cell$rowLeafGroup
,
cell$rowLeafGroup$parentGroup
,
cell$rowLeafGroup$parentGroup$parentGroup
, etc.