User Guide: 1 Debugging ggplots
‘gginnards’ 0.2.0
Pedro J. Aphalo
2024-06-27
Source:vignettes/user-guide-1.Rmd
user-guide-1.Rmd
Preliminaries
## Loading required package: ggplot2
We generate some artificial data.
set.seed(4321)
# generate artificial data
x <- 1:100
y <- (x + x^2 + x^3) + rnorm(length(x), mean = 0, sd = mean(x^3) / 4)
my.data <- data.frame(x,
y,
group = c("A", "B"),
y2 = y * c(0.5, 2),
block = c("a", "a", "b", "b"))
We change the default theme to an uncluttered one.
ggplot construction
Package ‘ggplot2’ defines its own class system, and function
ggplot()
can be considered as a constructor.
## [1] "gg" "ggplot"
These objects contain all the information needed to render a plot into graphical output, but not the rendered plot itself. They are list-like objects with heterogeneous named members.
The structure of objects of class "ggplot"
can be
explored with R’s method str()
as is the case for any
structured R object. Package ‘gginnards’ defines a a specialization of
str()
for class "ggplot"
. Our
str()
allows us to see the different slots of
these special type of lists. The difference with the default
str()
method is in the values of default arguments, and in
the ability to control which components or members are displayed.
We will use the str()
to follow the step by step
construction of a "ggplot"
object.
If we pass no arguments to the ggplot()
constructor an
empty plot will be rendered if we print it.
p0 <- ggplot()
p0
Object p
contains members, but "data"
,
"layers"
, "mapping"
, "theme"
and
"labels"
are empty lists.
str(p0)
## Object size: 4.4 kB
## List of 11
## $ data : list()
## $ layers : list()
## $ scales :Classes 'ScalesList', 'ggproto', 'gg' <ggproto object: Class ScalesList, gg>
## add: function
## add_defaults: function
## add_missing: function
## backtransform_df: function
## clone: function
## find: function
## get_scales: function
## has_scale: function
## input: function
## map_df: function
## n: function
## non_position_scales: function
## scales: NULL
## train_df: function
## transform_df: function
## super: <ggproto object: Class ScalesList, gg>
## $ guides :Classes 'Guides', 'ggproto', 'gg' <ggproto object: Class Guides, gg>
## add: function
## assemble: function
## build: function
## draw: function
## get_custom: function
## get_guide: function
## get_params: function
## get_position: function
## guides: NULL
## merge: function
## missing: <ggproto object: Class GuideNone, Guide, gg>
## add_title: function
## arrange_layout: function
## assemble_drawing: function
## available_aes: any
## build_decor: function
## build_labels: function
## build_ticks: function
## build_title: function
## draw: function
## draw_early_exit: function
## elements: list
## extract_decor: function
## extract_key: function
## extract_params: function
## get_layer_key: function
## hashables: list
## measure_grobs: function
## merge: function
## override_elements: function
## params: list
## process_layers: function
## setup_elements: function
## setup_params: function
## train: function
## transform: function
## super: <ggproto object: Class GuideNone, Guide, gg>
## package_box: function
## print: function
## process_layers: function
## setup: function
## subset_guides: function
## train: function
## update_params: function
## super: <ggproto object: Class Guides, gg>
## $ mapping : Named list()
## $ theme : list()
## $ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto', 'gg' <ggproto object: Class CoordCartesian, Coord, gg>
## aspect: function
## backtransform_range: function
## clip: on
## default: TRUE
## distance: function
## expand: TRUE
## is_free: function
## is_linear: function
## labels: function
## limits: list
## modify_scales: function
## range: function
## render_axis_h: function
## render_axis_v: function
## render_bg: function
## render_fg: function
## setup_data: function
## setup_layout: function
## setup_panel_guides: function
## setup_panel_params: function
## setup_params: function
## train_panel_guides: function
## transform: function
## super: <ggproto object: Class CoordCartesian, Coord, gg>
## $ facet :Classes 'FacetNull', 'Facet', 'ggproto', 'gg' <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## $ plot_env :<environment: R_GlobalEnv>
## $ layout :Classes 'Layout', 'ggproto', 'gg' <ggproto object: Class Layout, gg>
## coord: NULL
## coord_params: list
## facet: NULL
## facet_params: list
## finish_data: function
## get_scales: function
## layout: NULL
## map_position: function
## panel_params: NULL
## panel_scales_x: NULL
## panel_scales_y: NULL
## render: function
## render_labels: function
## reset_scales: function
## resolve_label: function
## setup: function
## setup_panel_guides: function
## setup_panel_params: function
## train_position: function
## super: <ggproto object: Class Layout, gg>
## $ labels : Named list()
If we pass an argument to parameter data
the data is
copied into the list slot with name data
. As we also map
the data to aesthetics, this mapping is stored in slot
maaping
.
## Object size: 12 kB
## List of 11
## $ data :'data.frame': 100 obs. of 5 variables:
## $ layers : list()
## $ scales :Classes 'ScalesList', 'ggproto', 'gg' <ggproto object: Class ScalesList, gg>
## add: function
## add_defaults: function
## add_missing: function
## backtransform_df: function
## clone: function
## find: function
## get_scales: function
## has_scale: function
## input: function
## map_df: function
## n: function
## non_position_scales: function
## scales: NULL
## train_df: function
## transform_df: function
## super: <ggproto object: Class ScalesList, gg>
## $ guides :Classes 'Guides', 'ggproto', 'gg' <ggproto object: Class Guides, gg>
## add: function
## assemble: function
## build: function
## draw: function
## get_custom: function
## get_guide: function
## get_params: function
## get_position: function
## guides: NULL
## merge: function
## missing: <ggproto object: Class GuideNone, Guide, gg>
## add_title: function
## arrange_layout: function
## assemble_drawing: function
## available_aes: any
## build_decor: function
## build_labels: function
## build_ticks: function
## build_title: function
## draw: function
## draw_early_exit: function
## elements: list
## extract_decor: function
## extract_key: function
## extract_params: function
## get_layer_key: function
## hashables: list
## measure_grobs: function
## merge: function
## override_elements: function
## params: list
## process_layers: function
## setup_elements: function
## setup_params: function
## train: function
## transform: function
## super: <ggproto object: Class GuideNone, Guide, gg>
## package_box: function
## print: function
## process_layers: function
## setup: function
## subset_guides: function
## train: function
## update_params: function
## super: <ggproto object: Class Guides, gg>
## $ mapping :List of 3
## $ theme : list()
## $ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto', 'gg' <ggproto object: Class CoordCartesian, Coord, gg>
## aspect: function
## backtransform_range: function
## clip: on
## default: TRUE
## distance: function
## expand: TRUE
## is_free: function
## is_linear: function
## labels: function
## limits: list
## modify_scales: function
## range: function
## render_axis_h: function
## render_axis_v: function
## render_bg: function
## render_fg: function
## setup_data: function
## setup_layout: function
## setup_panel_guides: function
## setup_panel_params: function
## setup_params: function
## train_panel_guides: function
## transform: function
## super: <ggproto object: Class CoordCartesian, Coord, gg>
## $ facet :Classes 'FacetNull', 'Facet', 'ggproto', 'gg' <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## $ plot_env :<environment: R_GlobalEnv>
## $ layout :Classes 'Layout', 'ggproto', 'gg' <ggproto object: Class Layout, gg>
## coord: NULL
## coord_params: list
## facet: NULL
## facet_params: list
## finish_data: function
## get_scales: function
## layout: NULL
## map_position: function
## panel_params: NULL
## panel_scales_x: NULL
## panel_scales_y: NULL
## render: function
## render_labels: function
## reset_scales: function
## resolve_label: function
## setup: function
## setup_panel_guides: function
## setup_panel_params: function
## train_position: function
## super: <ggproto object: Class Layout, gg>
## $ labels :List of 3
str(p1, max.level = 2, components = "data")
## Object size: 5.3 kB
## List of 1
## $ data:'data.frame': 100 obs. of 5 variables:
## ..$ x : int [1:100] 1 2 3 4 5 ...
## ..$ y : num [1:100] -27205 -14243 ...
## ..$ group: chr [1:100] "A" "B" ...
## ..$ y2 : num [1:100] -13603 -28485 ...
## ..$ block: chr [1:100] "a" "a" ...
A geometry adds a layer.
p2 <- p1 + geom_point()
str(p2)
## Object size: 12.5 kB
## List of 11
## $ data :'data.frame': 100 obs. of 5 variables:
## $ layers :List of 1
## $ scales :Classes 'ScalesList', 'ggproto', 'gg' <ggproto object: Class ScalesList, gg>
## add: function
## add_defaults: function
## add_missing: function
## backtransform_df: function
## clone: function
## find: function
## get_scales: function
## has_scale: function
## input: function
## map_df: function
## n: function
## non_position_scales: function
## scales: list
## train_df: function
## transform_df: function
## super: <ggproto object: Class ScalesList, gg>
## $ guides :Classes 'Guides', 'ggproto', 'gg' <ggproto object: Class Guides, gg>
## add: function
## assemble: function
## build: function
## draw: function
## get_custom: function
## get_guide: function
## get_params: function
## get_position: function
## guides: NULL
## merge: function
## missing: <ggproto object: Class GuideNone, Guide, gg>
## add_title: function
## arrange_layout: function
## assemble_drawing: function
## available_aes: any
## build_decor: function
## build_labels: function
## build_ticks: function
## build_title: function
## draw: function
## draw_early_exit: function
## elements: list
## extract_decor: function
## extract_key: function
## extract_params: function
## get_layer_key: function
## hashables: list
## measure_grobs: function
## merge: function
## override_elements: function
## params: list
## process_layers: function
## setup_elements: function
## setup_params: function
## train: function
## transform: function
## super: <ggproto object: Class GuideNone, Guide, gg>
## package_box: function
## print: function
## process_layers: function
## setup: function
## subset_guides: function
## train: function
## update_params: function
## super: <ggproto object: Class Guides, gg>
## $ mapping :List of 3
## $ theme : list()
## $ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto', 'gg' <ggproto object: Class CoordCartesian, Coord, gg>
## aspect: function
## backtransform_range: function
## clip: on
## default: TRUE
## distance: function
## expand: TRUE
## is_free: function
## is_linear: function
## labels: function
## limits: list
## modify_scales: function
## range: function
## render_axis_h: function
## render_axis_v: function
## render_bg: function
## render_fg: function
## setup_data: function
## setup_layout: function
## setup_panel_guides: function
## setup_panel_params: function
## setup_params: function
## train_panel_guides: function
## transform: function
## super: <ggproto object: Class CoordCartesian, Coord, gg>
## $ facet :Classes 'FacetNull', 'Facet', 'ggproto', 'gg' <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## $ plot_env :<environment: R_GlobalEnv>
## $ layout :Classes 'Layout', 'ggproto', 'gg' <ggproto object: Class Layout, gg>
## coord: NULL
## coord_params: list
## facet: NULL
## facet_params: list
## finish_data: function
## get_scales: function
## layout: NULL
## map_position: function
## panel_params: NULL
## panel_scales_x: NULL
## panel_scales_y: NULL
## render: function
## render_labels: function
## reset_scales: function
## resolve_label: function
## setup: function
## setup_panel_guides: function
## setup_panel_params: function
## train_position: function
## super: <ggproto object: Class Layout, gg>
## $ labels :List of 3
A summary()
method that produces a more compact output
is available in recent versions of ‘ggplot2’. However, it does not
reveal the internal structure of the objects.
summary(p2)
## data: x, y, group, y2, block [100x5]
## mapping: x = ~x, y = ~y, colour = ~group
## faceting: <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## -----------------------------------
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_identity
str(p2, max.level = 2, components = "mapping")
## Object size: 3 kB
## List of 1
## $ mapping:List of 3
## ..$ x : language ~x
## ..$ y : language ~y
## ..$ colour: language ~group
p3 <- p2 + theme_classic()
str(p3)
## Object size: 88.4 kB
## List of 11
## $ data :'data.frame': 100 obs. of 5 variables:
## $ layers :List of 1
## $ scales :Classes 'ScalesList', 'ggproto', 'gg' <ggproto object: Class ScalesList, gg>
## add: function
## add_defaults: function
## add_missing: function
## backtransform_df: function
## clone: function
## find: function
## get_scales: function
## has_scale: function
## input: function
## map_df: function
## n: function
## non_position_scales: function
## scales: list
## train_df: function
## transform_df: function
## super: <ggproto object: Class ScalesList, gg>
## $ guides :Classes 'Guides', 'ggproto', 'gg' <ggproto object: Class Guides, gg>
## add: function
## assemble: function
## build: function
## draw: function
## get_custom: function
## get_guide: function
## get_params: function
## get_position: function
## guides: NULL
## merge: function
## missing: <ggproto object: Class GuideNone, Guide, gg>
## add_title: function
## arrange_layout: function
## assemble_drawing: function
## available_aes: any
## build_decor: function
## build_labels: function
## build_ticks: function
## build_title: function
## draw: function
## draw_early_exit: function
## elements: list
## extract_decor: function
## extract_key: function
## extract_params: function
## get_layer_key: function
## hashables: list
## measure_grobs: function
## merge: function
## override_elements: function
## params: list
## process_layers: function
## setup_elements: function
## setup_params: function
## train: function
## transform: function
## super: <ggproto object: Class GuideNone, Guide, gg>
## package_box: function
## print: function
## process_layers: function
## setup: function
## subset_guides: function
## train: function
## update_params: function
## super: <ggproto object: Class Guides, gg>
## $ mapping :List of 3
## $ theme :List of 136
## $ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto', 'gg' <ggproto object: Class CoordCartesian, Coord, gg>
## aspect: function
## backtransform_range: function
## clip: on
## default: TRUE
## distance: function
## expand: TRUE
## is_free: function
## is_linear: function
## labels: function
## limits: list
## modify_scales: function
## range: function
## render_axis_h: function
## render_axis_v: function
## render_bg: function
## render_fg: function
## setup_data: function
## setup_layout: function
## setup_panel_guides: function
## setup_panel_params: function
## setup_params: function
## train_panel_guides: function
## transform: function
## super: <ggproto object: Class CoordCartesian, Coord, gg>
## $ facet :Classes 'FacetNull', 'Facet', 'ggproto', 'gg' <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## $ plot_env :<environment: R_GlobalEnv>
## $ layout :Classes 'Layout', 'ggproto', 'gg' <ggproto object: Class Layout, gg>
## coord: NULL
## coord_params: list
## facet: NULL
## facet_params: list
## finish_data: function
## get_scales: function
## layout: NULL
## map_position: function
## panel_params: NULL
## panel_scales_x: NULL
## panel_scales_y: NULL
## render: function
## render_labels: function
## reset_scales: function
## resolve_label: function
## setup: function
## setup_panel_guides: function
## setup_panel_params: function
## train_position: function
## super: <ggproto object: Class Layout, gg>
## $ labels :List of 3
Themes are stored as nested lists. To keep the output short we use
max.level = 2
although using max.level = 3
would be needed to see all nested members.
str(p3, max.level = 2, components = "theme")
## Object size: 76.2 kB
## List of 1
## $ theme:List of 136
## ..$ line :List of 6
## ..$ rect :List of 5
## ..$ text :List of 11
## ..$ title : NULL
## ..$ aspect.ratio : NULL
## ..$ axis.title : NULL
## ..$ axis.title.x :List of 11
## ..$ axis.title.x.top :List of 11
## ..$ axis.title.x.bottom : NULL
## ..$ axis.title.y :List of 11
## ..$ axis.title.y.left : NULL
## ..$ axis.title.y.right :List of 11
## ..$ axis.text :List of 11
## ..$ axis.text.x :List of 11
## ..$ axis.text.x.top :List of 11
## ..$ axis.text.x.bottom : NULL
## ..$ axis.text.y :List of 11
## ..$ axis.text.y.left : NULL
## ..$ axis.text.y.right :List of 11
## ..$ axis.text.theta : NULL
## ..$ axis.text.r :List of 11
## ..$ axis.ticks :List of 6
## ..$ axis.ticks.x : NULL
## ..$ axis.ticks.x.top : NULL
## ..$ axis.ticks.x.bottom : NULL
## ..$ axis.ticks.y : NULL
## ..$ axis.ticks.y.left : NULL
## ..$ axis.ticks.y.right : NULL
## ..$ axis.ticks.theta : NULL
## ..$ axis.ticks.r : NULL
## ..$ axis.minor.ticks.x.top : NULL
## ..$ axis.minor.ticks.x.bottom : NULL
## ..$ axis.minor.ticks.y.left : NULL
## ..$ axis.minor.ticks.y.right : NULL
## ..$ axis.minor.ticks.theta : NULL
## ..$ axis.minor.ticks.r : NULL
## ..$ axis.ticks.length : 'simpleUnit' num 2.75points
## ..$ axis.ticks.length.x : NULL
## ..$ axis.ticks.length.x.top : NULL
## ..$ axis.ticks.length.x.bottom : NULL
## ..$ axis.ticks.length.y : NULL
## ..$ axis.ticks.length.y.left : NULL
## ..$ axis.ticks.length.y.right : NULL
## ..$ axis.ticks.length.theta : NULL
## ..$ axis.ticks.length.r : NULL
## ..$ axis.minor.ticks.length : 'rel' num 0.75
## ..$ axis.minor.ticks.length.x : NULL
## ..$ axis.minor.ticks.length.x.top : NULL
## ..$ axis.minor.ticks.length.x.bottom: NULL
## ..$ axis.minor.ticks.length.y : NULL
## ..$ axis.minor.ticks.length.y.left : NULL
## ..$ axis.minor.ticks.length.y.right : NULL
## ..$ axis.minor.ticks.length.theta : NULL
## ..$ axis.minor.ticks.length.r : NULL
## ..$ axis.line :List of 6
## ..$ axis.line.x : NULL
## ..$ axis.line.x.top : NULL
## ..$ axis.line.x.bottom : NULL
## ..$ axis.line.y : NULL
## ..$ axis.line.y.left : NULL
## ..$ axis.line.y.right : NULL
## ..$ axis.line.theta : NULL
## ..$ axis.line.r : NULL
## ..$ legend.background :List of 5
## ..$ legend.margin : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
## ..$ legend.spacing : 'simpleUnit' num 11points
## ..$ legend.spacing.x : NULL
## ..$ legend.spacing.y : NULL
## ..$ legend.key : NULL
## ..$ legend.key.size : 'simpleUnit' num 1.2lines
## ..$ legend.key.height : NULL
## ..$ legend.key.width : NULL
## ..$ legend.key.spacing : 'simpleUnit' num 5.5points
## ..$ legend.key.spacing.x : NULL
## ..$ legend.key.spacing.y : NULL
## ..$ legend.frame : NULL
## ..$ legend.ticks : NULL
## ..$ legend.ticks.length : 'rel' num 0.2
## ..$ legend.axis.line : NULL
## ..$ legend.text :List of 11
## ..$ legend.text.position : NULL
## ..$ legend.title :List of 11
## ..$ legend.title.position : NULL
## ..$ legend.position : chr "right"
## ..$ legend.position.inside : NULL
## ..$ legend.direction : NULL
## ..$ legend.byrow : NULL
## ..$ legend.justification : chr "center"
## ..$ legend.justification.top : NULL
## ..$ legend.justification.bottom : NULL
## ..$ legend.justification.left : NULL
## ..$ legend.justification.right : NULL
## ..$ legend.justification.inside : NULL
## ..$ legend.location : NULL
## ..$ legend.box : NULL
## ..$ legend.box.just : NULL
## ..$ legend.box.margin : 'margin' num [1:4] 0cm 0cm 0cm 0cm
## ..$ legend.box.background : list()
## ..$ legend.box.spacing : 'simpleUnit' num 11points
## .. [list output truncated]
Data mappings in ggplots
How does mapping work? Geometries (geoms) and
statistics (stats) do not “see” the original variable names,
instead the data
passed to them is named according to the
aesthetics user variables are mapped to. Geoms and stats work
in tandem, with geoms doing the actual plotting and stats summarizing or
transforming the data. It can be instructive to be able to see what data
is received as input by a geom or stat, and what data is returned by a
stat.
Both geoms and stats can have either panel- or group functions. Panel functions receive as input the subset of the data that corresponds to a whole panel, mapped to the aesthetics and with factors indicating the grouping (set by the user by mapping to a discrete scale). Group functions receive as input the subset of data corresponding to a single group based on the mapping, and called once for each group present in a panel.
The motivation for writing the “debug” stats and geoms included in
package ‘gginnards’ is that at the moment it is in many cases not
possible to set breakpoints inside the code of stats and geoms, because
frequently nameless panel and group functions are stored within
list-like "ggplot"
objects as seen above.
This can make it tedious to analyse how these functions work, as one
may need to add print
statements to their definitions to
see the data. I wrote the “debug” stats and geoms as tools to help in
the development of my packages ‘ggpmisc’ and ‘ggspectra’, and as a way
of learning myself how data are passed around within the different
components of a ggplot
object when it is printed.
Data input to geometries
Data pass through a statistics before being received by a geometry.
However, many geometries, like geom_point()
and
geom_line()
use by default stat_identity()
which simply relays the unmodified data to the geometries.
The debug geometries and statistics in package ‘gginnards’
by default do not add any graphical element to the plot but instead they
make visible the data
as received as their input.
The geometry geom_debug_panel()
uses
stat_identity()
by default. Here the same data as rendered
by geom_point()
is printed as a tibble to the R console. We
can see that the columns are named according to the aesthetics the
variables in the user-supplied data have been mapped. In the case of
colour, the levels of the factor have been replaced by colour
definitions. Columns PANEL
and group
have been
also added.
ggplot(mpg, aes(cyl, hwy, colour = factor(cyl))) +
geom_point() +
geom_debug_panel()
## [1] "PANEL 1; group(s) 1, 2, 3, 4; 'draw_panel()' input 'data' (head):"
## colour x y PANEL group
## 1 #F8766D 4 29 1 1
## 2 #F8766D 4 29 1 1
## 3 #F8766D 4 31 1 1
## 4 #F8766D 4 30 1 1
## 5 #00BFC4 6 26 1 3
## 6 #00BFC4 6 26 1 3
## [1] "PANEL 1; group(s) 1, 2, 3, 4; 'draw_panel()' input 'params' (summary):"
## Length Class Mode
## x 11 ViewScale environment
## x.sec 11 ViewScale environment
## x.range 2 -none- numeric
## y 11 ViewScale environment
## y.sec 11 ViewScale environment
## y.range 2 -none- numeric
## guides 4 Guides environment
Below we show how geom_debug_panel()
can be used
together with functions that take a data frame as input and return a
value that can be printed. We use here head()
but other
functions such summary()
, nrow()
and
colnames()
as well as user defined functions can be useful
when data
is large. As shown here, additional arguments can
be passed by name to the function.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
geom_debug_panel(dbgfun.data = head, dbgfun.data.args = list(n = 3))
## [1] "PANEL 1; group(s) 1, 2; 'draw_panel()' input 'data' (anonymous function):"
## colour x y PANEL group
## 1 #F8766D 1 -27205.45 1 1
## 2 #00BFC4 2 -14242.65 1 2
## 3 #F8766D 3 45790.92 1 1
## [1] "PANEL 1; group(s) 1, 2; 'draw_panel()' input 'params' (summary):"
## Length Class Mode
## x 11 ViewScale environment
## x.sec 11 ViewScale environment
## x.range 2 -none- numeric
## y 11 ViewScale environment
## y.sec 11 ViewScale environment
## y.range 2 -none- numeric
## guides 4 Guides environment
When using a statistic that modifies the data, we can pass
geom_debug_panel()
as argument in the call to this
statistic. In this way the data printed to the console will be those
returned by the statistics and received by the geometry.
ggplot(mpg, aes(cyl, hwy)) +
stat_summary(fun.data = "mean_se") +
stat_summary(fun.data = "mean_se", geom = "debug_panel")
## [1] "PANEL 1; group(s) -1; 'draw_function()' input 'data' (head):"
## x group y ymin ymax PANEL flipped_aes orientation
## 1 4 -1 28.80247 28.30080 29.30414 1 FALSE NA
## 2 5 -1 28.75000 28.50000 29.00000 1 FALSE NA
## 3 6 -1 22.82278 22.40812 23.23745 1 FALSE NA
## 4 8 -1 17.62857 17.23865 18.01849 1 FALSE NA
As shown above an important use of geom_debug_panel()
it
to display the data returned by a statistic and received as input by
geometries. Not all extensions to ‘ggplot2’ document all the computed
variables returned by statistics. In other cases like in the next
example, the values returned will depend on the arguments passed. While
in the previous example the statistic returned a data frame with one row
per group, here the returned data frame has 160 rows. The data are by
default plotted as a line with a confidence band.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
stat_smooth(method = "lm", formula = y ~ poly(x, 2)) +
stat_smooth(method = "lm", formula = y ~ poly(x, 2),
geom = "debug_panel", dbgfun.data = head)
## [1] "PANEL 1; group(s) 1, 2; 'draw_function()' input 'data' (anonymous function):"
## colour x y ymin ymax se flipped_aes PANEL
## 1 #F8766D 1.000000 23456.375 -26310.65 73223.40 24738.29 FALSE 1
## 2 #F8766D 2.240506 18461.879 -28860.45 65784.21 23523.08 FALSE 1
## 3 #F8766D 3.481013 13882.703 -31095.30 58860.71 22357.76 FALSE 1
## 4 #F8766D 4.721519 9718.845 -33018.17 52455.86 21243.80 FALSE 1
## 5 #F8766D 5.962025 5970.307 -34632.24 46572.85 20182.79 FALSE 1
## 6 #F8766D 7.202532 2637.088 -35940.96 41215.13 19176.45 FALSE 1
## group orientation
## 1 1 NA
## 2 1 NA
## 3 1 NA
## 4 1 NA
## 5 1 NA
## 6 1 NA
Data input to statistics
Statistics can be defined to operate on data corresponding to a whole
panel or separately on data corresponding to each individual group, as
created by mapping aesthetics to factors. The statistics described below
print a summary of their data
input by default to the
console. These statistics, in addition return a data frame containing
summary information including labels
suitable for
“plotting” with geom = "text"
or
geom = "label"
. However, package ‘gginnards’ defines a
“null” geom, geom_null()
, which is used as default by the
debug statistics. This geom is similar to the more recently
added ggplot2::geom_blank()
.
Using geom “null” allows to add the debug stats for the side
effect of console output without altering the rendering of the plot when
there is at least one other plot layer. The default geom
"null"
does not alter the rendering of the plot or print to
the console the data
output by the debug stats.
Because of the way ‘ggplot2’ works, the values are listed to the
console at the time when the ggplot
object is printed. As
shown here, no other geom or stat is required, however in the remaining
examples we add geom_point()
to make the data also visible
in the plot.
ggplot(my.data, aes(x, y)) +
stat_debug_group()
## [1] "PANEL 1; group(s) -1; 'compute_group()' input 'data' (head):"
## x y PANEL group
## 1 1 -27205.450 1 -1
## 2 2 -14242.651 1 -1
## 3 3 45790.918 1 -1
## 4 4 53731.420 1 -1
## 5 5 -8028.578 1 -1
## 6 6 102863.943 1 -1
In the absence of facets or groups we get the printout of a single
data frame, which is similar to that returned by
geom_debug_panel()
. Without grouping, group is set to
-1
for all observations. As the we override the default
geom with geom_debug_panel()
a summary computed by the stat
is also printed to the console.
ggplot(my.data, aes(x, y)) +
geom_point() +
stat_debug_group(geom = "debug_panel")
## [1] "PANEL 1; group(s) -1; 'compute_group()' input 'data' (head):"
## x y PANEL group
## 1 1 -27205.450 1 -1
## 2 2 -14242.651 1 -1
## 3 3 45790.918 1 -1
## 4 4 53731.420 1 -1
## 5 5 -8028.578 1 -1
## 6 6 102863.943 1 -1
## [1] "PANEL 1; group(s) -1; 'draw_function()' input 'data' (head):"
## x y PANEL group
## 1 1 -27205.450 1 -1
## 2 2 -14242.651 1 -1
## 3 3 45790.918 1 -1
## 4 4 53731.420 1 -1
## 5 5 -8028.578 1 -1
## 6 6 102863.943 1 -1
In a plot with no grouping, there is no difference in the
data
input for compute_panel()
and
compute_group()
functions except for the order of the
variables or columns in the data frame (this applies in general to
ggplot statistics).
ggplot(my.data, aes(x, y)) +
geom_point() +
stat_debug_panel()
## [1] "PANEL 1; group(s) -1; 'compute_panel()' input 'data' (head):"
## x y PANEL group
## 1 1 -27205.450 1 -1
## 2 2 -14242.651 1 -1
## 3 3 45790.918 1 -1
## 4 4 53731.420 1 -1
## 5 5 -8028.578 1 -1
## 6 6 102863.943 1 -1
By mapping the colour
aesthetic we create a grouping. In
the case, compute_group()
is called with the data subset by
group, and a separate data frame is displayed for each call
compute_group()
, corresponding each to a level in the
mapped factor. In this case group
takes as values positive
consecutive integers. As a factor was mapped to colour, colour is
encoded as a factor.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
stat_debug_group()
## [1] "PANEL 1; group(s) 1; 'compute_group()' input 'data' (head):"
## x y colour PANEL group
## 1 1 -27205.450 A 1 1
## 3 3 45790.918 A 1 1
## 5 5 -8028.578 A 1 1
## 7 7 -18547.282 A 1 1
## 9 9 79924.325 A 1 1
## 11 11 -2823.736 A 1 1
## [1] "PANEL 1; group(s) 2; 'compute_group()' input 'data' (head):"
## x y colour PANEL group
## 2 2 -14242.65 B 1 2
## 4 4 53731.42 B 1 2
## 6 6 102863.94 B 1 2
## 8 8 13080.52 B 1 2
## 10 10 -44711.50 B 1 2
## 12 12 23839.55 B 1 2
Without facets, we still have only one panel.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
stat_debug_panel()
## [1] "PANEL 1; group(s) 1, 2; 'compute_panel()' input 'data' (head):"
## x y colour PANEL group
## 1 1 -27205.450 A 1 1
## 2 2 -14242.651 B 1 2
## 3 3 45790.918 A 1 1
## 4 4 53731.420 B 1 2
## 5 5 -8028.578 A 1 1
## 6 6 102863.943 B 1 2
When we map the same factor to a different aesthetic the data remain
similar, except for the column named after the aesthetic, in this case
shape
.
ggplot(my.data, aes(x, y, shape = group)) +
geom_point() +
stat_debug_group()
## [1] "PANEL 1; group(s) 1; 'compute_group()' input 'data' (head):"
## x y shape PANEL group
## 1 1 -27205.450 A 1 1
## 3 3 45790.918 A 1 1
## 5 5 -8028.578 A 1 1
## 7 7 -18547.282 A 1 1
## 9 9 79924.325 A 1 1
## 11 11 -2823.736 A 1 1
## [1] "PANEL 1; group(s) 2; 'compute_group()' input 'data' (head):"
## x y shape PANEL group
## 2 2 -14242.65 B 1 2
## 4 4 53731.42 B 1 2
## 6 6 102863.94 B 1 2
## 8 8 13080.52 B 1 2
## 10 10 -44711.50 B 1 2
## 12 12 23839.55 B 1 2
Facets based on factors create panels within a plot. Here we create a
plot with both facets and grouping. In this case, for each
panel the compute_panel()
function is called once
with a subset of the data that corresponds to one panel, but not split
by groups. For our example, it is called twice.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
stat_debug_panel(dbgfun.data = "nrow") +
facet_wrap(~block)
## [1] "PANEL 1; group(s) 1, 2; 'compute_panel()' input 'data' (nrow):"
## [1] 50
## [1] "PANEL 2; group(s) 1, 2; 'compute_panel()' input 'data' (nrow):"
## [1] 50
with grouping and facets, within each panel the
compute_group()
function is called for each group, in total
four times.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
stat_debug_group(dbgfun.data = "nrow") +
facet_wrap(~block)
## [1] "PANEL 1; group(s) 1; 'compute_group()' input 'data' (nrow):"
## [1] 25
## [1] "PANEL 1; group(s) 2; 'compute_group()' input 'data' (nrow):"
## [1] 25
## [1] "PANEL 2; group(s) 1; 'compute_group()' input 'data' (nrow):"
## [1] 25
## [1] "PANEL 2; group(s) 2; 'compute_group()' input 'data' (nrow):"
## [1] 25
Controlling the debug output
In the examples above we have demonstrated the use of the statistics and geometries using default arguments. Here we show examples of generation of other types of debug output.
stat_debug_group()
and stat_debug_panel()
return summary data that can be inspected using a geometry in addition
to printing the data received as argument. If we use
geom_debug_panel()
a summary is printed to the console.
With two groups, we get two summaries when we use
stat_debug_group()
.
ggplot(my.data, aes(x, y, shape = group)) +
geom_point() +
stat_debug_group(geom = "debug_panel")
## [1] "PANEL 1; group(s) 1; 'compute_group()' input 'data' (head):"
## x y shape PANEL group
## 1 1 -27205.450 A 1 1
## 3 3 45790.918 A 1 1
## 5 5 -8028.578 A 1 1
## 7 7 -18547.282 A 1 1
## 9 9 79924.325 A 1 1
## 11 11 -2823.736 A 1 1
## [1] "PANEL 1; group(s) 2; 'compute_group()' input 'data' (head):"
## x y shape PANEL group
## 2 2 -14242.65 B 1 2
## 4 4 53731.42 B 1 2
## 6 6 102863.94 B 1 2
## 8 8 13080.52 B 1 2
## 10 10 -44711.50 B 1 2
## 12 12 23839.55 B 1 2
## [1] "PANEL 1; group(s) 1, 2; 'draw_function()' input 'data' (head):"
## shape x y PANEL group
## 1 16 1 -27205.450 1 1
## 2 16 3 45790.918 1 1
## 3 16 5 -8028.578 1 1
## 4 16 7 -18547.282 1 1
## 5 16 9 79924.325 1 1
## 6 16 11 -2823.736 1 1
If we use stat_debug_panel()
we get a single
summary.
ggplot(my.data, aes(x, y, shape = group)) +
geom_point() +
stat_debug_panel(geom = "debug_panel")
## [1] "PANEL 1; group(s) 1, 2; 'compute_panel()' input 'data' (head):"
## x y shape PANEL group
## 1 1 -27205.450 A 1 1
## 2 2 -14242.651 B 1 2
## 3 3 45790.918 A 1 1
## 4 4 53731.420 B 1 2
## 5 5 -8028.578 A 1 1
## 6 6 102863.943 B 1 2
## [1] "PANEL 1; group(s) 1, 2; 'draw_function()' input 'data' (head):"
## shape x y PANEL group
## 1 16 1 -27205.450 1 1
## 2 17 2 -14242.651 1 2
## 3 16 3 45790.918 1 1
## 4 17 4 53731.420 1 2
## 5 16 5 -8028.578 1 1
## 6 17 6 102863.943 1 2
In principle one can use other geoms to annotate the plot with the debug summary. In this case we silence all output to the R console and use the stat as any other ggplot stat.
ggplot(my.data, aes(x, y, colour = group)) +
geom_point() +
stat_debug_group(geom = "text",
mapping = aes(label = sprintf("group = %i",
after_stat(group))),
dbgfun.data = function(x) {NULL})
## [1] "PANEL 1; group(s) 1; 'compute_group()' input 'data' (anonymous function):"
## NULL
## [1] "PANEL 1; group(s) 2; 'compute_group()' input 'data' (anonymous function):"
## NULL