stat_normalmix_eq() fits a Normal mixture model, by default with
normalmixEM(). Predicted values are
computed and, by default, plotted.
Usage
stat_normalmix_eq(
mapping = NULL,
data = NULL,
geom = "text_npc",
position = "identity",
...,
method = "normalmixEM",
method.args = list(),
n.min = 10L * k,
level = 0.95,
k = 2,
free.mean = TRUE,
free.sd = TRUE,
se = FALSE,
seed = NA,
fm.values = TRUE,
components = NULL,
eq.with.lhs = TRUE,
eq.digits = 2,
label.x = "left",
label.y = "top",
hstep = 0,
vstep = NULL,
output.type = NULL,
na.rm = FALSE,
orientation = "x",
parse = NULL,
show.legend = NA,
inherit.aes = TRUE
)Arguments
- mapping
The aesthetic mapping, usually constructed with
aes. Only needs to be set at the layer level if you are overriding the plot defaults.- data
A layer specific dataset, only needed if you want to override the plot defaults.
- geom
The geometric object to use display the data
- position
The position adjustment to use for overlapping points on this layer.
- ...
other arguments passed on to
layer. This can include aesthetics whose values you want to set, not map. Seelayerfor more details.- method
function or character If character, "normalmixEM" or the name of a model fit function are accepted, possibly followed by the fit function's
methodargument separated by a colon. The function must return a model fit object of classmixEM.- method.args
named list with additional arguments.
- n.min
integer Minimum number of distinct values in the mapped variable for fitting to the attempted.
- level
Level of confidence interval to use (0.95 by default).
- k
integer Number of mixture components to fit.
- free.mean, free.sd
logical If TRUE, allow the fitted
meanand/or fittedsdto vary among the component Normal distributions.- se
logical, if
TRUEstandard errors for parameter estimates are obtained by bootstrapping.- seed
RNG seed argument passed to
set.seed(). Defaults toNA, which means thatset.seed()will not be called.- fm.values
logical Add parameter estimates and their standard errors to the returned values (`FALSE` by default.)
- components
character One of
"all","sum", ormembersselect which densities are returned.- eq.with.lhs
If
characterthe string is pasted to the front of the equation label before parsing or alogical(see note).- eq.digits
integer Number of digits after the decimal point to use for parameters in labels. If
Inf, use exponential notation with three decimal places.- label.x, label.y
numericwith range 0..1 "normalized parent coordinates" (npc units) or character if usinggeom_text_npc()orgeom_label_npc(). If usinggeom_text()orgeom_label()numeric in native data units. If too short they will be recycled.- hstep, vstep
numeric in npc units, the horizontal and vertical step used between labels for different mixture model components.
- output.type
character One of "expression", "LaTeX", "text", "markdown" or "numeric".
- na.rm
a logical indicating whether NA values should be stripped before the computation proceeds.
- orientation
character Either "x" or "y", the mapping of the values to which the mixture model is to be fitetd. NOT YET IMPLEMENTED!
- parse
logical Passed to the geom. If
TRUE, the labels will be parsed into expressions and displayed as described in?plotmath. Default isTRUEifoutput.type = "expression"andFALSEotherwise.- show.legend
logical. Should this layer be included in the legends?
NA, the default, includes if any aesthetics are mapped.FALSEnever includes, andTRUEalways includes.- inherit.aes
If
FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g.borders.
Value
The value returned by the statistic is a data frame, with n
rows of predicted density for each component of the mixture plus their
sum and the corresponding vector of x values. Optionally it will
also include additional values related to the model fit.
Details
This statistic is similar to stat_density but
instead of fitting a single distribution it can fit a mixture of two or
more Normal distributions, using an approach related to clustering.
Defaults are consistent between stat_normalmix_line() and
stat_normalmix_eq(). Parameter seed if not NA is used
in a call to set.seed() immediately before calling the model fit
function. As the fitting procedure makes use of the (pseudo-)random number
generator (RNG), convergence can depend on it, and in such cases setting
seed to the same value in stat_normalmix_line() and in
stat_normalmix_eq() can ensure consistency, and more
generally, reproducibility.
A mixture model as described above, is fitted for k >= 2, while
k == 1 is treated as a special case and a Normal distribution fitted
with function fitdistr(). In this case the SE values
are exact estimates.
Computed variables
stat_normalmix_eq() provides the
following
variables, some of which depend on the orientation:
- y
the location of text labels
- eq.label
characterstring for equations- eq.label
characterstring for number of observations- eq.label
characterstring for model fit method- lambda
numericthe estimate of the contribution of the component of the mixture towards the joint density- mu
numericthe estimate of the mean- sigma
numericthe estimate of the standard deviation- component
A factor indexing the components of the mixture and/or their sum
If SE = TRUE is passed then columns with standard errors for the
parameter estimates:
- mu.se
numericthe estimate of the mean- sigma.se
numericthe estimate of the standard deviation
If fm.values = TRUE is passed then columns with diagnosis and
parameters estimates are added, with the same value in each row within a
group:
- n
numericthe number ofxvalues- .size
numericthe number ofdensityvalues- fm.class
characterthe most derived class of the fitted model object- fm.method
characterthe method, as given by theftfield of the fitted model objects
This is wasteful and disabled by default, but provides a simple and robust approach to achieve effects like colouring or hiding of the model fit line by group depending on the outcome of model fitting.
Aesthetics
stat_normalmix_eq expects observations mapped to
x from a numeric variable. A new grouping is added by mapping
as default component to the group aesthetic and
eq.label to the label aesthetic. Additional aesthetics as
understood by the geom ("text_npc" by default) can be set.
See also
Other ggplot statistics for mixture model fits.:
stat_normalmix_line()
Examples
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "sum") +
stat_normalmix_eq()
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 31
#> number of iterations= 35
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "sum") +
stat_normalmix_eq(use_label("eq", "n", "method"))
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 15
#> number of iterations= 22
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "sum") +
stat_normalmix_eq(geom = "label_npc")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 18
#> number of iterations= 44
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "sum") +
stat_normalmix_eq(geom = "text", label.x = "center", label.y = "bottom")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 20
#> number of iterations= 27
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "sum") +
stat_normalmix_eq(geom = "text", hjust = "inward")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 39
#> number of iterations= 29
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "members") +
stat_normalmix_eq(components = "members")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 24
#> number of iterations= 41
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(components = "members") +
stat_normalmix_eq(components = "members", se = TRUE)
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 56
#> number of iterations= 27
#> number of iterations= 11
#> number of iterations= 14
#> number of iterations= 25
#> number of iterations= 41
#> number of iterations= 14
#> number of iterations= 13
#> number of iterations= 10
#> number of iterations= 20
#> number of iterations= 20
#> number of iterations= 15
#> number of iterations= 11
#> number of iterations= 14
#> number of iterations= 18
#> number of iterations= 14
#> number of iterations= 15
#> number of iterations= 10
#> number of iterations= 11
#> number of iterations= 9
#> number of iterations= 23
#> number of iterations= 11
#> number of iterations= 17
#> number of iterations= 11
#> number of iterations= 14
#> number of iterations= 21
#> number of iterations= 10
#> number of iterations= 19
#> number of iterations= 13
#> number of iterations= 19
#> number of iterations= 14
#> number of iterations= 18
#> number of iterations= 15
#> number of iterations= 17
#> number of iterations= 25
#> number of iterations= 14
#> number of iterations= 11
#> number of iterations= 14
#> number of iterations= 14
#> number of iterations= 14
#> number of iterations= 30
#> number of iterations= 11
#> number of iterations= 15
#> number of iterations= 14
#> number of iterations= 26
#> number of iterations= 17
#> number of iterations= 16
#> number of iterations= 10
#> number of iterations= 15
#> number of iterations= 11
#> number of iterations= 22
#> number of iterations= 14
#> number of iterations= 18
#> number of iterations= 12
#> number of iterations= 12
#> number of iterations= 11
#> number of iterations= 15
#> number of iterations= 18
#> number of iterations= 13
#> number of iterations= 18
#> number of iterations= 16
#> number of iterations= 18
#> number of iterations= 16
#> number of iterations= 20
#> number of iterations= 13
#> number of iterations= 16
#> number of iterations= 6
#> number of iterations= 17
#> number of iterations= 21
#> number of iterations= 9
#> number of iterations= 14
#> number of iterations= 16
#> number of iterations= 21
#> number of iterations= 14
#> number of iterations= 17
#> number of iterations= 14
#> number of iterations= 14
#> number of iterations= 12
#> number of iterations= 20
#> number of iterations= 16
#> number of iterations= 13
#> number of iterations= 19
#> number of iterations= 30
#> number of iterations= 12
#> number of iterations= 9
#> number of iterations= 13
#> number of iterations= 14
#> number of iterations= 13
#> number of iterations= 20
#> number of iterations= 13
#> number of iterations= 11
#> number of iterations= 16
#> number of iterations= 11
#> number of iterations= 10
#> number of iterations= 12
#> number of iterations= 13
#> number of iterations= 19
#> number of iterations= 23
#> number of iterations= 8
#> number of iterations= 21
#> number of iterations= 19
#> number of iterations= 17
# ggplot(faithful, aes(y = waiting)) +
# stat_normalmix_eq(orientation = "y")
ggplot(faithful, aes(x = waiting)) +
geom_histogram(aes(y = after_stat(density)), bins = 20) +
stat_normalmix_line(aes(colour = after_stat(component),
fill = after_stat(component)),
geom = "area", linewidth = 1, alpha = 0.25) +
stat_normalmix_eq(aes(colour = after_stat(component)))
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 37
#> number of iterations= 32
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(aes(colour = after_stat(component),
fill = after_stat(component)),
geom = "area", linewidth = 1, alpha = 0.25,
components = "members") +
stat_normalmix_eq(aes(colour = after_stat(component)),
components = "members")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 23
#> number of iterations= 37
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(geom = "area", linewidth = 1, alpha = 0.25,
colour = "black", outline.type = "upper",
components = "sum", se = FALSE) +
stat_normalmix_eq(components = "sum")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 32
#> number of iterations= 30
# special case of no mixture
ggplot(subset(faithful, waiting > 66), aes(x = waiting)) +
stat_normalmix_line(k = 1) +
stat_normalmix_eq(k = 1)
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> With k = 1 one Normal distribution is fitted. Irrelevant parameters ignored!
#> With k = 1 one Normal distribution is fitted. Irrelevant parameters ignored!
ggplot(subset(faithful, waiting > 66), aes(x = waiting)) +
stat_normalmix_line(k = 1) +
stat_normalmix_eq(k = 1, se = TRUE)
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> With k = 1 one Normal distribution is fitted. Irrelevant parameters ignored!
#> With k = 1 one Normal distribution is fitted. Irrelevant parameters ignored!
# Inspecting the returned data using geom_debug()
gginnards.installed <- requireNamespace("gginnards", quietly = TRUE)
if (gginnards.installed)
library(gginnards)
if (gginnards.installed)
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_line(geom = "debug", components = "all")
#> number of iterations= 36
#> [1] "PANEL 1; group(s) comp.1 , comp.2 , comp.sum; 'draw_function()' input 'data' (head):"
#> x component density flipped_aes PANEL group y
#> 1 35.29541 comp.sum 1.092383e-04 FALSE 1 comp.sum 1.092383e-04
#> 2 35.29541 comp.1 1.092383e-04 FALSE 1 comp.1 1.092383e-04
#> 3 35.29541 comp.2 9.599836e-15 FALSE 1 comp.2 9.599836e-15
#> 4 35.61754 comp.sum 1.306554e-04 FALSE 1 comp.sum 1.306554e-04
#> 5 35.61754 comp.1 1.306554e-04 FALSE 1 comp.1 1.306554e-04
#> 6 35.61754 comp.2 1.457559e-14 FALSE 1 comp.2 1.457559e-14
#> orientation
#> 1 x
#> 2 x
#> 3 x
#> 4 x
#> 5 x
#> 6 x
stat_normalmix_eq(geom = "debug", components = "all")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> geom_debug: na.rm = FALSE
#> stat_normalmix_eq: method = normalmixEM, method.name = normalmixEM, se = FALSE, seed = NA, level = 0.95, na.rm = FALSE, orientation = x, method.args = list(), k = 2, free.mean = TRUE, free.sd = TRUE, components = all, n.min = 20, eq.with.lhs = TRUE, eq.digits = 2, label.x = left, label.y = top, hstep = 0, vstep = 0.05, npc.used = FALSE, output.type = expression, parse = TRUE
#> position_identity
if (gginnards.installed)
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_eq(geom = "debug", components = "sum")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 27
#> [1] "PANEL 1; group(s) comp.sum; 'draw_function()' input 'data' (head):"
#> lambda mu sigma k converged n fm.class fm.method component
#> 1 1 NA NA NA TRUE 272 mixEM normalmixEM comp.sum
#> eq.label
#> 1 DF~`=`~0.36 %*% italic(N)(mu*`=`*55, sigma*`=`*5.9) + 0.64 %*% italic(N)(mu*`=`*80, sigma*`=`*5.9)
#> n.label method.label npcx npcy PANEL group
#> 1 n~`=`~272 "method: normalmixEM" NA NA 1 comp.sum
#> label
#> 1 DF~`=`~0.36 %*% italic(N)(mu*`=`*55, sigma*`=`*5.9) + 0.64 %*% italic(N)(mu*`=`*80, sigma*`=`*5.9)
#> x y orientation
#> 1 0.05 0.95 x
if (gginnards.installed)
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_eq(geom = "debug", components = "members")
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 33
#> [1] "PANEL 1; group(s) comp.1, comp.2; 'draw_function()' input 'data' (head):"
#> lambda mu sigma k converged n fm.class fm.method component
#> 1 0.6391128 80.0911 5.867711 2 TRUE 272 mixEM normalmixEM comp.1
#> 2 0.3608872 54.6149 5.871252 2 TRUE 272 mixEM normalmixEM comp.2
#> eq.label n.label
#> 1 DF~`=`~0.64 %*% italic(N)(mu*`=`*80, sigma*`=`*5.9) n~`=`~272
#> 2 DF~`=`~0.36 %*% italic(N)(mu*`=`*55, sigma*`=`*5.9) n~`=`~272
#> method.label npcx x npcy PANEL group
#> 1 "method: normalmixEM" NA 0.05 NA 1 comp.1
#> 2 "method: normalmixEM" NA 0.05 NA 1 comp.2
#> label y orientation
#> 1 DF~`=`~0.64 %*% italic(N)(mu*`=`*80, sigma*`=`*5.9) 0.95 x
#> 2 DF~`=`~0.36 %*% italic(N)(mu*`=`*55, sigma*`=`*5.9) 0.9 x
if (gginnards.installed)
ggplot(faithful, aes(x = waiting)) +
stat_normalmix_eq(geom = "debug",
components = "members",
fm.values = TRUE)
#> Warning: Duplicated aesthetics after name standardisation: na.rm and orientation
#> number of iterations= 27
#> [1] "PANEL 1; group(s) comp.1, comp.2; 'draw_function()' input 'data' (head):"
#> lambda mu sigma k converged n fm.class fm.method component
#> 1 0.3608872 54.6149 5.871252 2 TRUE 272 mixEM normalmixEM comp.1
#> 2 0.6391128 80.0911 5.867711 2 TRUE 272 mixEM normalmixEM comp.2
#> eq.label n.label
#> 1 DF~`=`~0.36 %*% italic(N)(mu*`=`*55, sigma*`=`*5.9) n~`=`~272
#> 2 DF~`=`~0.64 %*% italic(N)(mu*`=`*80, sigma*`=`*5.9) n~`=`~272
#> method.label npcx x npcy PANEL group
#> 1 "method: normalmixEM" NA 0.05 NA 1 comp.1
#> 2 "method: normalmixEM" NA 0.05 NA 1 comp.2
#> label y orientation
#> 1 DF~`=`~0.36 %*% italic(N)(mu*`=`*55, sigma*`=`*5.9) 0.95 x
#> 2 DF~`=`~0.64 %*% italic(N)(mu*`=`*80, sigma*`=`*5.9) 0.9 x
