Skip to contents

Predicted values and a confidence band are computed and, by default, plotted. stat_ma_line() behaves similarly to stat_smooth except for fitting the model with lmodel2::lmodel2() with "MA" as default for method.

Usage

stat_ma_line(
  mapping = NULL,
  data = NULL,
  geom = "smooth",
  position = "identity",
  ...,
  orientation = NA,
  method = "lmodel2:MA",
  method.args = list(),
  n.min = 2L,
  formula = NULL,
  range.y = NULL,
  range.x = NULL,
  se = TRUE,
  fit.seed = NA,
  fm.values = FALSE,
  n = 80,
  nperm = 99,
  fullrange = FALSE,
  level = 0.95,
  na.rm = FALSE,
  show.legend = NA,
  inherit.aes = TRUE
)

Arguments

mapping

The aesthetic mapping, usually constructed with aes. Only needs to be set at the layer level if you are overriding the plot defaults.

data

A layer specific dataset, only needed if you want to override the plot defaults.

geom

The geometric object to use display the data

position

The position adjustment to use for overlapping points on this layer.

...

other arguments passed on to layer. This can include aesthetics whose values you want to set, not map. See layer for more details.

orientation

character Either "x" or "y" controlling the default for formula. The letter indicates the aesthetic considered the explanatory variable in the model fit.

method

function or character If character, "MA", "SMA" , "RMA" or "OLS", alternatively "lmodel2" or the name of a model fit function are accepted, possibly followed by the fit function's method argument separated by a colon (e.g. "lmodel2:MA"). If a function different to lmodel2(), it must accept arguments named formula, data, range.y, range.x and nperm and return a model fit object of class lmodel2.

method.args

named list with additional arguments. Not data or weights which are always passed through aesthetic mappings.

n.min

integer Minimum number of distinct values in the explanatory variable (on the rhs of formula) for fitting to the attempted.

formula

a formula object. Using aesthetic names x and y instead of original variable names.

range.y, range.x

character Pass "relative" or "interval" if method "RMA" is to be computed.

se

logical Return confidence interval around smooth? (`TRUE` by default, see `level` to control.)

fit.seed

RNG seed argument passed to set.seed(). Defaults to NA, indicating that set.seed() should not be called.

fm.values

logical Add metadata and parameter estimates extracted from the fitted model object; FALSE by default.

n

Number of points at which to predict with the fitted model.

nperm

integer Number of permutation used to estimate significance.

fullrange

Should the fit span the full range of the plot, or just the range of the data group used in each fit?

level

Level of confidence interval to use (only 0.95 currently).

na.rm

a logical indicating whether NA values should be stripped before the computation proceeds.

show.legend

logical. Should this layer be included in the legends? NA, the default, includes if any aesthetics are mapped. FALSE never includes, and TRUE always includes.

inherit.aes

If FALSE, overrides the default aesthetics, rather than combining with them. This is most useful for helper functions that define both data and aesthetics and shouldn't inherit behaviour from the default plot specification, e.g. borders.

Value

The value returned by the statistic is a data frame, that will have n rows of predicted values and their confidence limits. Optionally it will also include additional values related to the model fit.

Details

This statistic fits major axis ("MA") and other model II regressions with function lmodel2. Model II regression is called for when both x and y are subject to random variation and the intention is not to predict y from x by means of the model but rather to study the relationship between two independent variables. A frequent case in biology are allometric relationships among body parts.

As the fitted line is the same whether x or y is on the rhs of the model equation, orientation even if accepted does not have an effect on the fitted line. In contrast, geom_smooth treats each axis differently and can thus have two orientations. The orientation is easy to deduce from the argument passed to formula. Thus, stat_ma_line() will by default guess which orientation the layer should have. If no argument is passed to formula, the orientation can be specified directly passing an argument to the orientation parameter, which can be either "x" or "y". The value gives the axis that is on the rhs of the model equation, "x" being the default orientation. Package 'ggpmisc' does not define new geometries matching the new statistics as they are not needed and conceptually transformations of data are expressed as statistics.

The minimum number of observations with distinct values can be set through parameter n.min. The default n.min = 2L is the smallest possible value. However, model fits with very few observations are of little interest and using a larger number for n.min than the default is wise. As model fitting functions could depend on the RNG, fit.seed if different to NA is used as argument in a call to set.seed() immediately ahead of model fitting.

Note

stat_ma_line understands x and y, to be referenced in the formula. Both must be mapped to numeric variables.

Computed variables

`stat_ma_line()` provides the following variables, some of which depend on the orientation:

y or x

predicted value

ymin or xmin

lower pointwise confidence interval around the mean

ymax or xmax

upper pointwise confidence interval around the mean

se

standard error

If fm.values = TRUE is passed then columns based on the summary of the model fit are added, with the same value in each row within a group. This is wasteful and disabled by default, but provides a simple and robust approach to achieve effects like colouring or hiding of the model fit line based on P-values, r-squared or the number of observations.

Model fit methods supported

Several model fit functions are supported explicitly (see tables), and some of their differences smoothed out. Compatibility is checked late, based on the class of the returned fitted model object. This makes it possible to use wrapper functions that do model selection or other adjustments to the fit procedure on a per panel or per group basis. Moreover, if the value returned as model fit object is NULL no layer is added to the plot on a per group within panel basis.

In the case of fitted model objects of classes not explicitly supported an attempt is made to find the usual accessors and/or fitted object members, and if found, either complete or partial support is frequently achieved. In this case a message is issued encouraging users to check the valisdity of the values extracted.

The argument to parameter method can be either the name of a function object, possibly using double colon notation, or a character string matching the function name. This approach makes it possible to support model fit functions that are not dependencies of 'ggpmisc'. Either by attaching the package where the function is defined and passing it by name or as string, or using double colon notation when passing the name of the function. User-defined functions can be passed as argument to parameter method as long as they have parameters formula, data subset and possibly weights. Additional arguments can be passed to any method as a named list as an argument to parameter method.args. As in stat_smooth() prior weights are passed to the model fit functions' weights (plural!) parameter by mapping a numeric variable to plot aesthetic weight (singular!).

The table below lists natively supported model fit functions, with the caveat that only some 'broom' methods' specializations have been actually tested with statistics from 'ggpmisc'. In addition, the statistics based on 'broom' methods require the user to tailor their behaviour by passing additional arguments in the call.

Statistic\(f\)Supported model fit methods
stat_poly_line()G"lm", "rlm", "lts", "sma", "ma", "gls", others with methods predict() or fitted()
stat_poly_eq()G"lm", "rlm", "lts", "sma", "ma", "gls", others with needed accesors
stat_quant_line()G"rq", "rqss"
stat_quant_band()G"rq", "rqss"
stat_quant_eq()G"rq", "rqss"
stat_ma_line()G"SMA", "MA", "RMA", "OLS"
stat_ma_eq()G"SMA", "MA", "RMA", "OLS"
stat_fit_residuals()G"lm", "rlm", "lts", "sma", "ma", "gls", "rq", "rqss" others with method residuals()
stat_fit_fitted()G"lm", "rlm", "lts", "gls", "rq", "rqss" others with method fitted()
stat_fit_deviations()G"lm", "rlm", "lts", "gls", "rq", "rqss" others with methods fitted() and weights()
stat_fit_augment()Gany with 'broom' method augment()
stat_fit_glance()Gany with 'broom' method glance()
stat_fit_tidy()Gany with 'broom' method tidy()
stat_fit_tb()Pany with 'broom' method tidy()

The table below lists the names for fit methods coded in the statistics as given in the table above. The single colon notation is based on parsing the name and is available whenever passing the name of the fit method as a character string. In a string such as "head:tail" the "head" gives the name of the model fit function and the "tail" gives the argument to pass it's method parameter. In some cases the default formula = y ~ x needs to be overridden with an explicit argument.

Predefined method namesModel fit methodsR packageObject class
"lm", "lm:qr"lm()'stats'"lm"
"rlm", "rlm:M", "rlm:MM"rlm()'MASS'"rlm" ("lm")
"lts", "ltsReg"ltsReg()'robustbase'"lts"
"ma", "sma", "sma:SMA", "sma:MA", "sma:OLS"sma()'smatr'"ma" or "sma"
"gls", "gls:REML", "gls:ML"gls()'nlme'"gls"
"rq", "rq:sfn", "rq:sfnc", "rq:lasso"rq()'quantreg'"rq"
"rqss", "rqss:sfn", "rqss:sfnc", "rqss:lasso"rqss()'quantreg'"rqss"
"SMA", "MA", "RMA", "OLS"lmodel2()'lmodel2'

See also

Other ggplot statistics for major axis regression: stat_ma_eq()

Aesthetics

stat_ma_line() understands the following aesthetics. Required aesthetics are displayed in bold and defaults are displayed for optional aesthetics:

x
y
group→ inferred

Learn more about setting these aesthetics in vignette("ggplot2-specs").

Examples

# generate artificial data
set.seed(98723)
my.data <- data.frame(x = rnorm(100) + (0:99) / 10 - 5,
                      y = rnorm(100) + (0:99) / 10 - 5,
                      group = c("A", "B"))

ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line()


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(method = "MA")


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(method = "SMA")


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(method = "RMA",
               range.y = "interval", range.x = "interval")


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(method = "OLS")


# plot line to the ends of range of data (the default)
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(fullrange = FALSE) +
  expand_limits(x = c(-10, 10), y = c(-10, 10))


# plot line to the limits of the scales
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(fullrange = TRUE) +
  expand_limits(x = c(-10, 10), y = c(-10, 10))


# plot line to the limits of the scales
ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(orientation = "y", fullrange = TRUE) +
  expand_limits(x = c(-10, 10), y = c(-10, 10))


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line(formula = x ~ y)


# Smooths are automatically fit to each group (defined by categorical
# aesthetics or the group aesthetic) and for each facet.

ggplot(my.data, aes(x, y, colour = group)) +
  geom_point() +
  stat_ma_line()


ggplot(my.data, aes(x, y)) +
  geom_point() +
  stat_ma_line() +
  facet_wrap(~group)


# Inspecting the returned data using geom_debug_group()
gginnards.installed <- requireNamespace("gginnards", quietly = TRUE)

if (gginnards.installed)
  library(gginnards)

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    stat_ma_line(geom = "debug_group")

#> [1] "PANEL 1; group(s) -1; 'draw_function()' input 'data' (head):"
#>           x         y      ymin      ymax flipped_aes PANEL group orientation
#> 1 -6.560610 -6.050104 -6.711421 -5.450245       FALSE     1    -1           x
#> 2 -6.392213 -5.890429 -6.534468 -5.306240       FALSE     1    -1           x
#> 3 -6.223816 -5.730753 -6.357516 -5.162236       FALSE     1    -1           x
#> 4 -6.055419 -5.571077 -6.180563 -5.018232       FALSE     1    -1           x
#> 5 -5.887021 -5.411402 -6.003610 -4.874228       FALSE     1    -1           x
#> 6 -5.718624 -5.251726 -5.826657 -4.730224       FALSE     1    -1           x

if (gginnards.installed)
  ggplot(my.data, aes(x, y)) +
    stat_ma_line(geom = "debug_group", fm.values = TRUE)

#> [1] "PANEL 1; group(s) -1; 'draw_function()' input 'data' (head):"
#>           x         y      ymin      ymax p.value r.squared   n fm.class
#> 1 -6.560610 -6.050104 -6.711421 -5.450245    0.01 0.7917998 100  lmodel2
#> 2 -6.392213 -5.890429 -6.534468 -5.306240    0.01 0.7917998 100  lmodel2
#> 3 -6.223816 -5.730753 -6.357516 -5.162236    0.01 0.7917998 100  lmodel2
#> 4 -6.055419 -5.571077 -6.180563 -5.018232    0.01 0.7917998 100  lmodel2
#> 5 -5.887021 -5.411402 -6.003610 -4.874228    0.01 0.7917998 100  lmodel2
#> 6 -5.718624 -5.251726 -5.826657 -4.730224    0.01 0.7917998 100  lmodel2
#>    fm.method fm.formula fm.formula.chr flipped_aes PANEL group orientation
#> 1 lmodel2:MA      y ~ x          y ~ x       FALSE     1    -1           x
#> 2 lmodel2:MA      y ~ x          y ~ x       FALSE     1    -1           x
#> 3 lmodel2:MA      y ~ x          y ~ x       FALSE     1    -1           x
#> 4 lmodel2:MA      y ~ x          y ~ x       FALSE     1    -1           x
#> 5 lmodel2:MA      y ~ x          y ~ x       FALSE     1    -1           x
#> 6 lmodel2:MA      y ~ x          y ~ x       FALSE     1    -1           x