These functions find peaks (maxima) and valleys (minima) in a numeric vector,
using a user selectable span and global and local size thresholds, returning
a logical
vector.
Usage
find_peaks(
x,
global.threshold = NULL,
local.threshold = NULL,
local.reference = "median",
threshold.range = NULL,
span = 3,
strict = FALSE,
na.rm = FALSE
)
find_valleys(
x,
global.threshold = NULL,
local.threshold = NULL,
local.reference = "median",
threshold.range = NULL,
span = 3,
strict = FALSE,
na.rm = FALSE
)
Arguments
- x
numeric vector.
- global.threshold
numeric A value belonging to class
"AsIs"
is interpreted as an absolute minimum height or depth expressed in data units. A barenumeric
value (normally between 0.0 and 1.0), is interpreted as relative tothreshold.range
. In both cases it sets a global height (depth) threshold below which peaks (valleys) are ignored. A bare negativenumeric
value indicates the global height (depth) threshold below which peaks (valleys) are be ignored. Ifglobal.threshold = NULL
, no threshold is applied and all peaks returned.- local.threshold
numeric A value belonging to class
"AsIs"
is interpreted as an absolute minimum height (depth) expressed in data units relative to a within-window computed reference value. A barenumeric
value (normally between 0.0 and 1.0), is interpreted as expressed in units relative tothreshold.range
. In both caseslocal.threshold
sets a local height (depth) threshold below which peaks (valleys) are ignored. Iflocal.threshold = NULL
or ifspan
spans the whole ofx
, no threshold is applied.- local.reference
character One of
"median"
,"median.log"
,"median.sqrt"
,"farthest"
,"farthest.log"
or"farthest.sqrt"
. The reference used to assess the height of the peak, is either the minimum/maximum value within the window or the median of all values in the window.- threshold.range
numeric vector If of length 2 or a longer vector
range(threshold.range)
is used to scale both thresholds. WithNULL
, the default,range(x)
is used, and with a vector of length onerange(threshold.range, x)
is used, i.e., the range is expanded.- span
odd positive integer A peak is defined as an element in a sequence which is greater than all other elements within a moving window of width
span
centred at that element. The default value is 5, meaning that a peak is taller than its four nearest neighbours.span = NULL
extends the span to the whole length ofx
.- strict
logical flag: if
TRUE
, an element must be strictly greater than all other values in its window to be considered a peak. Default:FALSE
(since version 0.13.1).- na.rm
logical indicating whether
NA
values should be stripped before searching for peaks.
Value
A vector of logical values of the same length as x
. Values
that are TRUE correspond to local peaks in vector x
and can be used
to extract the rows corresponding to peaks from a data frame.
Details
As find_valleys
, stat_peaks
and stat_valleys
call find_peaks
to search for peaks or valleys, this description
applies to all four functions.
Function find_peaks
is a wrapper built onto function
peaks
from splus2R, adds support for peak
height thresholds and handles span = NULL
and non-finite (including
NA) values differently than splus2R::peaks
. Instead of giving an
error when na.rm = FALSE
and x
contains NA
values,
NA
values are replaced with the smallest finite value in x
.
span = NULL
is treated as a special case and selects max(x)
.
Passing `strict = TRUE` ensures that multiple global and within window
maxima are ignored, and can result in no peaks being returned.#'
Two tests make it possible to ignore irrelevant peaks. One test
(global.threshold
) is based on the absolute height of the peaks and
can be used in all cases to ignore globally low peaks. A second test
(local.threshold
) is available when the window defined by `span`
does not include all observations and can be used to ignore peaks that are
not locally prominent. In this second approach the height of each peak is
compared to a summary computed from other values within the window of width
equal to span
where it was found. In this second case, the reference
value used within each window containing a peak is given by
local.reference
. Parameter threshold.range
determines how the
bare numeric
values passed as argument to global.threshold
and local.threshold
are scaled. The default, NULL
uses the
range of x
. Thresholds for ignoring too small peaks are applied
after peaks are searched for, and threshold values can in some cases result
in no peaks being found. If either threshold is not available (NA
)
the returned value is a NA
vector of the same length as x
.
The local.threshold
argument is used as is when
local.reference
is "median"
or "farthest"
, i.e., the
same distance between peak and reference is used as cut-off irrespective of
the value of the reference. In cases when the prominence of peaks is
positively correlated with the baseline, a local.threshold
that
increases together with increasing computed within window median or
farthest value applies apply a less stringent height requirement in regions
with overall low height. In this case, natural logarithm or square root
weighting can be requested with `local.reference` arguments `"median.log"`,
`"farthest.log"`, `"median.sqrt"`, and `"farthest.sqrt"` as arguments for
local.reference
.
Note
The default for parameter strict
is FALSE
in functions
find_peaks()
and find_valleys()
, while it is
strict = TRUE
in peaks
.
See also
Other peaks and valleys functions:
find_spikes()
Examples
# lynx is a time.series object
lynx_num.df <-
try_tibble(lynx,
col.names = c("year", "lynx"),
as.numeric = TRUE) # years -> as numeric
which(find_peaks(lynx_num.df$lynx, span = 5))
#> [1] 8 18 28 37 46 55 65 75 84 93 96 105
which(find_valleys(lynx_num.df$lynx, span = 5))
#> [1] 12 22 32 41 49 59 69 78 88 99 109
lynx_num.df[find_peaks(lynx_num.df$lynx, span = 5), ]
#> # A tibble: 12 × 2
#> year lynx
#> <dbl> <dbl>
#> 1 1828 5943
#> 2 1838 3409
#> 3 1848 2536
#> 4 1857 2871
#> 5 1866 6721
#> 6 1875 2251
#> 7 1885 4431
#> 8 1895 4031
#> 9 1904 6991
#> 10 1913 3800
#> 11 1916 3790
#> 12 1925 3574
lynx_num.df[find_peaks(lynx_num.df$lynx, span = 51), ]
#> # A tibble: 2 × 2
#> year lynx
#> <dbl> <dbl>
#> 1 1866 6721
#> 2 1904 6991
lynx_num.df[find_peaks(lynx_num.df$lynx, span = NULL), ]
#> # A tibble: 1 × 2
#> year lynx
#> <dbl> <dbl>
#> 1 1904 6991
lynx_num.df[find_peaks(lynx_num.df$lynx,
span = 15,
global.threshold = 2/3), ]
#> # A tibble: 3 × 2
#> year lynx
#> <dbl> <dbl>
#> 1 1828 5943
#> 2 1866 6721
#> 3 1904 6991
lynx_num.df[find_peaks(lynx_num.df$lynx,
span = 15,
global.threshold = I(4000)), ]
#> # A tibble: 5 × 2
#> year lynx
#> <dbl> <dbl>
#> 1 1828 5943
#> 2 1866 6721
#> 3 1885 4431
#> 4 1895 4031
#> 5 1904 6991
lynx_num.df[find_peaks(lynx_num.df$lynx,
span = 15,
local.threshold = 0.5), ]
#> # A tibble: 5 × 2
#> year lynx
#> <dbl> <dbl>
#> 1 1828 5943
#> 2 1866 6721
#> 3 1885 4431
#> 4 1895 4031
#> 5 1904 6991