Chromatographic peak detection using the centWave method
Source:R/DataClasses.R
, R/functions-Params.R
, R/methods-OnDiskMSnExp.R
, and 1 more
findChromPeaks-centWave.Rd
The centWave algorithm perform peak density and wavelet based chromatographic peak detection for high resolution LC/MS data in centroid mode [Tautenhahn 2008].
The CentWaveParam
class allows to specify all settings
for a chromatographic peak detection using the centWave method. Instances
should be created with the CentWaveParam
constructor.
The findChromPeaks,OnDiskMSnExp,CentWaveParam
method
performs chromatographic peak detection using the centWave
algorithm on all samples from an OnDiskMSnExp
object. OnDiskMSnExp
objects encapsule all
experiment specific data and load the spectra data (mz and intensity
values) on the fly from the original files applying also all eventual
data manipulations.
ppm
,ppm<-
: getter and setter for the ppm
slot of the object.
peakwidth
,peakwidth<-
: getter and setter for the
peakwidth
slot of the object.
snthresh
,snthresh<-
: getter and setter for the
snthresh
slot of the object.
prefilter
,prefilter<-
: getter and setter for the
prefilter
slot of the object.
mzCenterFun
,mzCenterFun<-
: getter and setter for the
mzCenterFun
slot of the object.
integrate
,integrate<-
: getter and setter for the
integrate
slot of the object.
mzdiff
,mzdiff<-
: getter and setter for the
mzdiff
slot of the object.
fitgauss
,fitgauss<-
: getter and setter for the
fitgauss
slot of the object.
noise
,noise<-
: getter and setter for the
noise
slot of the object.
verboseColumns
,verboseColumns<-
: getter and
setter for the verboseColumns
slot of the object.
roiList
,roiList<-
: getter and setter for the
roiList
slot of the object.
fistBaselineCheck
,firstBaselineCheck<-
: getter
and setter for the firstBaselineCheck
slot of the object.
roiScales
,roiScales<-
: getter and setter for the
roiScales
slot of the object.
Usage
CentWaveParam(
ppm = 25,
peakwidth = c(20, 50),
snthresh = 10,
prefilter = c(3, 100),
mzCenterFun = "wMean",
integrate = 1L,
mzdiff = -0.001,
fitgauss = FALSE,
noise = 0,
verboseColumns = FALSE,
roiList = list(),
firstBaselineCheck = TRUE,
roiScales = numeric(),
extendLengthMSW = FALSE,
verboseBetaColumns = FALSE
)
# S4 method for class 'OnDiskMSnExp,CentWaveParam'
findChromPeaks(
object,
param,
BPPARAM = bpparam(),
return.type = "XCMSnExp",
msLevel = 1L,
...
)
# S4 method for class 'CentWaveParam'
ppm(object)
# S4 method for class 'CentWaveParam'
ppm(object) <- value
# S4 method for class 'CentWaveParam'
peakwidth(object)
# S4 method for class 'CentWaveParam'
peakwidth(object) <- value
# S4 method for class 'CentWaveParam'
snthresh(object)
# S4 method for class 'CentWaveParam'
snthresh(object) <- value
# S4 method for class 'CentWaveParam'
prefilter(object)
# S4 method for class 'CentWaveParam'
prefilter(object) <- value
# S4 method for class 'CentWaveParam'
mzCenterFun(object)
# S4 method for class 'CentWaveParam'
mzCenterFun(object) <- value
# S4 method for class 'CentWaveParam'
integrate(f)
# S4 method for class 'CentWaveParam'
integrate(object) <- value
# S4 method for class 'CentWaveParam'
mzdiff(object)
# S4 method for class 'CentWaveParam'
mzdiff(object) <- value
# S4 method for class 'CentWaveParam'
fitgauss(object)
# S4 method for class 'CentWaveParam'
fitgauss(object) <- value
# S4 method for class 'CentWaveParam'
noise(object)
# S4 method for class 'CentWaveParam'
noise(object) <- value
# S4 method for class 'CentWaveParam'
verboseColumns(object)
# S4 method for class 'CentWaveParam'
verboseColumns(object) <- value
# S4 method for class 'CentWaveParam'
roiList(object)
# S4 method for class 'CentWaveParam'
roiList(object) <- value
# S4 method for class 'CentWaveParam'
firstBaselineCheck(object)
# S4 method for class 'CentWaveParam'
firstBaselineCheck(object) <- value
# S4 method for class 'CentWaveParam'
roiScales(object)
# S4 method for class 'CentWaveParam'
roiScales(object) <- value
# S4 method for class 'CentWaveParam'
as.list(x, ...)
Arguments
- ppm
numeric(1)
defining the maximal tolerated m/z deviation in consecutive scans in parts per million (ppm) for the initial ROI definition.- peakwidth
numeric(2)
with the expected approximate peak width in chromatographic space. Given as a range (min, max) in seconds.- snthresh
numeric(1)
defining the signal to noise ratio cutoff.- prefilter
numeric(2)
:c(k, I)
specifying the prefilter step for the first analysis step (ROI detection). Mass traces are only retained if they contain at leastk
peaks with intensity>= I
.- mzCenterFun
Name of the function to calculate the m/z center of the chromatographic peak. Allowed are:
"wMean"
: intensity weighted mean of the peak's m/z values,"mean"
: mean of the peak's m/z values,"apex"
: use the m/z value at the peak apex,"wMeanApex3"
: intensity weighted mean of the m/z value at the peak apex and the m/z values left and right of it and"meanApex3"
: mean of the m/z value of the peak apex and the m/z values left and right of it.- integrate
Integration method. For
integrate = 1
peak limits are found through descent on the mexican hat filtered data, forintegrate = 2
the descent is done on the real data. The latter method is more accurate but prone to noise, while the former is more robust, but less exact.- mzdiff
numeric(1)
representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.- fitgauss
logical(1)
whether or not a Gaussian should be fitted to each peak. This affects mostly the retention time position of the peak.- noise
numeric(1)
allowing to set a minimum intensity required for centroids to be considered in the first analysis step (centroids with intensity< noise
are omitted from ROI detection).- verboseColumns
logical(1)
whether additional peak meta data columns should be returned.- roiList
An optional list of regions-of-interest (ROI) representing detected mass traces. If ROIs are submitted the first analysis step is omitted and chromatographic peak detection is performed on the submitted ROIs. Each ROI is expected to have the following elements specified:
scmin
(start scan index),scmax
(end scan index),mzmin
(minimum m/z),mzmax
(maximum m/z),length
(number of scans),intensity
(summed intensity). Each ROI should be represented by alist
of elements or a single rowdata.frame
.- firstBaselineCheck
logical(1)
. IfTRUE
continuous data within regions of interest is checked to be above the first baseline. In detail, a first rough estimate of the noise is calculated and peak detection is performed only in regions in which multiple sequential signals are higher than this first estimated baseline/noise level.- roiScales
Optional numeric vector with length equal to
roiList
defining the scale for each region of interest inroiList
that should be used for the centWave-wavelets.- extendLengthMSW
Option to force centWave to use all scales when running centWave rather than truncating with the EIC length. Uses the "open" method to extend the EIC to a integer base-2 length prior to being passed to
convolve
rather than the default "reflect" method. See https://github.com/sneumann/xcms/issues/445 for more information.- verboseBetaColumns
Option to calculate two additional metrics of peak quality via comparison to an idealized bell curve. Adds
beta_cor
andbeta_snr
to thechromPeaks
output, corresponding to a Pearson correlation coefficient to a bell curve with several degrees of skew as well as an estimate of signal-to-noise using the residuals from the best-fitting bell curve. See https://github.com/sneumann/xcms/pull/685 and https://doi.org/10.1186/s12859-023-05533-4 for more information.- object
For
findChromPeaks
: anOnDiskMSnExp
object containing the MS- and all other experiment-relevant data.For all other methods: a parameter object.
- param
An
CentWaveParam
object containing all settings for the centWave algorithm.- BPPARAM
A parameter class specifying if and how parallel processing should be performed. It defaults to
bpparam
. See documentation of theBiocParallel
for more details. If parallel processing is enabled, peak detection is performed in parallel on several of the input samples.- return.type
Character specifying what type of object the method should return. Can be either
"XCMSnExp"
(default),"list"
or"xcmsSet"
.- msLevel
integer(1)
defining the MS level on which the peak detection should be performed. Defaults tomsLevel = 1
.- ...
ignored.
- value
The value for the slot.
- f
For
integrate
: aCentWaveParam
object.- x
The parameter object.
Value
The CentWaveParam
function returns a CentWaveParam
class instance with all of the settings specified for chromatographic
peak detection by the centWave method.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the peak detection.
Details
The centWave algorithm is most suitable for high resolution
LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase
the method identifies regions of interest (ROIs) representing
mass traces that are characterized as regions with less than ppm
m/z deviation in consecutive scans in the LC/MS map. In detail, starting
with a single m/z, a ROI is extended if a m/z can be found in the next scan
(spectrum) for which the difference to the mean m/z of the ROI is smaller
than the user defined ppm
of the m/z. The mean m/z of the ROI is then
updated considering also the newly included m/z value.
These ROIs are then, after some cleanup, analyzed using continuous wavelet
transform (CWT) to locate chromatographic peaks on different scales.
The first analysis step is skipped, if regions of interest are passed
via the param
parameter.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
Slots
ppm,peakwidth,snthresh,prefilter,mzCenterFun,integrate,mzdiff,fitgauss,noise,verboseColumns,roiList,firstBaselineCheck,roiScales,extendLengthMSW,verboseBetaColumns
See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.
Note
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports peak detection on
OnDiskMSnExp
objects (defined in the MSnbase
package). All of the settings to the centWave algorithm can be passed
with a CentWaveParam
object.
References
Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504
See also
The do_findChromPeaks_centWave
core API function and
findPeaks.centWave
for the old user interface.
peaksWithCentWave
for functions to perform centWave peak
detection in purely chromatographic data.
XCMSnExp
for the object containing the results of
the peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-massifquant
,
findChromPeaks-matchedFilter
,
findPeaks-MSW
Examples
## Create a CentWaveParam object. Note that the noise is set to 10000 to
## speed up the execution of the example - in a real use case the default
## value should be used, or it should be set to a reasonable value.
cwp <- CentWaveParam(ppm = 20, noise = 10000, prefilter = c(3, 10000))
## Change snthresh parameter
snthresh(cwp) <- 25
cwp
#> Object of class: CentWaveParam
#> Parameters:
#> - ppm: [1] 20
#> - peakwidth: [1] 20 50
#> - snthresh: [1] 25
#> - prefilter: [1] 3 10000
#> - mzCenterFun: [1] "wMean"
#> - integrate: [1] 1
#> - mzdiff: [1] -0.001
#> - fitgauss: [1] FALSE
#> - noise: [1] 10000
#> - verboseColumns: [1] FALSE
#> - roiList: list()
#> - firstBaselineCheck: [1] TRUE
#> - roiScales: numeric(0)
#> - extendLengthMSW: [1] FALSE
#> - verboseBetaColumns: [1] FALSE
## Perform the peak detection using centWave on some of the files from the
## faahKO package. Files are read using the `readMsExperiment` function
## from the MsExperiment package
library(faahKO)
library(xcms)
library(MsExperiment)
fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE,
full.names = TRUE)
raw_data <- readMsExperiment(fls[1])
## Perform the peak detection using the settings defined above.
res <- findChromPeaks(raw_data, param = cwp)
head(chromPeaks(res))
#> mz mzmin mzmax rt rtmin rtmax into intb maxo
#> CP001 453.2 453.2 453.2 2506.073 2501.378 2527.982 1007409.0 1007380.8 38152
#> CP002 302.0 302.0 302.0 2617.185 2595.275 2640.659 687146.6 671297.8 30552
#> CP003 344.0 344.0 344.0 2679.783 2646.919 2709.517 5210015.9 5135916.9 152320
#> CP004 381.0 381.0 381.0 2678.218 2637.529 2720.472 2180565.2 2023571.9 52504
#> CP005 430.1 430.1 430.1 2681.348 2639.094 2712.647 2395840.3 2299899.6 65752
#> CP006 366.0 366.0 366.0 2679.783 2642.224 2718.907 3365174.0 3279468.3 79928
#> sn sample
#> CP001 38151 1
#> CP002 46 1
#> CP003 68 1
#> CP004 37 1
#> CP005 42 1
#> CP006 49 1