Peak detection in the chromatographic time domain
Source:R/DataClasses.R
, R/functions-Params.R
, R/methods-OnDiskMSnExp.R
, and 1 more
findChromPeaks-matchedFilter.Rd
The matchedFilter algorithm identifies peaks in the
chromatographic time domain as described in [Smith 2006]. The intensity
values are binned by cutting The LC/MS data into slices (bins) of a mass
unit (binSize
m/z) wide. Within each bin the maximal intensity is
selected. The chromatographic peak detection is then performed in each
bin by extending it based on the steps
parameter to generate
slices comprising bins current_bin - steps +1
to
current_bin + steps - 1
. Each of these slices is then filtered
with matched filtration using a second-derative Gaussian as the model
peak shape. After filtration peaks are detected using a signal-to-ratio
cut-off. For more details and illustrations see [Smith 2006].
The MatchedFilterParam
class allows to specify all
settings for a chromatographic peak detection using the matchedFilter
method. Instances should be created with the MatchedFilterParam
constructor.
The findChromPeaks,OnDiskMSnExp,MatchedFilterParam
method performs peak detection using the matchedFilter algorithm
on all samples from an OnDiskMSnExp
object.
OnDiskMSnExp
objects encapsule all experiment
specific data and load the spectra data (mz and intensity values) on the
fly from the original files applying also all eventual data
manipulations.
binSize
,binSize<-
: getter and setter for the
binSize
slot of the object.
impute
,impute<-
: getter and setter for the
impute
slot of the object.
baseValue
,baseValue<-
: getter and setter for the
baseValue
slot of the object.
distance
,distance<-
: getter and setter for the
distance
slot of the object.
fwhm
,fwhm<-
: getter and setter for the
fwhm
slot of the object.
sigma
,sigma<-
: getter and setter for the
sigma
slot of the object.
max
,max<-
: getter and setter for the
max
slot of the object.
snthresh
,snthresh<-
: getter and setter for the
snthresh
slot of the object.
steps
,steps<-
: getter and setter for the
steps
slot of the object.
mzdiff
,mzdiff<-
: getter and setter for the
mzdiff
slot of the object.
index
,index<-
: getter and setter for the
index
slot of the object.
Usage
MatchedFilterParam(
binSize = 0.1,
impute = "none",
baseValue = numeric(),
distance = numeric(),
fwhm = 30,
sigma = fwhm/2.3548,
max = 5,
snthresh = 10,
steps = 2,
mzdiff = 0.8 - binSize * steps,
index = FALSE
)
# S4 method for class 'OnDiskMSnExp,MatchedFilterParam'
findChromPeaks(
object,
param,
BPPARAM = bpparam(),
return.type = "XCMSnExp",
msLevel = 1L,
...
)
# S4 method for class 'MatchedFilterParam'
binSize(object)
# S4 method for class 'MatchedFilterParam'
binSize(object) <- value
# S4 method for class 'MatchedFilterParam'
impute(object)
# S4 method for class 'MatchedFilterParam'
impute(object) <- value
# S4 method for class 'MatchedFilterParam'
baseValue(object)
# S4 method for class 'MatchedFilterParam'
baseValue(object) <- value
# S4 method for class 'MatchedFilterParam'
distance(object)
# S4 method for class 'MatchedFilterParam'
distance(object) <- value
# S4 method for class 'MatchedFilterParam'
fwhm(object)
# S4 method for class 'MatchedFilterParam'
fwhm(object) <- value
# S4 method for class 'MatchedFilterParam'
sigma(object)
# S4 method for class 'MatchedFilterParam'
sigma(object) <- value
# S4 method for class 'MatchedFilterParam'
max(x)
# S4 method for class 'MatchedFilterParam'
max(object) <- value
# S4 method for class 'MatchedFilterParam'
snthresh(object)
# S4 method for class 'MatchedFilterParam'
snthresh(object) <- value
# S4 method for class 'MatchedFilterParam'
steps(object)
# S4 method for class 'MatchedFilterParam'
steps(object) <- value
# S4 method for class 'MatchedFilterParam'
mzdiff(object)
# S4 method for class 'MatchedFilterParam'
mzdiff(object) <- value
# S4 method for class 'MatchedFilterParam'
index(object)
# S4 method for class 'MatchedFilterParam'
index(object) <- value
Arguments
- binSize
numeric(1)
specifying the width of the bins/slices in m/z dimension.- impute
Character string specifying the method to be used for missing value imputation. Allowed values are
"none"
(no linear interpolation),"lin"
(linear interpolation),"linbase"
(linear interpolation within a certain bin-neighborhood) and"intlin"
. SeeimputeLinInterpol
for more details.- baseValue
The base value to which empty elements should be set. This is only considered for
method = "linbase"
and corresponds to theprofBinLinBase
'sbaselevel
argument.- distance
For
method = "linbase"
: number of non-empty neighboring element of an empty element that should be considered for linear interpolation. See details section for more information.- fwhm
numeric(1)
specifying the full width at half maximum of matched filtration gaussian model peak. Only used to calculate the actual sigma, see below.- sigma
numeric(1)
specifying the standard deviation (width) of the matched filtration model peak.- max
numeric(1)
representing the maximum number of peaks that are expected/will be identified per slice.- snthresh
numeric(1)
defining the signal to noise cutoff to be used in the chromatographic peak detection step.- steps
numeric(1)
defining the number of bins to be merged before filtration (i.e. the number of neighboring bins that will be joined to the slice in which filtration and peak detection will be performed).- mzdiff
numeric(1)
defining the minimum difference in m/z for peaks with overlapping retention times- index
logical(1)
specifying whether indicies should be returned instead of values for m/z and retention times.- object
For
findChromPeaks
: anOnDiskMSnExp
object containing the MS- and all other experiment-relevant data.For all other methods: a parameter object.
- param
An
MatchedFilterParam
object containing all settings for the matchedFilter algorithm.- BPPARAM
A parameter class specifying if and how parallel processing should be performed. It defaults to
bpparam
. See documentation of theBiocParallel
for more details. If parallel processing is enabled, peak detection is performed in parallel on several of the input samples.- return.type
Character specifying what type of object the method should return. Can be either
"XCMSnExp"
(default),"list"
or"xcmsSet"
.- msLevel
integer(1)
defining the MS level on which the peak detection should be performed. Defaults tomsLevel = 1
.- ...
ignored.
- value
The value for the slot.
- x
For
max
: aMatchedFilterParam
object.
Value
The MatchedFilterParam
function returns a
MatchedFilterParam
class instance with all of the settings
specified for chromatographic detection by the matchedFilter
method.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the peak detection.
Details
The intensities are binned by the provided m/z values within each
spectrum (scan). Binning is performed such that the bins are centered
around the m/z values (i.e. the first bin includes all m/z values between
min(mz) - bin_size/2
and min(mz) + bin_size/2
).
For more details on binning and missing value imputation see
binYonX
and imputeLinInterpol
methods.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
Slots
binSize,impute,baseValue,distance,fwhm,sigma,max,snthresh,steps,mzdiff,index
See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.
Note
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports chromatographic peak
detection on
OnDiskMSnExp
objects (defined in the
MSnbase
package). All of the settings to the matchedFilter
algorithm can be passed with a MatchedFilterParam
object.
References
Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.
See also
The do_findChromPeaks_matchedFilter
core API function
and findPeaks.matchedFilter
for the old user interface.
peaksWithMatchedFilter
for functions to perform matchedFilter
peak detection in purely chromatographic data.
XCMSnExp
for the object containing the results of
the chromatographic peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWave
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-massifquant
,
findPeaks-MSW
Examples
## Create a MatchedFilterParam object. Note that we use a unnecessarily large
## binSize parameter to reduce the run-time of the example.
mfp <- MatchedFilterParam(binSize = 5)
## Change snthresh parameter
snthresh(mfp) <- 15
mfp
#> Object of class: MatchedFilterParam
#> Parameters:
#> - binSize: [1] 5
#> - impute: [1] "none"
#> - baseValue: numeric(0)
#> - distance: numeric(0)
#> - fwhm: [1] 30
#> - sigma: [1] 12.73994
#> - max: [1] 5
#> - snthresh: [1] 15
#> - steps: [1] 2
#> - mzdiff: [1] -9.2
#> - index: [1] FALSE
## Perform the peak detection using matchecFilter on the files from the
## faahKO package. Files are read using the readMSData from the MSnbase
## package
library(faahKO)
library(MSnbase)
fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE,
full.names = TRUE)
raw_data <- readMSData(fls[1], mode = "onDisk")
#> Polarity can not be extracted from netCDF files, please set manually the polarity with the 'polarity' method.
## Perform the chromatographic peak detection using the settings defined
## above. Note that we are also disabling parallel processing in this
## example by registering a "SerialParam"
res <- findChromPeaks(raw_data, param = mfp)
head(chromPeaks(res))
#> mz mzmin mzmax rt rtmin rtmax into intf maxo
#> CP001 205.0000 205.0 205.0 2784.635 2770.550 2800.284 1778568.9 3580020 84280
#> CP002 205.0000 205.0 205.0 2784.635 2770.550 2800.284 1778568.9 3577971 84280
#> CP003 241.1460 241.1 241.2 3662.574 3646.924 3682.918 1465988.7 2234510 49728
#> CP004 241.1460 241.1 241.2 3662.574 3646.924 3682.918 1465988.7 2234510 49728
#> CP005 244.1000 244.1 244.1 2828.453 2814.369 2842.538 598990.3 1145078 31312
#> CP006 249.1591 249.1 249.2 3659.444 3643.794 3678.223 1435000.7 2367467 49040
#> maxf i sn sample
#> CP001 194233.12 1 63.28090 1
#> CP002 194213.46 1 66.00099 1
#> CP003 96022.23 1 25.42409 1
#> CP004 96022.23 1 25.42643 1
#> CP005 64181.64 2 16.99513 1
#> CP006 104291.09 1 36.83500 1