Skip to contents

The matchedFilter algorithm identifies peaks in the chromatographic time domain as described in [Smith 2006]. The intensity values are binned by cutting The LC/MS data into slices (bins) of a mass unit (binSize m/z) wide. Within each bin the maximal intensity is selected. The chromatographic peak detection is then performed in each bin by extending it based on the steps parameter to generate slices comprising bins current_bin - steps +1 to current_bin + steps - 1. Each of these slices is then filtered with matched filtration using a second-derative Gaussian as the model peak shape. After filtration peaks are detected using a signal-to-ratio cut-off. For more details and illustrations see [Smith 2006].

The MatchedFilterParam class allows to specify all settings for a chromatographic peak detection using the matchedFilter method. Instances should be created with the MatchedFilterParam constructor.

The findChromPeaks,OnDiskMSnExp,MatchedFilterParam method performs peak detection using the matchedFilter algorithm on all samples from an OnDiskMSnExp object. OnDiskMSnExp objects encapsule all experiment specific data and load the spectra data (mz and intensity values) on the fly from the original files applying also all eventual data manipulations.

binSize,binSize<-: getter and setter for the binSize slot of the object.

impute,impute<-: getter and setter for the impute slot of the object.

baseValue,baseValue<-: getter and setter for the baseValue slot of the object.

distance,distance<-: getter and setter for the distance slot of the object.

fwhm,fwhm<-: getter and setter for the fwhm slot of the object.

sigma,sigma<-: getter and setter for the sigma slot of the object.

max,max<-: getter and setter for the max slot of the object.

snthresh,snthresh<-: getter and setter for the snthresh slot of the object.

steps,steps<-: getter and setter for the steps slot of the object.

mzdiff,mzdiff<-: getter and setter for the mzdiff slot of the object.

index,index<-: getter and setter for the index slot of the object.

Usage

MatchedFilterParam(
  binSize = 0.1,
  impute = "none",
  baseValue = numeric(),
  distance = numeric(),
  fwhm = 30,
  sigma = fwhm/2.3548,
  max = 5,
  snthresh = 10,
  steps = 2,
  mzdiff = 0.8 - binSize * steps,
  index = FALSE
)

# S4 method for class 'OnDiskMSnExp,MatchedFilterParam'
findChromPeaks(
  object,
  param,
  BPPARAM = bpparam(),
  return.type = "XCMSnExp",
  msLevel = 1L,
  ...
)

# S4 method for class 'MatchedFilterParam'
binSize(object)

# S4 method for class 'MatchedFilterParam'
binSize(object) <- value

# S4 method for class 'MatchedFilterParam'
impute(object)

# S4 method for class 'MatchedFilterParam'
impute(object) <- value

# S4 method for class 'MatchedFilterParam'
baseValue(object)

# S4 method for class 'MatchedFilterParam'
baseValue(object) <- value

# S4 method for class 'MatchedFilterParam'
distance(object)

# S4 method for class 'MatchedFilterParam'
distance(object) <- value

# S4 method for class 'MatchedFilterParam'
fwhm(object)

# S4 method for class 'MatchedFilterParam'
fwhm(object) <- value

# S4 method for class 'MatchedFilterParam'
sigma(object)

# S4 method for class 'MatchedFilterParam'
sigma(object) <- value

# S4 method for class 'MatchedFilterParam'
max(x)

# S4 method for class 'MatchedFilterParam'
max(object) <- value

# S4 method for class 'MatchedFilterParam'
snthresh(object)

# S4 method for class 'MatchedFilterParam'
snthresh(object) <- value

# S4 method for class 'MatchedFilterParam'
steps(object)

# S4 method for class 'MatchedFilterParam'
steps(object) <- value

# S4 method for class 'MatchedFilterParam'
mzdiff(object)

# S4 method for class 'MatchedFilterParam'
mzdiff(object) <- value

# S4 method for class 'MatchedFilterParam'
index(object)

# S4 method for class 'MatchedFilterParam'
index(object) <- value

Arguments

binSize

numeric(1) specifying the width of the bins/slices in m/z dimension.

impute

Character string specifying the method to be used for missing value imputation. Allowed values are "none" (no linear interpolation), "lin" (linear interpolation), "linbase" (linear interpolation within a certain bin-neighborhood) and "intlin". See imputeLinInterpol for more details.

baseValue

The base value to which empty elements should be set. This is only considered for method = "linbase" and corresponds to the profBinLinBase's baselevel argument.

distance

For method = "linbase": number of non-empty neighboring element of an empty element that should be considered for linear interpolation. See details section for more information.

fwhm

numeric(1) specifying the full width at half maximum of matched filtration gaussian model peak. Only used to calculate the actual sigma, see below.

sigma

numeric(1) specifying the standard deviation (width) of the matched filtration model peak.

max

numeric(1) representing the maximum number of peaks that are expected/will be identified per slice.

snthresh

numeric(1) defining the signal to noise cutoff to be used in the chromatographic peak detection step.

steps

numeric(1) defining the number of bins to be merged before filtration (i.e. the number of neighboring bins that will be joined to the slice in which filtration and peak detection will be performed).

mzdiff

numeric(1) defining the minimum difference in m/z for peaks with overlapping retention times

index

logical(1) specifying whether indicies should be returned instead of values for m/z and retention times.

object

For findChromPeaks: an OnDiskMSnExp object containing the MS- and all other experiment-relevant data.

For all other methods: a parameter object.

param

An MatchedFilterParam object containing all settings for the matchedFilter algorithm.

BPPARAM

A parameter class specifying if and how parallel processing should be performed. It defaults to bpparam. See documentation of the BiocParallel for more details. If parallel processing is enabled, peak detection is performed in parallel on several of the input samples.

return.type

Character specifying what type of object the method should return. Can be either "XCMSnExp" (default), "list" or "xcmsSet".

msLevel

integer(1) defining the MS level on which the peak detection should be performed. Defaults to msLevel = 1.

...

ignored.

value

The value for the slot.

x

For max: a MatchedFilterParam object.

Value

The MatchedFilterParam function returns a

MatchedFilterParam class instance with all of the settings specified for chromatographic detection by the matchedFilter

method.

For findChromPeaks: if return.type = "XCMSnExp" an

XCMSnExp object with the results of the peak detection. If return.type = "list" a list of length equal to the number of samples with matrices specifying the identified peaks. If return.type = "xcmsSet" an xcmsSet object with the results of the peak detection.

Details

The intensities are binned by the provided m/z values within each spectrum (scan). Binning is performed such that the bins are centered around the m/z values (i.e. the first bin includes all m/z values between min(mz) - bin_size/2 and min(mz) + bin_size/2).

For more details on binning and missing value imputation see binYonX and imputeLinInterpol methods.

Parallel processing (one process per sample) is supported and can be configured either by the BPPARAM parameter or by globally defining the parallel processing mode using the register method from the BiocParallel package.

Slots

binSize,impute,baseValue,distance,fwhm,sigma,max,snthresh,steps,mzdiff,index

See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.

Note

These methods and classes are part of the updated and modernized xcms user interface which will eventually replace the findPeaks methods. It supports chromatographic peak detection on OnDiskMSnExp objects (defined in the MSnbase package). All of the settings to the matchedFilter algorithm can be passed with a MatchedFilterParam object.

References

Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.

See also

The do_findChromPeaks_matchedFilter core API function and findPeaks.matchedFilter for the old user interface.

peaksWithMatchedFilter for functions to perform matchedFilter peak detection in purely chromatographic data.

XCMSnExp for the object containing the results of the chromatographic peak detection.

Other peak detection methods: findChromPeaks(), findChromPeaks-centWave, findChromPeaks-centWaveWithPredIsoROIs, findChromPeaks-massifquant, findPeaks-MSW

Author

Colin A Smith, Johannes Rainer

Examples


## Create a MatchedFilterParam object. Note that we use a unnecessarily large
## binSize parameter to reduce the run-time of the example.
mfp <- MatchedFilterParam(binSize = 5)
## Change snthresh parameter
snthresh(mfp) <- 15
mfp
#> Object of class:  MatchedFilterParam 
#>  Parameters:
#>  - binSize: [1] 5
#>  - impute: [1] "none"
#>  - baseValue: numeric(0)
#>  - distance: numeric(0)
#>  - fwhm: [1] 30
#>  - sigma: [1] 12.73994
#>  - max: [1] 5
#>  - snthresh: [1] 15
#>  - steps: [1] 2
#>  - mzdiff: [1] -9.2
#>  - index: [1] FALSE

## Perform the peak detection using matchecFilter on the files from the
## faahKO package. Files are read using the readMSData from the MSnbase
## package
library(faahKO)
library(MSnbase)
fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE,
           full.names = TRUE)
raw_data <- readMSData(fls[1], mode = "onDisk")
#> Polarity can not be extracted from netCDF files, please set manually the polarity with the 'polarity' method.
## Perform the chromatographic peak detection using the settings defined
## above. Note that we are also disabling parallel processing in this
## example by registering a "SerialParam"
res <- findChromPeaks(raw_data, param = mfp)
head(chromPeaks(res))
#>             mz mzmin mzmax       rt    rtmin    rtmax      into    intf  maxo
#> CP001 205.0000 205.0 205.0 2784.635 2770.550 2800.284 1778568.9 3580020 84280
#> CP002 205.0000 205.0 205.0 2784.635 2770.550 2800.284 1778568.9 3577971 84280
#> CP003 241.1460 241.1 241.2 3662.574 3646.924 3682.918 1465988.7 2234510 49728
#> CP004 241.1460 241.1 241.2 3662.574 3646.924 3682.918 1465988.7 2234510 49728
#> CP005 244.1000 244.1 244.1 2828.453 2814.369 2842.538  598990.3 1145078 31312
#> CP006 249.1591 249.1 249.2 3659.444 3643.794 3678.223 1435000.7 2367467 49040
#>            maxf i       sn sample
#> CP001 194233.12 1 63.28090      1
#> CP002 194213.46 1 66.00099      1
#> CP003  96022.23 1 25.42409      1
#> CP004  96022.23 1 25.42643      1
#> CP005  64181.64 2 16.99513      1
#> CP006 104291.09 1 36.83500      1