Skip to contents

This function identifies peaks in the chromatographic time domain as described in [Smith 2006]. The intensity values are binned by cutting The LC/MS data into slices (bins) of a mass unit (binSize m/z) wide. Within each bin the maximal intensity is selected. The peak detection is then performed in each bin by extending it based on the steps parameter to generate slices comprising bins current_bin - steps +1 to current_bin + steps - 1. Each of these slices is then filtered with matched filtration using a second-derative Gaussian as the model peak shape. After filtration peaks are detected using a signal-to-ration cut-off. For more details and illustrations see [Smith 2006].

Usage

do_findChromPeaks_matchedFilter(
  mz,
  int,
  scantime,
  valsPerSpect,
  binSize = 0.1,
  impute = "none",
  baseValue,
  distance,
  fwhm = 30,
  sigma = fwhm/2.3548,
  max = 5,
  snthresh = 10,
  steps = 2,
  mzdiff = 0.8 - binSize * steps,
  index = FALSE,
  sleep = 0
)

Arguments

mz

Numeric vector with the individual m/z values from all scans/ spectra of one file/sample.

int

Numeric vector with the individual intensity values from all scans/spectra of one file/sample.

scantime

Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan.

valsPerSpect

Numeric vector with the number of values for each spectrum.

binSize

numeric(1) specifying the width of the bins/slices in m/z dimension.

impute

Character string specifying the method to be used for missing value imputation. Allowed values are "none" (no linear interpolation), "lin" (linear interpolation), "linbase" (linear interpolation within a certain bin-neighborhood) and "intlin". See imputeLinInterpol for more details.

baseValue

The base value to which empty elements should be set. This is only considered for method = "linbase" and corresponds to the profBinLinBase's baselevel argument.

distance

For method = "linbase": number of non-empty neighboring element of an empty element that should be considered for linear interpolation. See details section for more information.

fwhm

numeric(1) specifying the full width at half maximum of matched filtration gaussian model peak. Only used to calculate the actual sigma, see below.

sigma

numeric(1) specifying the standard deviation (width) of the matched filtration model peak.

max

numeric(1) representing the maximum number of peaks that are expected/will be identified per slice.

snthresh

numeric(1) defining the signal to noise ratio cutoff.

steps

numeric(1) defining the number of bins to be merged before filtration (i.e. the number of neighboring bins that will be joined to the slice in which filtration and peak detection will be performed).

mzdiff

numeric(1) representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.

index

logical(1) specifying whether indicies should be returned instead of values for m/z and retention times.

sleep

numeric(1) defining the number of seconds to wait between iterations. Defaults to sleep = 0. If > 0 a plot is generated visualizing the identified chromatographic peak. Note: this argument is for backward compatibility only and will be removed in future.

Value

A matrix, each row representing an identified chromatographic peak, with columns:

mz

Intensity weighted mean of m/z values of the peak across scans.

mzmin

Minimum m/z of the peak.

mzmax

Maximum m/z of the peak.

rt

Retention time of the peak's midpoint.

rtmin

Minimum retention time of the peak.

rtmax

Maximum retention time of the peak.

into

Integrated (original) intensity of the peak.

intf

Integrated intensity of the filtered peak.

maxo

Maximum intensity of the peak.

maxf

Maximum intensity of the filtered peak.

i

Rank of peak in merged EIC (<= max).

sn

Signal to noise ratio of the peak

Details

The intensities are binned by the provided m/z values within each spectrum (scan). Binning is performed such that the bins are centered around the m/z values (i.e. the first bin includes all m/z values between min(mz) - bin_size/2 and min(mz) + bin_size/2).

For more details on binning and missing value imputation see binYonX and imputeLinInterpol methods.

Note

This function exposes core peak detection functionality of the matchedFilter method. While this function can be called directly, users will generally call the corresponding method for the data object instead (e.g. the link{findPeaks.matchedFilter} method).

References

Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.

See also

binYonX for a binning function, imputeLinInterpol for the interpolation of missing values. matchedFilter for the standard user interface method.

Other core peak detection functions: do_findChromPeaks_centWave(), do_findChromPeaks_centWaveWithPredIsoROIs(), do_findChromPeaks_massifquant(), do_findPeaks_MSW()

Author

Colin A Smith, Johannes Rainer

Examples


## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

res <- do_findChromPeaks_matchedFilter(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect)
head(res)
#>            mz mzmin mzmax       rt    rtmin    rtmax      into      intf  maxo
#> [1,] 205.0000 205.0 205.0 2784.635 2770.550 2800.284 1778568.9 3610062.2 84280
#> [2,] 205.9819 205.9 206.0 2786.200 2772.115 2800.284  237993.6  448580.3 10681
#> [3,] 207.0821 207.0 207.1 2712.647 2698.562 2726.731  380873.0  730981.4 18800
#> [4,] 236.0956 236.0 236.1 2518.593 2504.508 2534.242  252282.0  458747.7 12957
#> [5,] 244.1000 244.1 244.1 2828.453 2814.369 2844.103  612169.9 1279308.9 31312
#> [6,] 266.0751 266.0 266.1 2828.453 2815.934 2844.103  113219.0  214886.8  5801
#>           maxf i       sn
#> [1,] 195026.48 1 28.33394
#> [2,]  23860.11 1 16.53987
#> [3,]  40065.74 1 12.87314
#> [4,]  24536.55 1 14.99012
#> [5,]  69898.24 1 24.20989
#> [6,]  11773.56 1 10.83870