Correspondence: group chromatographic peaks across samples
Source:R/AllGenerics.R
, R/XcmsExperiment.R
, R/functions-Params.R
, and 2 more
groupChromPeaks.Rd
The groupChromPeaks
method performs a correspondence analysis i.e., it
groups chromatographic peaks across samples to define the LC-MS features.
The correspondence algorithm can be selected, and configured, using the
param
argument. See documentation of XcmsExperiment()
and XCMSnExp()
for information on how to access and extract correspondence results.
The correspondence analysis can be performed on chromatographic peaks of
any MS level (if present and if chromatographic peak detection has been
performed for that MS level) defining features combining these peaks. The
MS level can be selected with the parameter msLevel
. By default, calling
groupChromPeaks
will remove any previous correspondence results. This can
be disabled with add = TRUE
, which will add newly defined features to
already present feature definitions.
Supported param
objects are:
PeakDensityParam
: correspondence using the peak density method (Smith 2006) that groups chromatographic peaks along the retention time axis within slices of (partially overlapping) m/z ranges. By default, these m/z ranges (bins) have a constant size. By settingppm
to a value larger than 0, m/z dependent bin sizes can be used instead (better representing the m/z dependent measurement error of some MS instruments). All peaks (from the same or from different samples) with their apex position being close on the retention time axis are grouped into a LC-MS feature. Only samples with non-missing sample group assignment (i.e. for which the value provided with parametersampleGroups
is different thanNA
) are considered and counted for the feature definition. This allows to exclude certain samples or groups (e.g. blanks) from the feature definition avoiding thus features with only detected peaks in these. Note that this affects only the definition of new features. Chromatographic peaks in these samples will still be assigned to features which were defined based on the other samples. See in additiondo_groupChromPeaks_density()
for the core API function.NearestPeaksParam
: performs peak grouping based on the proximity of chromatographic peaks from different samples in the m/z - rt space similar to the correspondence method of mzMine (Katajamaa 2006). The method creates first a master peak list consisting of all chromatographic peaks from the sample with the most detected peaks and iteratively calculates distances to peaks from the sample with the next most number of peaks grouping peaks together if their distance is smaller than the provided thresholds. See in additiondo_groupChromPeaks_nearest()
for the core API function.MzClustParam
: performs high resolution peak grouping for single spectrum metabolomics data (Kazmi 2006). This method should only be used for such data as the retention time is not considered in the correspondence analysis. See in additiondo_groupPeaks_mzClust()
for the core API function.
For specific examples and description of the method and settings see the help pages of the individual parameter classes listed above.
Usage
groupChromPeaks(object, param, ...)
# S4 method for class 'XcmsExperiment,Param'
groupChromPeaks(object, param, msLevel = 1L, add = FALSE)
PeakDensityParam(
sampleGroups = numeric(),
bw = 30,
minFraction = 0.5,
minSamples = 1,
binSize = 0.25,
ppm = 0,
maxFeatures = 50
)
MzClustParam(
sampleGroups = numeric(),
ppm = 20,
absMz = 0,
minFraction = 0.5,
minSamples = 1
)
NearestPeaksParam(
sampleGroups = numeric(),
mzVsRtBalance = 10,
absMz = 0.2,
absRt = 15,
kNN = 10
)
# S4 method for class 'PeakDensityParam'
sampleGroups(object)
# S4 method for class 'PeakDensityParam'
sampleGroups(object) <- value
# S4 method for class 'PeakDensityParam'
bw(object)
# S4 method for class 'PeakDensityParam'
bw(object) <- value
# S4 method for class 'PeakDensityParam'
minFraction(object)
# S4 method for class 'PeakDensityParam'
minFraction(object) <- value
# S4 method for class 'PeakDensityParam'
minSamples(object)
# S4 method for class 'PeakDensityParam'
minSamples(object) <- value
# S4 method for class 'PeakDensityParam'
binSize(object)
# S4 method for class 'PeakDensityParam'
binSize(object) <- value
# S4 method for class 'PeakDensityParam'
maxFeatures(object)
# S4 method for class 'PeakDensityParam'
maxFeatures(object) <- value
# S4 method for class 'PeakDensityParam'
ppm(object)
# S4 method for class 'MzClustParam'
sampleGroups(object)
# S4 method for class 'MzClustParam'
sampleGroups(object) <- value
# S4 method for class 'MzClustParam'
ppm(object)
# S4 method for class 'MzClustParam'
ppm(object) <- value
# S4 method for class 'MzClustParam'
absMz(object)
# S4 method for class 'MzClustParam'
absMz(object) <- value
# S4 method for class 'MzClustParam'
minFraction(object)
# S4 method for class 'MzClustParam'
minFraction(object) <- value
# S4 method for class 'MzClustParam'
minSamples(object)
# S4 method for class 'MzClustParam'
minSamples(object) <- value
# S4 method for class 'NearestPeaksParam'
sampleGroups(object)
# S4 method for class 'NearestPeaksParam'
sampleGroups(object) <- value
# S4 method for class 'NearestPeaksParam'
mzVsRtBalance(object)
# S4 method for class 'NearestPeaksParam'
mzVsRtBalance(object) <- value
# S4 method for class 'NearestPeaksParam'
absMz(object)
# S4 method for class 'NearestPeaksParam'
absMz(object) <- value
# S4 method for class 'NearestPeaksParam'
absRt(object)
# S4 method for class 'NearestPeaksParam'
absRt(object) <- value
# S4 method for class 'NearestPeaksParam'
kNN(object)
# S4 method for class 'NearestPeaksParam'
kNN(object) <- value
# S4 method for class 'PeakDensityParam'
as.list(x, ...)
# S4 method for class 'XCMSnExp,PeakDensityParam'
groupChromPeaks(object, param, msLevel = 1L, add = FALSE)
# S4 method for class 'XCMSnExp,MzClustParam'
groupChromPeaks(object, param, msLevel = 1L)
# S4 method for class 'XCMSnExp,NearestPeaksParam'
groupChromPeaks(object, param, msLevel = 1L, add = FALSE)
Arguments
- object
The data object on which the correspondence analysis should be performed. Can be an
XCMSnExp()
,XcmsExperiment()
object.- param
The parameter object selecting and configuring the algorithm.
- ...
Optional parameters.
- msLevel
integer(1)
defining the MS level on which the chromatographic peak detection should be performed.- add
logical(1)
(ifobject
contains already chromatographic peaks, i.e. is either anXCMSnExp
orXcmsExperiment
) whether chromatographic peak detection results should be added to existing results. By default (add = FALSE
) any additionalfindChromPeaks
call on a result object will remove previous results.- sampleGroups
For
PeakDensityParam
: A vector of the same length than samples defining the sample group assignments (i.e. which samples belong to which sample group). This parameter is mandatory forPeakDensityParam
and has to be defined also if there is no sample grouping in the experiment (in which case all samples should be assigned to the same group). Samples for which aNA
is provided will not be considered in the feature definitions step. ProvidingNA
for all blanks in an experiment will for example avoid features to be defined for signals (chrom peaks) present only in blank samples.- bw
For
PeakDensityParam
:numeric(1)
defining the bandwidth (standard deviation ot the smoothing kernel) to be used. This argument is passed to the [density() method.- minFraction
For
PeakDensityParam
:numeric(1)
defining the minimum fraction of samples in at least one sample group in which the peaks have to be present to be considered as a peak group (feature).- minSamples
For
PeakDensityParam
:numeric(1)
with the minimum number of samples in at least one sample group in which the peaks have to be detected to be considered a peak group (feature).- binSize
For
PeakDensityParam
:numeric(1)
defining the size of the overlapping slices in m/z dimension.- ppm
For
MzClustParam
:numeric(1)
representing the relative m/z error for the clustering/grouping (in parts per million). ForPeakDensityParam
:numeric(1)
to define m/z-dependent, increasing m/z bin sizes. Ifppm = 0
(the default) m/z bins are defined by the sequence of values from the smallest to the larges m/z value with a constant bin size ofbinSize
. Forppm
> 0 the size of each bin is increased in addition by theppm
of the (upper) m/z boundary of the bin. The maximal bin size (used for the largest m/z values) would then bebinSize
plusppm
parts-per-million of the largest m/z value of all peaks in the data set.- maxFeatures
For
PeakDensityParam
:numeric(1)
with the maximum number of peak groups to be identified in a single mz slice.- absMz
For
NearestPeaksParam
andMzClustParam
:numeric(1)
maximum tolerated distance for m/z values.- mzVsRtBalance
For
NearestPeaksParam
:numeric(1)
representing the factor by which m/z values are multiplied before calculating the (euclician) distance between two peaks.- absRt
For
NearestPeaksParam
:numeric(1)
maximum tolerated distance for retention times.- kNN
For
NearestPeaksParam
:integer(1)
representing the number of nearest neighbors to check.- value
The value for the slot.
- x
The parameter object.
Value
For groupChromPeaks
: either an XcmsExperiment()
or XCMSnExp()
object with the correspondence result.
References
Smith, C.A., Want E.J., O'Maille G., Abagyan R., and Siuzdak G. (2006) "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 78:779-787.
Katajamaa, M., Miettinen, J., Oresic, M. (2006) "MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data". Bioinformatics, 22:634-636.
Kazmi S. A., Ghosh, S., Shin, D., Hill, D.W., and Grant, D.F. (2006) "Alignment of high resolution mass spectra: development of a heuristic approach for metabolomics. Metabolomics Vol. 2, No. 2, 75-83.