
Align spectrum retention times across samples using peak groups found in most samples
Source:R/do_adjustRtime-functions.R
do_adjustRtime_peakGroups.RdThe function performs retention time correction by assessing
the retention time deviation across all samples using peak groups
(features) containg chromatographic peaks present in most/all samples.
The retention time deviation for these features in each sample is
described by fitting either a polynomial (smooth = "loess") or
a linear (smooth = "linear") model to the data points. The
models are subsequently used to adjust the retention time for each
spectrum in each sample.
Arguments
- peaks
a
matrixordata.framewith the identified chromatographic peaks in the samples.- peakIndex
a
listof indices that provides the grouping information of the chromatographic peaks (across and within samples).- rtime
a
listofnumericvectors with the retention times per file/sample.- minFraction
For
PeakGroupsParam:numeric(1)between 0 and 1 defining the minimum required proportion of samples in which peaks for the peak group were identified. Peak groups passing this criteria will be aligned across samples and retention times of individual spectra will be adjusted based on this alignment. ForminFraction = 1the peak group has to contain peaks in all samples of the experiment. Note that ifsubsetis provided, the specified fraction is relative to the defined subset of samples and not to the total number of samples within the experiment (i.e., a peak has to be present in the specified proportion of subset samples).- extraPeaks
For
PeakGroupsParam:numeric(1)defining the maximal number of additional peaks for all samples to be assigned to a peak group (feature) for retention time correction. For a data set with 6 samples,extraPeaks = 1uses all peak groups with a total peak count<= 6 + 1. The total peak count is the total number of peaks being assigned to a peak group and considers also multiple peaks within a sample that are assigned to the group. This parameter is ignored foradjustRtime()on anXcmsExperimentHdf5().- smooth
For
PeakGroupsParam:character(1)defining the function to be used to interpolate corrected retention times for all peak groups. Can be either"loess"or"linear".- span
For
PeakGroupsParam:numeric(1)defining the degree of smoothing (ifsmooth = "loess"). This parameter is passed to the internal call tostats::loess().- family
For
PeakGroupsParam:character(1)defining the method for loess smoothing. Allowed values are"gaussian"and"symmetric". Seestats::loess()for more information.- peakGroupsMatrix
optional
matrixof (raw) retention times for peak groups on which the alignment should be performed. Each column represents a sample, each row a feature/peak group. If not provided, this matrix will be determined depending on parametersminFractionandextraPeaks. If provided,minFractionandextraPeakswill be ignored.- subset
For
ObiwarpParamandPeakGroupsParam:integerwith the indices of samples within the experiment on which the alignment models should be estimated. Samples not part of the subset are adjusted based on the closest subset sample. See Subset-based alignment section for details.- subsetAdjust
For
ObiwarpParamandPeakGroupsParam:character(1)specifying the method with which non-subset samples should be adjusted. Supported options are"previous"and"average"(default). See Subset-based alignment section for details.
Details
The alignment bases on the presence of compounds that can be found
in all/most samples of an experiment. The retention times of individual
spectra are then adjusted based on the alignment of the features
corresponding to these house keeping compounds. The parameters
minFraction and extraPeaks can be used to fine tune which
features should be used for the alignment (i.e. which features
most likely correspond to the above mentioned house keeping compounds).
Parameter subset allows to define a subset of samples within the
experiment that should be aligned. All samples not being part of the subset
will be aligned based on the adjustment of the closest sample within the
subset. This allows to e.g. exclude blank samples from the alignment process
with their retention times being still adjusted based on the alignment
results of the real samples.
Note
The method ensures that returned adjusted retention times are increasingly ordered, just as the raw retention times.