Alignment: Retention time correction methods.
Source:R/AllGenerics.R
, R/XcmsExperiment.R
, R/functions-Params.R
, and 4 more
adjustRtime.Rd
The adjustRtime
method(s) perform retention time correction (alignment)
between chromatograms of different samples. Alignment is performed by defaul
on MS level 1 data. Retention times of spectra from other MS levels, if
present, are subsequently adjusted based on the adjusted retention times
of the MS1 spectra. Note that calling adjustRtime
on a xcms result object
will remove any eventually present previous alignment results as well as
any correspondence analysis results. To run a second round of alignment,
raw retention times need to be replaced with adjusted ones using the
applyAdjustedRtime()
function.
The alignment method can be specified (and configured) using a dedicated
param
argument.
Supported param
objects are:
ObiwarpParam
: performs retention time adjustment based on the full m/z - rt data using the obiwarp method (Prince (2006)). It is based on the original code but supports in addition alignment of multiple samples by aligning each against a center sample. The alignment is performed directly on the profile-matrix and can hence be performed independently of the peak detection or peak grouping.PeakGroupsParam
: performs retention time correctoin based on the alignment of features defined in all/most samples (corresponding to house keeping compounds or marker compounds) (Smith 2006). First the retention time deviation of these features is described by fitting either a polynomial (smooth = "loess"
) or a linear (smooth = "linear"
) function to the data points. These are then subsequently used to adjust the retention time of each spectrum in each sample (even from spectra of MS levels different than MS 1). Since the function is based on features (i.e. chromatographic peaks grouped across samples) a initial correspondence analysis has to be performed before using thegroupChromPeaks()
function. Alternatively, it is also possible to manually define anumeric
matrix with retention times of markers in each samples that should be used for alignment. Such amatrix
can be passed to the alignment function using thepeakGroupsMatrix
parameter of thePeakGroupsParam
parameter object. By default theadjustRtimePeakGroups
function is used to define thismatrix
. This function identifies peak groups (features) for alignment inobject
based on the parameters defined inparam
. See alsodo_adjustRtime_peakGroups()
for the core API function.
Usage
adjustRtime(object, param, ...)
# S4 method for MsExperiment,ObiwarpParam
adjustRtime(object, param, chunkSize = 2L, BPPARAM = bpparam())
# S4 method for MsExperiment,PeakGroupsParam
adjustRtime(object, param, msLevel = 1L, ...)
PeakGroupsParam(
minFraction = 0.9,
extraPeaks = 1,
smooth = "loess",
span = 0.2,
family = "gaussian",
peakGroupsMatrix = matrix(nrow = 0, ncol = 0),
subset = integer(),
subsetAdjust = c("average", "previous")
)
ObiwarpParam(
binSize = 1,
centerSample = integer(),
response = 1L,
distFun = "cor_opt",
gapInit = numeric(),
gapExtend = numeric(),
factorDiag = 2,
factorGap = 1,
localAlignment = FALSE,
initPenalty = 0,
subset = integer(),
subsetAdjust = c("average", "previous")
)
adjustRtimePeakGroups(object, param = PeakGroupsParam(), msLevel = 1L)
# S4 method for OnDiskMSnExp,ObiwarpParam
adjustRtime(object, param, msLevel = 1L)
# S4 method for PeakGroupsParam
minFraction(object)
# S4 method for PeakGroupsParam
minFraction(object) <- value
# S4 method for PeakGroupsParam
extraPeaks(object)
# S4 method for PeakGroupsParam
extraPeaks(object) <- value
# S4 method for PeakGroupsParam
smooth(x)
# S4 method for PeakGroupsParam
smooth(object) <- value
# S4 method for PeakGroupsParam
span(object)
# S4 method for PeakGroupsParam
span(object) <- value
# S4 method for PeakGroupsParam
family(object)
# S4 method for PeakGroupsParam
family(object) <- value
# S4 method for PeakGroupsParam
peakGroupsMatrix(object)
# S4 method for PeakGroupsParam
peakGroupsMatrix(object) <- value
# S4 method for PeakGroupsParam
subset(x)
# S4 method for PeakGroupsParam
subset(object) <- value
# S4 method for PeakGroupsParam
subsetAdjust(object)
# S4 method for PeakGroupsParam
subsetAdjust(object) <- value
# S4 method for ObiwarpParam
binSize(object)
# S4 method for ObiwarpParam
binSize(object) <- value
# S4 method for ObiwarpParam
centerSample(object)
# S4 method for ObiwarpParam
centerSample(object) <- value
# S4 method for ObiwarpParam
response(object)
# S4 method for ObiwarpParam
response(object) <- value
# S4 method for ObiwarpParam
distFun(object)
# S4 method for ObiwarpParam
distFun(object) <- value
# S4 method for ObiwarpParam
gapInit(object)
# S4 method for ObiwarpParam
gapInit(object) <- value
# S4 method for ObiwarpParam
gapExtend(object)
# S4 method for ObiwarpParam
gapExtend(object) <- value
# S4 method for ObiwarpParam
factorDiag(object)
# S4 method for ObiwarpParam
factorDiag(object) <- value
# S4 method for ObiwarpParam
factorGap(object)
# S4 method for ObiwarpParam
factorGap(object) <- value
# S4 method for ObiwarpParam
localAlignment(object)
# S4 method for ObiwarpParam
localAlignment(object) <- value
# S4 method for ObiwarpParam
initPenalty(object)
# S4 method for ObiwarpParam
initPenalty(object) <- value
# S4 method for ObiwarpParam
subset(x)
# S4 method for ObiwarpParam
subset(object) <- value
# S4 method for ObiwarpParam
subsetAdjust(object)
# S4 method for ObiwarpParam
subsetAdjust(object) <- value
# S4 method for XCMSnExp,PeakGroupsParam
adjustRtime(object, param, msLevel = 1L)
# S4 method for XCMSnExp,ObiwarpParam
adjustRtime(object, param, msLevel = 1L)
Arguments
- object
For
adjustRtime
: anOnDiskMSnExp()
,XCMSnExp()
,MsExperiment()
orXcmsExperiment()
object.- param
The parameter object defining the alignment method (and its setting).
- ...
ignored.
- chunkSize
For
adjustRtime
ifobject
is either anMsExperiment
orXcmsExperiment
:integer(1)
defining the number of files (samples) that should be loaded into memory and processed at the same time. Alignment is then performed in parallel (per sample) on this subset of loaded data. This setting thus allows to balance between memory demand and speed (due to parallel processing). Because parallel processing can only performed on the subset of data currently loaded into memory in each iteration, the value forchunkSize
should match the defined parallel setting setup. Using a parallel processing setup using 4 CPUs (separate processes) but usingchunkSize =
1will not perform any parallel processing, as only the data from one sample is loaded in memory at a time. On the other hand, setting
chunkSize` to the total number of samples in an experiment will load the full MS data into memory and will thus in most settings cause an out-of-memory error.- BPPARAM
parallel processing setup. Defaults to
BPPARAM = bpparam()
. Seebpparam()
for details.- msLevel
For
adjustRtime
:integer(1)
defining the MS level on which the alignment should be performed.- minFraction
For
PeakGroupsParam
:numeric(1)
between 0 and 1 defining the minimum required proportion of samples in which peaks for the peak group were identified. Peak groups passing this criteria will be aligned across samples and retention times of individual spectra will be adjusted based on this alignment. ForminFraction = 1
the peak group has to contain peaks in all samples of the experiment. Note that ifsubset
is provided, the specified fraction is relative to the defined subset of samples and not to the total number of samples within the experiment (i.e., a peak has to be present in the specified proportion of subset samples).- extraPeaks
For
PeakGroupsParam
:numeric(1)
defining the maximal number of additional peaks for all samples to be assigned to a peak group (feature) for retention time correction. For a data set with 6 samples,extraPeaks = 1
uses all peak groups with a total peak count<= 6 + 1
. The total peak count is the total number of peaks being assigned to a peak group and considers also multiple peaks within a sample that are assigned to the group.- smooth
For
PeakGroupsParam
:character(1)
defining the function to be used to interpolate corrected retention times for all peak groups. Can be either"loess"
or"linear"
.- span
For
PeakGroupsParam
:numeric(1)
defining the degree of smoothing (ifsmooth = "loess"
). This parameter is passed to the internal call toloess()
.- family
For
PeakGroupsParam
:character(1)
defining the method for loess smoothing. Allowed values are"gaussian"
and"symmetric"
. Seeloess()
for more information.- peakGroupsMatrix
For
PeakGroupsParam
: optionalmatrix
of (raw) retention times for the (marker) peak groups on which the alignment should be performed. Each column represents a sample, each row a feature/peak group. TheadjustRtimePeakGroups
method is used by default to determine this matrix on the providedobject
.- subset
For
ObiwarpParam
andPeakGroupsParam
:integer
with the indices of samples within the experiment on which the alignment models should be estimated. Samples not part of the subset are adjusted based on the closest subset sample. See Subset-based alignment section for details.- subsetAdjust
For
ObiwarpParam
andPeakGroupsParam
:character(1)
specifying the method with which non-subset samples should be adjusted. Supported options are"previous"
and"average"
(default). See Subset-based alignment section for details.- binSize
numeric(1)
defining the bin size (in mz dimension) to be used for the profile matrix generation. Seestep
parameter inprofile-matrix
documentation for more details.- centerSample
integer(1)
defining the index of the center sample in the experiment. It defaults tofloor(median(1:length(fileNames(object))))
. Note that ifsubset
is used, the index passed withcenterSample
is within these subset samples.- response
For
ObiwarpParam
:numeric(1)
defining the responsiveness of warping withresponse = 0
giving linear warping on start and end points andresponse = 100
warping using all bijective anchors.- distFun
For
ObiwarpParam
:character(1)
defining the distance function to be used. Allowed values are"cor"
(Pearson's correlation),"cor_opt"
(calculate only 10% diagonal band of distance matrix; better runtime),"cov"
(covariance),"prd"
(product) and"euc"
(Euclidian distance). The default value isdistFun = "cor_opt"
.- gapInit
For
ObiwarpParam
:numeric(1)
defining the penalty for gap opening. The default value for depends on the value ofdistFun
:distFun = "cor"
anddistFun = "cor_opt"
it is0.3
, fordistFun = "cov"
anddistFun = "prd"
0.0
and fordistFun = "euc"
0.9
.- gapExtend
For
ObiwarpParam
:numeric(1)
defining the penalty for gap enlargement. The default value forgapExtend
depends on the value ofdistFun
: fordistFun = "cor"
anddistFun = "cor_opt"
it is2.4
,distFun = "cov"
11.7
, fordistFun = "euc"
1.8
and fordistFun = "prd"
7.8
.- factorDiag
For
ObiwarpParam
:numeric(1)
defining the local weight applied to diagonal moves in the alignment.- factorGap
For
ObiwarpParam
:numeric(1)
defining the local weight for gap moves in the alignment.- localAlignment
For
ObiwarpParam
:logical(1)
whether a local alignment should be performed instead of the default global alignment.- initPenalty
For
ObiwarpParam
:numeric(1)
defining the penalty for initiating an alignment (for local alignment only).- value
The value for the slot.
- x
An
ObiwarpParam
orPeakGroupsParam
object.
Value
adjustRtime
on an OnDiskMSnExp
or XCMSnExp
object will return an
XCMSnExp
object with the alignment results.
adjustRtime
on an MsExperiment
or XcmsExperiment
will return an
XcmsExperiment
with the adjusted retention times stored in an new
spectra variable
rtime_adjusted
in the object's spectra
.
ObiwarpParam
and PeakGroupsParam
return the respective parameter object.
adjustRtimeGroups
returns a matrix
with the retention times of marker
features in each sample (each row one feature, each row one sample).
Subset-based alignment
All alignment methods allow to perform the retention time correction on a
user-selected subset of samples (e.g. QC samples) after which all samples
not part of that subset will be adjusted based on the adjusted retention
times of the closest subset sample (close in terms of index within object
and hence possibly injection index). It is thus suggested to load MS data
files in the order in which their samples were injected in the measurement
run(s).
How the non-subset samples are adjusted depends also on the parameter
subsetAdjust
: with subsetAdjust = "previous"
, each non-subset
sample is adjusted based on the closest previous subset sample which
results in most cases with adjusted retention times of the non-subset
sample being identical to the subset sample on which the adjustment bases.
The second, default, option is subsetAdjust = "average"
in which case
each non subset sample is adjusted based on the average retention time
adjustment from the previous and following subset sample. For the average,
a weighted mean is used with weights being the inverse of the distance of
the non-subset sample to the subset samples used for alignment.
See also section Alignment of experiments including blanks in the xcms vignette for more details.
References
Prince, J. T., and Marcotte, E. M. (2006) "Chromatographic Alignment of ESI-LC-MS Proteomic Data Sets by Ordered Bijective Interpolated Warping" Anal. Chem., 78 (17), 6140-6152.
Smith, C.A., Want, E.J., O'Maille, G., Abagyan, R. and Siuzdak, G. (2006). "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 78:779-787.
See also
plotAdjustedRtime()
for visualization of alignment results.