This user’s guide provides an overview of the package ASICS
. ASICS
is a
fully automated procedure to identify and quantify metabolites in \(^1\)H 1D-NMR
spectra of biological mixtures (Tardivel et al., 2017). It will enable
empowering NMR-based metabolomics by quickly and accurately helping experts to
obtain metabolic profiles. In addition to the quantification method, several
functions allowing spectrum preprocessing or statistical analyses of quantified
metabolites are available.
library(ASICS)
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang
## Registered S3 method overwritten by 'xts':
## method from
## as.zoo.xts zoo
library(ASICSdata)
In this user’s guide, a subset of the public datasets from Salek et al. (2007) is used. The experiment has been designed to improve the understanding of early stage of type 2 diabetes mellitus (T2DM) development. In the dataset used, \(^1\)H-NMR human metabolome was obtained from 25 healthy volunteers and 25 T2DM patients. Raw 1D Bruker spectral data files were found in the MetaboLights database (https://www.ebi.ac.uk/metabolights/, study MTBLS1).
For most time consumming functions, a parallel implementation is available for
unix-like OS using the BiocParallel package of Bioconductor. The number of used
cores is set with the option ncores
of the corresponding functions (default
to 1
, no parallel environment).
An object of class PureLibrary
with spectra of pure metabolites is required to
perform the quantification. Such a reference library is provided in ASICS
with
191 pure metabolite spectra. These spectra are metabolite spectra used as
references for quantification: only metabolites that are present in the library
object can be identified and quantified with ASICS
.
The default library is automatically loaded at package start. Available metabolites are displayed with:
head(getSampleName(pure_library), n = 8)
## [1] "1,3-Diaminopropane" "Levoglucosan" "1-Methylhydantoin"
## [4] "1-Methyl-L-Histidine" "QuinolinicAcid" "2-AminoAdipicAcid"
## [7] "2-AminobutyricAcid" "2-Deoxyadenosine"
This library can be complemented or another library can be created with new spectra of pure metabolites. These spectra are imported from Bruker files and a new library can be created with:
pure_spectra <- importSpectraBruker(system.file("extdata", "example_library",
package = "ASICS"))
new_pure_library <- createPureLibrary(pure_spectra,
nb.protons = c(5, 4))
A new library can also be created from txt or csv files, with samples in columns
and chemical shifts in rows (see help page of createPureLibrary
function for
all details).
The newly created library can be used for quantification or merged with another one:
merged_pure_library <- c(pure_library[1:10], new_pure_library)
The PureLibrary
merged_pure_library
contains the first ten spectra of the
default library and the two newly imported spectra.
First, data are imported in a data frame from Bruker files with the
importSpectraBruker
function. These spectra are baseline corrected
(Wang et al, 2013) and normalised by the area under the curve.
spectra_data <- importSpectraBruker(system.file("extdata",
"Human_diabetes_example",
package = "ASICSdata"))
Data can also be imported from other file types with importSpectra
function.
The only constraint is to have a data frame with spectra in columns
(column names are sample names) and chemical shifts in rows (row names
correspond to the ppm grid).
diabetes <- system.file("extdata", package = "ASICSdata")
spectra_data_txt <- importSpectra(name.dir = diabetes,
name.file = "spectra_diabetes_example.txt",
type = "txt")
Several functions for the preprocessing of spectra are also available: normalisation and alignment on a reference spectrum (based on Vu et al. (2011)).
Many types of normalisation are available. By default, spectra are normalised
to a constant sum (type.norm = "CS"
). Otherwise, a normalisation method
implemented in the PepsNMR
package could be used. For example:
spectra_norm <- normaliseSpectra(spectra_data_txt, type.norm = "pqn")
## Normalisation...
The alignment algorithm is based on Vu et al. (2011). To find the reference spectrum, the LCSS similarity is used (Vlachos et al. (2002)). Then the alignment is performed using the FFT cross-correlation and a hierarchical classification.
spectra_align <- alignSpectra(spectra_norm)
Finally, from the data frame, a Spectra
object is created. This is a required
step for the quantification.
spectra_obj <- createSpectra(spectra_norm)
Identification and quantification of metabolites can now be carried out using
only the function ASICS
. This function takes approximately 2 minutes per
spectrum to run. To control randomness in the algorithm (used in the estimation
of the significativity of a given metabolite concentration), the set.seed
parameter can be used.
# part of the spectrum to exclude (water and urea)
to_exclude <- matrix(c(4.5, 5.1, 5.5, 6.5), ncol = 2, byrow = TRUE)
ASICS_results <- ASICS(spectra_obj, exclusion.areas = to_exclude)
Summary of ASICS results:
ASICS_results
## An object of class ASICSResults
## It contains 50 spectra of 31087 points.
##
## ASICS results:
## 141 metabolites are identified for this set of spectra.
## Most concentrated metabolites are: Creatinine, AceticAcid, Citrate, L-Cysteine, L-Glycine, L-GlutamicAcid
The quality of the results can be assessed by stacking the original and the reconstructed spectra on one plot. A pure metabolite spectrum can also be added for visual comparison. For example, the first spectrum with Creatinine:
plot(ASICS_results, idx = 1, xlim = c(2.8, 3.3), add.metab = "Creatinine")
Relative concentrations of identified metabolites are saved in a data frame
accessible via the get_quantification
function:
head(getQuantification(ASICS_results), 10)[, 1:2]
## ADG10003u_007 ADG10003u_008
## Creatinine 0.007767615 0.001952661
## AceticAcid 0.005654296 0.000000000
## Citrate 0.004692770 0.002339383
## L-Cysteine 0.004257320 0.002966719
## L-Glycine 0.002948796 0.000000000
## L-GlutamicAcid 0.002862505 0.000000000
## PyroglutamicAcid 0.002456795 0.000000000
## L-Proline 0.002441877 0.002791684
## L-Arabitol 0.002318606 0.001741284
## 2-AminoAdipicAcid 0.002236440 0.001156904
Some analysis functions are also available in ASICS
.
First, a design data frame is imported. In this data frame, the first column needs to correspond to sample names of all spectra.
design <- read.table(system.file("extdata", "design_diabete_example.txt",
package = "ASICSdata"), header = TRUE)
Then, a preprocessing is performed on relative quantifications: metabolites with more than 75% of null quantifications are removed as well as two samples that are considered as outliers.
analysis_data <- formatForAnalysis(getQuantification(ASICS_results),
design = design, zero.threshold = 75,
zero.group = "condition",
outliers = c("ADG10003u_007",
"ADG19007u_163"))
To explore results of ASICS quantification, a PCA can be performed on results of preprocessing with:
resPCA <- pca(analysis_data)
plot(resPCA, graph = "ind", col.ind = "condition")