This user’s guide provides an overview of the package ASICS. ASICS is a fully automated procedure to identify and quantify metabolites in \(^1\)H 1D-NMR spectra of biological mixtures (Tardivel et al., 2017). It will enable empowering NMR-based metabolomics by quickly and accurately helping experts to obtain metabolic profiles. In addition to the quantification method, several functions allowing spectrum preprocessing or statistical analyses of quantified metabolites are available.

library(ASICS)
library(ASICSdata)

1 Dataset

In this user’s guide, a subset of the public datasets from Salek et al. (2007) is used. The experiment has been designed to improve the understanding of early stage of type 2 diabetes mellitus (T2DM) development. In the dataset used, \(^1\)H-NMR human metabolome was obtained from 25 healthy volunteers and 25 T2DM patients. Raw 1D Bruker spectral data files were found in the MetaboLights database (https://www.ebi.ac.uk/metabolights/, study MTBLS1).

2 Parallel environment

For most time consumming functions, a parallel implementation is available for unix-like OS using the BiocParallel package of Bioconductor. The number of used cores is set with the option ncores of the corresponding functions (default to 1, no parallel environment).

3 Library of pure NMR metabolite spectrum

An object of class PureLibrary with spectra of pure metabolites is required to perform the quantification. Such a reference library is provided in ASICS with 191 pure metabolite spectra. These spectra are metabolite spectra used as references for quantification: only metabolites that are present in the library object can be identified and quantified with ASICS.

The default library is automatically loaded at package start. Available metabolites are displayed with:

head(getSampleName(pure_library), n = 8)
## [1] "1,3-Diaminopropane"   "Levoglucosan"         "1-Methylhydantoin"   
## [4] "1-Methyl-L-Histidine" "QuinolinicAcid"       "2-AminoAdipicAcid"   
## [7] "2-AminobutyricAcid"   "2-Deoxyadenosine"

This library can be complemented or another library can be created with new spectra of pure metabolites. These spectra are imported from Bruker files and a new library can be created with:

pure_spectra <- importSpectraBruker(system.file("extdata", "example_library", 
                                                package = "ASICS"))
new_pure_library <- createPureLibrary(pure_spectra, 
                                        nb.protons = c(5, 4))

A new library can also be created from txt or csv files, with samples in columns and chemical shifts in rows (see help page of createPureLibrary function for all details).

The newly created library can be used for quantification or merged with another one:

merged_pure_library <- c(pure_library[1:10], new_pure_library)

The PureLibrary merged_pure_library contains the first ten spectra of the default library and the two newly imported spectra.

4 Identification and quantification of metabolites with ASICS

First, data are imported in a data frame from Bruker files with the importSpectraBruker function. These spectra are baseline corrected (Wang et al, 2013) and normalised by the area under the curve.

spectra_data <- importSpectraBruker(system.file("extdata", 
                                                "Human_diabetes_example", 
                                                package = "ASICSdata"))

Data can also be imported from other file types with importSpectra function. The only constraint is to have a data frame with spectra in columns (column names are sample names) and chemical shifts in rows (row names correspond to the ppm grid).

diabetes <- system.file("extdata", package = "ASICSdata")
spectra_data_txt <- importSpectra(name.dir = diabetes, 
                                  name.file = "spectra_diabetes_example.txt",
                                  type = "txt")

Several functions for the preprocessing of spectra are also available: normalisation and alignment on a reference spectrum (based on Vu et al. (2011)).

Many types of normalisation are available. By default, spectra are normalised to a constant sum (type.norm = "CS"). Otherwise, a normalisation method implemented in the PepsNMR package could be used. For example:

spectra_norm <- normaliseSpectra(spectra_data_txt, type.norm = "pqn")
## Normalisation method : pqn

The alignment algorithm is based on Vu et al. (2011). To find the reference spectrum, the FFT cross-correlation is used. Then the alignment is performed using the FFT cross-correlation and a hierarchical classification.

spectra_align <- alignSpectra(spectra_norm)

Finally, from the data frame, a Spectra object is created. This is a required step for the quantification.

spectra_obj <- createSpectra(spectra_align)

Identification and quantification of metabolites can now be carried out using only the function ASICS. All the steps described in the following figure are included:

Steps of the quantification workflow

Recently, new methods for reference library alignment and metabolite quantification were added. Thus, multiple scenarios can be performed:

Scenarios available in ASICS The method provided in the first version of the package is given in red. It can now be used by setting joint.align = FALSE and quantif.method = "FWER". To perform a joint alignment (blue, green and yellow scenarios), joint.align needs to be set to TRUE. The yellow scenario that performs joint quantification based on a simple joint alignment is obtained by additionally setting quantif.method = "Lasso". Finally, the green scenario performs a joint quantification using metabolites identified with a first step consisting of independent quantification. It is obtained by setting quantif.method = "both".

With quantif.method = "both", the number of identified metabolites can be controlled using clean.thres. If clean.thres = 10, only the metabolites identified in at least 10% of the complex spectra (during the first independant quantification step) are used in the joint quantification.

More details on these new algorithms can be found in Lefort et al. (2020).

ASICS function takes approximately 2 minutes per spectrum to run. To control randomness in the algorithm (used in the estimation of the significativity of a given metabolite concentration), the set.seed parameter can be used.

# part of the spectrum to exclude (water and urea)
to_exclude <- matrix(c(4.5, 5.1, 5.5, 6.5), ncol = 2, byrow = TRUE)
ASICS_results <- ASICS(spectra_obj, exclusion.areas = to_exclude)

Summary of ASICS results:

ASICS_results
## An object of class ASICSResults 
## It contains 50 spectra of 31087 points. 
## 
## ASICS results: 
##  162 metabolites are identified for this set of spectra. 
## Most concentrated metabolites are: Creatinine, Citrate, AceticAcid, L-GlutamicAcid, L-Glycine, L-Proline

The quality of the results can be assessed by stacking the original and the reconstructed spectra on one plot. A pure metabolite spectrum can also be added for visual comparison. For example, the first spectrum with Creatinine:

plot(ASICS_results, idx = 1, xlim = c(2.8, 3.3), add.metab = "Creatinine")