Package: peakPantheR
Authors: Arnaud Wolfer

Package for Peak Picking and ANnoTation of High resolution Experiments in R, implemented in R and Shiny

1 Overview

peakPantheR implements functions to detect, integrate and report pre-defined features in MS files (e.g. compounds, fragments, adducts, …).

It is designed for:

Real time feature detection and integration (see Real Time Annotation)
- process multiple compounds in one file at a time
Post-acquisition feature detection, integration and reporting (see Parallel Annotation)
- process multiple compounds in multiple files in parallel, store results in a single object

peakPantheR can process LC/MS data files in NetCDF, mzML/mzXML and mzData format as data import is achieved using Bioconductor’s mzR package.

2 Installation

To install peakPantheR from Bioconductor:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("peakPantheR")

Install the development version of peakPantheR directly from GitHub with:

# Install devtools
if(!require("devtools")) install.packages("devtools")
devtools::install_github("phenomecentre/peakPantheR")

3 Input Data

Both real time and parallel compound integration require a common set of information:

Path(s) to netCDF / mzML MS file(s)
An expected region of interest (RT / m/z window) for each compound.

3.1 MS files

For demonstration purpose we can annotate a set a set of raw MS spectra (in NetCDF format) provided by the faahKO package. Briefly, this subset of the data from (Saghatelian et al. 2004) invesigate the metabolic consequences of knocking out the fatty acid amide hydrolase (FAAH) gene in mice. The dataset consists of samples from the spinal cords of 6 knock-out and 6 wild-type mice. Each file contains data in centroid mode acquired in positive ion mode form 200-600 m/z and 2500-4500 seconds.

Below we install the faahKO package and locate raw CDF files of interest:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("faahKO")

library(faahKO)
## file paths
input_spectraPaths  <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"),
                        system.file('cdf/KO/ko16.CDF', package = "faahKO"),
                        system.file('cdf/KO/ko18.CDF', package = "faahKO"))
input_spectraPaths
#> [1] "/home/biocbuild/bbs-3.11-bioc/R/library/faahKO/cdf/KO/ko15.CDF"
#> [2] "/home/biocbuild/bbs-3.11-bioc/R/library/faahKO/cdf/KO/ko16.CDF"
#> [3] "/home/biocbuild/bbs-3.11-bioc/R/library/faahKO/cdf/KO/ko18.CDF"

3.2 Expected regions of interest

Expected regions of interest (targeted features) are specified using the following information:

cpdID (numeric)
cpdName (character)
rtMin (sec)
rtMax (sec)
rt (sec, optional / NA)
mzMin (m/z)
mzMax (m/z)
mz (m/z, optional / NA)

Below we define 2 features of interest that are present in the faahKO dataset and can be employed in subsequent vignettes:

# targetFeatTable
input_targetFeatTable <- data.frame(matrix(vector(), 2, 8, dimnames=list(c(), 
                        c("cpdID", "cpdName", "rtMin", "rt", "rtMax", "mzMin", 
                        "mz", "mzMax"))), stringsAsFactors=FALSE)
input_targetFeatTable[1,] <- c(1, "Cpd 1", 3310., 3344.888, 3390., 522.194778, 
                                522.2, 522.205222)
input_targetFeatTable[2,] <- c(2, "Cpd 2", 3280., 3385.577, 3440., 496.195038,
                                496.2, 496.204962)
input_targetFeatTable[,c(1,3:8)] <- sapply(input_targetFeatTable[,c(1,3:8)], 
                                            as.numeric)

cpdID	cpdName	rtMin	rt	rtMax	mzMin	mz	mzMax
1	Cpd 1	3310	3344.888	3390	522.194778	522.2	522.205222
2	Cpd 2	3280	3385.577	3440	496.195038	496.2	496.204962

4 See Also

References

Saghatelian, A., S. A. Trauger, E. J. Want, E. G. Hawkins, G. Siuzdak, and B. F. Cravatt. 2004. “Assignment of Endogenous Substrates to Enzymes by Global Metabolite Profiling.” Biochemistry 43:14332–9. http://dx.doi.org/10.1021/bi0480335.

Getting Started with the peakPantheR package

2019-10-01

Package