Discovery, Identification, and Screening of Lipids and Oxylipins in HPLC-MS Datasets Using LOBSTAHS

James Collins

2016-10-17

Introduction

This document describes the purpose and use of the R-package LOBSTAHS: Lipid and Oxylipin Biomarker Screening through Adduct Hierarchy Sequences. The package is described in Collins et al. 2016.1 In the sections below, LOBSTAHS package functions are applied to a model dataset using example code. LOBSTAHS requires the additional packages xcms2 and CAMERA3; the model dataset, consisting of lipid data from cultures of the marine diatom Phaeodactylum tricornutum, is contained in the R data package PtH2O2lipids.

Purpose

LOBSTAHS contains several functions to help scientists discover and identify lipid and oxidized lipid biomarkers in HPLC-MS data that have been pre-processed with the popular R packages xcms and CAMERA. First, LOBSTAHS uses exact mass to make initial compound assignments from a set of customizable onboard databases. Then, a series of orthogonal screening criteria are applied to refine and winnow the list of assignments. A basic workflow based on xcms, CAMERA, and LOBSTAHS is illustrated in the schematic. Each step in the figure is described in detail in the following paragraphs.

Installing the LOBSTAHS package

Install dependencies:
source("http://bioconductor.org/biocLite.R")
biocLite("CAMERA")
biocLite("xcms")
Install RTools:

For windows: Download and install RTools from http://cran.r-project.org/bin/windows/Rtools/

For Unix: Install the R-development-packages (r-devel or r-base-dev)

Install packages needed for installation from Github:
install.packages("devtools")
Install LOBSTAHS:
library("devtools")
install_github("vanmooylipidomics/LOBSTAHS") 
Install ‘PtH2O2lipids,’ containing example data & precursor xsAnnotate object:
## install dataset 'PtH2O2lipids'
## see LOBSTAHS documentation for examples 

install_github("vanmooylipidomics/PtH2O2lipids")

Acquisition of HPLC-MS data suitable for LOBSTAHS

Data for LOBSTAHS should be acquired using a mass spectrometer with sufficiently high mass accuracy and resolution. Suitable MS acquisition platforms with which LOBSTAHS has been tested include Fourier transform ion cyclotron resonance (FT-ICR) and Orbitrap instruments. A quadrupole time-of-flight (Q-TOF) instrument could also be used if it possessed sufficient mass accuracy (i.e., < 5-7 ppm). While the software will accept any HPLC-MS data as input, insufficient mass accuracy and resolution will produce results with large numbers of database matches for each feature that cannot be distinguished from each other.

File conversion

After acquisition, each data file should be converted from manufacturer to open source (.mzXML) format in centroid (not profile) mode. If data were acquired on the instrument using ion mode switching, the data from each ionization mode (i.e., polarity) must also be extracted into a separate file. The msConvert tool (part of the ProteoWizard toolbox) can be used to accomplish these tasks. msConvert commands can be executed with either the provided GUI or at the command line; however, conversion of manufacuter file formats can only be accomplished using the Windows installation. An R script (Exactive_full_scan_process_ms1+.r) for batch conversion and extraction of data files acquired on a Thermo Exactive Orbitrap instrument is provided as part of the Van Mooy Lab Lipidomics Toolbox. This vignette is not intended as a comprehensive guide to mass spectrometer file conversion; users are encouraged to digest the msConvert documentation. However, assuming a hypothetical data file “Exactive_data.raw” was acquired on an Thermo Orbitrap instrument with ion mode switching, the following code could be within R used to convert and then extract positive and negative mode scans to separate files. (Code presumes you’ve already installed msConvert.)

Initial file conversion; saves converted file to a directory “mzXML_ms1_two_mode”:
system(paste("msconvert Exactive_data.raw --mzXML --filter \"peakPicking true 1-\" -o mzXML_ms1_two_mode -v"))
Extract positive, negative mode scans, then save in separate directories:
system(paste("msconvert mzXML_ms1_two_mode/Exactive_data.mzXML --mzXML --filter \"polarity positive\" -o mzXML_ms1_pos_mode -v"))
system(paste("msconvert mzXML_ms1_two_mode/Exactive_data.mzXML --mzXML --filter \"polarity negative\" -o mzXML_ms1_neg_mode -v"))

Pre-processing with xcms

After all data files in a particular dataset have been converted and extracted into files of like polarity, the R-package xcms can then be used to perform feature detection, retention time correction, and peak grouping. While the paragraphs below contain basic instructions for preparation of data, this vignette is not intended as ae manual or guide to the complex world of mass spectrometer data processing in xcms and CAMERA; users should acquaint themselves with the manuals (here and here) and very helpful vignettes (here and here) for the two packages.

First, data files should be given intuitive file names (containing, e.g., a sample ID along with information on experimental timepoint and treatment) and placed into a directory structure according to an index variable; the xcms vignette LC/MS Preprocessing and Analysis with xcms contains a detailed explanation. Feature detection, retention time correction, and peak grouping should then be performed. Values of parameters for xcms functions can be obtained from several sources:

The R script prepOrbidata.R (part of the Van Mooy Lab Lipidomics Toolbox) contains code for complete preparation in xcms of the PtH2O2lipids (or similar) dataset. The script allows the user to apply parameter values assembled from the literature or obtain them from IPO optimization of a subset of samples. The final parameter values used in Collins et al. 2016 for analysis of the PtH2O2lipids dataset – obtained using HPLC-ESI-MS on an Exactive Orbitrap instrument – are given in Table S5 of the electronic supplement. The settings and values listed in the table could be used as a starting point for analysis of similar data.

The xcmsSet produced using these settings from the P. tricornutum data is stored in the PtH2O2lipids package:

library(PtH2O2lipids)
ptH2O2lipids$xsAnnotate@xcmsSet
## An "xcmsSet" object with 16 samples
## 
## Time range: -1.1-1742.4 seconds (0-29 minutes)
## Mass range: 100.0397-1499.3326 m/z
## Peaks: 340991 (about 21312 per sample)
## Peak Groups: 18314 
## Sample classes: 0_uM_H2O2, 150_uM_H2O2, 30_uM_H2O2 
## 
## Profile settings: method = bin
##                   step = 0.001
## 
## Memory usage: 64.5 MB

Whatever settings are used in xcms, the processed data should contain high-quality features which have been aligned across samples; the quality of the processed data should be verified by individual inspection of a subset of features. At the conclusion of processing with xcms, the user should have a single xcmsSet object containing the dataset.

Final pre-processing with CAMERA

In the final pre-processing step, the R-package CAMERA should be used to (1) aggregate peak groups in the xcmsSet object into pseudospectra and (2) identify features in the data representing likely isotope peaks. Use of CAMERA to aggregate the peak groups into pseudospectra is critical because LOBSTAHS applies its various orthogonal screening criteria to only those peak groups within each pseudospectrum. CAMERA should not be used to eliminate adduct ions since the adduct ion hierarchy screening function in LOBSTAHS presumes that all adduct ions for a given analyte will be present in the dataset. We can create an xsAnnotate object from the P. tricornutum data by applying the wrapper function annotate() to the xcmsSet we created in the previous step:

library(xcms)
library(CAMERA)
library(LOBSTAHS)

# first, a necessary workaround to avoid a import error; see
# https://support.bioconductor.org/p/69414/

imports = parent.env(getNamespace("CAMERA"))
unlockBinding("groups", imports)
imports[["groups"]] = xcms::groups
lockBinding("groups", imports)

# create annotated xset using wrapper annotate(), allowing us to perform all
# CAMERA tasks at once

xsA = annotate(ptH2O2lipids$xsAnnotate@xcmsSet,
                   
               quick=FALSE,
               sample=NA, # use all samples
               nSlaves=1, # set to number of available cores or processors if 
                          # > 1
                   
               # group FWHM settings (defaults)
                   
               sigma=6,
               perfwhm=0.6,
                   
               # groupCorr settings (defaults)
                   
               cor_eic_th=0.75,
               graphMethod="hcs",
               pval=0.05,
               calcCiS=TRUE,
               calcIso=TRUE,
               calcCaS=FALSE,
                   
               # findIsotopes settings
                   
               maxcharge=4,
               maxiso=4,
               minfrac=0.5,
                   
               # adduct annotation settings
                   
               psg_list=NULL,
               rules=NULL,
               polarity="positive", # the PtH2O2lipids xcmsSet contains
                                    # positive-mode data
               multiplier=3,
               max_peaks=100,
                   
               # common to multiple tasks
                   
               intval="into",
               ppm=2.5,
               mzabs=0.0015
                   
               )
#> Start grouping after retention time.
#> Created 113 pseudospectra.
#> Generating peak matrix!
#> Run isotope peak annotation
#>  % finished: 10  20  30  40  50  60  70  80  90  100  
#> Found isotopes: 5692 
#> Start grouping after correlation.
#> Generating EIC's .. 
#> 
#> Calculating peak correlations in 113 Groups... 
#>  % finished: 10  20  30  40  50  60  70  80  90  100  
#> 
#> Calculating isotope assignments in 113 Groups... 
#>  % finished: 10  20  30  40  50  60  70  80  90  100  
#> Calculating graph cross linking in 113 Groups... 
#>  % finished: 10  20  30  40  50  60  70  80  90  100  
#> New number of ps-groups:  5080 
#> xsAnnotate has now 5080 groups, instead of 113 
#> Generating peak matrix for peak annotation!
#> 
#> Calculating possible adducts in 5080 Groups... 
#>  % finished: 10  20  30  40  50  60  70  80  90  100 

We now have an xsAnnotate object “xsA” to which we will next apply the screening and identification functions of LOBSTAHS. In the annotate() call, we set quick = FALSE because we want to run groupCorr(). This will also cause CAMERA to perform internal adduct annotation. While we will perform our own adduct annotation later with LOBSTAHS, allowing CAMERA to identify its own adducts doesn’t hurt, particularly if it leads to the creation of better pseudospectra.

The LOBSTAHS database: Use the default, or easily create your own

Before screening a dataset with LOBSTAHS, you should first decide whewther to use one of two default databases, or generate your own from the templates provided. LOBSTAHS databases contain a mixture of in silico and empirical data for the different adduct ions of a wide range of intact polar diacylglycerols (IP-DAG), triacylglycerols (TAG), polyunsaturated aldehydes (PUAs), free fatty acids (FFA), and common photosynthetic pigments. In addition, the latest LOBSTAHS release includes support for lyso lipids under an “IP_MAG” species class. The default databases (as of August 26, 2016) include 14,068 and 11,408 unique compounds that can be identifed in positive and negative ionization mode data, respectively. The databases can be easily customized (see below) if the user wishes to identify additional lipids in new lipid classes. LOBSTAHS databases are contained in S4 LOBdbase objects that are generated or accessed by various package functions. Each LOBdbase is specific to a particular polarity (i.e., ion mode); if evaluating positive-mode data, you will use a positive mode database (and vice versa).

We can access the default databases to examine their scope:

library(LOBSTAHS)
data(default.LOBdbase)

default.LOBdbase$positive # default positive mode database
## 
## A positive polarity "LOBdbase" object.
## 
## Contains entries for 78285 possible adduct ions of 14068 unique parent compounds.
## 
## Parent lipid types ( 6 ): DNPPE, IP_DAG, IP_MAG, pigment, PUA, TAG 
## IP-DAG classes ( 8 ): DGCC, DGDG, DGTS_DGTA, PE, PG, PC, SQDG, MGDG 
## IP-MAG classes ( 1 ): LPC 
## Pigments ( 22 ): Chl_a, 19prime_but_fuco, 19prime_hex_fuco, Allox, Alpha_carotene, Beta_carotenes, Chl_b, Chl_c2, Chl_c3, Chlide_a, Croco, Dd_Ddc, Dt, Echin, Fuco, Lut, Neox_Nos, Peri, Pheophytin_a, Pras, Viol, Zeax 
## Adducts ( 9 ): [M+2Na-H]+, [M+2Na+Cl]+, [M+C2H3Na2O2]+, [M+C4H10N3]+, [M+H]+, [M+K]+, [M+Na]+, [M+NH4]+, [M+NH4+ACN]+ 
## 
## m/z range: 97.0647914-1396.1777983 
## 
## Ranges of chemical parameters represented in molecules other than pigments:
## 
## Total number of acyl carbon atoms: 6-78 
## Total number of acyl carbon-carbon double bonds: 0-18 
## Number of additional oxygen atoms: 0-4 
## 
## Memory usage: 6.46 MB
default.LOBdbase$negative # default negative mode database
## 
## A negative polarity "LOBdbase" object.
## 
## Contains entries for 62372 possible adduct ions of 11408 unique parent compounds.
## 
## Parent lipid types ( 6 ): DNPPE, FFA, IP_DAG, IP_MAG, pigment, PUA 
## IP-DAG classes ( 8 ): DGCC, DGDG, DGTS_DGTA, PE, PG, PC, SQDG, MGDG 
## IP-MAG classes ( 1 ): LPC 
## Pigments ( 22 ): Chl_a, 19prime_but_fuco, 19prime_hex_fuco, Allox, Alpha_carotene, Beta_carotenes, Chl_b, Chl_c2, Chl_c3, Chlide_a, Croco, Dd_Ddc, Dt, Echin, Fuco, Lut, Neox_Nos, Peri, Pheophytin_a, Pras, Viol, Zeax 
## Adducts ( 11 ): [M-H]-, [M+2NaAc+Cl]-, [M+3Ac+2Na]-, [M+Cl]-, [M+HAc-H]-, [M+K-2H]-, [M+Na-2H]-, [M+Na+Cl-H]-, [M+NaAc-H]-, [M+NaAc+Cl]-, [M+NaAc+HAc-H]- 
## 
## m/z range: 95.05023848-1459.92498456 
## 
## Ranges of chemical parameters represented in molecules other than pigments:
## 
## Total number of acyl carbon atoms: 6-52 
## Total number of acyl carbon-carbon double bonds: 0-12 
## Number of additional oxygen atoms: 0-4 
## 
## Memory usage: 5.15 MB

Input tables required to generate databases

LOBSTAHS generates databases based on values of parameters defined in simple tables. The values in these tables define the range of molecular properties within each lipid class for which database entries will be created. Database generation is accomplished with the function generateLOBdbase(). Default values can be viewed by loading the default tables into the R workspace (directions below); the default input data are also given in Tables 1 and 2 of Collins et al. 2016.

A first table, the componentCompTable, defines the base elemental compositions of each molecule or lipid class that will be considered when the databsse is created. If a user wishes to add additional molecules or classes of molecules to the default set of compounds, he or she will need to add additional entrires to this table.

Two other user-editable tables define ranges of chemical properties for which databsase entires are created within each lipid class specified in the componentCompTable:

A fourth table, adductHierarchies, contains empirical data on the relative abundances of various adduct ions formed by the compounds that belong to each lipid class. This table is also user-editable, but any additions or changes should first be confirmed by empirical analyusis.

Users can load the default tables into the R workspace (and subsequently view them) at any time:

data(default.acylRanges)
data(default.oxyRanges)
data(default.componentCompTable)
data(default.adductHierarchies)

Customization of database inputs

Users can easily customize the values of the input parameters used to create databases in LOBSTAHS. This can be accomplished by modidying one or more of the Microsoft Excel templates that are provided with the package. The templates are automatically installed with LOBSTAHS into the /LOBSTAHS/doc/ subdirectory of your R library path. You can use .libPaths() to find the location of your R library. Alternatively, the latest versions of the templates can dopwnloaded from the LOBSTAHS GitHub repository.

What modifications must be made to which table(s) will depend on the user’s goal. For example, if the user wishes only to expand (or constrain) the range of molecular properties for which database entries are to be created within an existing lipid class, only the acylRanges and/or oxyRanges tables need be modified. However, a user seeking to add a new molecules or lipid class to the database(s) will need to first define the new molecule/class in both the componentCompTable and adductHierarchies tables. If the new entry is a single molecule (e.g., a new pigment), the new data in these two tables will be sufficient. However, if a new class of acyl lipid is defined, corresponding data must also be added to the acylRanges and (if necessary) oxyRanges tables.

Once any modificatons are made, the user should save the table(s) to text file(s) in comma-separated values (.csv) format; these can then be imported into R when the generateLOBdbase() function is called.

Database generation using generateLOBdbase

LOBSTAHS databases are generated using the generateLOBdbase() function. generateLOBdbase uses an in silico simulation to create database entries defined by parameters in the input tables. One entry is created for each possible adduct ion of a parent compound. generateLOBdbase can create databases for one or both ion modes. Paths to any .csv files containing user-customized input data should be specified during the call to generateLOBdbase; if NULL or no value is specified for any of the input tables, the defaults (see above) will be used. Finally, the user can specify whether a .csv file containing the new database should be created in addition to the LOBdbase object.

To recreate the default databases and store them to an object called “LOBdb”, the user would run the following:

LOBdb = generateLOBdbase(polarity = c("positive","negative"), gen.csv = FALSE,
                 component.defs = NULL, AIH.defs = NULL, acyl.ranges = NULL,
                 oxy.ranges = NULL)

Compound/biomarker identification and screening using doLOBscreen: The “meat” of LOBSTAHS

Once the user has created his/her database (or has decided to use the appropriate onboard default), compound identification and screening can be performed using the function doLOBscreen(). At this point, the user should have in hand an xsAnnotate object containing data which have processed in xcms and CAMERA. Working sequentially within each CAMERA pseudospectrum, doLOBscreen accomplishes the following (see also the schematic):

  1. First, if user elected remove.iso = TRUE, any secondary isotope peaks identified by CAMERA are removed from the dataset.

  2. Using a matching tolerance (match.ppm) and database specified by the user, putative (initial) compound assignments are applied to features in the dataset. The match.ppm should reflect the accuracy of the mass spectrometer used to acquire the data. If no value is given for database, the default database of the appropriate polarity will be used. polarity = positive or polarity = negative should be specified. If no polarity is given, doLOBscreen will attempt to detect it from the dataset… but detection is not perfect. At this point in the screening process, the current version of LOBSTAHS discards any features for which a match was not found in the database.6

  3. If user elected rt.restrict = TRUE, the (corrected) retention time of each feature to which an assignment has just been made is compared against the retention time range expected for the assignment’s parent lipid class. These expected retention time ranges, or “windows,” are specified in a table. The default values (specific to the chromatography under which LOBSTAHS was developed at Woods Hole Oceanographic Institution7) can be accessed in a manner similar to the database generation input defaults:

    data(default.rt.windows)

    The default retention time windows are also given in Table S2 of the electronic supplement to Collins et al. 2016.

    If the user wishes to use the retention time restriction feature with his/her own retention time data (highly recommended) a .csv table can be created from an Excel template available in the same subdirectory (/LOBSTAHS/doc/) of the R library path where the database generation templates reside. The template can also be downloaded from the LOBSTAHS GitHub repository.

    Important note: To account for shifts in retention time that occur during chromatographic alignment in xcms, LOBSTAHS automatically expands the retention time ranges given in the rt.windows table by 10% at each extreme. If xcms retention time correction results in large differences between raw and corrected retention times, the user should elect not to apply retention time restriction in LOBSTAHS (i.e., set rt.restrict = FALSE) since valid feartures could be lost. The extent of deviation between raw and corrected retention times can be diagnosed using the retention time correction profile plot, obtained with plottype = "mdevden" when calling rector in xcms. A future version of LOBSTAHS will allow the user to the set the factor by which the lipid class retention time windows should be expanded.

  4. Next, assignments with an odd total number of acyl carbon atoms can be eliminated (achieved by setting exclude.oddFA = TRUE). Applies only to acyl lipids (i.e., IP-DAG, TAG, PUA, or free fatty acids). Useful if data are (or are believed to be) of exclusively eukaryotic origin, since synthesis of fatty acids with odd numbers of carbon atoms is not known in eukaryotes.8

  5. A series of adduct ion hierarchy rules are then applied to the remaining assignments. The theory and development of these rules is described in Collins et al 2016. and summarized in the schematic above. During the adduct ion screening process, a series of codes are applied to each assignment to indicate the degree to which it satisfied the hierarchy rules Unlike the other screening features, application of the adduct ion hierarchy rules is not optional.

  6. Once the list of compound assignments has been screened by pseudospectrum according to the user’s spefifications, the assignments are then evaluated as a single group to identify possible isomers and isobars (compounds having distinct but very similar m/z). These isomers and isobars are annotated with additional codes so the user can examine them in subsequent analysis.

Once done, doLOBscreen returns a LOBSet object containing the fully screened dataset. To screen the PtH2O2lipids xsAnnotate object using the same settings that produced the results in Collins et al. 2016, the user would run:

myPtH2O2LOBSet = doLOBscreen(ptH2O2lipids$xsAnnotate, polarity = "positive",
                             database = NULL, remove.iso = TRUE,
                             rt.restrict =  TRUE, rt.windows = NULL,
                             exclude.oddFA = TRUE, match.ppm = 2.5)

In this example, the object “myPtH2O2LOBSet” would be identical to the screened LOBSet in the PtH2O2lipids package (ptH2O2lipids$LOBSet). Information about a LOBSet can be viewed by calling the object at the R prompt. For example:

ptH2O2lipids$LOBSet
#> A positive polarity "LOBSet" containing LC-MS peak data. Compound assignments and adduct ion hierarchy screening annotations applied to 16 samples using the "LOBSTAHS" package.
#> 
#> Individual peaks: 21869 
#> Peak groups: 1595 
#> Compound assignments: 1969 
#> m/z range: 551.425088845409-1269.09515435315 
#> 
#> Peak groups having possible regisomers: 556 
#> Peak groups having possible structural functional isomers: 375 
#> Peak groups having isobars indistinguishable within ppm matching tolerance: 84 
#> 
#> Restrictions applied prior to conducting adduct ion hierarchy screening: remove.iso, rt.restrict, exclude.oddFA 
#> 
#> Match tolerance used for database assignments: 2.5 ppm
#> 
#> Memory usage: 1.26 MB

More detailed diagnostic information can also be obtained from the LOBSet. The effectiveness of the various screening criteria are recorded in a data frame LOBscreen_diagnostics:

LOBscreen_diagnostics(ptH2O2lipids$LOBSet)
#>                     peakgroups  peaks assignments parent_compounds
#> initial                  18314 251545          NA               NA
#> post_remove_iso          12146 163938          NA               NA
#> initial_assignments       5077  67862       15929            14076
#> post_rt_restrict          4451  60070       13504            11779
#> post_exclude_oddFA        3871  52337        7458             6283
#> post_AIHscreen            1595  21869        2056             1969

The numbers of isomers/isboars identified, and the number of assignments/compounds affected by these identifications, are recorded in LOBisoID_diagnostics:

LOBisoID_diagnostics(ptH2O2lipids$LOBSet)
#>                      peakgroups parent_compounds assignments features
#> C3r_regio.iso               556              352         750     7591
#> C3f_funct.struct.iso        375              577         752     5057
#> C3c_isobars                  84              162         195     1137

Follow-on analysis of screened data

With the set of screened compound assignments in hand, the user has several options. Users familiar with R can extract data for further analysis or screening directly from the LOBSet object. Alternatively, the function getLOBpeaklist can be used to extract a table of results from the LOBSet. Options in getLOBpeaklist allow the user to (1) include isomer and isobar cross-references (the default; recommended) and (2) simultaneously generate a .csv file with the results. The .csv file is exported with a unique timestamp to the R working directory.

Package updates and improvements

Users are encouraged to submit issues with LOBSTAHS via the package GitHub site. Several improvements to the package are planned; the package will be updated as these are incorporated. Notice will be posted to the package releases page when updates are published.

References

Benton, H. P., Want, E. J., and Ebbels, T. M. D. 2010. Correction of mass calibration gaps in liquid chromatography-mass spectrometry metabolomics data. Bioinformatics 26: 2488-2489

Collins, J.R., Edwards, B.R., Fredricks, H.F., and Van Mooy, B.A.S. 2016. LOBSTAHS: An adduct-based lipidomics strategy for discovery and identification of oxidative stress biomarkers. Anal. Chem. 88: 7154-7162; doi:10.1021/acs.analchem.6b01260

Hummel, J., Segu, S., Li, Y., Irgang, S., Jueppner, J., and Giavalisco, P. 2011. Ultra performance liquid chromatography and high resolution mass spectrometry for the analysis of plant lipids. Front Plant Sci 2

Kuhl, C., Tautenhahn, R., Bottcher, C., Larson, T. R., and Neumann, S. 2012. CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal. Chem. 84: 283-289

Libiseller, G., Dvorzak, M., Kleb, U., Gander, E., Eisenberg, T., Madeo, F., Neumann, S., Trausinger, G., Sinner, F., Pieber, T., and Magnes, C. 2015. IPO: a tool for automated optimization of XCMS parameters. BMC Bioinformatics 16: 118

Patti, G.J., Tautenhahn, R., and Siuzdak, G. 2012. Meta-analysis of untargeted metabolomic data from multiple profiling experiments. Nat. Protocols 7: 508-516

Pearson, A. 2014. Lipidomics for geochemistry. In Treatise on Geochemistry, Holland, H. D., and Turekian, K. K. eds., 2nd Ed., pp. 291-336. Elsevier: Oxford.

Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., and Siuzdak, G. 2006. XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal. Chem. 78: 779–787

Tautenhahn, R., Boettcher, C., and Neumann, S. 2008. Highly sensitive feature detection for high resolution LC/MS. BMC Bioinformatics 9: 504

Notes

  1. Collins, J.R., Edwards, B.R., Fredricks, H.F., and Van Mooy, B.A.S., 2016, “LOBSTAHS: An adduct-based lipidomics strategy for discovery and identification of oxidative stress biomarkers,” Anal. Chem. 88: 7154-7162; doi:10.1021/acs.analchem.6b01260

  2. Smith et al., 2006, “XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification,” Anal. Chem. 78: 779–787; Tautenhahn et al., 2008, “Highly sensitive feature detection for high resolution LC/MS,” BMC Bioinformatics 9: 504; Benton et al., 2010, “Correction of mass calibration gaps in liquid chromatography-mass spectrometry metabolomics data,” Bioinformatics 26: 2488-2489

  3. Kuhl et al., 2012, “CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets,” Anal. Chem. 84: 283-289

  4. See, e.g., Patti et al., 2012, “Meta-analysis of untargeted metabolomic data from multiple profiling experiment,” Nature Protocols 7: 508-516

  5. Libiseller et al., 2015, “IPO: a tool for automated optimization of XCMS parameters,” BMC Bioinformatics 16: 118

  6. A planned improvement, currently under development, will allow the user to retain unidentified features in a separate data object.

  7. We used a modified version of the chromatography presented in Hummel et al., 2011, “Ultra performance liquid chromatography and high resolution mass spectrometry for the analysis of plant lipids,” Front Plant Sci 2; see electronic supplement to Collins et al. 2016 for full specification.

  8. See A. Pearson, 2014, “Lipidomics for geochemistry” in Treatise on Geochemistry (Holland, H. D., and Turekian, K. K. eds.), 2nd Ed., Elsevier, Oxford, pp. 291-336.