AdductomicsR workflow

Josie Hayes

April 25, 2023

Getting Started

#ensure you have mzR installed
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("mzR", version = "3.8")

# install the package directly from Github
library(devtools)
devtools::install_github("JosieLHayes/adductomicsR")

#install the data package containing the data 
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("ExperimentHub", version = "3.9")

#or download the packages and install from source
library(devtools)
devtools::install("path_to_dir/adductomicsR")
devtools::install("path_to_dir/adductData")

After installation of the adductomics package and all dependencies attach the adductomics package by typing (copying and pasting) this line of code into the R console and hitting enter:

# load the package
library(adductomicsR)
library(adductData)
library(ExperimentHub)

We have provided 2 mzXML files for use in this vignette in adductData.

Preparation of the data

Mass drift correction: Usually mass drift is corrected using lock masses on the mass spectrometer. If this has not been done a python script is provided in the directory in which the package is saved on your computer at /inst/extdata/thermo_MassDriftCalc.py and can be launched from within python using (replace the path to the python script in your system): exec(open(“thermo_MassDriftCalc.py“).read())

Retention time correction

Each sample is corrected for retention time drift using the rtDevModeling function. To run this with the default parameters enter the path of the directory containing your mzXML files and the run order file (order in which samples were run). For further information on parameters see ??rtDevModelling. An example run order file is available in inst/extdata (within the directory where the package is saved on your computer) and 2 mzXML files are available in adductData/ExperimentHub.These files will be used in this vignette automatically.

Download the mzXML files from ExperimentHub for use in this vignette. They must have .mzXML to be recognized by the package so they are renamed as well.

eh  = suppressMessages(suppressWarnings(ExperimentHub::ExperimentHub()))
temp = suppressMessages(suppressWarnings(
AnnotationHub::query(eh, 'adductData')))
suppressMessages(suppressWarnings(temp[['EH1957']])) #first mzXML file
##                                                       EH1957 
## "/home/biocbuild/.cache/R/ExperimentHub/1b12302eb6ed28_1957"
file.rename(cache(temp["EH1957"]), file.path(hubCache(temp),
                                             'ORB35017.mzXML'))
## [1] TRUE
temp[['EH1958']] #second mzXML file
## see ?adductData and browseVignettes('adductData') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
##                                                       EH1958 
## "/home/biocbuild/.cache/R/ExperimentHub/1b12307490d887_1958"
file.rename(cache(temp["EH1958"]), file.path(hubCache(temp), 'ORB35022.mzXML'))
## [1] TRUE
rtDevModelling(
  MS2Dir = hubCache(temp),
  nCores=4,
  runOrder =paste0(system.file("extdata", 
                               package ="adductomicsR"),'/runOrder.csv')
  )

Identify adducts

The specSimPepId function detects adducts present on the peptide. To run this with the default parameters (the largest triply charged peptide of human serum albumin) enter the path of your mzxml files and rtDevModels object. For further information on running this with different peptides see ??specSimPepId. This produces MS2 spectra plots, each in a separate directory for each sample. A plot of the model spectrum is also saved in the mzXML files directory for comparison. The spectra are grouped based on the mz and RT windows, and plots of these groups are also provided based on the raw RT and adjusted RT. These plots can be used to determine whether multiple groups pertain to the same peak.

specSimPepId(
  MS2Dir = hubCache(temp),
  nCores=4, 
  rtDevModels =paste0(hubCache(temp),'/rtDevModels.RData')
  )

Generate a target table for quantification

A list of the adducts for quantification and their monoisotopic mass (MIM), retention time (RT), peptide and charge is generated using the following command. Substitute the file path of the allResults file to the location of your allResults file from the previous step.

generateTargTable(
  allresultsFile=paste0(system.file("extdata",package =
  "adductomicsR"),'/allResults_ALVLIAFAQYLQQCPFEDHVK_example.csv'),
  csvDir=tempdir(check = FALSE)
  )

It is recommended that the allGroups plot ( m/z vs RT) is used to ensure that the adducts in the target table do not pertain to the same peak, as the quantification step can be computationally intensive.

Quantify adducts

See ??adductQuant for an explanation on the parameters for this function. To use your target table produced in the previous step, alter the value in the ‘targTable’ option to the path of your target table. Similarly replaced the path to the directory of your own mzXML files in filePaths (set as “Users/Documents/mzXMLfiles” here.

adductQuant(
  nCores=2, 
  targTable=paste0(system.file("extdata", 
                               package="adductomicsR"),
                               '/exampletargTable2.csv'), 
  intStdRtDrift=30, 
  rtDevModels= paste0(hubCache(temp),'/rtDevModels.RData'),
  filePaths=list.files(hubCache(temp),pattern=".mzXML",
                       all.files=FALSE,full.names=TRUE),
  quantObject=NULL,
  indivAdduct=NULL,
  maxPpm=5,
  minSimScore=0.8,
  spikeScans=1,
  minPeakHeight=100,
  maxRtDrift=20,
  maxRtWindow=240,
  isoWindow=80,
  hkPeptide='LVNEVTEFAK', 
  gaussAlpha=16
  )

Extract the results from the AdductQuantif Object

It is recommended that spectra for each of the adducts found are checked manually using LC-MS software, either at this step or before quantification.

To load your adductquantif object set the path to the file on your system. In the example it assumes the file is present in your working directory.

#load the adductquantif object 
load(paste0(hubCache(temp),"/adductQuantResults.Rda"))

#produce a peakTable from the Adductquantif object and save to a temporary
#directory
suppressMessages(suppressWarnings(outputPeakTable(object=
    object, outputDir=tempdir(check = FALSE))))

Filter the results from the peak area table

Mass spectrometry data is inherently noisy, and the filterAdductTable() function will filter out samples and adducts based on set thresholds. It is recommended to use this filter function to remove adducts that have many missing values and samples where the housekeeping peptide is weak, suggestive of misintegration. Substitute the name of the peaklist file with the path and the name of your peaklist file produced in the previous step.

filterAdductTable(
  paste0(tempdir(check = FALSE),"/adductQuantif_peakList_", Sys.Date(), ".csv")
  )
#session info
sessionInfo()
## R version 4.3.0 RC (2023-04-13 r84269)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 22.04.2 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.17-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] adductomicsR_1.16.0 ExperimentHub_2.8.0 AnnotationHub_3.8.0
## [4] BiocFileCache_2.8.0 dbplyr_2.3.2        BiocGenerics_0.46.0
## [7] adductData_1.15.0  
## 
## loaded via a namespace (and not attached):
##  [1] ade4_1.7-22                   tidyselect_1.2.0             
##  [3] dplyr_1.1.2                   blob_1.2.4                   
##  [5] filelock_1.0.2                Biostrings_2.68.0            
##  [7] bitops_1.0-7                  fastmap_1.1.1                
##  [9] RCurl_1.98-1.12               promises_1.2.0.1             
## [11] digest_0.6.31                 mime_0.12                    
## [13] lifecycle_1.0.3               cluster_2.1.4                
## [15] ellipsis_0.3.2                KEGGREST_1.40.0              
## [17] interactiveDisplayBase_1.38.0 RSQLite_2.3.1                
## [19] kernlab_0.9-32                magrittr_2.0.3               
## [21] compiler_4.3.0                rlang_1.1.0                  
## [23] sass_0.4.5                    tools_4.3.0                  
## [25] utf8_1.2.3                    yaml_2.3.7                   
## [27] knitr_1.42                    htmlwidgets_1.6.2            
## [29] bit_4.0.5                     mclust_6.0.0                 
## [31] curl_5.0.0                    xml2_1.3.3                   
## [33] plyr_1.8.8                    withr_2.5.0                  
## [35] purrr_1.0.1                   nnet_7.3-18                  
## [37] grid_4.3.0                    stats4_4.3.0                 
## [39] fansi_1.0.4                   xtable_1.8-4                 
## [41] pastecs_1.3.21                prabclus_2.3-2               
## [43] iterators_1.0.14              fpc_2.2-10                   
## [45] MASS_7.3-59                   cli_3.6.1                    
## [47] rmarkdown_2.21                crayon_1.5.2                 
## [49] generics_0.1.3                robustbase_0.95-1            
## [51] reshape2_1.4.4                httr_1.4.5                   
## [53] DBI_1.1.3                     cachem_1.0.7                 
## [55] stringr_1.5.0                 zlibbioc_1.46.0              
## [57] modeltools_0.2-23             rvest_1.0.3                  
## [59] parallel_4.3.0                AnnotationDbi_1.62.0         
## [61] BiocManager_1.30.20           XVector_0.40.0               
## [63] vctrs_0.6.2                   boot_1.3-28.1                
## [65] jsonlite_1.8.4                IRanges_2.34.0               
## [67] S4Vectors_0.38.0              bit64_4.0.5                  
## [69] diptest_0.76-0                foreach_1.5.2                
## [71] jquerylib_0.1.4               glue_1.6.2                   
## [73] DEoptimR_1.0-12               codetools_0.2-19             
## [75] DT_0.27                       stringi_1.7.12               
## [77] BiocVersion_3.17.1            later_1.3.0                  
## [79] GenomeInfoDb_1.36.0           tibble_3.2.1                 
## [81] pillar_1.9.0                  rappdirs_0.3.3               
## [83] htmltools_0.5.5               GenomeInfoDbData_1.2.10      
## [85] R6_2.5.1                      evaluate_0.20                
## [87] shiny_1.7.4                   Biobase_2.60.0               
## [89] lattice_0.21-8                png_0.1-8                    
## [91] memoise_2.0.1                 httpuv_1.6.9                 
## [93] bslib_0.4.2                   class_7.3-21                 
## [95] Rcpp_1.0.10                   flexmix_2.3-19               
## [97] xfun_0.39                     pkgconfig_2.0.3