Contents

1 Introduction

The msPurity package can perform spectral matching to the MassBank and LipidBlast libraries that are stored in a SQLite database.

The user is free to add any other data to the SQLite database as well.

TODO: More detailed description to be added.

2 Spectral matching

2.1 LC-MS/MS

We link the spectral matching result back to XCMS feature, therefore we need to run XCMS first.

(Please use the appropiate settings for your data)

library(msPurity)
## Loading required package: Rcpp
msmsPths <- list.files(system.file("extdata", "lcms", "mzML", package="msPurityData"), full.names = TRUE, pattern = "MSMS")
xset <- xcms::xcmsSet(msmsPths, nSlaves = 1)
## Use of argument 'nSlaves' is deprecated, please use 'BPPARAM' instead.
## Loading required package: xcms
## Loading required package: Biobase
## Loading required package: BiocGenerics
## Loading required package: parallel
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
## 
##     clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
##     clusterExport, clusterMap, parApply, parCapply, parLapply,
##     parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     Filter, Find, Map, Position, Reduce, anyDuplicated, append,
##     as.data.frame, basename, cbind, colMeans, colSums, colnames,
##     dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
##     intersect, is.unsorted, lapply, lengths, mapply, match, mget,
##     order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
##     rowMeans, rowSums, rownames, sapply, setdiff, sort, table,
##     tapply, union, unique, unsplit, which, which.max, which.min
## Welcome to Bioconductor
## 
##     Vignettes contain introductory material; view with
##     'browseVignettes()'. To cite Bioconductor, see
##     'citation("Biobase")', and for packages 'citation("pkgname")'.
## Loading required package: BiocParallel
## Loading required package: MSnbase
## Loading required package: mzR
## Loading required package: S4Vectors
## Loading required package: stats4
## 
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:base':
## 
##     expand.grid
## Loading required package: ProtGenerics
## 
## This is MSnbase version 2.8.3 
##   Visit https://lgatto.github.io/MSnbase/ to get started.
## 
## Attaching package: 'MSnbase'
## The following object is masked from 'package:stats':
## 
##     smooth
## The following object is masked from 'package:base':
## 
##     trimws
## 
## This is xcms version 3.4.2
## 
## Attaching package: 'xcms'
## The following object is masked from 'package:stats':
## 
##     sigma
xset <- xcms::group(xset)
## Processing 3163 mz slices ...
## OK
xset <- xcms::retcor(xset)
## Performing retention time correction using 351 peak groups.
xset <- xcms::group(xset)
## Processing 3163 mz slices ... OK

The purityA function is then called to calculate the precursor purity of the fragmentation results and the frag4feature function will links the fragmentation data back to the XCMS feature.

A SQLite database is also generated with the all the results included.

pa  <- purityA(msmsPths, interpol = "linear")
pa <- frag4feature(pa, xset, create_db=TRUE)
## Creating a database of fragmentation spectra and LC features

The spectral matching is then run on all fragmentation scans collected in all files using the spectral_matching function. The function updates the previously generated database with the spectral matching annotations.

As we have all the connections between fragmentation scans and XCMS features we can now see which XCMS feature has been annotated. A summary of the annotations for the XCMS grouped peaks is provided in the output

result <- spectral_matching( pa@db_path, out_dir = tempdir())
## Running msPurity spectral matching function for LC-MS(/MS) data
## Performing spectral matching
## Warning in result_fetch(res@ptr, n = n): Column `precursor_mz`: mixed type,
## first seen values of type real, coercing other values of type string
## Warning in result_fetch(res@ptr, n = n): Column `mass_accuracy`: mixed type,
## first seen values of type real, coercing other values of type string
## Summarising LC features annotations

It should be noted that in a typical Data Dependent Acuisition (DDA) experiment not all the fragmentation scans collected can be linked backed to an associated XCMS features and in some cases the percentage of XCMS features with fragmentation spectra can sometimes be quite small.

2.2 DI-MS/MS or DI-MSn

TODO