Contents

Compiled date: 2021-05-19

Last edited: 2021-10-05

License: GPL-3

1 Installation

Run the following code to install the Bioconductor version of the package.

# install.packages("BiocManager")
BiocManager::install("fobitools")

2 Load fobitools

library(fobitools)

You can also load some additional packages that will be very useful in this vignette.

library(dplyr)
library(kableExtra)

3 metaboliteUniverse and metaboliteList

In microarrays, for example, we can study almost all the genes of an organism in our sample, so it makes sense to perform an over representation analysis (ORA) considering all the genes present in Gene Ontology (GO). Since most of the GO pathways would be represented by some gene in the microarray.

This is different in nutrimetabolomics. Targeted nutrimetabolomics studies sets of about 200-500 diet-related metabolites, so it would not make sense to use all known metabolites (for example in HMDB or CHEBI) in an ORA, as most of them would not have been quantified in the study.

In nutrimetabolomic studies it may be interesting to study enriched or over represented foods/food groups by the metabolites resulting from the study statistical analysis, rather than the enriched metabolic pathways, as would make more sense in genomics or other metabolomics studies.

The Food-Biomarker Ontology (FOBI) provides a biological knowledge for conducting these enrichment analyses in nutrimetabolomic studies, as FOBI provides the relationships between several foods and their associated dietary metabolites (Castellano-Escuder et al. 2020).

Accordingly, to perform an ORA with the fobitools package, it is necessary to provide a metabolite universe (all metabolites included in the statistical analysis) and a list of selected metabolites (selected metabolites according to a statistical criterion).

Here is an example:

# select 300 random metabolites from FOBI
idx_universe <- sample(nrow(fobitools::idmap), 300, replace = FALSE)
metaboliteUniverse <- fobitools::idmap %>%
  dplyr::slice(idx_universe) %>%
  pull(FOBI)

# select 10 random metabolites from metaboliteUniverse that are associated with 'Red meat' (FOBI:0193), 
# 'Lean meat' (FOBI:0185) , 'egg food product' (FOODON:00001274), 
# or 'grape (whole, raw)' (FOODON:03301702)
fobi_subset <- fobitools::parse_fobi() %>%
  filter(FOBI %in% metaboliteUniverse) %>%
  filter(id_BiomarkerOf %in% c("FOBI:0193", "FOBI:0185", "FOODON:00001274", "FOODON:03301702")) %>%
  dplyr::slice(sample(nrow(.), 10, replace = FALSE))

metaboliteList <- fobi_subset %>%
  pull(FOBI)
fobitools::ora(metaboliteList = metaboliteList, 
               metaboliteUniverse = metaboliteUniverse, 
               subOntology = "food", 
               pvalCutoff = 0.01)
className classSize overlap pval padj overlapMetabolites
dairy food product 9 6 0.0000000 0.0000002 FOBI:08823 , FOBI:030701, FOBI:030696, FOBI:030705, FOBI:030699, FOBI:030703
egg food product 7 5 0.0000001 0.0000020 FOBI:030701, FOBI:030696, FOBI:030705, FOBI:030699, FOBI:030703
meat food product 10 5 0.0000015 0.0000158 FOBI:030701, FOBI:030696, FOBI:030705, FOBI:030699, FOBI:030703
Red meat 4 3 0.0000743 0.0005199 FOBI:030706, FOBI:08823 , FOBI:030707
soybean (whole) 20 5 0.0000839 0.0005199 FOBI:030701, FOBI:030696, FOBI:030705, FOBI:030699, FOBI:030703
Lean meat 3 2 0.0023703 0.0122466 FOBI:030706, FOBI:030707

4 Network visualization of metaboliteList terms

Then, with the fobi_graph function we can visualize the metaboliteList terms with their corresponding FOBI relationships.

terms <- fobi_subset %>%
  pull(id_code)

# create the associated graph
fobitools::fobi_graph(terms = terms, 
                      get = "anc",
                      labels = TRUE,
                      legend = TRUE)

5 Session Information

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 20.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.13-bioc/R/lib/libRblas.so
#> LAPACK: /home/biocbuild/bbs-3.13-bioc/R/lib/libRlapack.so
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> attached base packages:
#> [1] parallel  stats4    stats     graphics  grDevices utils     datasets 
#> [8] methods   base     
#> 
#> other attached packages:
#>  [1] SummarizedExperiment_1.22.0  Biobase_2.52.0              
#>  [3] GenomicRanges_1.44.0         GenomeInfoDb_1.28.0         
#>  [5] IRanges_2.26.0               S4Vectors_0.30.0            
#>  [7] BiocGenerics_0.38.0          MatrixGenerics_1.4.0        
#>  [9] matrixStats_0.58.0           metabolomicsWorkbenchR_1.2.0
#> [11] POMA_1.2.0                   ggrepel_0.9.1               
#> [13] rvest_1.0.0                  kableExtra_1.3.4            
#> [15] forcats_0.5.1                stringr_1.4.0               
#> [17] dplyr_1.0.6                  purrr_0.3.4                 
#> [19] readr_1.4.0                  tidyr_1.1.3                 
#> [21] tibble_3.1.2                 ggplot2_3.3.3               
#> [23] tidyverse_1.3.1              fobitools_1.0.0             
#> [25] BiocStyle_2.20.0            
#> 
#> loaded via a namespace (and not attached):
#>   [1] utf8_1.2.1                  tidyselect_1.1.1           
#>   [3] RSQLite_2.2.7               grid_4.1.0                 
#>   [5] BiocParallel_1.26.0         gmp_0.6-2                  
#>   [7] pROC_1.17.0.1               munsell_0.5.0              
#>   [9] codetools_0.2-18            preprocessCore_1.54.0      
#>  [11] withr_2.4.2                 colorspace_2.0-1           
#>  [13] highr_0.9                   knitr_1.33                 
#>  [15] rstudioapi_0.13             mzID_1.30.0                
#>  [17] labeling_0.4.2              GenomeInfoDbData_1.2.6     
#>  [19] polyclip_1.10-0             bit64_4.0.5                
#>  [21] farver_2.1.0                vctrs_0.3.8                
#>  [23] generics_0.1.0              ipred_0.9-11               
#>  [25] xfun_0.23                   randomForest_4.6-14        
#>  [27] R6_2.5.0                    doParallel_1.0.16          
#>  [29] clue_0.3-59                 graphlayouts_0.7.1         
#>  [31] syuzhet_1.0.6               MsCoreUtils_1.4.0          
#>  [33] DelayedArray_0.18.0         bitops_1.0-7               
#>  [35] cachem_1.0.5                fgsea_1.18.0               
#>  [37] assertthat_0.2.1            scales_1.1.1               
#>  [39] vroom_1.4.0                 ggraph_2.0.5               
#>  [41] nnet_7.3-16                 gtable_0.3.0               
#>  [43] Cairo_1.5-12.2              affy_1.70.0                
#>  [45] tidygraph_1.2.0             timeDate_3043.102          
#>  [47] tictoc_1.0.1                rlang_0.4.11               
#>  [49] clisymbols_1.2.0            systemfonts_1.0.2          
#>  [51] mzR_2.26.0                  GlobalOptions_0.1.2        
#>  [53] splines_4.1.0               ModelMetrics_1.2.2.2       
#>  [55] impute_1.66.0               selectr_0.4-2              
#>  [57] broom_0.7.6                 RecordLinkage_0.4-12.1     
#>  [59] reshape2_1.4.4              BiocManager_1.30.15        
#>  [61] yaml_2.2.1                  modelr_0.1.8               
#>  [63] backports_1.2.1             caret_6.0-88               
#>  [65] tools_4.1.0                 lava_1.6.9                 
#>  [67] bookdown_0.22               affyio_1.62.0              
#>  [69] ellipsis_0.3.2              jquerylib_0.1.4            
#>  [71] ff_4.0.4                    RColorBrewer_1.1-2         
#>  [73] proxy_0.4-25                MSnbase_2.18.0             
#>  [75] MultiAssayExperiment_1.18.0 Rcpp_1.0.6                 
#>  [77] plyr_1.8.6                  zlibbioc_1.38.0            
#>  [79] RCurl_1.98-1.3              ps_1.6.0                   
#>  [81] rpart_4.1-15                GetoptLong_1.0.5           
#>  [83] viridis_0.6.1               haven_2.4.1                
#>  [85] cluster_2.1.2               fs_1.5.0                   
#>  [87] magrittr_2.0.1              RSpectra_0.16-0            
#>  [89] data.table_1.14.0           magick_2.7.2               
#>  [91] circlize_0.4.12             reprex_2.0.0               
#>  [93] pcaMethods_1.84.0           ProtGenerics_1.24.0        
#>  [95] hms_1.1.0                   patchwork_1.1.1            
#>  [97] evaluate_0.14               xtable_1.8-4               
#>  [99] XML_3.99-0.6                readxl_1.3.1               
#> [101] gridExtra_2.3               shape_1.4.6                
#> [103] compiler_4.1.0              ellipse_0.4.2              
#> [105] ncdf4_1.17                  crayon_1.4.1               
#> [107] htmltools_0.5.1.1           mgcv_1.8-35                
#> [109] corpcor_1.6.9               qdapRegex_0.7.2            
#> [111] lubridate_1.7.10            DBI_1.1.1                  
#> [113] tweenr_1.0.2                dbplyr_2.1.1               
#> [115] ComplexHeatmap_2.8.0        MASS_7.3-54                
#> [117] Matrix_1.3-3                permute_0.9-5              
#> [119] cli_2.5.0                   vsn_3.60.0                 
#> [121] gower_0.2.2                 textclean_0.9.3            
#> [123] evd_2.3-3                   RankProd_3.18.0            
#> [125] igraph_1.2.6                pkgconfig_2.0.3            
#> [127] lexicon_1.2.1               recipes_0.1.16             
#> [129] MALDIquant_1.19.3           xml2_1.3.2                 
#> [131] foreach_1.5.1               rARPACK_0.11-0             
#> [133] svglite_2.0.0               ggcorrplot_0.1.3           
#> [135] bslib_0.2.5.1               XVector_0.32.0             
#> [137] webshot_0.5.2               prodlim_2019.11.13         
#> [139] ada_2.0-5                   digest_0.6.27              
#> [141] vegan_2.5-7                 rmarkdown_2.8              
#> [143] cellranger_1.1.0            fastmatch_1.1-0            
#> [145] curl_4.3.1                  rjson_0.2.20               
#> [147] glasso_1.11                 nlme_3.1-152               
#> [149] lifecycle_1.0.0             jsonlite_1.7.2             
#> [151] mixOmics_6.16.0             viridisLite_0.4.0          
#> [153] limma_3.48.0                fansi_0.4.2                
#> [155] pillar_1.6.1                ontologyIndex_2.7          
#> [157] lattice_0.20-44             fastmap_1.1.0              
#> [159] httr_1.4.2                  survival_3.2-11            
#> [161] glue_1.4.2                  png_0.1-7                  
#> [163] iterators_1.0.13            glmnet_4.1-1               
#> [165] bit_4.0.4                   ggforce_0.3.3              
#> [167] class_7.3-19                stringi_1.6.2              
#> [169] sass_0.4.0                  struct_1.4.0               
#> [171] blob_1.2.1                  memoise_2.0.0              
#> [173] Rmpfr_0.8-4                 e1071_1.7-6

References

Castellano-Escuder, Pol, Raúl González-Domı́nguez, David S Wishart, Cristina Andrés-Lacueva, and Alex Sánchez-Pla. 2020. “FOBI: An Ontology to Represent Food Intake Data and Associate It with Metabolomic Data.” Database 2020.