Introduction

This package is designed for reactome pathway-based analysis. Reactome is an open-source, open access, manually curated and peer-reviewed pathway database.

Citation

If you use ReactomePA(Yu and He 2016) in published research, please cite:

G Yu, QY He*. ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization. Molecular BioSystems 2016, 12(2):477-479. doi: 10.1039/C5MB00663E

Supported organisms

Currently ReactomePA supports several model organisms, including ‘celegans’, ‘fly’, ‘human’, ‘mouse’, ‘rat’, ‘yeast’ and ‘zebrafish’. The input gene ID should be Entrez gene ID. We recommend using clusterProfiler::bitr to convert biological IDs. For more detail, please refer to bitr: Biological Id TranslatoR.

Pathway Enrichment Analysis

Enrichment analysis is a widely used approach to identify biological themes. Here, we implement hypergeometric model to assess whether the number of selected genes associated with reactome pathway is larger than expected. The p values were calculated based the hypergeometric model(Boyle et al. 2004).

## [1] "4312"  "8318"  "10874" "55143" "55388" "991"
##                          ID
## R-HSA-69620     R-HSA-69620
## R-HSA-2500257 R-HSA-2500257
## R-HSA-141424   R-HSA-141424
## R-HSA-141444   R-HSA-141444
## R-HSA-69618     R-HSA-69618
## R-HSA-68877     R-HSA-68877
##                                                                                        Description
## R-HSA-69620                                                                 Cell Cycle Checkpoints
## R-HSA-2500257                                              Resolution of Sister Chromatid Cohesion
## R-HSA-141424                                         Amplification of signal from the kinetochores
## R-HSA-141444  Amplification  of signal from unattached  kinetochores via a MAD2  inhibitory signal
## R-HSA-69618                                                             Mitotic Spindle Checkpoint
## R-HSA-68877                                                                   Mitotic Prometaphase
##               GeneRatio   BgRatio       pvalue     p.adjust       qvalue
## R-HSA-69620      37/322 293/10654 9.962958e-14 7.173330e-11 6.355319e-11
## R-HSA-2500257    23/322 126/10654 2.824834e-12 1.016940e-09 9.009735e-10
## R-HSA-141424     20/322  96/10654 6.054136e-12 1.089744e-09 9.654753e-10
## R-HSA-141444     20/322  96/10654 6.054136e-12 1.089744e-09 9.654753e-10
## R-HSA-69618      21/322 112/10654 1.508315e-11 2.171973e-09 1.924292e-09
## R-HSA-68877      26/322 200/10654 2.966075e-10 3.559290e-08 3.153406e-08
##                                                                                                                                                                                                                                 geneID
## R-HSA-69620   CDC45/CDCA8/MCM10/CDC20/CENPE/CCNB2/NDC80/UBE2C/SKA1/CENPM/CENPN/CCNA2/CDK1/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/CHEK1/CCNB1/MCM5/MCM2/KIF2C/CDC25A/CDC6/PLK1/BUB1B/GTSE1/EXO1/ZWINT/CENPU/SPC25/CENPI/CCNE1/ORC6/ORC1/TAOK1
## R-HSA-2500257                                                                                CDCA8/CDC20/CENPE/CCNB2/NDC80/SKA1/CENPM/CENPN/CDK1/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/CCNB1/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
## R-HSA-141424                                                                                                  CDCA8/CDC20/CENPE/NDC80/SKA1/CENPM/CENPN/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
## R-HSA-141444                                                                                                  CDCA8/CDC20/CENPE/NDC80/SKA1/CENPM/CENPN/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
## R-HSA-69618                                                                                             CDCA8/CDC20/CENPE/NDC80/UBE2C/SKA1/CENPM/CENPN/ERCC6L/MAD2L1/KIF18A/BIRC5/AURKB/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
## R-HSA-68877                                                                 CDCA8/CDC20/CENPE/CCNB2/NDC80/NCAPH/SKA1/NEK2/CENPM/CENPN/CDK1/ERCC6L/MAD2L1/KIF18A/BIRC5/NCAPG/AURKB/CCNB1/KIF2C/PLK1/BUB1B/ZWINT/CENPU/SPC25/CENPI/TAOK1
##               Count
## R-HSA-69620      37
## R-HSA-2500257    23
## R-HSA-141424     20
## R-HSA-141444     20
## R-HSA-69618      21
## R-HSA-68877      26

For calculation/parameter details, please refer to the vignette of DOSE(Yu et al. 2015)..

Pathway analysis of NGS data

Pathway analysis using NGS data (eg, RNA-Seq and ChIP-Seq) can be performed by linking coding and non-coding regions to coding genes via ChIPseeker package, which can annotates genomic regions to their nearest genes, host genes, and flanking genes respectivly. In addtion, it provides a function, seq2gene, that simultaneously considering host genes, promoter region and flanking gene from intergenic region that may under control via cis-regulation. This function maps genomic regions to genes in a many-to-many manner and facilitate functional analysis. For more details, please refer to ChIPseeker(Yu, Wang, and He 2015).

Visualize enrichment result

We implement barplot, dotplot enrichment map and category-gene-network for visualization. It is very common to visualize the enrichment result in bar or pie chart. We believe the pie chart is misleading and only provide bar chart.

Enrichment map can be viusalized by enrichMap:

In order to consider the potentially biological complexities in which a gene may belong to multiple annotation categories, we developed cnetplot function to extract the complex association between genes and diseases.

Comparing enriched reactome pathways among gene clusters with clusterProfiler

We have developed an R package clusterProfiler(Yu et al. 2012) for comparing biological themes among gene clusters. ReactomePA works fine with clusterProfiler and can compare biological themes at reactome pathway perspective.

Gene Set Enrichment Analysis

A common approach in analyzing gene expression profiles was identifying differential expressed genes that are deemed interesting. The enrichPathway function we demonstrated previously were based on these differential expressed genes. This approach will find genes where the difference is large, but it will not detect a situation where the difference is small, but evidenced in coordinated way in a set of related genes. Gene Set Enrichment Analysis (GSEA)(Subramanian et al. 2005) directly addressed this limitation. All genes can be used in GSEA; GSEA aggregates the per gene statistics across genes within a gene set, therefore making it possible to detect situations where all genes in a predefined set change in a small but coordinated way. For algorithm details, please refer to the vignette of DOSE(Yu et al. 2015).

##                          ID                                 Description setSize
## R-HSA-1474244 R-HSA-1474244           Extracellular matrix organization     266
## R-HSA-1474290 R-HSA-1474290                          Collagen formation      74
## R-HSA-3000178 R-HSA-3000178                           ECM proteoglycans      74
## R-HSA-216083   R-HSA-216083          Integrin cell surface interactions      80
## R-HSA-3000171 R-HSA-3000171      Non-integrin membrane-ECM interactions      56
## R-HSA-1650814 R-HSA-1650814 Collagen biosynthesis and modifying enzymes      53
##               enrichmentScore       NES       pvalue    p.adjust     qvalues
## R-HSA-1474244      -0.4576106 -1.932999 0.0001337077 0.003113242 0.002260658
## R-HSA-1474290      -0.5097483 -1.820540 0.0001538462 0.003113242 0.002260658
## R-HSA-3000178      -0.6262504 -2.236622 0.0001538462 0.003113242 0.002260658
## R-HSA-216083       -0.5123103 -1.849922 0.0001546073 0.003113242 0.002260658
## R-HSA-3000171      -0.5863352 -1.997771 0.0001574803 0.003113242 0.002260658
## R-HSA-1650814      -0.5915513 -1.993844 0.0001575796 0.003113242 0.002260658
##               rank                   leading_edge
## R-HSA-1474244 1943 tags=33%, list=16%, signal=29%
## R-HSA-1474290 1897 tags=43%, list=15%, signal=37%
## R-HSA-3000178 1890 tags=46%, list=15%, signal=39%
## R-HSA-216083  1890 tags=39%, list=15%, signal=33%
## R-HSA-3000171 2538 tags=45%, list=20%, signal=36%
## R-HSA-1650814 1890 tags=47%, list=15%, signal=40%
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                       core_enrichment
## R-HSA-1474244 825/8038/11132/4017/1288/4811/3910/3371/1291/3791/831/1301/4238/7450/3685/80781/1280/1306/4314/3675/8425/977/4054/7837/7042/3912/4322/1278/1511/4060/30008/1277/164656/22795/10516/81578/1293/2247/1295/58494/8076/5118/2192/1281/83700/50509/4319/1290/1513/11096/2202/4313/2199/3693/10536/1294/11117/3339/1462/1289/1292/3908/4016/3909/4053/6678/1296/633/5654/2331/63923/7043/3913/1300/2200/1634/7177/1287/3679/4680/2006/7373/1307/1311/1308/652/4148/54829/4239
## R-HSA-1474290                                                                                                                                                                                                                                                                                                      4017/1288/1291/1301/80781/1280/1306/4314/977/7837/4322/1278/1277/81578/1293/1295/5118/1281/50509/1290/10536/1294/1289/1292/4016/3909/1296/1300/1287/7373/1307/1308
## R-HSA-3000178                                                                                                                                                                                                                                                                                             1288/3910/3371/1291/3685/1280/7042/3912/1278/4060/1277/1293/1281/50509/1290/3693/3339/1462/1289/1292/3908/3909/6678/633/2331/63923/7043/3913/1634/1287/3679/1311/4148/54829
## R-HSA-216083                                                                                                                                                                                                                                                                                                           1288/3371/1291/3791/7450/3685/80781/1280/3675/1278/4060/1277/1293/1295/58494/1281/83700/50509/1290/3693/1294/3339/1289/1292/1296/1300/2200/1287/3679/1307/1311
## R-HSA-3000171                                                                                                                                                                                                                                                                                                                                           7057/3915/6385/4921/1288/3910/3371/1301/3685/1280/3912/1278/1277/2247/1281/50509/1290/3693/3339/1289/3908/3909/3913/1300/1287
## R-HSA-1650814                                                                                                                                                                                                                                                                                                                                        1288/1291/1301/80781/1280/1306/1278/1277/81578/1293/1295/5118/1281/50509/1290/10536/1294/1289/1292/1296/1300/1287/7373/1307/1308

Pathway Visualization

In ReactomePA, we also implemented viewPathway to visualized the pathway.

Need helps?

If you have questions/issues, please visit ReactomePA homepage first. Your problems are mostly documented. If you think you found a bug, please follow the guide and provide a reproducible example to be posted on github issue tracker. For questions, please post to Bioconductor support site and tag your post with ReactomePA.

References

Boyle, Elizabeth I, Shuai Weng, Jeremy Gollub, Heng Jin, David Botstein, J Michael Cherry, and Gavin Sherlock. 2004. “GO::TermFinder–open Source Software for Accessing Gene Ontology Information and Finding Significantly Enriched Gene Ontology Terms Associated with a List of Genes.” Bioinformatics (Oxford, England) 20 (18):3710–5. https://doi.org/10.1093/bioinformatics/bth456.

Subramanian, Aravind, Pablo Tamayo, Vamsi K. Mootha, Sayan Mukherjee, Benjamin L. Ebert, Michael A. Gillette, Amanda Paulovich, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles.” Proceedings of the National Academy of Sciences of the United States of America 102 (43):15545–50. https://doi.org/10.1073/pnas.0506580102.

Yu, Guangchuang, and Qing-Yu He. 2016. “ReactomePA: An R/Bioconductor Package for Reactome Pathway Analysis and Visualization.” Molecular BioSystems 12 (2):477–79. https://doi.org/10.1039/C5MB00663E.

Yu, Guangchuang, Li-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters.” OMICS: A Journal of Integrative Biology 16 (5):284–87. https://doi.org/10.1089/omi.2011.0118.

Yu, Guangchuang, Li-Gen Wang, and Qing-Yu He. 2015. “ChIPseeker: An R/Bioconductor Package for Chip Peak Annotation, Comparison and Visualization.” Bioinformatics 31 (14):2382–3. https://doi.org/10.1093/bioinformatics/btv145.

Yu, Guangchuang, Li-Gen Wang, Guang-Rong Yan, and Qing-Yu He. 2015. “DOSE: An R/Bioconductor Package for Disease Ontology Semantic and Enrichment Analysis.” Bioinformatics 31 (4):608–9. https://doi.org/10.1093/bioinformatics/btu684.