regionReport (Collado-Torres, Jaffe, and Leek, 2015) creates HTML reports styled with knitrBootstrap (Hester, 2013) for a set of regions such as derfinder (Collado-Torres, Frazee, Love, Irizarry, et al., 2015) results.
This package includes a basic exploration for a general set of genomic regions which can be easily customized to include the appropriate conclusions and/or further exploration of the results. Such a report can be generated using renderReport()
. regionReport has a separate template for running a basic exploration analysis of derfinder results by using derfinderReport()
. Both reports are written in R Markdown format and include all the code for making the plots and explorations in the report itself. For both reports, regionReport relies on knitr (Xie, 2014), rmarkdown (Allaire, Cheng, Xie, McPherson, et al., 2015), and knitrBootstrap (Hester, 2013) for generating the report.
The plots in regionReport are powered by derfinderPlot (Collado-Torres, Jaffe, and Leek, 2015), ggbio (Yin, Cook, and Lawrence, 2012), and ggplot2 (Wickham, 2009).
The regionReport supplementary website regionReportSupp has examples of using regionReport with results from DiffBind and derfinder. Included as a vignette, this package also has an example using a small data set derived from bumphunter. These represent different uses of regionReport for results from ChIP-seq, methylation, and RNA-seq data. In particular, the DiffBind example illustrates how to expand a basic report created with renderReport()
.
For a general use case, you first have to identify a set of genomic regions of interest and store it as a GRanges
object. In a typical workflow you will have some variables measured for each of the regions, such as p-values and scores. renderReport()
uses the set of regions and three main arguments:
pvalueVars
: this is a character vector (named optionally) with the names of the variables that are bound between 0 and 1, such as p-values. For each of these variables, renderReport()
explores the distribution by chromosome, the overall distribution, and makes a table with commonly used cutoffs.densityVars
: is another character vector (named optionally) with another set of variables you wish to explore by making density graphs. This is commonly used for scores and other similar numerical variables.significantVar
: is a logical vector separating the regions into by whether they are statistically significant. For example, this information is used to explore the width of all the regions and compare it the significant ones.Other parameters control the name of the report, where it’ll be located, the transcripts database used to annotate the nearest genes, graphical parameters, etc.
Here is a short example of how to use renderReport()
. Note that we are using regions produced by derfinder just for convenience sake. You can also run this example by using example('renderReport', 'regionReport', ask=FALSE)
.
## Load derfinder
library('derfinder')
regions <- genomeRegions$regions
## Assign chr length
library('GenomicRanges')
seqlengths(regions) <- c('chr21' = 48129895)
## The output will be saved in the 'derfinderReport-example' directory
dir.create('renderReport-example', showWarnings = FALSE, recursive = TRUE)
## Generate the HTML report
report <- renderReport(regions, 'Example run', pvalueVars = c(
'Q-values' = 'qvalues', 'P-values' = 'pvalues'), densityVars = c(
'Area' = 'area', 'Mean coverage' = 'meanCoverage'),
significantVar = regions$qvalues <= 0.05, nBestRegions = 20,
outdir = 'renderReport-example')
Prior to using regionReport::derfinderReport()
you must use derfinder to analyze a specific data set. While there are many ways to do so, we recommend using analyzeChr() with the same prefix argument. Then merging the results with mergeResults().
Below, we run derfinder for the example data included in the package. The steps are:
## Load derfinder
library('derfinder')
## The output will be saved in the 'report' directory
dir.create('report', showWarnings = FALSE, recursive = TRUE)
The following code runs derfinder.
## Save the current path
initialPath <- getwd()
setwd(file.path(initialPath, 'report'))
## Generate output from derfinder
## Collapse the coverage information
collapsedFull <- collapseFullCoverage(list(genomeData$coverage),
verbose=TRUE)
## Calculate library size adjustments
sampleDepths <- sampleDepth(collapsedFull, probs=c(0.5), nonzero=TRUE,
verbose=TRUE)
## Build the models
group <- genomeInfo$pop
adjustvars <- data.frame(genomeInfo$gender)
models <- makeModels(sampleDepths, testvars=group, adjustvars=adjustvars)
## Analyze chromosome 21
analysis <- analyzeChr(chr='21', coverageInfo=genomeData, models=models,
cutoffFstat=1, cutoffType='manual', seeds=20140330, groupInfo=group,
mc.cores=1, writeOutput=TRUE, returnOutput=TRUE)
## Save the stats options for later
optionsStats <- analysis$optionsStats
## Change the directory back to the original one
setwd(initialPath)
For convenience, we have included the derfinder results as part of regionReport. Note that the above functions are routinely checked as part of derfinder.
## Copy previous results
file.copy(system.file(file.path('extdata', 'chr21'), package='derfinder',
mustWork=TRUE), 'report', recursive=TRUE)
## [1] TRUE
Next, proceed to merging the results.
## Merge the results from the different chromosomes. In this case, there's
## only one: chr21
mergeResults(chrs = 'chr21', prefix = 'report',
genomicState = genomicState$fullGenome)
## 2015-11-02 20:42:49 mergeResults: Saving options used
## 2015-11-02 20:42:49 Loading chromosome chr21
## Neither 'cutoffFstatUsed' nor 'optionsStats' were supplied, so the FWER calculation step will be skipped.
## 2015-11-02 20:42:49 mergeResults: Saving fullNullSummary
## 2015-11-02 20:42:49 mergeResults: Re-calculating the p-values
## 2015-11-02 20:42:49 mergeResults: Saving fullRegions
## 2015-11-02 20:42:49 mergeResults: assigning genomic states
## 2015-11-02 20:42:49 annotateRegions: counting
## 2015-11-02 20:42:49 annotateRegions: annotating
## 2015-11-02 20:42:49 mergeResults: Saving fullAnnotatedRegions
## 2015-11-02 20:42:49 mergeResults: Saving fullFstats
## 2015-11-02 20:42:49 mergeResults: Saving fullTime
Once the derfinder output has been generated and merged, use derfinderReport() to create the HTML report.
## Load derfindeReport
library('regionReport')
## Generate the HTML report
report <- derfinderReport(prefix='report', browse=FALSE,
nBestRegions=15, makeBestClusters=TRUE, outdir='html',
fullCov=list('21'=genomeDataRaw$coverage), optionsStats=optionsStats)
Once the output is generated, you can browse the report from R
using browseURL() as shown below.
## Browse the report
browseURL(report)
You can compare the resulting report with the pre-compiled report using the following code.
browseURL(system.file(file.path('basicExploration', 'basicExploration.html'),
package = 'regionReport', mustWork = TRUE))
Note that the reports require an active Internet connection to render correctly.
The report is self-explanatory and will change some of the text depending on the input options.
If the report is taking too long to compile (say more than 3 hours), you might want to consider setting nBestCluters to a small number or even set makeBestClusters to FALSE
.
If you are interested in using the advanced arguments, use derfinder::advancedArg()
as shown below:
## URLs to advanced arguemtns
derfinder::advancedArg('derfinderReport', package = 'regionReport',
browse = FALSE)
## Set browse = TRUE if you want to open them in your browser
In particular, you might be interested in specifying the output_format
argument in either renderReport()
or derfinderReport()
. For example, setting output_format = 'pdf_document'
will generate a PDF file instead. However, you will lose interactivity for toggling hiding/showing code and the tables will be static.
This package was made possible thanks to:
Code for creating the vignette
## Create the vignette
library('rmarkdown')
system.time(render('regionReport.Rmd', 'BiocStyle::html_document'))
## Extract the R code
library('knitr')
knit('regionReport.Rmd', tangle = TRUE)
## Copy report output to be distributed with the package for comparison
## purposes
if(gsub('.*/', '', getwd()) == 'realVignettes') {
file.copy(file.path('report', 'html', 'basicExploration.html'),
file.path('..', '..', 'inst', 'basicExploration',
'basicExploration.html'), overwrite=TRUE)
} else {
file.copy(file.path('report', 'html', 'basicExploration.html'),
file.path('..', 'inst', 'basicExploration', 'basicExploration.html'),
overwrite=TRUE)
}
## Clean up
file.remove('regionReportRef.bib')
## [1] TRUE
#unlink('regionReport_files', recursive=TRUE)
unlink('report', recursive = TRUE)
Date the vignette was generated.
## [1] "2015-11-02 20:43:45 PST"
Wallclock time spent generating the vignette.
## Time difference of 57.029 secs
R
session information.
## Session info -----------------------------------------------------------------------------------------------------------
## setting value
## version R version 3.2.2 (2015-08-14)
## system x86_64, linux-gnu
## ui X11
## language en_US:
## collate C
## tz <NA>
## date 2015-11-02
## Packages ---------------------------------------------------------------------------------------------------------------
## package * version date source
## AnnotationDbi 1.32.0 2015-11-03 Bioconductor
## BSgenome 1.38.0 2015-11-03 Bioconductor
## Biobase 2.30.0 2015-11-03 Bioconductor
## BiocGenerics * 0.16.0 2015-11-03 Bioconductor
## BiocInstaller 1.20.0 2015-11-03 Bioconductor
## BiocParallel 1.4.0 2015-11-03 Bioconductor
## BiocStyle * 1.8.0 2015-11-03 Bioconductor
## Biostrings 2.38.0 2015-11-03 Bioconductor
## DBI 0.3.1 2014-09-24 CRAN (R 3.2.2)
## Formula 1.2-1 2015-04-07 CRAN (R 3.2.2)
## GGally 0.5.0 2014-12-02 CRAN (R 3.2.2)
## GenomeInfoDb * 1.6.1 2015-11-03 Bioconductor
## GenomicAlignments 1.6.1 2015-11-03 Bioconductor
## GenomicFeatures 1.22.2 2015-11-03 Bioconductor
## GenomicFiles 1.6.0 2015-11-03 Bioconductor
## GenomicRanges * 1.22.0 2015-11-03 Bioconductor
## Hmisc 3.17-0 2015-09-21 CRAN (R 3.2.2)
## IRanges * 2.4.1 2015-11-03 Bioconductor
## MASS 7.3-44 2015-08-30 CRAN (R 3.2.2)
## Matrix 1.2-2 2015-07-08 CRAN (R 3.2.2)
## OrganismDbi 1.12.0 2015-11-03 Bioconductor
## R6 2.1.1 2015-08-19 CRAN (R 3.2.2)
## RBGL 1.46.0 2015-11-03 Bioconductor
## RColorBrewer 1.1-2 2014-12-07 CRAN (R 3.2.2)
## RCurl 1.95-4.7 2015-06-30 CRAN (R 3.2.2)
## RJSONIO 1.3-0 2014-07-28 CRAN (R 3.2.2)
## RSQLite 1.0.0 2014-10-25 CRAN (R 3.2.2)
## Rcpp 0.12.1 2015-09-10 CRAN (R 3.2.2)
## RefManageR 0.8.63 2015-06-09 CRAN (R 3.2.2)
## Rsamtools 1.22.0 2015-11-03 Bioconductor
## S4Vectors * 0.8.0 2015-11-03 Bioconductor
## SummarizedExperiment 1.0.0 2015-11-03 Bioconductor
## VariantAnnotation 1.16.3 2015-11-03 Bioconductor
## XML 3.98-1.3 2015-06-30 CRAN (R 3.2.2)
## XVector 0.10.0 2015-11-03 Bioconductor
## acepack 1.3-3.3 2014-11-24 CRAN (R 3.2.2)
## bibtex 0.4.0 2014-12-31 CRAN (R 3.2.2)
## biomaRt 2.26.0 2015-11-03 Bioconductor
## biovizBase 1.18.0 2015-11-03 Bioconductor
## bitops 1.0-6 2013-08-17 CRAN (R 3.2.2)
## bumphunter * 1.10.0 2015-11-03 Bioconductor
## cluster 2.0.3 2015-07-21 CRAN (R 3.2.2)
## codetools 0.2-14 2015-07-15 CRAN (R 3.2.2)
## colorspace 1.2-6 2015-03-11 CRAN (R 3.2.2)
## derfinder * 1.4.1 2015-11-03 Bioconductor
## derfinderHelper 1.4.1 2015-11-03 Bioconductor
## derfinderPlot 1.4.1 2015-11-03 Bioconductor
## devtools * 1.9.1 2015-09-11 CRAN (R 3.2.2)
## dichromat 2.0-0 2013-01-24 CRAN (R 3.2.2)
## digest 0.6.8 2014-12-31 CRAN (R 3.2.2)
## doRNG 1.6 2014-03-07 CRAN (R 3.2.2)
## evaluate 0.8 2015-09-18 CRAN (R 3.2.2)
## foreach * 1.4.3 2015-10-13 CRAN (R 3.2.2)
## foreign 0.8-66 2015-08-19 CRAN (R 3.2.2)
## formatR 1.2.1 2015-09-18 CRAN (R 3.2.2)
## futile.logger 1.4.1 2015-04-20 CRAN (R 3.2.2)
## futile.options 1.0.0 2010-04-06 CRAN (R 3.2.2)
## ggbio 1.18.0 2015-11-03 Bioconductor
## ggplot2 1.0.1 2015-03-17 CRAN (R 3.2.2)
## graph 1.48.0 2015-11-03 Bioconductor
## gridExtra 2.0.0 2015-07-14 CRAN (R 3.2.2)
## gtable 0.1.2 2012-12-05 CRAN (R 3.2.2)
## htmltools 0.2.6 2014-09-08 CRAN (R 3.2.2)
## httr 1.0.0 2015-06-25 CRAN (R 3.2.2)
## iterators * 1.0.8 2015-10-13 CRAN (R 3.2.2)
## knitcitations * 1.0.7 2015-10-28 CRAN (R 3.2.2)
## knitr 1.11 2015-08-14 CRAN (R 3.2.2)
## knitrBootstrap 0.9.0 2013-10-17 CRAN (R 3.2.2)
## lambda.r 1.1.7 2015-03-20 CRAN (R 3.2.2)
## lattice 0.20-33 2015-07-14 CRAN (R 3.2.2)
## latticeExtra 0.6-26 2013-08-15 CRAN (R 3.2.2)
## limma 3.26.1 2015-11-03 Bioconductor
## locfit * 1.5-9.1 2013-04-20 CRAN (R 3.2.2)
## lubridate 1.3.3 2013-12-31 CRAN (R 3.2.2)
## magrittr 1.5 2014-11-22 CRAN (R 3.2.2)
## markdown 0.7.7 2015-04-22 CRAN (R 3.2.2)
## matrixStats 0.15.0 2015-10-27 CRAN (R 3.2.2)
## memoise 0.2.1 2014-04-22 CRAN (R 3.2.2)
## mgcv 1.8-9 2015-10-30 CRAN (R 3.2.2)
## munsell 0.4.2 2013-07-11 CRAN (R 3.2.2)
## nlme 3.1-122 2015-08-19 CRAN (R 3.2.2)
## nnet 7.3-11 2015-08-30 CRAN (R 3.2.2)
## pkgmaker 0.22 2014-05-14 CRAN (R 3.2.2)
## plyr 1.8.3 2015-06-12 CRAN (R 3.2.2)
## proto 0.3-10 2012-12-22 CRAN (R 3.2.2)
## qvalue 2.2.0 2015-11-03 Bioconductor
## regionReport * 1.4.1 2015-11-03 Bioconductor
## registry 0.3 2015-07-08 CRAN (R 3.2.2)
## reshape 0.8.5 2014-04-23 CRAN (R 3.2.2)
## reshape2 1.4.1 2014-12-06 CRAN (R 3.2.2)
## rmarkdown 0.8.1 2015-10-10 CRAN (R 3.2.2)
## rngtools 1.2.4 2014-03-06 CRAN (R 3.2.2)
## rpart 4.1-10 2015-06-29 CRAN (R 3.2.2)
## rstudioapi 0.3.1 2015-04-07 CRAN (R 3.2.2)
## rtracklayer 1.30.1 2015-11-03 Bioconductor
## scales 0.3.0 2015-08-25 CRAN (R 3.2.2)
## stringi 1.0-1 2015-10-22 CRAN (R 3.2.2)
## stringr 1.0.0 2015-04-30 CRAN (R 3.2.2)
## survival 2.38-3 2015-07-02 CRAN (R 3.2.2)
## whisker 0.3-2 2013-04-28 CRAN (R 3.2.2)
## xtable 1.7-4 2014-09-12 CRAN (R 3.2.2)
## yaml 2.1.13 2014-06-12 CRAN (R 3.2.2)
## zlibbioc 1.16.0 2015-11-03 Bioconductor
This vignette was generated using BiocStyle (Morgan, Oleś, and Huber, 2015) with knitr (Xie, 2014) and rmarkdown (Allaire, Cheng, Xie, McPherson, et al., 2015) running behind the scenes.
Citations made with knitcitations (Boettiger, 2015).
[1] J. Allaire, J. Cheng, Y. Xie, J. McPherson, et al. rmarkdown: Dynamic Documents for R. R package version 0.8.1. 2015. URL: http://CRAN.R-project.org/package=rmarkdown.
[2] S. Arora, M. Morgan, M. Carlson and H. Pages. GenomeInfoDb: Utilities for manipulating chromosome and other ‘seqname’ identifiers. R package version 1.6.1. 2015.
[3] B. Auguie. gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.0.0. 2015. URL: http://CRAN.R-project.org/package=gridExtra.
[4] C. Boettiger. knitcitations: Citations for ‘Knitr’ Markdown Files. R package version 1.0.7. 2015. URL: http://CRAN.R-project.org/package=knitcitations.
[5] M. Carlson and B. P. Maintainer. TxDb.Hsapiens.UCSC.hg19.knownGene: Annotation package for TxDb object(s). R package version 3.2.2. 2015.
[6] L. Collado-Torres, A. C. Frazee, M. I. Love, R. A. Irizarry, et al. “derfinder: Software for annotation-agnostic RNA-seq differential expression analysis”. In: bioRxiv (2015). DOI: 10.1101/015370. URL: http://www.biorxiv.org/content/early/2015/02/19/015370.abstract.
[7] L. Collado-Torres, A. E. Jaffe and J. T. Leek. derfinderPlot: Plotting functions for derfinder. https://github.com/leekgroup/derfinderPlot - R package version 1.4.1. 2015. URL: http://www.bioconductor.org/packages/release/bioc/html/derfinderPlot.html.
[8] L. Collado-Torres, A. E. Jaffe and J. T. Leek. regionReport: Generate HTML reports for exploring a set of regions. https://github.com/leekgroup/regionReport - R package version 1.4.1. 2015. URL: http://www.bioconductor.org/packages/release/bioc/html/regionReport.html.
[9] J. Hester. knitrBootstrap: Knitr Bootstrap framework. R package version 0.9.0. 2013. URL: http://CRAN.R-project.org/package=knitrBootstrap.
[10] A. E. Jaffe, P. Murakami, H. Lee, J. T. Leek, et al. “Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies”. In: International journal of epidemiology 41.1 (2012), pp. 200–209. DOI: 10.1093/ije/dyr238.
[11] M. Lawrence, W. Huber, H. Pagès, P. Aboyoun, et al. “Software for Computing and Annotating Genomic Ranges”. In: PLoS Computational Biology 9 (8 2013). DOI: 10.1371/journal.pcbi.1003118. URL: http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1003118}.
[12] M. Morgan, A. Oleś and W. Huber. BiocStyle: Standard styles for vignettes and other Bioconductor documents. R package version 1.8.0. 2015. URL: https://github.com/Bioconductor/BiocStyle.
[13] E. Neuwirth. RColorBrewer: ColorBrewer Palettes. R package version 1.1-2. 2014. URL: http://CRAN.R-project.org/package=RColorBrewer.
[14] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2015. URL: https://www.R-project.org/.
[15] S. Urbanek and J. Horner. Cairo: R graphics device using cairo graphics library for creating high-quality bitmap (PNG, JPEG, TIFF), vector (PDF, SVG, PostScript) and display (X11 and Win32) output. R package version 1.5-9. 2015. URL: http://CRAN.R-project.org/package=Cairo.
[16] H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009. ISBN: 978-0-387-98140-6. URL: http://had.co.nz/ggplot2/book.
[17] H. Wickham and W. Chang. devtools: Tools to Make Developing R Packages Easier. R package version 1.9.1. 2015. URL: http://CRAN.R-project.org/package=devtools.
[18] Y. Xie. “knitr: A Comprehensive Tool for Reproducible Research in R”. In: Implementing Reproducible Computational Research. Ed. by V. Stodden, F. Leisch and R. D. Peng. ISBN 978-1466561595. Chapman and Hall/CRC, 2014. URL: http://www.crcpress.com/product/isbn/9781466561595.
[19] T. Yin, D. Cook and M. Lawrence. “ggbio: an R package for extending the grammar of graphics for genomic data”. In: Genome Biology 13.8 (2012), p. R77.
[20] T. Yin, M. Lawrence and D. Cook. biovizBase: Basic graphic utilities for visualization of genomic data. R package version 1.18.0. 2015.