In this vignette, we demonstrate the unsegmented block bootstrap functionality implemented in nullranges. “Unsegmented” refers to the fact that this implementation does not consider segmentation of the genome for sampling of blocks, see the segmented block bootstrap vignette for the alternative implementation.
First we use the DNase hypersensitivity peaks in A549 downloaded from AnnotationHub, and pre-processed as described in the nullrangesOldData package.
## see ?nullrangesData and browseVignettes('nullrangesData') for documentation
## loading from cache
The following chunk of code evaluates various types of bootstrap/permutation schemes, first within chromosome, and then across chromosome (the default). The default type
is bootstrap, and the default for withinChrom
is FALSE
(bootstrapping with blocks moving across chromosomes).
set.seed(5) # reproducibility
library(microbenchmark)
blockLength <- 5e5
microbenchmark(
list=alist(
p_within=bootRanges(dhs, blockLength=blockLength,
type="permute", withinChrom=TRUE),
b_within=bootRanges(dhs, blockLength=blockLength,
type="bootstrap", withinChrom=TRUE),
p_across=bootRanges(dhs, blockLength=blockLength,
type="permute", withinChrom=FALSE),
b_across=bootRanges(dhs, blockLength=blockLength,
type="bootstrap", withinChrom=FALSE)
), times=10)
## Unit: milliseconds
## expr min lq mean median uq max neval cld
## p_within 1304.0599 1422.1930 2711.4735 1884.6968 4062.0629 5822.6951 10 b
## b_within 1074.0883 1320.6994 2201.2710 1626.2640 3378.6092 4579.0371 10 b
## p_across 260.4065 297.4865 416.6162 339.2431 523.9431 839.5706 10 a
## b_across 274.5905 281.1801 465.6156 487.3540 605.5588 730.5026 10 a
We create some synthetic ranges in order to visualize the different options of the unsegmented bootstrap implemented in nullranges.
library(GenomicRanges)
seq_nms <- rep(c("chr1","chr2","chr3"),c(4,5,2))
gr <- GRanges(seqnames=seq_nms,
IRanges(start=c(1,101,121,201,
101,201,216,231,401,
1,101),
width=c(20, 5, 5, 30,
20, 5, 5, 5, 30,
80, 40)),
seqlengths=c(chr1=300,chr2=450,chr3=200),
chr=factor(seq_nms))
The following function uses functionality from plotgardener to plot the ranges. Note in the plotting helper function that chr
will be used to color ranges by chromosome of origin.
suppressPackageStartupMessages(library(plotgardener))
plotGRanges <- function(gr) {
pageCreate(width = 5, height = 2, xgrid = 0,
ygrid = 0, showGuides = FALSE)
for (i in seq_along(seqlevels(gr))) {
chrom <- seqlevels(gr)[i]
chromend <- seqlengths(gr)[[chrom]]
suppressMessages({
p <- pgParams(chromstart = 0, chromend = chromend,
x = 0.5, width = 4*chromend/500, height = 0.5,
at = seq(0, chromend, 50),
fill = colorby("chr", palette=palette.colors))
prngs <- plotRanges(data = gr, params = p,
chrom = chrom,
y = 0.25 + (i-1)*.7,
just = c("left", "bottom"))
annoGenomeLabel(plot = prngs, params = p, y = 0.30 + (i-1)*.7)
})
}
}
Visualizing two permutations of blocks within chromosome:
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=100, type="permute", withinChrom=TRUE)
plotGRanges(gr_prime)
}
Visualizing two bootstraps within chromosome:
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=100, withinChrom=TRUE)
plotGRanges(gr_prime)
}
Visualizing two permutations of blocks across chromosome. Here we use larger blocks than previously.
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=200, type="permute", withinChrom=FALSE)
plotGRanges(gr_prime)
}
Visualizing two bootstraps across chromosome:
for (i in 1:2) {
gr_prime <- bootRanges(gr, blockLength=200, withinChrom=FALSE)
plotGRanges(gr_prime)
}
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.14-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.14-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] microbenchmark_1.4-7 excluderanges_0.99.6
## [3] EnsDb.Hsapiens.v86_2.99.0 ensembldb_2.18.0
## [5] AnnotationFilter_1.18.0 GenomicFeatures_1.46.0
## [7] AnnotationDbi_1.56.0 patchwork_1.1.1
## [9] plyranges_1.14.0 nullrangesData_0.99.2
## [11] ExperimentHub_2.2.0 AnnotationHub_3.2.0
## [13] BiocFileCache_2.2.0 dbplyr_2.1.1
## [15] ggplot2_3.3.5 plotgardener_1.0.0
## [17] nullranges_1.0.0 InteractionSet_1.22.0
## [19] SummarizedExperiment_1.24.0 Biobase_2.54.0
## [21] MatrixGenerics_1.6.0 matrixStats_0.61.0
## [23] GenomicRanges_1.46.0 GenomeInfoDb_1.30.0
## [25] IRanges_2.28.0 S4Vectors_0.32.0
## [27] BiocGenerics_0.40.0
##
## loaded via a namespace (and not attached):
## [1] plyr_1.8.6 RcppHMM_1.2.2
## [3] lazyeval_0.2.2 splines_4.1.1
## [5] BiocParallel_1.28.0 TH.data_1.1-0
## [7] digest_0.6.28 yulab.utils_0.0.4
## [9] htmltools_0.5.2 fansi_0.5.0
## [11] magrittr_2.0.1 memoise_2.0.0
## [13] ks_1.13.2 Biostrings_2.62.0
## [15] sandwich_3.0-1 prettyunits_1.1.1
## [17] colorspace_2.0-2 blob_1.2.2
## [19] rappdirs_0.3.3 xfun_0.27
## [21] dplyr_1.0.7 crayon_1.4.1
## [23] RCurl_1.98-1.5 jsonlite_1.7.2
## [25] survival_3.2-13 zoo_1.8-9
## [27] glue_1.4.2 gtable_0.3.0
## [29] zlibbioc_1.40.0 XVector_0.34.0
## [31] strawr_0.0.9 DelayedArray_0.20.0
## [33] scales_1.1.1 mvtnorm_1.1-3
## [35] DBI_1.1.1 Rcpp_1.0.7
## [37] xtable_1.8-4 progress_1.2.2
## [39] gridGraphics_0.5-1 bit_4.0.4
## [41] mclust_5.4.7 httr_1.4.2
## [43] RColorBrewer_1.1-2 speedglm_0.3-3
## [45] ellipsis_0.3.2 pkgconfig_2.0.3
## [47] XML_3.99-0.8 farver_2.1.0
## [49] sass_0.4.0 utf8_1.2.2
## [51] DNAcopy_1.68.0 ggplotify_0.1.0
## [53] tidyselect_1.1.1 labeling_0.4.2
## [55] rlang_0.4.12 later_1.3.0
## [57] munsell_0.5.0 BiocVersion_3.14.0
## [59] tools_4.1.1 cachem_1.0.6
## [61] generics_0.1.1 RSQLite_2.2.8
## [63] ggridges_0.5.3 evaluate_0.14
## [65] stringr_1.4.0 fastmap_1.1.0
## [67] yaml_2.2.1 knitr_1.36
## [69] bit64_4.0.5 purrr_0.3.4
## [71] KEGGREST_1.34.0 mime_0.12
## [73] pracma_2.3.3 xml2_1.3.2
## [75] biomaRt_2.50.0 compiler_4.1.1
## [77] filelock_1.0.2 curl_4.3.2
## [79] png_0.1-7 interactiveDisplayBase_1.32.0
## [81] tibble_3.1.5 bslib_0.3.1
## [83] stringi_1.7.5 highr_0.9
## [85] lattice_0.20-45 ProtGenerics_1.26.0
## [87] Matrix_1.3-4 vctrs_0.3.8
## [89] pillar_1.6.4 lifecycle_1.0.1
## [91] BiocManager_1.30.16 jquerylib_0.1.4
## [93] data.table_1.14.2 bitops_1.0-7
## [95] httpuv_1.6.3 rtracklayer_1.54.0
## [97] R6_2.5.1 BiocIO_1.4.0
## [99] promises_1.2.0.1 KernSmooth_2.23-20
## [101] codetools_0.2-18 MASS_7.3-54
## [103] assertthat_0.2.1 rjson_0.2.20
## [105] withr_2.4.2 GenomicAlignments_1.30.0
## [107] Rsamtools_2.10.0 multcomp_1.4-17
## [109] GenomeInfoDbData_1.2.7 parallel_4.1.1
## [111] hms_1.1.1 rmarkdown_2.11
## [113] shiny_1.7.1 restfulr_0.0.13