In this vignette, we demonstrate the unsegmented block bootstrap functionality implemented in nullranges. “Unsegmented” refers to the fact that this implementation does not consider segmentation of the genome for sampling of blocks, see the segmented block bootstrap vignette for the alternative implementation.

Timing on DHS peaks

First we use the DNase hypersensitivity peaks in A549 downloaded from AnnotationHub, and pre-processed as described in the nullrangesOldData package.

library(nullrangesData)
dhs <- DHSA549Hg38()
## see ?nullrangesData and browseVignettes('nullrangesData') for documentation
## loading from cache
library(nullranges)

The following chunk of code evaluates various types of bootstrap/permutation schemes, first within chromosome, and then across chromosome (the default). The default type is bootstrap, and the default for withinChrom is FALSE (bootstrapping with blocks moving across chromosomes).

set.seed(5) # reproducibility
library(microbenchmark)
blockLength <- 5e5
microbenchmark(
  list=alist(
    p_within=bootRanges(dhs, blockLength=blockLength,
                        type="permute", withinChrom=TRUE),
    b_within=bootRanges(dhs, blockLength=blockLength,
                        type="bootstrap", withinChrom=TRUE),
    p_across=bootRanges(dhs, blockLength=blockLength,
                        type="permute", withinChrom=FALSE),
    b_across=bootRanges(dhs, blockLength=blockLength,
                        type="bootstrap", withinChrom=FALSE)
  ), times=10)
## Unit: milliseconds
##      expr      min       lq     mean   median       uq      max neval cld
##  p_within 829.1908 853.9015 860.3320 859.5251 871.7421 881.8746    10   c
##  b_within 718.6603 742.7654 761.3850 758.4495 787.5324 799.4478    10  b 
##  p_across 163.9868 176.0295 187.9694 192.4362 196.7239 201.8509    10 a  
##  b_across 185.1014 195.6303 207.4733 210.2374 212.6999 239.3761    10 a

Visualize on synthetic data

We create some synthetic ranges in order to visualize the different options of the unsegmented bootstrap implemented in nullranges.

library(GenomicRanges)
seq_nms <- rep(c("chr1","chr2","chr3"),c(4,5,2))
gr <- GRanges(seqnames=seq_nms,
              IRanges(start=c(1,101,121,201,
                              101,201,216,231,401,
                              1,101),
                      width=c(20, 5, 5, 30,
                              20, 5, 5, 5, 30,
                              80, 40)),
              seqlengths=c(chr1=300,chr2=450,chr3=200),
              chr=factor(seq_nms))

The following function uses functionality from plotgardener to plot the ranges. Note in the plotting helper function that chr will be used to color ranges by chromosome of origin.

suppressPackageStartupMessages(library(plotgardener))
plotGRanges <- function(gr) {
  pageCreate(width = 5, height = 2, xgrid = 0,
                ygrid = 0, showGuides = FALSE)
  for (i in seq_along(seqlevels(gr))) {
    chrom <- seqlevels(gr)[i]
    chromend <- seqlengths(gr)[[chrom]]
    suppressMessages({
      p <- pgParams(chromstart = 0, chromend = chromend,
                    x = 0.5, width = 4*chromend/500, height = 0.5,
                    at = seq(0, chromend, 50),
                    fill = colorby("chr", palette=palette.colors))
      prngs <- plotRanges(data = gr, params = p,
                          chrom = chrom,
                          y = 0.25 + (i-1)*.7,
                          just = c("left", "bottom"))
      annoGenomeLabel(plot = prngs, params = p, y = 0.30 + (i-1)*.7)
    })
  }
}
plotGRanges(gr)

Within chromosome

Visualizing two permutations of blocks within chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=100, type="permute", withinChrom=TRUE)
  plotGRanges(gr_prime)
}

Visualizing two bootstraps within chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=100, withinChrom=TRUE)
  plotGRanges(gr_prime)
}

Across chromosome

Visualizing two permutations of blocks across chromosome. Here we use larger blocks than previously.

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=200, type="permute", withinChrom=FALSE)
  plotGRanges(gr_prime)
}

Visualizing two bootstraps across chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=200, withinChrom=FALSE)
  plotGRanges(gr_prime)
}

Session information

sessionInfo()
## R version 4.1.1 (2021-08-10)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows Server x64 (build 17763)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=C                          
## [2] LC_CTYPE=English_United States.1252   
## [3] LC_MONETARY=English_United States.1252
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.1252    
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] microbenchmark_1.4-7        excluderanges_0.99.6       
##  [3] EnsDb.Hsapiens.v86_2.99.0   ensembldb_2.17.4           
##  [5] AnnotationFilter_1.17.1     GenomicFeatures_1.45.2     
##  [7] AnnotationDbi_1.55.2        patchwork_1.1.1            
##  [9] plyranges_1.13.1            nullrangesData_0.99.2      
## [11] ExperimentHub_2.1.4         AnnotationHub_3.1.7        
## [13] BiocFileCache_2.1.1         dbplyr_2.1.1               
## [15] ggplot2_3.3.5               plotgardener_0.99.16       
## [17] nullranges_0.99.19          InteractionSet_1.21.1      
## [19] SummarizedExperiment_1.23.5 Biobase_2.53.0             
## [21] MatrixGenerics_1.5.4        matrixStats_0.61.0         
## [23] GenomicRanges_1.45.0        GenomeInfoDb_1.29.10       
## [25] IRanges_2.27.2              S4Vectors_0.31.5           
## [27] BiocGenerics_0.39.2        
## 
## loaded via a namespace (and not attached):
##   [1] plyr_1.8.6                    RcppHMM_1.2.2                
##   [3] lazyeval_0.2.2                splines_4.1.1                
##   [5] BiocParallel_1.27.17          TH.data_1.1-0                
##   [7] digest_0.6.28                 yulab.utils_0.0.4            
##   [9] htmltools_0.5.2               fansi_0.5.0                  
##  [11] magrittr_2.0.1                memoise_2.0.0                
##  [13] ks_1.13.2                     Biostrings_2.61.2            
##  [15] sandwich_3.0-1                prettyunits_1.1.1            
##  [17] colorspace_2.0-2              blob_1.2.2                   
##  [19] rappdirs_0.3.3                xfun_0.27                    
##  [21] dplyr_1.0.7                   crayon_1.4.1                 
##  [23] RCurl_1.98-1.5                jsonlite_1.7.2               
##  [25] survival_3.2-13               zoo_1.8-9                    
##  [27] glue_1.4.2                    gtable_0.3.0                 
##  [29] zlibbioc_1.39.0               XVector_0.33.0               
##  [31] strawr_0.0.9                  DelayedArray_0.19.4          
##  [33] scales_1.1.1                  mvtnorm_1.1-3                
##  [35] DBI_1.1.1                     Rcpp_1.0.7                   
##  [37] xtable_1.8-4                  progress_1.2.2               
##  [39] gridGraphics_0.5-1            bit_4.0.4                    
##  [41] mclust_5.4.7                  httr_1.4.2                   
##  [43] RColorBrewer_1.1-2            speedglm_0.3-3               
##  [45] ellipsis_0.3.2                pkgconfig_2.0.3              
##  [47] XML_3.99-0.8                  farver_2.1.0                 
##  [49] sass_0.4.0                    utf8_1.2.2                   
##  [51] DNAcopy_1.67.0                ggplotify_0.1.0              
##  [53] tidyselect_1.1.1              labeling_0.4.2               
##  [55] rlang_0.4.12                  later_1.3.0                  
##  [57] munsell_0.5.0                 BiocVersion_3.14.0           
##  [59] tools_4.1.1                   cachem_1.0.6                 
##  [61] generics_0.1.0                RSQLite_2.2.8                
##  [63] ggridges_0.5.3                evaluate_0.14                
##  [65] stringr_1.4.0                 fastmap_1.1.0                
##  [67] yaml_2.2.1                    knitr_1.36                   
##  [69] bit64_4.0.5                   purrr_0.3.4                  
##  [71] KEGGREST_1.33.0               mime_0.12                    
##  [73] pracma_2.3.3                  xml2_1.3.2                   
##  [75] biomaRt_2.49.7                compiler_4.1.1               
##  [77] filelock_1.0.2                curl_4.3.2                   
##  [79] png_0.1-7                     interactiveDisplayBase_1.31.2
##  [81] tibble_3.1.5                  bslib_0.3.1                  
##  [83] stringi_1.7.5                 highr_0.9                    
##  [85] lattice_0.20-45               ProtGenerics_1.25.1          
##  [87] Matrix_1.3-4                  vctrs_0.3.8                  
##  [89] pillar_1.6.4                  lifecycle_1.0.1              
##  [91] BiocManager_1.30.16           jquerylib_0.1.4              
##  [93] data.table_1.14.2             bitops_1.0-7                 
##  [95] httpuv_1.6.3                  rtracklayer_1.53.1           
##  [97] R6_2.5.1                      BiocIO_1.3.0                 
##  [99] promises_1.2.0.1              KernSmooth_2.23-20           
## [101] codetools_0.2-18              MASS_7.3-54                  
## [103] assertthat_0.2.1              rjson_0.2.20                 
## [105] withr_2.4.2                   GenomicAlignments_1.29.0     
## [107] Rsamtools_2.9.1               multcomp_1.4-17              
## [109] GenomeInfoDbData_1.2.7        parallel_4.1.1               
## [111] hms_1.1.1                     rmarkdown_2.11               
## [113] shiny_1.7.1                   restfulr_0.0.13