An introduction to the bambu package using NanoporeRNASeq data

Introduction

NanoporeRNASeq contains RNA-Seq data from the K562 and MCF7 cell lines that were generated by the SG-NEx project (https://github.com/GoekeLab/sg-nex-data). Each of these cell line has three replicates, with 1 direct RNA sequencing data and 2 cDNA sequencing data. The files contains reads aligned to the human genome (Grch38) chromosome 22 (1:25500000).

Accessing NanoporeRNASeq data

Load the NanoporeRNASeq package

List the samples

List the available BamFile

Get the annotation GRangesList

Visualizing gene of interest from a single bam file

We can visualize the one sample for a single gene ENST00000215832 (MAPK1)

library(ggbio)
range <- HsChr22BambuAnnotation$ENST00000215832
# plot mismatch track
library(BSgenome.Hsapiens.NCBI.GRCh38)
# plot annotation track
tx <- autoplot(range, aes(col = strand), group.selfish = TRUE)
# plot coverage track
coverage <- autoplot(bamFiles[[1]], aes(col = coverage), which = range)

# merge the tracks into one plot
tracks(annotation = tx, coverage = coverage, heights = c(1, 3)) + theme_minimal()

Running Bambu with NanoporeRNASeq data

Load the bambu package

Run bambu

Applying bambu to bamFiles

bambu returns a SummarizedExperiment object

Visualizing gene examples

We can visualize the annotated and novel isoforms identified in this gene example using plot functions from bambu

##> [[1]]
##> TableGrob (3 x 1) "arrange": 3 grobs
##>   z     cells    name                grob
##> 1 1 (2-2,1-1) arrange      gtable[layout]
##> 2 2 (3-3,1-1) arrange      gtable[layout]
##> 3 3 (1-1,1-1) arrange text[GRID.text.262]
sessionInfo()
##> R version 4.2.1 (2022-06-23)
##> Platform: x86_64-pc-linux-gnu (64-bit)
##> Running under: Ubuntu 20.04.5 LTS
##> 
##> Matrix products: default
##> BLAS:   /home/biocbuild/bbs-3.16-bioc/R/lib/libRblas.so
##> LAPACK: /home/biocbuild/bbs-3.16-bioc/R/lib/libRlapack.so
##> 
##> locale:
##>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##>  [3] LC_TIME=en_GB              LC_COLLATE=C              
##>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
##> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
##> 
##> attached base packages:
##> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
##> [8] base     
##> 
##> other attached packages:
##>  [1] bambu_3.0.0                           
##>  [2] SummarizedExperiment_1.28.0           
##>  [3] Biobase_2.58.0                        
##>  [4] MatrixGenerics_1.10.0                 
##>  [5] matrixStats_0.62.0                    
##>  [6] BSgenome.Hsapiens.NCBI.GRCh38_1.3.1000
##>  [7] BSgenome_1.66.0                       
##>  [8] rtracklayer_1.58.0                    
##>  [9] ggbio_1.46.0                          
##> [10] ggplot2_3.3.6                         
##> [11] Rsamtools_2.14.0                      
##> [12] Biostrings_2.66.0                     
##> [13] XVector_0.38.0                        
##> [14] GenomicRanges_1.50.0                  
##> [15] GenomeInfoDb_1.34.0                   
##> [16] IRanges_2.32.0                        
##> [17] S4Vectors_0.36.0                      
##> [18] NanoporeRNASeq_1.8.0                  
##> [19] ExperimentHub_2.6.0                   
##> [20] AnnotationHub_3.6.0                   
##> [21] BiocFileCache_2.6.0                   
##> [22] dbplyr_2.2.1                          
##> [23] BiocGenerics_0.44.0                   
##> 
##> loaded via a namespace (and not attached):
##>   [1] backports_1.4.1               Hmisc_4.7-1                  
##>   [3] plyr_1.8.7                    lazyeval_0.2.2               
##>   [5] splines_4.2.1                 BiocParallel_1.32.0          
##>   [7] digest_0.6.30                 ensembldb_2.22.0             
##>   [9] htmltools_0.5.3               fansi_1.0.3                  
##>  [11] magrittr_2.0.3                checkmate_2.1.0              
##>  [13] memoise_2.0.1                 cluster_2.1.4                
##>  [15] prettyunits_1.1.1             jpeg_0.1-9                   
##>  [17] colorspace_2.0-3              blob_1.2.3                   
##>  [19] rappdirs_0.3.3                xfun_0.34                    
##>  [21] dplyr_1.0.10                  crayon_1.5.2                 
##>  [23] RCurl_1.98-1.9                jsonlite_1.8.3               
##>  [25] graph_1.76.0                  survival_3.4-0               
##>  [27] VariantAnnotation_1.44.0      glue_1.6.2                   
##>  [29] gtable_0.3.1                  zlibbioc_1.44.0              
##>  [31] DelayedArray_0.24.0           scales_1.2.1                 
##>  [33] DBI_1.1.3                     GGally_2.1.2                 
##>  [35] Rcpp_1.0.9                    xtable_1.8-4                 
##>  [37] progress_1.2.2                htmlTable_2.4.1              
##>  [39] foreign_0.8-83                bit_4.0.4                    
##>  [41] OrganismDbi_1.40.0            Formula_1.2-4                
##>  [43] htmlwidgets_1.5.4             httr_1.4.4                   
##>  [45] RColorBrewer_1.1-3            ellipsis_0.3.2               
##>  [47] farver_2.1.1                  pkgconfig_2.0.3              
##>  [49] reshape_0.8.9                 XML_3.99-0.12                
##>  [51] nnet_7.3-18                   sass_0.4.2                   
##>  [53] deldir_1.0-6                  utf8_1.2.2                   
##>  [55] labeling_0.4.2                tidyselect_1.2.0             
##>  [57] rlang_1.0.6                   reshape2_1.4.4               
##>  [59] later_1.3.0                   AnnotationDbi_1.60.0         
##>  [61] munsell_0.5.0                 BiocVersion_3.16.0           
##>  [63] tools_4.2.1                   cachem_1.0.6                 
##>  [65] xgboost_1.6.0.1               cli_3.4.1                    
##>  [67] generics_0.1.3                RSQLite_2.2.18               
##>  [69] evaluate_0.17                 stringr_1.4.1                
##>  [71] fastmap_1.1.0                 yaml_2.3.6                   
##>  [73] knitr_1.40                    bit64_4.0.5                  
##>  [75] purrr_0.3.5                   KEGGREST_1.38.0              
##>  [77] AnnotationFilter_1.22.0       RBGL_1.74.0                  
##>  [79] mime_0.12                     formatR_1.12                 
##>  [81] xml2_1.3.3                    biomaRt_2.54.0               
##>  [83] compiler_4.2.1                rstudioapi_0.14              
##>  [85] filelock_1.0.2                curl_4.3.3                   
##>  [87] png_0.1-7                     interactiveDisplayBase_1.36.0
##>  [89] tibble_3.1.8                  bslib_0.4.0                  
##>  [91] stringi_1.7.8                 highr_0.9                    
##>  [93] GenomicFeatures_1.50.1        lattice_0.20-45              
##>  [95] ProtGenerics_1.30.0           Matrix_1.5-1                 
##>  [97] vctrs_0.5.0                   pillar_1.8.1                 
##>  [99] lifecycle_1.0.3               BiocManager_1.30.19          
##> [101] jquerylib_0.1.4               data.table_1.14.4            
##> [103] bitops_1.0-7                  httpuv_1.6.6                 
##> [105] R6_2.5.1                      BiocIO_1.8.0                 
##> [107] latticeExtra_0.6-30           promises_1.2.0.1             
##> [109] gridExtra_2.3                 codetools_0.2-18             
##> [111] dichromat_2.0-0.1             assertthat_0.2.1             
##> [113] rjson_0.2.21                  withr_2.5.0                  
##> [115] GenomicAlignments_1.34.0      GenomeInfoDbData_1.2.9       
##> [117] parallel_4.2.1                hms_1.1.2                    
##> [119] grid_4.2.1                    rpart_4.1.19                 
##> [121] tidyr_1.2.1                   rmarkdown_2.17               
##> [123] biovizBase_1.46.0             shiny_1.7.3                  
##> [125] base64enc_0.1-3               interp_1.1-3                 
##> [127] restfulr_0.0.15