1 Purpose

biodbMirbase is a biodb extension package that implements a connector to miRBase mature database (Griffiths-Jones et al. 2006, @griffithsjones2007_miRBase, @kozomara2010_miRBase, @kozomara2013_miRBase).

2 Installation

Install using Bioconductor:

if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")
BiocManager::install('biodbMirbase')

3 Initialization

The first step in using biodbMirbase, is to create an instance of the biodb class Biodb from the main biodb package. This is done by calling the constructor of the class:

mybiodb <- biodb::newInst()

During this step the configuration is set up, the cache system is initialized and extension packages are loaded.

We will see at the end of this vignette that the biodb instance needs to be terminated with a call to the terminate() method.

4 Creating a connector to miRBase mature database

In biodb the connection to a database is handled by a connector instance that you can get from the factory. biodbMirbase implements a connector to a remote database. Here is the code to instantiate a connector:

conn <- mybiodb$getFactory()$createConn('mirbase.mature')
## Loading required package: biodbMirbase

5 Accessing entries

To get some of the first entry IDs (accession numbers) from the database, run:

ids <- conn$getEntryIds(2)
## INFO  [16:06:03.158] Create cache folder "/home/biocbuild/.cache/R/biodb/mirbase.mature-ecbae556d51779d9d48125ba5a617e19" for "mirbase.mature-ecbae556d51779d9d48125ba5a617e19". 
## INFO  [16:06:03.160] Downloading whole database of mirbase.mature. 
## INFO  [16:06:06.940] Extract whole database of mirbase.mature.
ids
## [1] "MIMAT0000001" "MIMAT0000002"

To retrieve entries, use:

entries <- conn$getEntry(ids)
entries
## [[1]]
## Biodb miRBase mature database entry instance MIMAT0000001.
## 
## [[2]]
## Biodb miRBase mature database entry instance MIMAT0000002.

To convert a list of entries into a dataframe, run:

x <- mybiodb$entriesToDataframe(entries)
x
##      accession                     description         name
## 1 MIMAT0000001 Caenorhabditis elegans let-7-5p cel-let-7-5p
## 2 MIMAT0000002 Caenorhabditis elegans lin-4-5p cel-lin-4-5p
##                   aa.seq mirbase.mature.id
## 1 UGAGGUAGUAGGUUGUAUAGUU      MIMAT0000001
## 2  UCCCUGAGACCUCAAGUGUGA      MIMAT0000002

6 Closing biodb instance

When done with your biodb instance you have to terminate it, in order to ensure release of resources (file handles, database connection, etc):

mybiodb$terminate()
## INFO  [16:06:10.656] Closing BiodbMain instance... 
## INFO  [16:06:10.658] Connector "mirbase.mature" deleted.

7 Session information

sessionInfo()
## R version 4.2.0 RC (2022-04-19 r82224)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.15-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.15-bioc/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] biodbMirbase_1.0.0 BiocStyle_2.24.0  
## 
## loaded via a namespace (and not attached):
##  [1] progress_1.2.2      tidyselect_1.1.2    xfun_0.30          
##  [4] bslib_0.3.1         purrr_0.3.4         vctrs_0.4.1        
##  [7] generics_0.1.2      htmltools_0.5.2     BiocFileCache_2.4.0
## [10] yaml_2.3.5          utf8_1.2.2          blob_1.2.3         
## [13] XML_3.99-0.9        rlang_1.0.2         jquerylib_0.1.4    
## [16] pillar_1.7.0        withr_2.5.0         glue_1.6.2         
## [19] DBI_1.1.2           rappdirs_0.3.3      bit64_4.0.5        
## [22] dbplyr_2.1.1        lifecycle_1.0.1     plyr_1.8.7         
## [25] stringr_1.4.0       memoise_2.0.1       evaluate_0.15      
## [28] knitr_1.38          fastmap_1.1.0       curl_4.3.2         
## [31] fansi_1.0.3         biodb_1.4.0         Rcpp_1.0.8.3       
## [34] openssl_2.0.0       filelock_1.0.2      BiocManager_1.30.17
## [37] cachem_1.0.6        jsonlite_1.8.0      bit_4.0.4          
## [40] hms_1.1.1           chk_0.8.0           askpass_1.1        
## [43] digest_0.6.29       stringi_1.7.6       bookdown_0.26      
## [46] dplyr_1.0.8         cli_3.3.0           tools_4.2.0        
## [49] magrittr_2.0.3      sass_0.4.1          RSQLite_2.2.12     
## [52] tibble_3.1.6        crayon_1.5.1        pkgconfig_2.0.3    
## [55] ellipsis_0.3.2      prettyunits_1.1.1   assertthat_0.2.1   
## [58] rmarkdown_2.14      httr_1.4.2          lgr_0.4.3          
## [61] R6_2.5.1            compiler_4.2.0

References

Griffiths-Jones, Sam, Russell J. Grocock, Stijn van Dongen, Alex Bateman, and Anton J. Enright. 2006. “MiRBase: MicroRNA Sequences, Targets and Gene Nomenclature.” Nucleic Acids Research 34 (suppl_1): D140–D144. https://doi.org/10.1093/nar/gkj112.

Griffiths-Jones, Sam, Harpreet Kaur Saini, Stijn van Dongen, and Anton J. Enright. 2007. “miRBase: tools for microRNA genomics.” Nucleic Acids Research 36 (November): D154–D158. https://doi.org/10.1093/nar/gkm952.

Kozomara, Ana, and Sam Griffiths-Jones. 2010. “MiRBase: Integrating microRNA Annotation and Deep-Sequencing Data.” Nucleic Acids Research 39 (suppl_1): D152–D157. https://doi.org/10.1093/nar/gkq1027.

———. 2013. “MiRBase: Annotating High Confidence microRNAs Using Deep Sequencing Data.” Nucleic Acids Research 42 (D1): D68–D73. https://doi.org/10.1093/nar/gkt1181.