Authors: Koki Tsuyuzaki [aut, cre], Manabu Ishii [aut], Itoshi Nikaido [aut]
Last modified: 2021-05-06 15:58:59
Compiled: Thu May 6 16:00:07 2021

1 Installation

To install this package, start R (>= 4.1.0) and enter:

if(!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("AHPubMedDbs")

2 Fetch PubMed tibble datasets from AnnotationHub

The AHPubMedDbs package provides the metadata for all PubMed datasets , which is preprocessed as tibble format and saved in AnnotationHub. First we load/update the AnnotationHub resource.

library(AnnotationHub)
ah <- AnnotationHub()

Next we list all PubMed entries from AnnotationHub.

query(ah, "PubMed")
## AnnotationHub with 21 records
## # snapshotDate(): 2021-05-06
## # $dataprovider: NCBI
## # $species: NA
## # $rdataclass: data.table, Tibble, SQLiteFile
## # additional mcols(): taxonomyid, genome, description,
## #   coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## #   rdatapath, sourceurl, sourcetype 
## # retrieve records with, e.g., 'object[["AH91771"]]' 
## 
##             title                                   
##   AH91771 | SQLite for PubMed ID                    
##   AH91772 | SQLite for PubMed Abstract              
##   AH91773 | SQLite for PubMed Author Information    
##   AH91774 | SQLite for PMC                          
##   AH91775 | SQLite for MeSH (Descriptor)            
##   ...       ...                                     
##   AH91787 | Data.table for PubMed Author Information
##   AH91788 | Data.table for PMC                      
##   AH91789 | Data.table for MeSH (Descriptor)        
##   AH91790 | Data.table for MeSH (Qualifier)         
##   AH91791 | Data.table for MeSH (SCR)

We can confirm the metadata in AnnotationHub in Bioconductor S3 bucket with mcols().

mcols(query(ah, "PubMed"))
## DataFrame with 21 rows and 15 columns
##                          title dataprovider     species taxonomyid      genome
##                    <character>  <character> <character>  <integer> <character>
## AH91771   SQLite for PubMed ID         NCBI          NA         NA          NA
## AH91772 SQLite for PubMed Ab..         NCBI          NA         NA          NA
## AH91773 SQLite for PubMed Au..         NCBI          NA         NA          NA
## AH91774         SQLite for PMC         NCBI          NA         NA          NA
## AH91775 SQLite for MeSH (Des..         NCBI          NA         NA          NA
## ...                        ...          ...         ...        ...         ...
## AH91787 Data.table for PubMe..         NCBI          NA         NA          NA
## AH91788     Data.table for PMC         NCBI          NA         NA          NA
## AH91789 Data.table for MeSH ..         NCBI          NA         NA          NA
## AH91790 Data.table for MeSH ..         NCBI          NA         NA          NA
## AH91791 Data.table for MeSH ..         NCBI          NA         NA          NA
##                    description coordinate_1_based             maintainer
##                    <character>          <integer>            <character>
## AH91771                   PMID                  1 Koki Tsuyuzaki <k.t...
## AH91772 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91773 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91774 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91775 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## ...                        ...                ...                    ...
## AH91787 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91788 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91789 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91790 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
## AH91791 Correspondence table..                  1 Koki Tsuyuzaki <k.t...
##         rdatadateadded preparerclass                           tags  rdataclass
##            <character>   <character>                         <list> <character>
## AH91771     2021-04-19   AHPubMedDbs         NCBI,PubMed,SQLite,...  SQLiteFile
## AH91772     2021-04-19   AHPubMedDbs         NCBI,PubMed,SQLite,...  SQLiteFile
## AH91773     2021-04-19   AHPubMedDbs         NCBI,PubMed,SQLite,...  SQLiteFile
## AH91774     2021-04-19   AHPubMedDbs            NCBI,PMC,SQLite,...  SQLiteFile
## AH91775     2021-04-19   AHPubMedDbs       Descriptor,MeSH,NCBI,...  SQLiteFile
## ...                ...           ...                            ...         ...
## AH91787     2021-04-19   AHPubMedDbs     data.table,NCBI,PubMed,...  data.table
## AH91788     2021-04-19   AHPubMedDbs        data.table,NCBI,PMC,...  data.table
## AH91789     2021-04-19   AHPubMedDbs data.table,Descriptor,MeSH,...  data.table
## AH91790     2021-04-19   AHPubMedDbs       data.table,MeSH,NCBI,...  data.table
## AH91791     2021-04-19   AHPubMedDbs       data.table,MeSH,NCBI,...  data.table
##                      rdatapath              sourceurl  sourcetype
##                    <character>            <character> <character>
## AH91771 AHPubMedDbs/v001/pub.. https://github.com/r..         XML
## AH91772 AHPubMedDbs/v001/abs.. https://github.com/r..         XML
## AH91773 AHPubMedDbs/v001/aut.. https://github.com/r..         XML
## AH91774 AHPubMedDbs/v001/pmc.. https://github.com/r..         XML
## AH91775 AHPubMedDbs/v001/des.. https://github.com/r..         XML
## ...                        ...                    ...         ...
## AH91787 AHPubMedDbs/v001/aut.. https://github.com/r..         XML
## AH91788 AHPubMedDbs/v001/pmc.. https://github.com/r..         XML
## AH91789 AHPubMedDbs/v001/des.. https://github.com/r..         XML
## AH91790 AHPubMedDbs/v001/qua.. https://github.com/r..         XML
## AH91791 AHPubMedDbs/v001/scr.. https://github.com/r..         XML

We can retrieve only the PubMedDb tibble files as follows.

qr <- query(ah, c("PubMedDb"))
# pubmed_tibble <- qr[[1]]

Session information

## R Under development (unstable) (2021-01-20 r79850)
## Platform: x86_64-apple-darwin17.7.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS:   /Users/ka36530_ca/R-stuff/bin/R-devel/lib/libRblas.dylib
## LAPACK: /Users/ka36530_ca/R-stuff/bin/R-devel/lib/libRlapack.dylib
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
## [1] AnnotationHub_2.99.5 BiocFileCache_1.99.7 dbplyr_2.1.1        
## [4] BiocGenerics_0.37.5  BiocStyle_2.19.2    
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.6                    png_0.1-7                    
##  [3] Biostrings_2.59.2             assertthat_0.2.1             
##  [5] digest_0.6.27                 utf8_1.2.1                   
##  [7] mime_0.10                     R6_2.5.0                     
##  [9] stats4_4.1.0                  RSQLite_2.2.7                
## [11] evaluate_0.14                 httr_1.4.2                   
## [13] pillar_1.6.0                  zlibbioc_1.37.0              
## [15] rlang_0.4.11                  curl_4.3.1                   
## [17] jquerylib_0.1.4               blob_1.2.1                   
## [19] S4Vectors_0.29.18             rmarkdown_2.7                
## [21] stringr_1.4.0                 bit_4.0.4                    
## [23] shiny_1.6.0                   compiler_4.1.0               
## [25] httpuv_1.6.0                  xfun_0.22                    
## [27] pkgconfig_2.0.3               htmltools_0.5.1.1            
## [29] tidyselect_1.1.1              KEGGREST_1.31.2              
## [31] tibble_3.1.1                  interactiveDisplayBase_1.29.0
## [33] bookdown_0.22                 IRanges_2.25.11              
## [35] fansi_0.4.2                   withr_2.4.2                  
## [37] crayon_1.4.1                  dplyr_1.0.6                  
## [39] later_1.2.0                   rappdirs_0.3.3               
## [41] jsonlite_1.7.2                xtable_1.8-4                 
## [43] lifecycle_1.0.0               DBI_1.1.1                    
## [45] magrittr_2.0.1                stringi_1.5.3                
## [47] cachem_1.0.4                  XVector_0.31.1               
## [49] promises_1.2.0.1              bslib_0.2.4                  
## [51] ellipsis_0.3.2                filelock_1.0.2               
## [53] generics_0.1.0                vctrs_0.3.8                  
## [55] tools_4.1.0                   bit64_4.0.5                  
## [57] Biobase_2.51.0                glue_1.4.2                   
## [59] purrr_0.3.4                   BiocVersion_3.13.1           
## [61] fastmap_1.1.0                 yaml_2.2.1                   
## [63] AnnotationDbi_1.53.1          BiocManager_1.30.12          
## [65] memoise_2.0.0                 knitr_1.33                   
## [67] sass_0.3.1