GEOfastq
can be installed from Bioconductor as follows:
The NCBI Gene Expression Omnibus (GEO) offers a convenient interface to explore high-throughput experimental data such as RNA-seq. GEO deposits RNA-seq data as sra files to the Sequence Read Archive (SRA) which can be converted to fastq files using fastq-dump
. This conversion process can be quite slow and it is usually more convenient to download fastq files for a GEO accession generated by the European Nucleotide Archive (ENA). GEOfastq
crawls GEO to retrieve metadata and ENA fastq urls, and then downloads them.
To get fastq data for a GEO series, we first retrieve the metadata for a GEO accession:
Next, we extract the sample accessions for this study and retrieve the GEO metadata and ENA fastq url for an example:
gsm_names <- extract_gsms(gse_text)
gsm_name <- gsm_names[182]
srp_meta <- crawl_gsms(gsm_name)
#> 1 GSMs to process
Now that we have retrieved the necessary metadata, we are ready to download the fastq files for this sample:
The following package and versions were used in the production of this vignette.
#> R version 4.1.1 Patched (2021-08-22 r80813)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Mojave 10.14.6
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] GEOfastq_1.2.0
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.7 knitr_1.36 magrittr_2.0.1 doParallel_1.0.16
#> [5] R6_2.5.1 rlang_0.4.12 fastmap_1.1.0 foreach_1.5.1
#> [9] stringr_1.4.0 plyr_1.8.6 tools_4.1.1 parallel_4.1.1
#> [13] xfun_0.27 jquerylib_0.1.4 htmltools_0.5.2 iterators_1.0.13
#> [17] yaml_2.2.1 digest_0.6.28 sass_0.4.0 bitops_1.0-7
#> [21] codetools_0.2-18 RCurl_1.98-1.5 evaluate_0.14 rmarkdown_2.11
#> [25] stringi_1.7.5 compiler_4.1.1 bslib_0.3.1 jsonlite_1.7.2