download.SRA.metadata {ORFik} | R Documentation |
Given a experiment identifier, query information from different locations of SRA to get a complete metadata table of the experiment. It first finds Runinfo for each library, then sample info, if pubmed id is not found searches for that and searches for author through pubmed. A common problem is that the project is not linked to an article, you will not then get a pubmed id.
download.SRA.metadata( SRP, outdir = tempdir(), remove.invalid = TRUE, auto.detect = FALSE, abstract = "printsave" )
SRP |
a string, a study ID as either the PRJ, SRP, ERP, DRPor GSE of the study, examples would be "SRP226389" or "ERP116106". If GSE it will try to convert to the SRP to find the files. The call works as long the runs are registered on the efetch server, as their is a linked SRP link from bioproject or GSE. Example which fails is "PRJNA449388", which does not have a linking like this. |
outdir |
directory to save file, default: tempdir(). The file will be called "SraRunInfo_SRP.csv", where SRP is the SRP argument. We advice to use bioproject IDs "PRJNA...". The directory will be created if not existing. |
remove.invalid |
logical, default TRUE. Remove Runs with 0 reads (spots) |
auto.detect |
logical, default FALSE. If TRUE, ORFik will add additional columns: |
abstract |
character, default "printsave". If abstract for project exists,
print and save it (save the file to same directory as runinfo).
Alternatives: "print", Only print first time downloaded,
will not be able to print later. |
a data.table of the metadata, 1 row per sample, SRR run number defined in Run column.
doi: 10.1093/nar/gkq1019
Other sra:
download.SRA()
,
download.ebi()
,
install.sratoolkit()
,
rename.SRA.files()
## Originally on SRA download.SRA.metadata("SRP226389") ## Now try with auto detection (guessing additional library info) ## Need to specify output dir as tempfile() to re-download #download.SRA.metadata("SRP226389", tempfile(), auto.detect = TRUE) ## Originally on ENA (RCP-seq data) # download.SRA.metadata("ERP116106") ## Originally on GEO (GSE) (save to directory to keep info with fastq files) # download.SRA.metadata("GSE61011", "/path/to/fastq.folder/")