bulk_windows_pipeline_setup {FLAMES} | R Documentation |
An implementation of the FLAMES pipeline designed to run on Windows, or any OS without access to minimap2, for read realignment. This pipeline requires external read alignment, in betwen pipeline calls.
bulk_windows_pipeline_setup( annot, fastq, in_bam = NULL, outdir, genome_fa, downsample_ratio = 1, config_file )
annot |
gene annotations file in gff3 format |
fastq |
file path to input fastq file |
in_bam |
optional bam file to use instead of fastq files (skips read alignment step) |
outdir |
directory to store all output files. |
genome_fa |
genome fasta file. |
downsample_ratio |
downsampling ratio if performing downsampling analysis. |
config_file |
JSON configuration file. If specified, |
This function, bulk_windows_pipeline_setup
is the first step in the 3 step Windows FLAMES
bulk pipeline, and should be run first, read alignment undertaken,
then windows_pipline_isoforms
should be run, read realignment performed, and finally
windows_pipeline_quantification
should be run.
For each function, besides bulk_windows_pipeline_setup
, a list pipeline_variables
is returned, which
contains the information required to continue the pipeline. This list should be passed into each function, and updated
with the returned list. In the case of bulk_windows_pipeline_setup
, pipeline_variables
is
the list returned. See the vignette 'Vignette for FLAMES bulk on Windows' for more details.
a list pipeline_variables
with the required variables for execution of later Windows pipeline
steps. File paths required to perform minimap2 alignment are given in pipeline_variables$return_files.
This list should be given as input for windows_pipeline_isoforms
after minimap2 alignment has taken place; windows_pipeline_isoforms
is the
continuation of this pipeline.
## example windows pipeline for BULK data. See Vignette for single cell data. # download the two fastq files, move them to a folder to be merged together temp_path <- tempfile() bfc <- BiocFileCache::BiocFileCache(temp_path, ask=FALSE) file_url <- "https://raw.githubusercontent.com/OliverVoogd/FLAMESData/master/data" # download the required fastq files, and move them to new folder fastq1 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq1", paste(file_url, "fastq/sample1.fastq.gz", sep="/")))]] fastq2 <- bfc[[names(BiocFileCache::bfcadd(bfc, "Fastq2", paste(file_url, "fastq/sample2.fastq.gz", sep="/")))]] fastq_dir <- paste(temp_path, "fastq_dir", sep="/") # the downloaded fastq files need to be in a directory to be merged together dir.create(fastq_dir) file.copy(c(fastq1, fastq2), fastq_dir) unlink(c(fastq1, fastq2)) # the original files can be deleted # run the FLAMES bulk pipeline setup #pipeline_variables <- bulk_windows_pipeline_setup(annot=system.file("extdata/SIRV_anno.gtf", package="FLAMES"), # fastq=fastq_dir, # outdir=tempdir(), genome_fa=system.file("extdata/SIRV_genomefa.fasta", package="FLAMES"), # config_file=system.file("extdata/SIRV_config_default.json", package="FLAMES")) # read alignment is handled externally (below downloads aligned bam for example) # genome_bam <- paste0(temp_path, "/align2genome.bam") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM", paste(file_url, "align2genome.bam", sep="/")))]], genome_bam) # # genome_index <- paste0(temp_path, "/align2genome.bam.bai") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Genome BAM Index", paste(file_url, "align2genome.bam.bai", sep="/")))]], genome_index) # pipeline_variables$genome_bam = genome_bam # # # run the FLAMES bulk pipeline find isoforms step # pipeline_variables <- windows_pipeline_isoforms(pipeline_variables) # # # read realignment is handled externally # realign_bam <- paste0(temp_path, "/realign2genome.bam") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM", paste(file_url, "realign2transcript.bam", sep="/")))]], realign_bam) # # realign_index <- paste0(temp_path, "/realign2genome.bam.bai") # file.rename(bfc[[names(BiocFileCache::bfcadd(bfc, "Realign BAM Index", paste(file_url, "realign2transcript.bam.bai", sep="/")))]], realign_index) # pipeline_variables$realign_bam <- realign_bam # # # finally, quantification, which returns a Summarized Experiment object # se <- windows_pipeline_quantification(pipeline_variables)