getBamInfo {SGSeq} | R Documentation |
Obtain paired-end status, median aligned read length, median aligned insert size and library size from BAM files.
getBamInfo(sample_info, yieldSize = NULL, cores = 1)
sample_info |
Data frame with sample information including
mandatory columns “sample_name” and “file_bam”.
Column “sample_name” must be a character vector. Column
“file_bam” can be a character vector or |
yieldSize |
Number of records used for obtaining library
information, or |
cores |
Number of cores available for parallel processing |
BAM files must have been generated with a splice-aware alignment program that outputs the custom tag ‘XS’ for spliced reads, indicating the direction of transcription. BAM files must be indexed.
Library information can be inferred from a subset of BAM records
by setting the number of records via argument yieldSize
.
Note that library size is only obtained if yieldSize
is NULL.
sample_info
with additional columns “paired_end”,
“read_length”, “frag_length”, and “lib_size”
if yieldSize
is NULL
Leonard Goldstein
path <- system.file("extdata", package = "SGSeq") si$file_bam <- file.path(path, "bams", si$file_bam) ## data.frame as sample_info and character vector as file_bam si <- si[, c("sample_name", "file_bam")] si_complete <- getBamInfo(si) ## DataFrame as sample_info and BamFileList as file_bam DF <- DataFrame(si) DF$file_bam <- BamFileList(DF$file_bam) DF_complete <- getBamInfo(DF)