predictCoverage {alpine} | R Documentation |
Predict coverage for a single-isoform gene given fitted bias parameters in a set of models, and compare to the observed fragment coverage.
predictCoverage(gene, bam.files, fitpar, genome, model.names)
gene |
a GRangesList with the exons of different genes |
bam.files |
a character string pointing to indexed BAM files |
fitpar |
the output of running fitBiasModels |
genome |
a BSgenome object |
model.names |
a character vector listing the models, see same argument in estimateAbundance |
Note that if the range between minsize
and maxsize
does not cover most of the fragment length distribution, the
predicted coverage will underestimate the observed coverage.
a list with elements frag.cov, the observed fragment coverage
from the bam.files
and pred.cov, a list with the predicted
fragment coverage for each of the models
.
# these next lines just write out a BAM file from R # typically you would already have a BAM file library(alpineData) library(GenomicAlignments) library(rtracklayer) gap <- ERR188088() dir <- system.file(package="alpineData", "extdata") bam.file <- c("ERR188088" = file.path(dir,"ERR188088.bam")) export(gap, con=bam.file) data(preprocessedData) library(BSgenome.Hsapiens.NCBI.GRCh38) model.names <- c("fraglen","fraglen.vlmm","GC","all") pred.cov <- predictCoverage(gene=ebt.fit[["ENST00000379660"]], bam.files=bam.file, fitpar=fitpar.small, genome=Hsapiens, model.names=model.names) # plot the coverage: # note that, because [125,175] bp range specified in fitpar.small # does not cover the fragment width distribution, the predicted curves # will underestimate the observed. we correct here post-hoc frag.cov <- pred.cov[["ERR188088"]][["frag.cov"]] plot(frag.cov, type="l", lwd=3, ylim=c(0,max(frag.cov)*1.5)) for (i in seq_along(model.names)) { m <- model.names[i] pred <- pred.cov[["ERR188088"]][["pred.cov"]][[m]] lines(pred/mean(pred)*mean(frag.cov), col=i+1, lwd=3) } legend("topright", legend=c("observed",model.names), col=seq_len(length(model.names)+1), lwd=3)