TargetExperiment-class {TarSeqQC} | R Documentation |
This S4 class represents a Targeted Sequencing Experiment in R. Targeted Sequencing Experiments are characterized by a 'bed file' that contains the specification of the explored 'features' as a 'panel'. This features could be amplicons, exons, transcripts, among others. In general each feature is associated to one gene. A gene could be related to many features. This class allows the representation and quality control of a Targeted Sequencing Experiment.
scanBamP
ScanBamParam containing the information to scan the BAM file.
pileupP
PileupParam containing the information to build the pileup.
bedFile
GRanges object that models the bed file.
bamFile
BamFile object that is a reference to the BAM file.
fastaFile
FaFile object that is a reference to the reference sequence.
featurePanel
GRanges object that models the feature panel and related statistics.
genePanel
GRanges object that models the analyzed panel and related statistics at a gene level.
attribute
character indicates which attribute 'coverage' or 'medianCounts' will be used to the analysis.
feature
character indicates the name of the analyzed features. E.g 'amplicon', 'exon', 'transcript'.
Model Targeted Sequencing Experiments in R.
Obtain coverage and read counts per sequenced feature.
Evaluate the performance of a targeted sequencing experiment using coverage/read counts information.
Detect in early stage sequencing or library preparation errors.
Explore read profiles for particular features or genomic regions.
Explore any kind of experiment in which 'feature' definition is possible for several genes. E.g RNA-seq experiments in which transcripts could be the 'features'.
Report quality control results.
TargetExperiment S4 class includes the following functions:
calculate pileup statistics for the BAM file
build and model a feature panel as a GRanges object and compute read statistics
summarize the feature panel to a gene panel and compute read statistics
constructor of TargetExperiment to generate the feature and gene panels starting from an alignment BAM file and the bed file
return the respective TargetExperiment slot
set the respective TargetExperiment slots
generic output of the object
generic output of the object
print statistics summary for the set attribute
build a frequency table of the attribute occurrence in user configured intervals
plot a summarized view of the feature panel performance
plot the density and distribution of the attribute
plot the sequencing performance for each feature and/or gene
plot the reads profile for a particular feature
plot the explored attribute for each feature of a particular gene
plot nucleotide percentages for each position of a particular feature
plot the reads profile for a particular genomic region
calculate frequencies of reads fall in and out of targeted regions
plot frequencies of reads fall in and out of targeted regions
plot attribute distributions along groups of bias sources
plot density and box plots or frequency bar plot of metadata columns
internal function to add the first sheet of xlsx reports
build the experiment report as an xlsx file
Gabriela A. Merino gmerino@bdmg.com.ar, Cristobal Fresno cfresno@bdmg.com.ar, Yanina Murua ymurua@leloir.org.ar, Andrea S. Llera allera@leloir.org.ar and Elmer A. Fernandez efernandez@bdmg.com.ar
Rsamtools
Other TargetExperiment: TargetExperiment
,
ampliPanel2
, ampliPanel
,
initialize
, myCounts
## Defining bam file, bed file and fasta file names and paths bamFile<-system.file("extdata", "mybam.bam", package="TarSeqQC", mustWork=TRUE) bedFile<-system.file("extdata", "mybed.bed", package="TarSeqQC", mustWork=TRUE) fastaFile<-system.file("extdata", "myfasta.fa", package="TarSeqQC", mustWork=TRUE) ## Creating a TargetExperiment object # Defining feature parameter feature<-"amplicon" # Defining attribute parameter attribute<-"coverage" ampliPanel<-TargetExperiment(bedFile, bamFile, fastaFile, attribute=attribute, feature=feature) ## Alternative object creation # Creating the TargetExperiment object ampliPanel<-TargetExperiment(bedFile, bamFile, fastaFile) # Set feature slot value setFeature(ampliPanel)<-"amplicon" # Set attribute slot value setAttribute(ampliPanel)<-"coverage" # Set pileupP slot value in order to set the maximum depth at 1000 setPileupP(ampliPanel)<-PileupParam(max_depth=1000) # Set the featurePanel slot but now using the new pileupP definition setFeaturePanel(ampliPanel)<-buildFeaturePanel(ampliPanel) ## Early exploration # show/print ampliPanel # summary summary(ampliPanel) # summary at feature level summaryFeatureLev(ampliPanel) # summary at gene level summaryGeneLev(ampliPanel) # attribute boxplot and density plot exploration g<-plotAttrExpl(ampliPanel,level="feature",join=TRUE, log=FALSE, color="blue") if(interactive()){ x11(type="cairo");g } # explore amplicon length distribution g<-plotMetaDataExpl(ampliPanel, "length", log=FALSE, join=FALSE, color= "blueviolet") if(interactive()){ g } # explore gene's relative frequencies g<-plotMetaDataExpl(ampliPanel, "gene", abs=FALSE) if(interactive()){ g } ## Deep exploration and Quality Control myfrequencies<-readFrequencies(ampliPanel) g<-plotInOutFeatures(readFrequencies(ampliPanel)) if(interactive()){ g } # definition of the interval extreme values attributeThres<-c(0,1,50,200,500, Inf) # plot panel overview g<-plot(ampliPanel, attributeThres, chrLabels =TRUE) if(interactive()){ x11(type="cairo");g } # plot panel overview in a feature performance plot g<-plotFeatPerform(ampliPanel, attributeThres, complete=TRUE, log=FALSE, featureLabs=TRUE, sepChr=TRUE, legend=TRUE) if(interactive()){ g } # explore possible attribute bias g<-biasExploration(ampliPanel, source="gc", dens=TRUE) if(interactive()){ x11(type="cairo");g } ## Controlling low counts features # Do a frequency table for the attribute intervals summaryIntervals(ampliPanel, attributeThres) #plotting attribute intervals g<-plotAttrPerform(ampliPanel) if(interactive()){ g } # getting low counts features at gene level getLowCtsFeatures(ampliPanel, level="gene", threshold=50) # getting low counts features at feature level getLowCtsFeatures(ampliPanel, level="feature", threshold=50) # exploring amplicon attribute values for a particular gene g<-plotGeneAttrPerFeat(ampliPanel, geneID="gene4") # adjust text size g<-g+theme(title=element_text(size=16), axis.title=element_text(size=16), legend.text=element_text(size=14)) if(interactive()){ g } ##Obtain the pileup matrix for the first amplicon bed<-getBedFile(ampliPanel)[1] ## extracting the pileup matrix myCounts<-pileupCounts(bed, bamFile, fastaFile) head(myCounts) # getting and exploring a sequenced region of a particular gene getRegion(ampliPanel, level="gene", ID="gene7", collapse=FALSE) # plot a particular genomic region g<-plotRegion(ampliPanel,region=c(4500,6800), seqname="chr10", SNPs=TRUE, xlab="", title="gene7 amplicons",size=0.5) if(interactive()){ x11(type="cairo");g } # exploring the read count profile for a particular amplicon g<-plotFeature(ampliPanel, featureID="AMPL20") if(interactive()){ x11(type="cairo");g } # exploring the nucleotide percentages compositions of the read counts for a # particular amplicon g<-plotNtdPercentage(ampliPanel,featureID="AMPL20") if(interactive()){ g } ## Building the XLSX report imageFile<-system.file("extdata", "plot.pdf", package="TarSeqQC", mustWork=TRUE) buildReport(ampliPanel, attributeThres, imageFile ,file="Results.xlsx")