dba {DiffBind} | R Documentation |
Construct a DBA object
Description
Constructs a new DBA object from a sample sheet,
or based on an existing DBA object
Usage
dba(DBA,mask, minOverlap=2,
sampleSheet="dba_samples.csv",
config=data.frame(AnalysisMethod=DBA_DESEQ2,th=0.05,
DataType=DBA_DATA_GRANGES, RunParallel=TRUE,
minQCth=15, fragmentSize=125,
bCorPlot=FALSE, reportInit="DBA",
bUsePval=FALSE, design=TRUE,
doBlacklist=TRUE, doGreylist=TRUE),
peakCaller="raw", peakFormat, scoreCol, bLowerScoreBetter,
filter, skipLines=0,
bAddCallerConsensus=FALSE,
bRemoveM=TRUE, bRemoveRandom=TRUE,
bSummarizedExperiment=FALSE,
attributes, dir)
Arguments
DBA |
existing DBA object – if present, will return a fully-constructed DBA object
based on the passed one,
using criteria specified in the mask and/or minOverlap parameters.
If missing, will create a new DBA object based on the sampleSheet .
|
mask |
logical or numerical vector indicating which peaksets to include
in the resulting model if basing DBA object on an existing one.
See dba.mask .
|
minOverlap |
only include peaks in at least this many peaksets in the main binding matrix
if basing DBA object on an existing one.
If minOverlap is between zero and one, peak will be included from at
least this proportion of peaksets.
|
sampleSheet |
data frame containing sample sheet, or file name of sample sheet to load
(ignored if DBA is specified).
Columns names in sample sheet may include:
-
SampleID: Identifier string for sample.
Must be unique for each sample.
-
Tissue: Identifier string for tissue type
-
Factor: Identifier string for factor
-
Condition: Identifier string for condition
-
Treatment: Identifier string for treatment
-
Replicate: Replicate number of sample
-
bamReads: file path for bam file containing aligned reads for ChIP sample
-
bamControl: file path for bam file containing aligned reads for control sample
-
Spikein: file path for bam file containing aligned spike-in reads
-
ControlID: Identifier string for control sample
-
Peaks: path for file containing peaks for sample.
Format determined by PeakCaller field or caller parameter
-
PeakCaller: Identifier string for peak caller used.
If Peaks is not a bed file, this will determine how the Peaks file is parsed.
If missing, will use default peak caller specified in caller parameter. Possible values:
-
“raw”: text file file; peak score is in fourth column
-
“bed”: .bed file; peak score is in fifth column
-
“narrow”: default peak.format: narrowPeaks file
-
“macs”: MACS .xls file
-
“swembl”: SWEMBL .peaks file
-
“bayes”: bayesPeak file
-
“peakset”: peakset written out using pv.writepeakset
-
“fp4”: FindPeaks v4
-
PeakFormat: string indicating format for peak files;
see PeakCaller and dba.peakset
-
ScoreCol: column in peak files that contains peak scores
-
LowerBetter: logical indicating that lower scores signify better peaks
Counts: file path for externally computed read counts;
see dba.peakset
(counts parameter)
For sample sheets loaded from a file, the accepted formats are comma-separated values
(column headers, followed by one line per sample),
or Excel-formatted spreadsheets (.xls
or .xlsx extension).
Leading and trailing white space will be removed from all values, with a warning.
|
config |
data frame containing configuration options,
or file name of config file to load when constructing a new DBA object from a sample sheet.
NULL indicates no config file.
Relevant fields include:
-
AnalysisMethod: either DBA_DESEQ2 or DBA_EDGER .
-
th: default threshold for reporting and
plotting analysis results.
-
DataType: default class for peaks and reports
(DBA_DATA_GRANGES, DBA_DATA_RANGEDDATA, or DBA_DATA_FRAME ).
-
RunParallel: logical indicating if counting and analysis
operations should be run in parallel using multicore by default.
-
minQCth: numeric, for filtering reads based on mapping
quality score; only reads with a mapping quality score
greater than or equal to this will be counted.
-
fragmentSize: numeric with mean fragment size.
Reads will be extended to this length before counting overlaps.
May be a vector of lengths, one for each sample.
-
bCorPlot: logical indicating that a correlation heatmap
should be plotted automatically
-
ReportInit: string to append to the beginning of saved
report file names.
-
bUsePval: logical, default indicating whether to use FDR
(FALSE ) or p-values (TRUE ).
-
doBlacklist: logical, whether to attempt to find and apply
a blacklist if none is present when running dba.analyze .
-
doGreylist: logical, whether to attempt to generate and apply
a greylist if none is present when running dba.analyze .
|
peakCaller |
if a sampleSheet is specified, the default peak caller that will be used
if the PeakCaller column is absent.
|
peakFormat |
if a sampleSheet is specified, the default peak file format
that will be used if the PeakFormat column is absent.
|
scoreCol |
if a sampleSheet is specified, the default column
in the peak files that will be used
for scoring if the ScoreCol column is absent.
|
bLowerScoreBetter |
if a sampleSheet is specified, the sort order for peak scores
if the LowerBetter column is absent.
|
filter |
if a sampleSheet is specified, a filter value if the
Filter column is absent.
Peaks with scores lower than this value
(or higher if bLowerScoreBetter or LowerBetter is
TRUE ) will be removed.
|
skipLines |
if a sampleSheet is specified, the number of lines (ie header lines)
at the beginning of each peak file to skip.
|
bAddCallerConsensus |
add a consensus peakset for each sample with more than one peakset
(i.e. different peak callers) when constructing a new DBA object from a
sampleSheet .
|
bRemoveM |
logical indicating whether to remove peaks on chrM (mitochondria)
when constructing a new DBA object from a sample sheet.
|
bRemoveRandom |
logical indicating whether to remove peaks on chrN_random when
constructing a new DBA object from a sample sheet.
|
bSummarizedExperiment |
logical indicating whether to return resulting object as a SummarizedExperiment .
|
bCorPlot |
logical indicating that a correlation heatmap should be plotted before returning.
If DBA is NULL (a new DBA object is being created),
and bCorPlot is missing, then this will take the default value (FALSE ).
However if DBA is NULL (a new DBA object is being created),
and bCorPlot is specified, then the specified value will become the
default value of bCorPlot for the resultant DBA object.
|
attributes |
vector of attributes to use subsequently as defaults when generating
labels in plotting functions:
-
DBA_ID
-
DBA_TISSUE
-
DBA_FACTOR
-
DBA_CONDITION
-
DBA_TREATMENT
-
DBA_REPLICATE
-
DBA_CONSENSUS
-
DBA_CALLER
-
DBA_CONTROL
|
dir |
Directory path.
If supplied, files referenced in the sampleSheet will have
this path prepended.
Applies to PeakFiles , bamReads , bamControl ,
and Spikein , if present.
If sampleSheet is a filepath, this will prepended to that as well.
|
Details
MODE: Construct a new DBA object from a samplesheet:
dba(sampleSheet, config,
bAddCallerConsensus, bRemoveM, bRemoveRandom,
attributes)
MODE: Construct a DBA object based on an existing one:
dba(DBA, mask, attributes)
MODE: Convert a DBA object to a SummarizedExperiment object:
dba(DBA, bSummarizedExperiment=TRUE)
Value
DBA object
Author(s)
Rory Stark and Gordon Brown
See Also
dba.peakset
, dba.show
, DBA.config
.
Examples
# Create DBA object from a samplesheet
## Not run:
basedir <- system.file("extra", package="DiffBind")
tamoxifen <- dba(sampleSheet="tamoxifen.csv", dir=basedir)
tamoxifen
tamoxifen <- dba(sampleSheet="tamoxifen_allfields.csv")
tamoxifen
tamoxifen <- dba(sampleSheet="tamoxifen_allfields.csv",config="config.csv")
tamoxifen
## End(Not run)
#Create a DBA object with a subset of samples
data(tamoxifen_peaks)
Responsive <- dba(tamoxifen,tamoxifen$masks$Responsive)
Responsive
# change peak caller but leave peak format the same
basedir <- system.file("extra", package="DiffBind")
tamoxifen <- dba(sampleSheet="tamoxifen.csv", dir=basedir,
peakCaller="macs", peakFormat="raw", scoreCol=5 )
dba.show(tamoxifen, attributes=c(DBA_TISSUE,DBA_CONDITION,DBA_REPLICATE,DBA_CALLER))
# Convert DBA object to SummarizedExperiment
data(tamoxifen_counts)
sset <- dba(tamoxifen,bSummarizedExperiment=TRUE)
sset
[Package
DiffBind version 3.4.0
Index]