make_DBscores {universalmotif} | R Documentation |
Create P-value databases.
Description
Generate data used by compare_motifs()
for P-value calculations. By default,
compare_motifs()
uses an internal database based on the JASPAR2018 core motifs
(Khan et al. 2018). Parameters for distributions are
are estimated for every combination of motif widths
.
Usage
make_DBscores(db.motifs, method = c("PCC", "EUCL", "SW", "KL", "WEUCL",
"ALLR", "BHAT", "HELL", "WPCC", "SEUCL", "MAN", "ALLR_LL"),
shuffle.db = TRUE, shuffle.k = 3, shuffle.method = "linear",
rand.tries = 1000, widths = 5:30, min.position.ic = 0,
normalise.scores = c(FALSE, TRUE), min.overlap = 6, min.mean.ic = 0.25,
progress = TRUE, nthreads = 1, tryRC = TRUE, score.strat = c("sum",
"a.mean", "g.mean", "median", "wa.mean", "wg.mean", "fzt"))
Arguments
db.motifs |
list Database motifs.
|
method |
character(1) One of PCC, EUCL, SW, KL, ALLR, BHAT, HELL,
SEUCL, MAN, ALLR_LL, WEUCL, WPCC. See details.
|
shuffle.db |
logical(1) Deprecated. Does nothing.
generate random motifs with create_motif() .
|
shuffle.k |
numeric(1) See shuffle_motifs() .
|
shuffle.method |
character(1) See shuffle_motifs() .
|
rand.tries |
numeric(1) Approximate number of comparisons
to perform for every combination of widths .
|
widths |
numeric Motif widths to use in P-value database calculation.
|
min.position.ic |
numeric(1) Minimum information content required between
individual alignment positions for it to be counted in the final alignment
score. It is recommended to use this together with normalise.scores = TRUE ,
as this will help punish scores resulting from only a fraction of an
alignment.
|
normalise.scores |
logical(1) Favour alignments which leave fewer
unaligned positions, as well as alignments between motifs of similar length.
Similarity scores are multiplied by the ratio of
aligned positions to the total number of positions in the larger motif,
and the inverse for distance scores.
|
min.overlap |
numeric(1) Minimum overlap required when aligning the
motifs. Setting this to a number higher then the width of the motifs
will not allow any overhangs. Can also be a number between 0 and 1,
representing the minimum fraction that the motifs must overlap.
|
min.mean.ic |
numeric(1) Minimum mean information content between the
two motifs for an alignment to be scored. This helps prevent scoring
alignments between low information content regions of two motifs. Note that
this can result in some comparisons failing if no alignment passes the
mean IC threshold. Use average_ic() to filter out low IC motifs to get around
this if you want to avoid getting NA s in your output.
|
progress |
logical(1) Show progress.
|
nthreads |
numeric(1) Run compare_motifs() in parallel with nthreads
threads. nthreads = 0 uses all available threads.
|
tryRC |
logical(1) Try the reverse complement of the motifs as well,
report the best score.
|
score.strat |
character(1) How to handle column scores calculated from
motif alignments. "sum": add up all scores. "a.mean": take the arithmetic
mean. "g.mean": take the geometric mean. "median": take the median.
"wa.mean", "wg.mean": weighted arithmetic/geometric mean. "fzt": Fisher
Z-transform. Weights are the
total information content shared between aligned columns.
|
Details
See compare_motifs()
for more info on comparison parameters.
To replicate the internal universalmotif DB scores, run
make_DBscores()
with the default settings. Note that this will be
a slow process.
Arguments widths
, method
, normalise.scores
and score.strat
are
vectorized; all combinations will be attempted.
Value
A DataFrame
with score distributions for the
input database. If more than one make_DBscores()
run occurs (i.e. args
method
, normalise.scores
or score.strat
are longer than 1), then
the function args are included in the metadata
slot.
Author(s)
Benjamin Jean-Marie Tremblay, benjamin.tremblay@uwaterloo.ca
References
Khan A, Fornes O, Stigliani A, Gheorghe M, Castro-Mondragon JA,
van der Lee R, Bessy A, Cheneby J, Kulkarni SR, Tan G, Baranasic
D, Arenillas DJ, Sandelin A, Vandepoele K, Lenhard B, Ballester B,
Wasserman WW, Parcy F, Mathelier A (2018). “JASPAR 2018: update of
the open-access database of transcription factor binding profiles
and its web framework.” Nucleic Acids Research, 46, D260-D266.
See Also
compare_motifs()
Examples
## Not run:
library(MotifDb)
motifs <- convert_motifs(MotifDb[1:100])
scores <- make_DBscores(motifs, method = "PCC")
compare_motifs(motifs, 1:100, db.scores = scores)
## End(Not run)
[Package
universalmotif version 1.11.17
Index]