merge_similar {universalmotif} | R Documentation |
Identify and merge similar motifs within a collection of motifs.
Description
Given a list of motifs, merge_similar()
will identify similar motifs with
compare_motifs()
, and merge similar ones with merge_motifs()
.
Usage
merge_similar(motifs, threshold = 0.95, threshold.type = "score.abs",
method = "PCC", use.type = "PPM", min.overlap = 6, min.mean.ic = 0,
tryRC = TRUE, relative_entropy = FALSE, normalise.scores = FALSE,
min.position.ic = 0, score.strat.compare = "a.mean",
score.strat.merge = "sum", nthreads = 1)
Arguments
motifs |
See convert_motifs() for acceptable motif formats.
|
threshold |
numeric(1) The minimum (for similarity metrics) or maximum (for
distance metrics) threshold score for merging.
|
threshold.type |
character(1) Type of score used for thresholding.
Currently unused.
|
method |
character(1) One of PCC, EUCL, SW, KL, BHAT, HELL,
SEUCL, MAN, WEUCL, WPCC. See compare_motifs() . (The ALLR and ALLR_LL
methods cannot be used for distance matrix construction.)
|
use.type |
character(1) One of 'PPM' and 'ICM' .
The latter allows for taking into account the background
frequencies if relative_entropy = TRUE . Note that 'ICM' is not
allowed when method = c("ALLR", "ALLR_LL") .
|
min.overlap |
numeric(1) Minimum overlap required when aligning the
motifs. Setting this to a number higher then the width of the motifs
will not allow any overhangs. Can also be a number between 0 and 1,
representing the minimum fraction that the motifs must overlap.
|
min.mean.ic |
numeric(1) Minimum mean information content between the
two motifs for an alignment to be scored. This helps prevent scoring
alignments between low information content regions of two motifs. Note that
this can result in some comparisons failing if no alignment passes the
mean IC threshold. Use average_ic() to filter out low IC motifs to get around
this if you want to avoid getting NA s in your output.
|
tryRC |
logical(1) Try the reverse complement of the motifs as well,
report the best score.
|
relative_entropy |
logical(1) Change the ICM calculation affecting
min.position.ic and min.mean.ic . See convert_type() .
|
normalise.scores |
logical(1) Favour alignments which leave fewer
unaligned positions, as well as alignments between motifs of similar length.
Similarity scores are multiplied by the ratio of
aligned positions to the total number of positions in the larger motif,
and the inverse for distance scores.
|
min.position.ic |
numeric(1) Minimum information content required between
individual alignment positions for it to be counted in the final alignment
score. It is recommended to use this together with normalise.scores = TRUE ,
as this will help punish scores resulting from only a fraction of an
alignment.
|
score.strat.compare |
character(1) The score.strat parameter used
by compare_motifs() . For clustering purposes, the "sum" option cannot
be used.
|
score.strat.merge |
character(1) The score.strat parameter used
by merge_motifs() . As discussed in merge_motifs() , the "sum" option
is recommended over "a.mean" to maximize the overlap between motifs.
|
nthreads |
numeric(1) Run compare_motifs() in parallel with nthreads
threads. nthreads = 0 uses all available threads.
|
Details
See compare_motifs()
for more info on comparison parameters, and
merge_motifs()
for more info on motif merging.
Value
See convert_motifs()
for available output formats.
Author(s)
Benjamin Jean-Marie Tremblay, benjamin.tremblay@uwaterloo.ca
See Also
compare_motifs()
, merge_motifs()
Examples
## Not run:
library(MotifDb)
motifs <- filter_motifs(MotifDb, family = "bHLH")[1:50]
length(motifs)
motifs <- merge_similar(motifs)
length(motifs)
## End(Not run)
[Package
universalmotif version 1.12.0
Index]