packSearch {packFinder}R Documentation

packFinder Algorithm Pipeline

Description

General use pipeline function for the Pack-TYPE transposon finding algorithm.

Usage

packSearch(
  tirSeq,
  Genome,
  mismatch = 0,
  elementLength,
  tsdLength,
  tsdMismatch = 0
)

Arguments

tirSeq

A DNAString object containing the TIR sequence to be searched for.

Genome

A DNAStringSet object to be searched.

mismatch

The maximum edit distance to be considered for TIR matches (indels + substitions). See matchPattern for details.

elementLength

The maximum element length to be considered, as a vector of two integers. E.g. c(300, 3500)

tsdLength

Integer referring to the length of the flanking TSD region.

tsdMismatch

An integer referring to the allowable mismatch (substitutions or indels) between a transposon's TSD sequences. matchPattern from Biostrings is used for pattern matching.

Details

Finds potential pack-TYPE elements based on:

The algorithm finds potential forward and reverse TIR sequences using identifyTirMatches and their associated TSD sequence via getTsds. The main filtering stage, identifyPotentialPackElements, filters matches to obtain a dataframe of potential PACK elements. Note that this pipeline does not consider the possibility of discovered elements being autonomous elements, so it is recommended to cluster and/or BLAST elements for further analysis. Furthermore, only exact TSD matches are considered, so supplying long sequences for TSD elements may lead to false-negative results.

Value

A dataframe, containing elements identified by thealgorithm. These may be autonomous or pack-TYPE elements. Will contain the following features:

This dataframe is in the format produced by coercing a link[GenomicRanges:GRanges-class]{GRanges} object to a dataframe: data.frame(GRanges). Downstream functions, such as packClust, use this dataframe to manipulate predicted transposable elements.

Note

This algorithm does not consider:

Pattern matching is done via matchPattern.

Author(s)

Jack Gisby

See Also

identifyTirMatches, getTsds, identifyPotentialPackElements, packClust, packMatches, DNAStringSet, DNAString, matchPattern

Examples

data(arabidopsisThalianaRefseq)

packMatches <- packSearch(
    Biostrings::DNAString("CACTACAA"),
    arabidopsisThalianaRefseq,
    elementLength = c(300, 3500),
    tsdLength = 3
)


[Package packFinder version 1.5.2 Index]