encodeKMerSeq {DNAshapeR} | R Documentation |
DNAshapeR can be used to generate feature vectors for a user-defined model. The model can be a k-mer sequence. Sequence is encoded in four binary features (i.e., in terms of 1-mers, 0001 for adenine, 0010 for cytosine, 0100 for guanine, and 1000 for thymine) at each nucleotide position (Zhou, et al., 2015). The function permits an encoding of 2-mers and 3-mers (16 and 64 binary features at each position, respectively).
encodeKMerSeq(k, dnaStringSet)
k |
A number indicating k-mer sequence encoding |
dnaStringSet |
A DNAStringSet object of the inputted fasta file |
featureVector A matrix containing encoded features. Sequence feature is represented as binary numbers
Tsu-Pei Chiu