generate_kmers {transite} | R Documentation |
Counts occurrences of k-mers of length k
in the given set of
sequences. Corrects for homopolymeric stretches.
generate_kmers(sequences, k)
sequences |
character vector of DNA or RNA sequences |
k |
length of k-mer, either |
Returns a named numeric vector, where the elements are k-mer counts and the names are DNA k-mers.
generate_kmers
always returns DNA k-mers, even if
sequences
contains RNA sequences.
RNA sequences are internally converted to DNA sequences. It is not
allowed to mix DNA and
RNA sequences.
Other k-mer functions:
calculate_kmer_enrichment()
,
check_kmers()
,
compute_kmer_enrichment()
,
count_homopolymer_corrected_kmers()
,
draw_volcano_plot()
,
estimate_significance_core()
,
estimate_significance()
,
generate_permuted_enrichments()
,
run_kmer_spma()
,
run_kmer_tsma()
# count hexamers in set of RNA sequences rna_sequences <- c( "CAACAGCCUUAAUU", "CAGUCAAGACUCC", "CUUUGGGGAAU", "UCAUUUUAUUAAA", "AAUUGGUGUCUGGAUACUUCCCUGUACAU", "AUCAAAUUA", "AGAU", "GACACUUAAAGAUCCU", "UAGCAUUAACUUAAUG", "AUGGA", "GAAGAGUGCUCA", "AUAGAC", "AGUUC", "CCAGUAA", "UUAUUUA", "AUCCUUUACA", "UUUUUUU", "UUUCAUCAUU", "CCACACAC", "CUCAUUGGAG", "ACUUUGGGACA", "CAGGUCAGCA" ) hexamer_counts <- generate_kmers(rna_sequences, 6) # count heptamers in set of DNA sequences dna_sequences <- c( "CAACAGCCTTAATT", "CAGTCAAGACTCC", "CTTTGGGGAAT", "TCATTTTATTAAA", "AATTGGTGTCTGGATACTTCCCTGTACAT", "ATCAAATTA", "AGAT", "GACACTTAAAGATCCT", "TAGCATTAACTTAATG", "ATGGA", "GAAGAGTGCTCA", "ATAGAC", "AGTTC", "CCAGTAA", "TTATTTA", "ATCCTTTACA", "TTTTTTT", "TTTCATCATT", "CCACACAC", "CTCATTGGAG", "ACTTTGGGACA", "CAGGTCAGCA" ) hexamer_counts <- generate_kmers(dna_sequences, 7)