process_spikes {spiky} | R Documentation |
Sequence feature verification: never trust anyone, least of all yourself.
process_spikes(fasta, methylated = 0, ...)
fasta |
fasta file (or GRanges or DataFrame) w/spike sequences |
methylated |
whether CpGs in each are methylated (0 or 1, default 0) |
... |
additional arguments, e.g. kernels (currently unused) |
GCfrac is the GC content of spikes as a proportion instead of a percent. OECpG is (observed/expected) CpGs (expectation is 25% of GC dinucleotides).
a DataFrame suitable for downstream processing
kmers
data(spike) spikes <- system.file("extdata", "spikes.fa", package="spiky", mustWork=TRUE) spikemeth <- spike$methylated process_spikes(spikes, spikemeth) data(phage) phages <- system.file("extdata", "phages.fa", package="spiky", mustWork=TRUE) identical(process_spikes(phage), phage) identical(phage, process_spikes(phage)) data(genbank_mito) (mt <- process_spikes(genbank_mito)) # see also genbank_mito.R gb_mito <- system.file("extdata", "genbank_mito.R", package="spiky") library(kebabs) CpGmotifs <- c("[AT]CG[AT]","C[ATC]G", "CCGG", "CGCG") mot <- motifKernel(CpGmotifs, normalized=FALSE) km <- getKernelMatrix(mot, subset(phage, methylated == 0)$sequence) heatmap(km, symm=TRUE) #' # refactor this out mt <- process_spikes(genbank_mito) mtiles <- unlist(tileGenome(seqlengths(mt$sequence), tilewidth=100)) bymito <- split(mtiles, seqnames(mtiles)) binseqs <- getSeq(mt$sequence, bymito[["Homo sapiens"]]) rCRS6mers <- kmers(binseqs, k=6) # plot binned Pr(kmer): library(ComplexHeatmap) Heatmap(kmax(rCRS6mers), name="Pr(kmer)") # not run library(kebabs) kernels <- list( k6f=spectrumKernel(k=6, r=1), k6r=spectrumKernel(k=6, r=1, revComp=TRUE) ) kms <- lapply(kernels, getKernelMatrix, x=mt["Human", "sequence"]) library(ComplexHeatmap)