The IsomirDataSeq is a subclass of SummarizedExperiment::SummarizedExperiment used to store the raw data, intermediate calculations and results of an miRNA/isomiR analysis. This class stores all raw isomiRs data for each sample, processed information, summary for each isomiR type, raw counts, normalized counts, and table with experimental information for each sample.
IsomirDataSeqFromFiles creates this object using seqbuster output files.
Methods for this objects are isomiRs::counts()
to get
count matrix and isomiRs::isoSelect()
for miRNA/isomiR selection. Functions
available for this object are isomiRs::isoCounts()
for
count matrix creation,
isomiRs::isoNorm()
for normalization,
isomiRs::isoDE()
for
differential expression and isomiRs::isoPLSDA()
for clustering.
isomiRs::isoPlot()
helps with basic expression plot.
metadata
contains two lists: rawList
is a list with same
length than number of samples and stores the input files
for each sample; isoList
is a list with same length than
number of samples and stores information for each isomiR type summarizing
the different changes for the different isomiRs (trimming at 3',
trimming a 5', addition and substitution). For instance, you can get
the data stored in isoList
for sample 1 and 5' changes
with this code metadata(ids)[['isoList']][[1]]$t5sum
.
The naming of isomiRs follows these rules:
miRNA name
type:ref if the sequence is the same than the miRNA reference.
iso
if the sequence has variations.
t5 tag
:indicates variations at 5 position.
The naming contains two words: direction - nucleotides
,
where direction can be UPPER CASE NT
(changes upstream of the 5 reference position) or
LOWER CASE NT (changes downstream of the 5 reference position).
0
indicates no variation, meaning the 5 position is
the same than the reference. After direction
,
it follows the nucleotide/s that are added (for upstream changes)
or deleted (for downstream changes).
t3 tag
:indicates variations at 3 position.
The naming contains two words: direction - nucleotides
,
where direction can be LOWER CASE NT
(upstream of the 3 reference position) or
UPPER CASE NT (downstream of the 3 reference position).
0
indicates no variation, meaning the 3 position is
the same than the reference. After direction
,
it follows the nucleotide/s that are added (for downstream changes)
or deleted (for upstream chanes).
ad tag
:indicates nucleotides additions at 3 position.
The naming contains two words: direction - nucleotides
,
where direction is UPPER CASE NT
(upstream of the 5 reference position).
0
indicates no variation, meaning the 3 position
has no additions. After direction
,
it follows the nucleotide/s that are added.
mm tag
: indicates nucleotides substitutions along
the sequences. The naming contains three words:
position-nucleotideATsequence-nucleotideATreference
.
seed tag
: same than mm
tag,
but only if the change happens between nucleotide 2 and 8.
In general nucleotides in UPPER case mean insertions respect to the reference sequence, and nucleotides in LOWER case mean deletions respect to the reference sequence.
[['isoList']: R:['isoList' [1]: R:1
path <- system.file("extra", package="isomiRs") fn_list <- list.files(path, full.names = TRUE) de <- data.frame(row.names=c("f1" , "f2"), condition = c("newborn", "newborn")) ids <- IsomirDataSeqFromFiles(fn_list, coldata=de)#>#> f1 f2 #> hsa-let-7a-5p 0 18 #> hsa-let-7b-5p 0 6 #> hsa-let-7c-5p 0 61 #> hsa-let-7f-5p 8 7 #> hsa-let-7g-5p 74 3 #> hsa-let-7i-5p 0 2