The IsomirDataSeq is a subclass of SummarizedExperiment::SummarizedExperiment used to store the raw data, intermediate calculations and results of an miRNA/isomiR analysis. This class stores all raw isomiRs data for each sample, processed information, summary for each isomiR type, raw counts, normalized counts, and table with experimental information for each sample.

Details

IsomirDataSeqFromFiles creates this object using seqbuster output files.

Methods for this objects are isomiRs::counts() to get count matrix and isomiRs::isoSelect() for miRNA/isomiR selection. Functions available for this object are isomiRs::isoCounts() for count matrix creation, isomiRs::isoNorm() for normalization, isomiRs::isoDE() for differential expression and isomiRs::isoPLSDA() for clustering. isomiRs::isoPlot() helps with basic expression plot. metadata contains two lists: rawList is a list with same length than number of samples and stores the input files for each sample; isoList is a list with same length than number of samples and stores information for each isomiR type summarizing the different changes for the different isomiRs (trimming at 3', trimming a 5', addition and substitution). For instance, you can get the data stored in isoList for sample 1 and 5' changes with this code metadata(ids)[['isoList']][[1]]$t5sum.

The naming of isomiRs follows these rules:

  • miRNA name

  • type:ref if the sequence is the same than the miRNA reference. iso if the sequence has variations.

  • t5 tag:indicates variations at 5 position. The naming contains two words: direction - nucleotides, where direction can be UPPER CASE NT (changes upstream of the 5 reference position) or LOWER CASE NT (changes downstream of the 5 reference position). 0 indicates no variation, meaning the 5 position is the same than the reference. After direction, it follows the nucleotide/s that are added (for upstream changes) or deleted (for downstream changes).

  • t3 tag:indicates variations at 3 position. The naming contains two words: direction - nucleotides, where direction can be LOWER CASE NT (upstream of the 3 reference position) or UPPER CASE NT (downstream of the 3 reference position). 0 indicates no variation, meaning the 3 position is the same than the reference. After direction, it follows the nucleotide/s that are added (for downstream changes) or deleted (for upstream chanes).

  • ad tag:indicates nucleotides additions at 3 position. The naming contains two words: direction - nucleotides, where direction is UPPER CASE NT (upstream of the 5 reference position). 0 indicates no variation, meaning the 3 position has no additions. After direction, it follows the nucleotide/s that are added.

  • mm tag: indicates nucleotides substitutions along the sequences. The naming contains three words: position-nucleotideATsequence-nucleotideATreference.

  • seed tag: same than mm tag, but only if the change happens between nucleotide 2 and 8.

In general nucleotides in UPPER case mean insertions respect to the reference sequence, and nucleotides in LOWER case mean deletions respect to the reference sequence.

[['isoList']: R:['isoList' [1]: R:1

Examples

path <- system.file("extra", package="isomiRs") fn_list <- list.files(path, full.names = TRUE) de <- data.frame(row.names=c("f1" , "f2"), condition = c("newborn", "newborn")) ids <- IsomirDataSeqFromFiles(fn_list, coldata=de)
#> Total samples filtered due to low number of hits: 0
head(counts(ids))
#> f1 f2 #> hsa-let-7a-5p 0 18 #> hsa-let-7b-5p 0 6 #> hsa-let-7c-5p 0 61 #> hsa-let-7f-5p 8 7 #> hsa-let-7g-5p 74 3 #> hsa-let-7i-5p 0 2