Abstract
isomiRs package version: 1.14.0Lorena Pantano Harvard TH Chan School of Public Health, Boston, US
miRNAs are small RNA fragments (18-23 nt long) that influence gene expression during development and cell stability. Morin et al (Morin et al. 2008), discovered isomiRs first time after sequencing human stem cells.
IsomiRs are miRNAs that vary slightly in sequence, which result from variations in the cleavage site during miRNA biogenesis (5’-trimming and 3’-trimming variants), nucleotide additions to the 3’-end of the mature miRNA (3’-addition variants) and nucleotide modifications (substitution variants)(Martí et al. 2010).
There are many tools designed for isomiR detection, however the majority are web application where user can not control the analysis. The two main command tools for isomiRs mapping are SeqBuster and sRNAbench (Guillermo et al. 2014). isomiRs. package is designed to analyze the output of SeqBuster tool or any other tool after converting to the desire format.
If you use the package, please cite this paper (Pantano L 2010).
The input should be the output of SeqBuster-miraligner tool (*.mirna files). It is compatible with mirTOP tool as well, which parses BAM files with alignments against miRNA precursors.
For each sample the file should have the following format:
seq name freq mir start end mism add t5 t3
TGTAAACATCCTACACTCAGCT seq_100014_x23 23 hsa-miR-30b-5p 17 40 0 0 0 GT
TGTAAACATCCCTGACTGGAA seq_100019_x4 4 hsa-miR-30d-5p 6 26 13TC 0 0 g
TGTAAACATCCCTGACTGGAA seq_100019_x4 4 hsa-miR-30e-5p 17 37 12CT 0 0 g
CAAATTCGTATCTAGGGGATT seq_100049_x1 1 hsa-miR-10a-3p 63 81 0 TT 0 ata
TGACCTAGGAATTGACAGCCAGT seq_100060_x1 1 hsa-miR-192-5p 25 47 8GT 0 c agt
This is the standard output of SeqBuster-miraligner tool, but can be converted from any other tool having the mapping information on the precursors. Read more on miraligner manual.
This object will store all raw data from the input files and some processed information used for visualization and statistical analysis. It is a subclass of SummarizedExperiment
with colData
and counts
methods. Beside that, the object contains raw and normalized counts from miraligner allowing to update the summarization of miRNA expression.
The user can access the normalized count matrix with counts(object, norm=TRUE)
.
You can browse for the same miRNA or isomiRs in all samples with isoSelect
method.
## DataFrame with 6 rows and 14 columns
## cc1 cc2 cc3 cc4
## <numeric> <numeric> <numeric> <numeric>
## hsa-let-7a-5p 235825 171354 149541 180654
## hsa-let-7a-5p;iso_3p:T 8806 5478 5427 6249
## hsa-let-7a-5p;iso_3p:gtt 1173 884 751 1010
## hsa-let-7a-5p;iso_3p:t 50920 52403 35525 53213
## hsa-let-7a-5p;iso_3p:tt 6041 5467 3817 7491
## hsa-let-7a-5p;iso_add3p:A 9616 4742 5437 6714
## cc5 cc6 cc7 ct1
## <numeric> <numeric> <numeric> <numeric>
## hsa-let-7a-5p 168884 107430 153061 143030
## hsa-let-7a-5p;iso_3p:T 5538 3734 5482 4825
## hsa-let-7a-5p;iso_3p:gtt 963 580 863 927
## hsa-let-7a-5p;iso_3p:t 51499 30659 45492 42878
## hsa-let-7a-5p;iso_3p:tt 6394 3984 5940 7107
## hsa-let-7a-5p;iso_add3p:A 5846 3799 5532 5214
## ct2 ct3 ct4 ct5
## <numeric> <numeric> <numeric> <numeric>
## hsa-let-7a-5p 163569 114028 123454 133092
## hsa-let-7a-5p;iso_3p:T 6045 4626 4181 4505
## hsa-let-7a-5p;iso_3p:gtt 756 495 719 682
## hsa-let-7a-5p;iso_3p:t 35795 24993 34512 36973
## hsa-let-7a-5p;iso_3p:tt 4828 2858 5087 5652
## hsa-let-7a-5p;iso_add3p:A 7326 4790 4430 5010
## ct6 ct7
## <numeric> <numeric>
## hsa-let-7a-5p 158909 140272
## hsa-let-7a-5p;iso_3p:T 5134 4670
## hsa-let-7a-5p;iso_3p:gtt 1048 860
## hsa-let-7a-5p;iso_3p:t 47873 39585
## hsa-let-7a-5p;iso_3p:tt 6589 5447
## hsa-let-7a-5p;iso_add3p:A 6034 5081
metadata(mirData)
contains two lists: rawList
is a list with same length than number of samples and stores the input files for each sample; isoList
is a list with same length than number of samples and stores information for each isomiR type summarizing the different changes for the different isomiRs (trimming at 3’, trimming a 5’, addition and substitution). For instance, you can get the data stored in isoList
for sample 1 and 5’ changes with this code metadata(ids)[["isoList"]][[1]]["t5sum"]
.
IsomiR names follows this structure:
iso
if the sequence has variations. t5 tag: indicates variations at 5’ position. The naming contains two words: direction - nucleotides
, where direction can be UPPER CASE NT (changes upstream of the 5’ reference position) or LOWER CASE NT (changes downstream of the 5’ reference position). 0
indicates no variation, meaning the 5’ position is the same than the reference. After direction
, it follows the nucleotide/s that are added (for upstream changes) or deleted (for downstream changes). t3 tag: indicates variations at 3’ position. The naming contains two words: direction - nucleotides
, where direction can be LOWER CASE NT (upstream of the 3’ reference position) or UPPER CASE NT (downstream of the 3’ reference position). 0
indicates no variation, meaning the 3’ position is the same than the reference. After direction
, it follows the nucleotide/s that are added (for downstream changes) or deleted (for upstream chanes). ad tag: indicates nucleotides additions at 3’ position. The naming contains two words: direction - nucleotides
, where direction is UPPER CASE NT (upstream of the 5’ reference position). 0
indicates no variation, meaning the 3’ position has no additions. After direction
, it follows the nucleotide/s that are added. mm tag: indicates nucleotides substitutions along the sequences. The naming contains three words: position-nucleotideATsequence-nucleotideATreference
. *seed tag: same than mm
tag, but only if the change happens between nucleotide 2 and 8.In general nucleotides in UPPER case mean insertions respect to the reference sequence, and nucleotides in LOWER case mean deletions respect to the reference sequence.
We are going to use a small RNAseq data from human brain samples (Pantano et al. 2015) to give some basic examples of isomiRs analyses.
In this data set we will find two groups:
pc: 7 control individuals pt: 7 patients with Parkinson’s Disease in early stage.
The function IsomirDataSeqFromFiles
needs a vector with the paths for each file and a data frame with the design experiment similar to the one used for a mRNA differential expression analysis. Row names of the data frame should be the names for each sample in the same order than the list of files.
You can plot isomiRs expression with isoPlot
. In this figure you will see how abundant is each type of isomiRs at different positions considering the total abundance and the total number of sequences. The type
parameter controls what type of isomiRs to show. It can be trimming (iso5 and iso3), addition (add) or substitution (subs) changes.
isoCounts
gets the count matrix that can be used for many different downstream analyses changing the way isomiRs are collapsed. The following command will merge all isomiRs into one feature: the reference miRNA.
## cc1 cc2 cc3 cc4 cc5 cc6 cc7 ct1
## hsa-let-7a-2-3p 5 13 4 9 11 6 7 0
## hsa-let-7a-3p 767 707 630 609 731 389 681 258
## hsa-let-7a-5p 317730 244832 203888 260931 244784 153049 220563 208511
## hsa-let-7b-3p 1037 1949 679 1385 1884 697 1499 338
## hsa-let-7b-5p 69671 64702 42347 65322 60767 34975 56875 22215
## hsa-let-7c-3p 36 16 25 40 26 23 35 3
## ct2 ct3 ct4 ct5 ct6 ct7
## hsa-let-7a-2-3p 4 2 12 12 8 3
## hsa-let-7a-3p 496 343 635 622 452 519
## hsa-let-7a-5p 222066 154246 175523 189733 230690 199923
## hsa-let-7b-3p 931 494 1149 1194 1431 988
## hsa-let-7b-5p 45069 28312 41214 43058 61258 46600
## hsa-let-7c-3p 30 28 26 24 22 17
The normalization uses rlog
from DESeq2 package and allows quick integration to another analyses like heatmap, clustering or PCA.
To get a detail description for each isomiR, the function isoAnnotate
can return the naming, sequence and importance value for each sample and isomiR. The importance is calculated by:
\[importance = \frac{isomiR\_reads}{miRNA\_reads}\]
The columns are:
position:nt_ref:nt_isomiR
.## seq
## 1 AAAAACCGTAGTTACTTTTGCAT
## 2 AAAAACTGAGACTACTTTTG
## 3 AAAAACTGAGACTACTTTTGC
## 4 AAAAACTGAGACTACTTTTGCA
## 5 AAAAACTGAGACTACTTTTGCAA
## 6 AAAAACTGCAGTTACTTTTGC
## uid cc1 cc2
## 1 hsa-miR-548bb-3p;iso_snp:8GA;iso_add:T;iso_5p:c;iso_3p:A NA NA
## 2 hsa-miR-548e-3p;iso_3p:ca 9.375 10.71429
## 3 hsa-miR-548e-3p;iso_3p:a 34.375 10.71429
## 4 hsa-miR-548e-3p;ref 21.875 64.28571
## 5 hsa-miR-548e-3p;iso_add:A 34.375 NA
## 6 hsa-miR-548ah-3p;iso_5p:c 15.000 11.76471
## cc3 cc4 cc5 cc6 cc7 ct1 ct2 ct3 ct4
## 1 NA 100.000000 NA NA NA NA NA NA NA
## 2 15.15152 6.896552 NA NA NA 19.04762 NA 8.333333 20
## 3 30.30303 41.379310 28 66.66667 25 19.04762 47.61905 29.166667 20
## 4 27.27273 17.241379 44 33.33333 60 38.09524 23.80952 16.666667 30
## 5 21.21212 13.793103 12 NA 15 23.80952 19.04762 25.000000 30
## 6 23.07692 NA NA NA NA NA NA NA NA
## ct5 ct6 ct7 edit_mature_position
## 1 NA NA NA 9:AG
## 2 8.695652 NA NA NA
## 3 30.434783 13.04348 46.15385 NA
## 4 30.434783 34.78261 30.76923 NA
## 5 17.391304 43.47826 23.07692 NA
## 6 NA NA NA NA
The isoDE
uses functions from DESeq2 package. This function has parameters to create a matrix using only the reference miRNAs, all isomiRs, or some of them. This matrix and the design matrix are the inputs for DESeq2. The output will be a DESeqDataSet object, allowing to generate any plot or table explained in DESeq2 package vignette.
## log2 fold change (MLE): condition ct vs cc
## Wald test p-value: condition ct vs cc
## DataFrame with 6 rows and 6 columns
## baseMean log2FoldChange lfcSE
## <numeric> <numeric> <numeric>
## hsa-let-7a-2-3p 6.60005242130612 -0.391808034451503 0.609492802121228
## hsa-let-7a-3p 546.805469613238 -0.22568581262064 0.224797224087315
## hsa-let-7a-5p 220625.551423747 0.113470209115607 0.270505352954979
## hsa-let-7b-3p 1096.78693283053 -0.391274509777867 0.329460850713291
## hsa-let-7b-5p 47415.9668329051 -0.235214733416519 0.226064546587921
## hsa-let-7c-3p 23.1177591558883 -0.17443807888616 0.302732297642984
## stat pvalue padj
## <numeric> <numeric> <numeric>
## hsa-let-7a-2-3p -0.642842758910175 0.52032613478516 0.989527585749851
## hsa-let-7a-3p -1.00395284477792 0.315401343211601 0.989527585749851
## hsa-let-7a-5p 0.419474911960399 0.674869086101294 0.989527585749851
## hsa-let-7b-3p -1.18762065031626 0.234982898335358 0.989527585749851
## hsa-let-7b-5p -1.04047599221862 0.2981188122644 0.989527585749851
## hsa-let-7c-3p -0.576212317761606 0.564471680375384 0.989527585749851
You can differentiate between reference sequences and isomiRs at 5’ end with this command:
## row baseMean log2FoldChange lfcSE
## 1 hsa-let-7a-2-3p 2.4059793 -0.8425278 1.0440442
## 2 hsa-let-7a-2-3p;iso_5p:TAA 0.1285419 1.1051761 3.0845751
## 3 hsa-let-7a-2-3p;ref 4.0407844 -0.2219555 0.6581561
## 4 hsa-let-7a-3p 376.7216287 -0.1663188 0.2027009
## 5 hsa-let-7a-3p;iso_5p:A 0.6517381 1.0223121 1.7714731
## 6 hsa-let-7a-3p;iso_5p:c 84.6025648 -0.3165339 0.2538101
## stat pvalue padj
## 1 -0.8069848 0.4196753 0.9865695
## 2 0.3582912 0.7201254 0.9865695
## 3 -0.3372384 0.7359372 0.9865695
## 4 -0.8205135 0.4119234 0.9865695
## 5 0.5770972 0.5638738 0.9865695
## 6 -1.2471289 0.2123502 0.9865695
Alternative, for more complicated cases or if you want to control more the differential expression analysis paramters you can use directly DESeq2 package feeding it with the output of counts(ids)
and colData(ids)
like this:
Partial Least Squares Discriminant Analysis (PLS-DA) is a technique specifically appropriate for analysis of high dimensionality data sets and multicollineality (Pérez-Enciso and Tenenhaus 2003). PLS-DA is a supervised method (i.e. makes use of class labels) with the aim to provide a dimension reduction strategy in a situation where we want to relate a binary response variable (in our case young or old status) to a set of predictor variables. Dimensionality reduction procedure is based on orthogonal transformations of the original variables (isomiRs) into a set of linearly uncorrelated latent variables (usually termed as components) such that maximizes the separation between the different classes in the first few components (Xia and Wishart 2011). We used sum of squares captured by the model (R2) as a goodness of fit measure. We implemented this method using the DiscriMiner
into isoPLSDA
function. The output p-value of this function will tell about the statistical significant of the group separation using miRNA expression data. Moreover, the function isoPLSDAplot
helps to visualize the results. It will plot the samples using the significant components (t1, t2, t3 …) from the PLS-DA analysis and the samples distribution along the components.
ids = isoCounts(ids, iso5=TRUE, minc=10, mins=6)
ids = isoNorm(ids, formula = ~ condition)
pls.ids = isoPLSDA(ids, "condition", nperm = 2)
df = isoPLSDAplot(pls.ids)
The analysis can be done again using only the most important discriminant isomiRS from the PLS-DA models based on the analysis. We used Variable Importance for the Projection (VIP) criterion to select the most important features, since takes into account the contribution of a specific predictor for both the explained variability on the response and the explained variability on the predictors.
The package offers a correlation analysis of miRNA and gene expression data. Having two SummarizedExperiments objects with their expression, the target prediction for each miRNA, the function isoNetwork
and isoPlotNetwork
can generate a summarized figure showing the relationship between expression profile, miRNA repression and enrichment analysis:
# library(org.Mm.eg.db)
# library(clusterProfiler)
data(isoExample)
# ego <- enrichGO(row.names(assay(gene_ex_rse, "norm")),
# org.Mm.eg.db, "ENSEMBL", ont = "BP")
data = isoNetwork(mirna_ex_rse, gene_ex_rse, target = ma_ex,
enrich = ego, summarize = "group")
## Dimmension of cor matrix: 20 25
As an option, org
can be org.Mm.eg.db
and genename
can be ENSEMBL
and it will run enrcihGO
internally.
To create the ma_ex matrix, the user can use findTargets
:
mirna_ma <- data.frame(gene = names(gene_ex_rse)[1:20],
mir = names(mirna_ex_rse))
ma_ex <- findTargets(mirna_ex_rse, gene_ex_rse, mirna_ma)
## Dimmension of cor matrix: 20 20
## mmu-miR-101a-3p mmu-miR-107-3p mmu-miR-142a-3p
## ENSMUSG00000004530 0 0 0
## ENSMUSG00000005397 0 0 0
## ENSMUSG00000006386 0 0 0
## ENSMUSG00000016494 0 0 0
## ENSMUSG00000019997 0 0 0
## ENSMUSG00000021822 0 0 0
## mmu-miR-144-3p
## ENSMUSG00000004530 0
## ENSMUSG00000005397 0
## ENSMUSG00000006386 0
## ENSMUSG00000016494 0
## ENSMUSG00000019997 0
## ENSMUSG00000021822 0
And to get the mirna_ma
data.frame with the miRNA-target information, the user can use mirna2targetscan
function:
## miRFamily Seedmatch PCT gene targetscan
## 1 miR-34ac/34bc-5p/449abc/449c-5p 8mer 0.29 79719 hsa-miR-34c-5p
## 2 miR-34ac/34bc-5p/449abc/449c-5p 8mer 0.76 215 hsa-miR-34c-5p
## 3 miR-34ac/34bc-5p/449abc/449c-5p 7mer-1a 0.58 3983 hsa-miR-34c-5p
## 4 miR-34ac/34bc-5p/449abc/449c-5p 8mer 0.73 29 hsa-miR-34c-5p
## 5 miR-34ac/34bc-5p/449abc/449c-5p 8mer 0.67 64746 hsa-miR-34c-5p
## 6 miR-34ac/34bc-5p/449abc/449c-5p 8mer 0.75 40 hsa-miR-34c-5p
## mir ext num letter
## 1 hsa-miR-34c-5p hsa-miR-34c hsa-miR-34c hsa-miR-34
## 2 hsa-miR-34c-5p hsa-miR-34c hsa-miR-34c hsa-miR-34
## 3 hsa-miR-34c-5p hsa-miR-34c hsa-miR-34c hsa-miR-34
## 4 hsa-miR-34c-5p hsa-miR-34c hsa-miR-34c hsa-miR-34
## 5 hsa-miR-34c-5p hsa-miR-34c hsa-miR-34c hsa-miR-34
## 6 hsa-miR-34c-5p hsa-miR-34c hsa-miR-34c hsa-miR-34
Here is the output of sessionInfo
on the system on which this document was compiled:
## R version 3.6.1 (2019-07-05)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.3 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.10-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.10-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] targetscan.Hs.eg.db_0.6.1 AnnotationDbi_1.48.0
## [3] DESeq2_1.26.0 pheatmap_1.0.12
## [5] isomiRs_1.14.0 SummarizedExperiment_1.16.0
## [7] DelayedArray_0.12.0 BiocParallel_1.20.0
## [9] matrixStats_0.55.0 Biobase_2.46.0
## [11] GenomicRanges_1.38.0 GenomeInfoDb_1.22.0
## [13] IRanges_2.20.0 S4Vectors_0.24.0
## [15] BiocGenerics_0.32.0 DiscriMiner_0.1-29
## [17] BiocStyle_2.14.0
##
## loaded via a namespace (and not attached):
## [1] assertive.base_0.0-7 colorspace_1.4-1
## [3] rjson_0.2.20 ellipsis_0.3.0
## [5] circlize_0.4.8 htmlTable_1.13.2
## [7] XVector_0.26.0 ggdendro_0.1-20
## [9] GlobalOptions_0.1.1 base64enc_0.1-3
## [11] clue_0.3-57 rstudioapi_0.10
## [13] assertive.sets_0.0-3 ggrepel_0.8.1
## [15] bit64_0.9-7 splines_3.6.1
## [17] logging_0.10-108 mnormt_1.5-5
## [19] geneplotter_1.64.0 knitr_1.25
## [21] zeallot_0.1.0 Formula_1.2-3
## [23] Nozzle.R1_1.1-1 broom_0.5.2
## [25] annotate_1.64.0 cluster_2.1.0
## [27] png_0.1-7 readr_1.3.1
## [29] BiocManager_1.30.9 compiler_3.6.1
## [31] backports_1.1.5 assertthat_0.2.1
## [33] Matrix_1.2-17 lazyeval_0.2.2
## [35] limma_3.42.0 lasso2_1.2-20
## [37] acepack_1.4.1 htmltools_0.4.0
## [39] tools_3.6.1 gtable_0.3.0
## [41] glue_1.3.1 GenomeInfoDbData_1.2.2
## [43] reshape2_1.4.3 dplyr_0.8.3
## [45] Rcpp_1.0.2 vctrs_0.2.0
## [47] gdata_2.18.0 nlme_3.1-141
## [49] psych_1.8.12 xfun_0.10
## [51] stringr_1.4.0 DEGreport_1.22.0
## [53] lifecycle_0.1.0 gtools_3.8.1
## [55] XML_3.98-1.20 edgeR_3.28.0
## [57] MASS_7.3-51.4 zlibbioc_1.32.0
## [59] scales_1.0.0 hms_0.5.1
## [61] RColorBrewer_1.1-2 ComplexHeatmap_2.2.0
## [63] yaml_2.2.0 memoise_1.1.0
## [65] gridExtra_2.3 ggplot2_3.2.1
## [67] rpart_4.1-15 reshape_0.8.8
## [69] latticeExtra_0.6-28 stringi_1.4.3
## [71] RSQLite_2.1.2 genefilter_1.68.0
## [73] checkmate_1.9.4 caTools_1.17.1.2
## [75] shape_1.4.4 rlang_0.4.1
## [77] pkgconfig_2.0.3 bitops_1.0-6
## [79] evaluate_0.14 lattice_0.20-38
## [81] purrr_0.3.3 labeling_0.3
## [83] htmlwidgets_1.5.1 cowplot_1.0.0
## [85] bit_1.1-14 tidyselect_0.2.5
## [87] GGally_1.4.0 plyr_1.8.4
## [89] magrittr_1.5 R6_2.4.0
## [91] gplots_3.0.1.1 generics_0.0.2
## [93] Hmisc_4.2-0 DBI_1.0.0
## [95] pillar_1.4.2 foreign_0.8-72
## [97] survival_2.44-1.1 RCurl_1.95-4.12
## [99] nnet_7.3-12 tibble_2.1.3
## [101] crayon_1.3.4 KernSmooth_2.23-16
## [103] rmarkdown_1.16 GetoptLong_0.1.7
## [105] locfit_1.5-9.1 grid_3.6.1
## [107] data.table_1.12.6 blob_1.2.0
## [109] ConsensusClusterPlus_1.50.0 digest_0.6.22
## [111] xtable_1.8-4 tidyr_1.0.0
## [113] munsell_0.5.0
Guillermo, Barturen, Rueda Antonio, Hamberg Maarten, Alganza Angel, Lebron Ricardo, Kotsyfakis Michalis, Shi BuJun, KoppersLalic Danijela, and Hackenberg Michael. 2014. “sRNAbench: profiling of small RNAs and its sequence variants in single or multi-species high-throughput experiments.” Methods in Next Generation Sequencing 1 (1):2084–7173. https://doi.org/10.2478/mngs-2014-0001.
Martí, Eulàlia, Lorena Pantano, Mónica Bañez-Coronel, Franc Llorens, Elena Miñones-Moyano, Sílvia Porta, Lauro Sumoy, Isidre Ferrer, and Xavier Estivill. 2010. “A myriad of miRNA variants in control and Huntington’s disease brain regions detected by massively parallel sequencing.” Nucleic Acids Res. 38:7219–35. https://doi.org/10.1093/nar/gkq575.
Morin, R. D., M. D. O’Connor, M. Griffith, F. Kuchenbauer, A. Delaney, A.-L. Prabhu, Y. Zhao, et al. 2008. “Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells.” Genome Res. 18:610–21. https://doi.org/10.1101/gr.7179508.
Pantano L, Estivil X, Marti E. 2010. “SeqBuster.” Nucleic Acids Res. 38:e34. https://doi.org/10.1093/nar/gkp1127.
Pantano, Lorena, Marc R Friedlander, Georgia Escaramis, Esther Lizano, Joan Pallares-Albanell, Isidre Ferrer, Xavier Estivill, and Eulalia Marti. 2015. “Specific small-RNA signatures in the amygdala at premotor and motor stages of Parkinson’s disease revealed by deep sequencing analysis.” Bioinformatics (Oxford, England), November. https://doi.org/10.1093/bioinformatics/btv632.
Pérez-Enciso, Miguel, and Michel Tenenhaus. 2003. “Prediction of clinical outcome with microarray data: a partial least squares discriminant analysis (PLS-DA) approach.” Human Genetics 112:581–92. https://doi.org/10.1007/s00439-003-0921-9.
Xia, Jianguo, and David S Wishart. 2011. “Web-based inference of biological patterns, functions and pathways from metabolomic data using MetaboAnalyst.” Nature Protocols 6:743–60. https://doi.org/10.1038/nprot.2011.319.