Created by the compcodeR package, version 0.99.1
Date: Wed Apr 2 11:00:28 2014
Data set:
B_625_625
Number of samples per condition:
5
Included replicates (for repeated simulated data sets):
1
Differential expression methods included in the comparison:
edgeR.3.4.2.exact.TMM.movingave.tagwise
ttest.1.44.0.TMM
voom.3.18.11.limma.TMM
Parameter values:
value
fdr.threshold 0.05
tpr.threshold 0.05
typeI.threshold 0.05
ma.threshold 0.05
fdc.maxvar 1500
overlap.threshold 0.05
fracsign.threshold 0.05
signal.measure mean
Gene score vs 'signal' for genes expressed in only one condition, all replicates
Overlap between sets of differentially expressed genes, single replicate
Sorensen index between sets of differentially expressed genes, single replicate
A receiver operating characteristic (ROC) curve is a way to summarize the ability of a test or ranking procedure to rank truly positive (i.e., truly differentially expressed) genes ahead of truly negative (i.e., truly non-differentially expressed). To create the ROC curve for a given differential expression method, the genes are ranked in decreasing order by the score, which is assigned to them during the differential expression analysis and quantifies the degree of statistical significance or association with the predictor (the condition). For a given threshold, all genes with scores above the threshold are classified as 'positive' and all genes with scores below the threshold are classified as 'negative'. Comparing these assignments to the true differential expression status, a true positive rate and a false positive rate can be computed and marked in a plot. As the threshold is changed, these pairs of values trace out the ROC curve. A good test procedure gives a ROC curve which passes close to the upper left corner of the plot, while a poor test corresponds to a ROC curve close to the diagonal.
A receiver operating characteristic (ROC) curve is a way to summarize the ability of a test or ranking procedure to rank truly positive (i.e., truly differentially expressed) genes ahead of truly negative (i.e., truly non-differentially expressed). To create the ROC curve for a given differential expression method, the genes are ranked in decreasing order by the score, which is assigned to them during the differential expression analysis and quantifies the degree of statistical significance or association with the predictor (the condition). For a given threshold, all genes with scores above the threshold are classified as 'positive' and all genes with scores below the threshold are classified as 'negative'. Comparing these assignments to the true differential expression status, a true positive rate and a false positive rate can be computed and marked in a plot. As the threshold is changed, these pairs of values trace out the ROC curve. A good test procedure gives a ROC curve which passes close to the upper left corner of the plot, while a poor test corresponds to a ROC curve close to the diagonal.
A receiver operating characteristic (ROC) curve is a way to summarize the ability of a test or ranking procedure to rank truly positive (i.e., truly differentially expressed) genes ahead of truly negative (i.e., truly non-differentially expressed). To create the ROC curve for a given differential expression method, the genes are ranked in decreasing order by the score, which is assigned to them during the differential expression analysis and quantifies the degree of statistical significance or association with the predictor (the condition). For a given threshold, all genes with scores above the threshold are classified as 'positive' and all genes with scores below the threshold are classified as 'negative'. Comparing these assignments to the true differential expression status, a true positive rate and a false positive rate can be computed and marked in a plot. As the threshold is changed, these pairs of values trace out the ROC curve. A good test procedure gives a ROC curve which passes close to the upper left corner of the plot, while a poor test corresponds to a ROC curve close to the diagonal. The area under the ROC curve (AUC) summarizes the performance of the ranking procedure. A good method gives an AUC close to 1, while a poor method gives an AUC closer to 0.5. Each boxplot below summarizes the AUCs across all data set replicates included in the comparison.
A false discovery curve depicts the number of false positives encountered while stepping through a list of genes ranked by a score representing their statistical significance or the degree of association with a predictor. The truly differentially expressed genes are considered the 'true positives', and the truly non-differentially expressed genes the 'true negatives'. Hence, at a given position in the ranked list (shown on the x-axis) the value on the y-axis represents the number of truly non-differentially expressed genes that are ranked above that position. A good ranking method puts few true negatives among the top-ranked genes, and hence the false discovery curve rises slowly. A poor ranking method is recognized by a steeply increasing false discovery curve.
A false discovery curve depicts the number of false positives encountered while stepping through a list of genes ranked by a score representing their statistical significance or the degree of association with a predictor. The truly differentially expressed genes are considered the 'true positives', and the truly non-differentially expressed genes the 'true negatives'. Hence, at a given position in the ranked list (shown on the x-axis) the value on the y-axis represents the number of truly non-differentially expressed genes that are ranked above that position. A good ranking method puts few true negatives among the top-ranked genes, and hence the false discovery curve rises slowly. A poor ranking method is recognized by a steeply increasing false discovery curve.
An MA plot depicts the average expression level of the genes ('A', shown on the x-axis) and their log-fold change between two conditions ('M', shown on the y-axis). The genes called differentially expressed at an adjusted p-value threshold of 0.05 are marked in color.
In the figures below the gene score, which is computed in the differential expression analysis (and stored in the 'score' field of the result object), is plotted against the average expression level of the genes ('A', shown on the x-axis). A high value of the score signifies a 'more significant' gene. The colored line represents a loess fit to the data.
In the figures below the gene score, which is computed in the differential expression analysis (and stored in the 'score' field of the result object), is plotted against the 'signal', for genes that are expressed in only one of the two conditions. The signal (shown on the x-axis) is defined by computing the logarithm (base 2) of the normalized pseudo-counts for all samples in the condition where the gene is expressed, and averaging these values . A high value of the score (shown on the y-axis) signifies a 'more significant' gene. We expect the methods to give higher scores to genes with stronger signal.
The violin plots below show the distribution of the score assigned to the genes by the differential expression methods, as a function of the number of 'outlier counts' (that is, extremely high or low counts introduced artificially in the data and not generated by the underlying statistical distribution) for the genes. All types of outliers are summed. This allows an investigation of the sensitivity of a differential expression method to outlier counts (deviations from the underlying statistical model). A method that is sensitive to outliers shows a different score distribution for genes with outlier counts than for genes without outlier counts. When interpreting the figures below, be observant on the number of genes generating each distribution (indicated below the figure), since an empirical distribution based on only a few values many not be completely representative of the true distribution.
The figures below indicate the fraction of genes in the data set that are called significant at an adjusted p-value threshold of 0.05. Only differential expression methods returning corrected p-values or FDR estimates are included in the figure. Each boxplot summarizes the values obtained across all data set replicates included in the comparison.
The nominal p-value returned by a statistical test indicate the probability of obtaining a value of the test statistic which is at least as extreme as the one observed, given that the null hypothesis is true, e.g., that the gene is not differentially expressed between the compared conditions. Classifying a gene as statistically significantly differentially expressed although it is not truly differentially expressed is referred to as a type I error. For a good statistical test, we expect that the observed type I error rate at a given nominal p-value threshold (that is, the fraction of truly non-differentially expressed genes with a nominal p-value below this threshold) does not exceed the threshold. The figures below show the observed type I error rate at a nominal p-value threshold of 0.05. Only differential expression methods returning nominal p-values are included in the comparison. Each boxplot summarizes the values obtained across all data set replicates included in the comparison.
The false discovery rate (FDR) indicates the fraction of truly non-differentially expressed genes that we expect to find among the genes that we consider to be differentially expressed. For high-dimensional problems, where many statistical tests are performed simultaneously (such as gene expression studies) it is more relevant to attempt to control the FDR than to control the gene-wise type I error rate, since it is almost certain that at least one gene will show a low nominal p-value even if the null hypothesis is true. To control the FDR, typically, the nominal p-values are adjusted for the large number of tests that are performed. The figures below indicate the observed rate of false discoveries (i.e., the fraction of truly non-differentially expressed genes among the genes that are considered significant) at an adjusted p-value threshold of 0.05. Only methods returning corrected p-values or FDR estimates are included. Each boxplot summarizes the values obtained across all data set replicates that are included in the comparison. For a good method, the observed FDR should not be too high above the imposed adjusted p-value threshold (indicated by a dashed vertical line). If the observed FDR is much larger than the imposed adjusted p-value threshold, the fraction of false discoveries is not controlled at the claimed level.
The figures below show the observed false discovery rate as a function of the the average expression level of the genes ('A', shown on the x-axis). The average expression levels in a given data set are binned into 10 quantiles (each containing 1/10 of the values) and the false discovery rate at an imposed adjusted p-value cutoff of 0.05 is estimated for each bin. Each boxplot summarizes the values obtained across all data set replicates included in the comparison.
The true positive rate (TPR) indicates the fraction of truly non-differentially expressed genes that are indeed considered significant by a method at a given significance threshold. A good method gives a high true positive rate, while at the same time keeping the false discovery rate under control. The figures below show the observed rate of true positives at an adjusted p-value threshold of 0.05. Only methods returning corrected p-values or FDR estimates are included. Each boxplot summarizes the values obtained across all data set replicates included in the comparison.
The table below shows, for each pair of differential expression methods, the number of genes that are considered statistically significant by both of them at an adjusted p-value threshold of 0.05. Only methods returning corrected p-values or FDR estimates are included in the comparison. Note that the size of the overlap between two sets naturally depends on the number of genes in each of the sets (indicated along the diagonal of the table).
edgeR.3.4.2.exact.TMM.movingave.tagwise
edgeR.3.4.2.exact.TMM.movingave.tagwise 448
ttest.1.44.0.TMM 172
voom.3.18.11.limma.TMM 330
ttest.1.44.0.TMM voom.3.18.11.limma.TMM
edgeR.3.4.2.exact.TMM.movingave.tagwise 172 330
ttest.1.44.0.TMM 203 189
voom.3.18.11.limma.TMM 189 361
The table below shows, for each pair of differential expression methods, the Sorensen index, which is a way of quantifying the overlap between the collections of differentially expressed genes found by the two methods at an adjusted p-value threshold of 0.05.Only methods returning corrected p-values or FDR estimates are included. The Sorensen index is defined as the ratio between the number of genes shared by the two sets and the average number of genes in the two sets. Hence, it always attains values between 0 and 1. A larger Sorensen index implies better overlap between the two sets, and hence that the two compared methods give similar differential expression results. The values of the Sorensen index for all pairs of compared methods are also visualized in a 'heatmap', where the color corresponds to Sorensen index.
edgeR.3.4.2.exact.TMM.movingave.tagwise
edgeR.3.4.2.exact.TMM.movingave.tagwise 1.0000
ttest.1.44.0.TMM 0.5284
voom.3.18.11.limma.TMM 0.8158
ttest.1.44.0.TMM voom.3.18.11.limma.TMM
edgeR.3.4.2.exact.TMM.movingave.tagwise 0.5284 0.8158
ttest.1.44.0.TMM 1.0000 0.6702
voom.3.18.11.limma.TMM 0.6702 1.0000
The table below shows, for each pair of compared differential expression methods, the Spearman correlation between the scores that they assign to the genes. The value of the correlation is always between -1 and 1, and a high positive value of the Spearman correlation indicates that the compared methods rank the genes in a similar fashion. The results are also shown in a 'heatmap', where the color indicates the Spearman correlation. Finally, the methods are clustered using hierarchical clustering, with a dissimilarity measure defined as 1 - Spearman correlation. This visualizes the relationships among the compared differential expression methods, and groups together methods that rank the genes similarly.
edgeR.3.4.2.exact.TMM.movingave.tagwise
edgeR.3.4.2.exact.TMM.movingave.tagwise 1.0000
ttest.1.44.0.TMM 0.9347
voom.3.18.11.limma.TMM 0.8369
ttest.1.44.0.TMM voom.3.18.11.limma.TMM
edgeR.3.4.2.exact.TMM.movingave.tagwise 0.9347 0.8369
ttest.1.44.0.TMM 1.0000 0.8603
voom.3.18.11.limma.TMM 0.8603 1.0000