Instructions on how to re-create results from Korthauer and Kimes et al. (2019) on benchmarking methods that control the false discovery rate (FDR).
benchmarkfdrData2019 1.4.0
This package provides a R / Bioconductor resource to re-create plots and extend the analyses of Korthauer and Kimes et al. (2019). In this paper, methods controlling the False Discovery Rate (FDR) were applied to a collection of simulated and biological data sets to generate the benchmarking summaries provided with this package. Here, we give an example of how to load summary objects, plot results, and apply a new method to the dataset.
The package can be installed from R (version >= 3.6) using the
BiocManager
package,
available on CRAN.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("benchmarkfdrData2019")
suppressPackageStartupMessages({
library(ExperimentHub)
library(benchmarkfdrData2019)
library(SummarizedBenchmark)
library(dplyr)
library(ggplot2)
library(rlang)
})
In addition to the ExperimentHub and benchmarkfdrData2019
packages, we also load the SummarizedBenchmark package. Benchmarking
results made available with this package for all case studies and simulations described
in Korthauer and Kimes et al. (2019) were created using the SummarizedBenchmark
package and are stored as SummarizedBenchmark objects.
However, note that the objects were generated using the fdrbenchmark
branch of the corresponding SummarizedBenchmark
GitHub repository, and do not include
all of the features described in newer versions of the package (e.g. available on Bioconductor).
In this vignette, we use the release version of the SummarizedBenchmark
package
available on Bioconductor.
However, the fdrbenchmark
version of the SummarizedBenchmark
package can be
installed from GitHub, again, using the BiocManager
package.
BiocManager::install("areyesq89/SummarizedBenchmark", ref = "fdrbenchmark")
The data are available for downloaded from the Bioconductor
ExperimentHub
web resource. The complete list of resources availble with the
benchmarkfdrData2019
package can be queried using the following
command.
hub <- ExperimentHub()
bfdrData <- query(hub, "benchmarkfdrData2019")
bfdrData
## ExperimentHub with 171 records
## # snapshotDate(): 2020-10-02
## # $dataprovider: NA, Geoffrey J. Barton lab, University of Dundee, Dundee, U...
## # $species: Saccharomyces cerevisiae, Homo sapiens, human gut metagenome, Mu...
## # $rdataclass: SummarizedBenchmark
## # additional mcols(): taxonomyid, genome, description,
## # coordinate_1_based, maintainer, rdatadateadded, preparerclass, tags,
## # rdatapath, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH2267"]]'
##
## title
## EH2267 | h3k4me3-promoters-benchmark
## EH2268 | h3k4me3-csaw-benchmark
## EH2269 | h3k4me3-csaw-uninf-benchmark
## EH2270 | h3k4me3-csaw-cov-benchmark
## EH2271 | cbp-csaw-benchmark
## ... ...
## EH2433 | varyingpi0-benchmark-nullprop70
## EH2434 | varyingpi0-benchmark-nullprop80
## EH2435 | varyingpi0-benchmark-nullprop90
## EH2436 | varyingpi0-benchmark-nullprop95
## EH2437 | varyingpi0-benchmark-nullprop99
The above command only returns the metadata associated with each data object available on ExperimentHub. Individual resources must be retrieved from ExperimentHub before they can be loaded in R. Here, we retrieve and load two resource objects to illustrate the analyses that can be performed using the data available with this package.
First, we load the benchmark results for a ChIP-seq case study where differential
binding was tested using the csaw package with region width used
as the independent covariate. The resource is stored with the title "cbp-csaw-benchmark"
.
First, we determine the corresponding ExperimentHub ID for the resource.
cbp_id <- bfdrData$ah_id[bfdrData$title == "cbp-csaw-benchmark"]
cbp_id
## [1] "EH2271"
Using the ID, we can now access the metadata associated with the resource by
subsetting bfdrData
using single brackets ([
). Using double brackets ([[
)
will retrieve the resource from the ExperimentHub server.
bfdrData[cbp_id]
## ExperimentHub with 1 record
## # snapshotDate(): 2020-10-02
## # names(): EH2271
## # package(): benchmarkfdrData2019
## # $dataprovider: Biochemistry, St Jude Children's Research Hospital
## # $species: Mus musculus
## # $rdataclass: SummarizedBenchmark
## # $rdatadateadded: 2019-04-23
## # $title: cbp-csaw-benchmark
## # $description: Differential peak calling in CREB-binding protein (CBP) knoc...
## # $taxonomyid: 10090
## # $genome: mm10
## # $sourcetype: FASTQ
## # $sourceurl: https://www.ebi.ac.uk/ena/data/view/PRJNA236594
## # $sourcesize: NA
## # $tags: c("SingleCellData", "ExperimentData", "RNASeqData",
## # "ExpressionData", "ExperimentHub")
## # retrieve record with 'object[["EH2271"]]'
chipres <- bfdrData[[cbp_id]]
chipres
## class: SummarizedBenchmark
## dim: 59596 21
## metadata(0):
## assays(1): bench
## rownames: NULL
## rowData names(2): bench ind_covariate
## colnames(21): unadjusted bonf ... lfdr adapt-glm
## colData names(24): bfunc bpost ... param.alphas blabel
Next, we load the benchmark results from a yeast in silico RNA-seq experiment where differential expression was tested using DESeq2 with a strong (simulated) independent and informative covariate. Unlike the ChIP-seq analysis above, with the in silico experiment, we know ground truth, and therefore can evaluate FDR control as well as the true positive rate (TPR) at nominal FDR significance thresholds.
Since the in silico experiments were repeated 100 times, the data object is
a list of 100 SummarizedBenchmark objects for each replication. The resource
is stored with the title "yeast-results-de5"
. Here, we demonstrate an alternative approach
to retrieving the resource from the ExperimentHub server. Rather than subset bfdrData
using double brackets ([[
), we retrieve the resource by calling the
resource name as a function (yeast-results-de5()
). This functionality is available
for all resources available with this package (including the ChIP-seq resource
loaded above).
yeast_id <- bfdrData$ah_id[bfdrData$title == "yeast-results-de5"]
bfdrData[yeast_id]
## ExperimentHub with 1 record
## # snapshotDate(): 2020-10-02
## # names(): EH2320
## # package(): benchmarkfdrData2019
## # $dataprovider: Geoffrey J. Barton lab, University of Dundee, Dundee, UK
## # $species: Saccharomyces cerevisiae
## # $rdataclass: SummarizedBenchmark
## # $rdatadateadded: 2019-04-23
## # $title: yeast-results-de5
## # $description: Yeast RNA-seq 48 sample simulation study: Unimodal alternati...
## # $taxonomyid: 4932
## # $genome: Ensembl release 68
## # $sourcetype: tar.gz
## # $sourceurl: https://github.com/bartongroup/profDGE48
## # $sourcesize: NA
## # $tags: c("SingleCellData", "ExperimentData", "RNASeqData",
## # "ExpressionData", "ExperimentHub")
## # retrieve record with 'object[["EH2320"]]'
yeastres <- `yeast-results-de5`()
length(yeastres)
## [1] 100
yeastres[[1]]
## class: SummarizedBenchmark
## dim: 6544 23
## metadata(0):
## assays(1): qvalue
## rownames: NULL
## rowData names(3): qvalue ind_covariate log2FC
## colnames(23): unadjusted bonf ... fdrreg-t fdrreg-e
## colData names(28): bfunc bpost ... param.control blabel
To be able to work with the latest release of the SummarizedBenchmark
package,
we must fill in a missing slot of the SummarizedBenchmark objects.
chipres@BenchDesign <- BenchDesign()
yeastres <- lapply(yeastres, function(x) { x@BenchDesign <- BenchDesign(); x })
The SummarizedBenchmark objects include the original p-values, informative covariate, and corrected significance values for the various methods compared in Korthauer and Kimes et al. (2019).
SummarizedBenchmark objects are an extension of the Bioconductor SummarizedExperiment class, with results organized as a rectangular “assay”, with associated row and column metadata. Here, the rows of the objects correspond to individual hypothesis tests and the columns correspond to the approaches used for multiple testing correction.
We can take a look at the names of the methods included in the ChIP-seq results object.
colnames(chipres)
## [1] "unadjusted" "bonf" "bh" "qvalue" "ihw-a01"
## [6] "ihw-a02" "ihw-a03" "ihw-a04" "ihw-a05" "ihw-a06"
## [11] "ihw-a07" "ihw-a08" "ihw-a09" "ihw-a10" "ashq"
## [16] "bl-df02" "bl-df03" "bl-df04" "bl-df05" "lfdr"
## [21] "adapt-glm"
Notice that the results include the IHW and BL methods multiple times.
These ihw-
and bl-
columns correspond to separate runs of the
methods with different parameter settings. Briefly, the IHW method requires specifying an
alpha FDR threshold while running the method. Here, the method was run with alpha values of
0.01, 0.02, .., 0.10
. The BL method was run with spline degrees of freedom 2, 3, 4, 5
.
The corrected significance returned by each method is included in the single assay, "bench"
(corresponding to the benchmarked results).
dim(assay(chipres, "bench"))
## [1] 59596 21
head(assay(chipres, "bench"))
## unadjusted bonf bh qvalue ihw-a01 ihw-a02 ihw-a03
## [1,] 0.09367441 1 0.2750488 0.1587219 0.4153819 1.0000000 1.0000000
## [2,] 0.07635065 1 0.2456775 0.1417726 0.2592850 0.2359257 0.2113001
## [3,] 0.44641540 1 0.6193447 0.3574041 0.6874034 0.6697854 0.4831826
## [4,] 0.54230545 1 0.6976696 0.4026029 0.7825177 1.0000000 1.0000000
## [5,] 0.11280772 1 0.2936016 0.1694281 0.3009025 0.3865540 0.2590641
## [6,] 0.31152661 1 0.5027725 0.2901340 0.5194227 0.4229486 0.3428025
## ihw-a04 ihw-a05 ihw-a06 ihw-a07 ihw-a08 ihw-a09 ihw-a10 ashq
## [1,] 0.2383437 0.2010557 0.1859702 0.1894541 0.1837203 0.2021730 0.1998073 NA
## [2,] 0.2529069 0.1952980 0.2451944 0.1935890 0.1943609 0.1971796 0.1879095 NA
## [3,] 0.6147271 0.4718577 0.4405994 0.4508408 0.4687339 0.4390055 0.4804587 NA
## [4,] 0.8947457 0.9276785 1.0000000 0.9881300 0.6758374 0.8118452 0.7374463 NA
## [5,] 0.2836033 0.2245577 0.2129845 0.2218124 0.2323623 0.2743020 0.2317344 NA
## [6,] 0.3358033 0.2741951 0.3035988 0.3531403 0.3852453 0.3926378 0.4048568 NA
## bl-df02 bl-df03 bl-df04 bl-df05 lfdr adapt-glm
## [1,] 0.09501728 0.1242898 0.1359421 0.1404518 0.05770709 0.1246816
## [2,] 0.09334559 0.1185802 0.1286718 0.1325802 0.05362979 0.1142933
## [3,] 0.17635627 0.2454214 0.2725921 0.2830917 0.24715606 0.2476961
## [4,] 0.35015194 0.4105930 0.4347989 0.4441300 0.38512849 Inf
## [5,] 0.11155442 0.1417115 0.1537717 0.1584425 0.06377030 0.1327596
## [6,] 0.10655455 0.1638255 0.1859225 0.1944646 0.12164067 0.1999106
The ASH (ashq
) results are NA
as the method was not applied to the data.
Given the multiple-testing-corrected results provided in the "bench"
assay of the
SummarizedBenchmark objects, we can take a look at several performance metrics
to compare the various methods. For the ChIP-seq case study, we can take a look at the
number of rejections at various significance cutoffs. With the in silico yeast
experiments, since truth is known, we can also look at FDR and TPR, as well as other
related metrics.
SummarizedBenchmark objects include functionality to easily add and evaluate metrics
for data stored as assays. This is performed by first adding performance metrics
with addPerformanceMetric
, followed by a call to estimatePerformanceMetrics
.
While custom performance metrics can be defined by users, the package fortunately
includes several default metrics that can be added by name.
availableMetrics()
## functions description
## 1 rejections Number of rejections
## 2 TPR True Positive Rate
## 3 TNR True Negative Rate
## 4 FDR False Discovery Rate (estimated)
## 5 FNR False Negative Rate
## 6 correlation Pearson correlation
## 7 sdad Standard Deviation of the Absolute Difference
## 8 hamming Hamming distance
## 9 LPnorm L_{p} norm
## 10 adjustedRandIndex Adjusted Rand Index
## requiresTruth
## 1 FALSE
## 2 TRUE
## 3 TRUE
## 4 TRUE
## 5 TRUE
## 6 TRUE
## 7 TRUE
## 8 TRUE
## 9 TRUE
## 10 TRUE
We will add the "rejections"
metric to the "bench"
assay and compute the number
of rejections for each method at cutoffs between 0.01 and 0.10.
chipres <- addPerformanceMetric(chipres,
evalMetric = "rejections",
assay = "bench")
Next, we compute the number of rejections and organize this as a tidy data.frame.
chipdf <- estimatePerformanceMetrics(chipres,
alpha = seq(0.01, 0.10, by = .01),
tidy = TRUE)
dim(chipdf)
## [1] 210 29
head(chipdf)
## bfunc bpost bfunc_anon vers_src
## 1 function(p) {; p;} <NA> TRUE bfunc
## 2 p.adjust <NA> FALSE bfunc
## 3 p.adjust <NA> FALSE bfunc
## 4 qvalue::qvalue function(x) {; x$qvalues;} FALSE bfunc
## 5 IHW::ihw IHW::adj_pvalues FALSE bfunc
## 6 IHW::ihw IHW::adj_pvalues FALSE bfunc
## pkg_name pkg_vers param.p param.method param.pvalues param.covariates
## 1 <NA> <NA> pval <NA> <NA> <NA>
## 2 stats 3.5.0 pval "bonferroni" <NA> <NA>
## 3 stats 3.5.0 pval "BH" <NA> <NA>
## 4 qvalue 2.12.0 pval <NA> <NA> <NA>
## 5 IHW 1.8.0 <NA> <NA> pval ind_covariate
## 6 IHW 1.8.0 <NA> <NA> pval ind_covariate
## param.alpha param.betahat param.sebetahat param.pValues param.X
## 1 <NA> <NA> <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA>
## 5 0.01 <NA> <NA> <NA> <NA>
## 6 0.02 <NA> <NA> <NA> <NA>
## param.smooth.df param.unadj_p param.groups param.pvals param.x
## 1 <NA> <NA> <NA> <NA> <NA>
## 2 <NA> <NA> <NA> <NA> <NA>
## 3 <NA> <NA> <NA> <NA> <NA>
## 4 <NA> <NA> <NA> <NA> <NA>
## 5 <NA> <NA> <NA> <NA> <NA>
## 6 <NA> <NA> <NA> <NA> <NA>
## param.pi_formulas param.mu_formulas param.alphas blabel label value
## 1 <NA> <NA> <NA> unadjusted unadjusted 5342
## 2 <NA> <NA> <NA> bonf bonf 0
## 3 <NA> <NA> <NA> bh bh 0
## 4 <NA> <NA> <NA> qvalue qvalue 4
## 5 <NA> <NA> <NA> ihw-a01 ihw-a01 0
## 6 <NA> <NA> <NA> ihw-a02 ihw-a02 0
## assay performanceMetric alpha
## 1 bench rejections 0.01
## 2 bench rejections 0.01
## 3 bench rejections 0.01
## 4 bench rejections 0.01
## 5 bench rejections 0.01
## 6 bench rejections 0.01
Each row in the data.frame corresponds to a method + metric + cutoff
combination
(e.g. "unadjusted" + "rejections" + "alpha = 0.01"
). This information is stored in
the "label"
, "performanceMetric"
, and "alpha"
columns, with the corresponding
metric value in the "value"
column. All other columns contain method metadata,
such as the package version, when the method was evaluated.
We will now clean up the IHW and BL methods which, as described above, include multiple parameter settings.
## subset IHW
chipdf <- dplyr:::filter(chipdf, !(grepl("ihw", label) & param.alpha != alpha))
chipdf <- dplyr:::mutate(chipdf, label = gsub("(ihw)-a\\d+", "\\1", label))
## subset BL
chipdf <- dplyr:::filter(chipdf, ! label %in% paste0("bl-df0", c(2, 4, 5)))
We only keep a subset of the columns and drop NA values.
chipdf <- dplyr::select(chipdf, label, performanceMetric, alpha, value)
chipdf <- dplyr::filter(chipdf, !is.na(value))
head(chipdf)
## label performanceMetric alpha value
## 1 unadjusted rejections 0.01 5342
## 2 bonf rejections 0.01 0
## 3 bh rejections 0.01 0
## 4 qvalue rejections 0.01 4
## 5 ihw rejections 0.01 0
## 6 bl-df03 rejections 0.01 78
We now plot the number of rejections.
ggplot(chipdf, aes(x = alpha, y = value, color = label)) +
geom_point() +
geom_line() +
scale_color_viridis_d("Method") +
scale_x_continuous(breaks = seq(0, 1, .01), limits = c(0, .11)) +
ylab("Rejections") +
theme_bw() +
ggtitle("Number of rejections across multiple-testing methods",
"ChIP-seq CBP differential analysis with informative covariate")