IHWpaper 1.10.0
As IHW has been benchmarked thoroughly elsewhere (for example, the rest of this package contains many such simulations), here we restrict ourselves to a small study to illustrate the effects of censoring (as the censoring threshold \(\tau\) varies) and null-proportion adaptivity.
In particular, we consider the following example of a simple conditional two-groups model (for two different values of \(\pi_0 \in \{0.7, 0.9\})\):
\[ \begin{aligned} &X_i \sim U[0,2.5]\\ &H_i \mid X_i \sim \text{Bernoulli}(\pi_0)\\ &Z_i \mid (H_i = 0, X_i) \sim \mathcal{N}(0,1)\\ &Z_i \mid (H_i = 1, X_i) \sim \mathcal{N}(X_i,1)\\ &P_i = 1 - \Phi(X_i) \end{aligned} \] We compare the following approaches, using the nominal level \(\alpha=0.1\), \(P_i\) as the p-values and \(X_i\) as the covariates (except for BH which does not account for covariates):
The censored IHW (IHWc) variant was implemented as follows:
For the exact details of the simulation and also to reproduce it, see the folder “inst/theory_paper/ihwc_wasserman_normal_simulation.R” of this package.
library("ggplot2")
library("grid")
library("dplyr")
library("tidyr")
library("cowplot")
library("IHWpaper")
# color from https://github.com/dill/beyonce
# beyonce::beyonce_palette(30,5)
colors <- c("#040320", "#352140", "#871951", "#EB4A60", "#CFAB7A")
sim_folder <- system.file("simulation_benchmarks/theory_paper",
package = "IHWpaper")
sim_df <- readRDS(file.path(sim_folder, "ihwc_wasserman_normal_sim.Rds"))
taus_df <- expand.grid(fdr_pars = unique(sim_df$fdr_pars),
fdr_method = c("BH","IHW","IHW-Storey"),
sim_pars = unique(sim_df$sim_pars)) %>%
na.exclude()
df1 <- left_join(taus_df, select(sim_df, -fdr_pars), by=c("fdr_method","sim_pars")) %>%
select(-fdr_pars.y) %>% rename(fdr_pars = fdr_pars.x)
## Adding missing grouping variables: `fdr_pars`
## Warning: Column `fdr_method` joining factor and character vector, coercing
## into character vector
## Warning: Column `sim_pars` joining factor and character vector, coercing into
## character vector
df <- full_join(df1, filter(sim_df, fdr_method %in% c("IHWc_Storey","IHWc"))) %>%
mutate( fdr_method = ifelse(fdr_method=="IHWc_Storey","IHWc-Storey", fdr_method))
## Joining, by = c("fdr_pars", "fdr_method", "sim_pars", "alpha", "FDR", "power", "rj_ratio", "FPR", "FDR_sd", "FWER", "nsuccessful", "sim_method", "m")
pi0s <- sapply(strsplit(df$sim_pars, split="[[:punct:]]"), function(x) paste0(x[[2]],".",x[[3]]))
df$pi0s <- pi0s
fdr_plot <- ggplot(df, aes(x=fdr_pars, y=FDR, col=fdr_method)) + geom_line() +
facet_grid(.~pi0s, labeller=label_bquote(cols=pi[0] == .(pi0s))) +
scale_x_log10() +
scale_color_manual(values=colors) +
theme(legend.title=element_blank()) +
xlab(expression(tau~"(censoring threshold)")) +
ylab("FDR")
fdr_plot
power_plot <- ggplot(df, aes(x=fdr_pars, y=power, col=fdr_method)) + geom_line() +
facet_grid(.~pi0s, labeller=label_bquote(cols=pi[0] == .(pi0s))) +
scale_x_log10() +
scale_color_manual(values=colors) +
theme(legend.title=element_blank()) +
xlab(expression(tau~"(censoring threshold)")) +
ylab("Power")
power_plot
The first of the above figures shows the FDR of the procedures as \(\tau\) varies (note that BH/IHW/IHW-Storey do not depend on \(\tau\)). Formally, only IHWc and IHWc-Storey have provable finite-sample control. However, we see that all methods are below the nominal \(\alpha=0.1\). Furthermore, IHW-Storey is at essentially exactly \(0.1\), i.e. by estimating \(\pi_0\) it controls FDR at exactly the nominal level. This is also the case for IHWc-Storey when it is properly tuned. IHWc and IHWc can have FDR way below \(\alpha\) for improper choice of the tuning parameter \(\tau\).
The second plot shows the power of the procedures (defined as the expected proportion of true discoveries among all false null hypotheses). Of course for \(\pi_0 = 0.7\) all methods have a lot more power than for \(\pi_0=0.9\). We also observe the theoretically expected trade-off for the choice of censoring threshold \(\tau\): If it is chosen way too large, then we cannot estimate the underlying conditional two-groups model well, thus the weights are suboptimal and we lose power. On the other hand, if \(\tau\) is too small, then we lose a lot of power since we cannot reject the p-values below \(\tau\). BH has a lot less power compared to IHW/IHW-Storey and also IHWc/IHWc-Storey for a wide range of censoring thresholds \(\tau\).
Benjamini, Yoav, and Yosef Hochberg. 1995. “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical Society. Series B (Statistical Methodology). JSTOR, 289–300.
Markitsis, Anastasios, and Yinglei Lai. 2010. “A Censored Beta Mixture Model for the Estimation of the Proportion of Non-Differentially Expressed Genes.” Bioinformatics 26 (5). Oxford Univ Press:640–46.