The function bbhw (bulk-based hypothesis weighing)
implements various approaches leveraging independent bulk RNAseq data to
increase the power of cell-type level differential expression (state)
analysis (i.e. across conditions). The general strategy is to group the
hypotheses1 into different bins based on the bulk
dataset, and then employ some False Discovery Rate (FDR) methods that
can make use of this grouping. FDR methods rely on an excess of small
p-values (compared to a uniform distribution) to control the rate of
false discoveries, and the rationale for covariate-based methods is that
power can be gained by concentrating this excess within certain bins.
This is illustrated in the following figure:
Bulk-based weighing of single-cell hypotheses. A-C: Example p-values of differential expression upon stress, from both a single-cell dataset with modest sample size (A and C) and a much larger bulk dataset (B) (data from Waag et al., biorxiv 2025). C: Splitting the single-cell p-values by significance of the respective gene in the bulk data leads to higher local excess of p-values. D: Example precision-recall curves in a different dataset with ground truth (based on downsampled data) using standard (local) FDR and the default bbhw method (see our paper for more detail).
On its own, the single-cell data shows a modest enrichment in small
p-values (panel A). However, splitting genes into bins based on their
p-value in the bulk dataset (B) leads to much stronger local
(i.e. within bin) excess of small p-values (C). Exploiting this, the
bbhw method can yield showing substantial increased in
power, as illustrated in a different dataset with ground truth (D).
As this is done at the level of multiple-testing correction,
bbhw can be executed on top of any per-celltype
differential state analysis as produced by pbDS or
mmDS (see example below), assuming that the bulk data is
from the same (or a similar) contrast, and that its samples are
independent from the single-cell samples.
For more detail about the methods and their performance, see the paper. Here, we concentrate on practical use and only briefly summarize the options.
For each of steps (bin creation and usage in FDR), different methods can be applied, which are described below.
The same bulk p-value will necessarily be more informative in a cell type that is abundant, or for a gene that is more specific to that cell type, since in both cases the cell type will contribute more to the bulk expression of the gene2. We therefore include such information (if available) in the binning of the hypotheses.
Specifically, the following options are available:
bulkDEA. nbins are created using
quantiles.bin.method="sig". Each
significance bin is then further split into genes for which the cell
type contributes much to the bulk, and genes for which the cell type
contributes little.inv.logit( logit(p) * sqrt(c) ) where p and
c are respectively the bulk p-value and the proportion of
bulk reads contributed by the cell type). We then split this covariate
into quantile bins as is done for the “sig” method. This is the
recommended method.edgeR::predFC.In all cases, if useSign=TRUE (default) and
bulkDEA contains logFC information, the concordance of the
direction of change between bulk and single-cell data will also be used
to adjust the covariate.
Once the bins are created, the following
correction.method options are available:
pmin(1,nbins/rank(p)). This results is proper FDR control
even across a large number of bins, but the method is more conservative
than others.correction.method options.To enable the Grouped Benjamini-Hochberg (gBH) methods, we
implemented the procedure from Hu, Zhao and Zhou (Hu, Zhao, and Zhou 2010) in the
gBH function, which can be used outside of the present
context. The method has two variants which differ in the way the rate of
true null hypotheses in each group/bin is estimated (see
?gBH for more detail). We recommend using
correction.method="gBH.LSL", which in our benchmark
provided the best power while appropriately controlling error.
Each method exists in two flavors: a local one, which is applied for
each cell type separately, and a global one, which is applied once
across all cell types (see the local argument). We
recommend using the local one, for the same reasons discussed above: the
excess of small p-values in affected cell types gets diluted when polled
with the unaffected cell types, and conversely conversely the FDR is
underestimated for small p-values from unaffected cell types (which are
the result of noise) due to the excess of small p-values coming from
affected cell types.
We first load an example dataset and perform a pseudo-bulk differential state analysis (for more information on this, see the vignette).
data("example_sce")
# since the dataset is a bit too small for the procedure, we'll double it:
sce <- rbind(example_sce, example_sce)
row.names(sce) <- make.unique(row.names(sce))
# pseudo-bulk aggregation
pb <- aggregateData(sce)
res <- pbDS(pb, verbose = FALSE)
# the signal is too strong in this data, so we'll reduce it
# (don't do this with real data!):
res$table$stim <- lapply(res$table$stim, \(x) {
x$p_val <- sqrt(x$p_val)
x$p_adj.loc <- p.adjust(x$p_val, method="fdr")
x
})
# we assemble the cell types in a single table for the comparison of interest:
res2 <- bind_rows(res$table$stim, .id="cluster_id")
head(res2 <- res2[order(res2$p_adj.loc),])## gene cluster_id logFC logCPM F p_val p_adj.loc
## 1835 ISG15 CD8 T cells 5.577734 11.049612 314.9321 1.368396e-07 7.522788e-05
## 1847 IFI6 CD8 T cells 5.822380 9.926318 238.6031 3.948743e-07 7.522788e-05
## 2261 ISG20 CD8 T cells 3.922602 11.043214 256.9252 3.966321e-07 7.522788e-05
## 2404 ISG15.1 CD8 T cells 5.577734 11.049612 314.9321 1.368396e-07 7.522788e-05
## 2416 IFI6.1 CD8 T cells 5.822380 9.926318 238.6031 3.948743e-07 7.522788e-05
## 2830 ISG20.1 CD8 T cells 3.922602 11.043214 256.9252 3.966321e-07 7.522788e-05
## p_adj.glb contrast
## 1835 5.005213e-11 stim
## 1847 1.401695e-10 stim
## 2261 1.401695e-10 stim
## 2404 5.005213e-11 stim
## 2416 1.401695e-10 stim
## 2830 1.401695e-10 stim
We next prepare the bulk differential expression table. Since we don’t actually have bulk data for this example dataset, we’ll just make it up (don’t do this!):
set.seed(123)
bulkDEA <- data.frame(
row.names=row.names(pb),
logFC=rnorm(nrow(pb)),
PValue=runif(nrow(pb)))
# we'll spike some signal in:
gs_by_k <- split(res2$gene, res2$cluster_id)
sel <- unique(unlist(lapply(gs_by_k, \(x) x[3:150])))
bulkDEA[sel,"PValue"] <- abs(rnorm(length(sel), sd=0.01))
head(bulkDEA)## logFC PValue
## HES4 -0.56047565 0.01610123
## ISG15 -0.23017749 0.01839477
## AURKAIP1 1.55870831 0.68257091
## MRPL20 0.07050839 0.50781291
## SSU72 0.12928774 0.55437180
## RER1 1.71506499 0.84124010
We then can apply bbhw :
# here, we manually set 'nbins'
# because the dataset is too small,
# you shouldn't normally need to do this:
res2 <- bbhw(pbDEA=res2, bulkDEA=bulkDEA, pb=pb, nbins=4, verbose=FALSE)We can see that the adjustment methods do not rank the genes in the same fashion:
## gene cluster_id p_adj.loc padj
## 2280 IFI6.1 CD8 T cells 7.522788e-05 3.067512e-05
## 2305 ISG20 CD8 T cells 7.522788e-05 3.067512e-05
## 2306 ISG20.1 CD8 T cells 7.522788e-05 3.067512e-05
## 2303 ISG15 CD8 T cells 7.522788e-05 8.165219e-05
## 2304 ISG15.1 CD8 T cells 7.522788e-05 8.165219e-05
## 2279 IFI6 CD8 T cells 7.522788e-05 1.462639e-04
## bbhw
## standard FALSE TRUE
## FALSE 5149 127
## TRUE 3 67
This results in a different number of genes with adjusted p-values < 0.05:
## standard bbhw
## 70 194
In this case the effect is very modest, chiefly because this is a small subset of data (hence few hypotheses) and the signal for this subset is already highly significant to begin with. However, this can often results in substantial improvements in both precision and recall (see Figure 1D above).
## R version 4.5.2 (2025-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] UpSetR_1.4.0 scater_1.39.2
## [3] scuttle_1.21.0 SingleCellExperiment_1.33.0
## [5] SummarizedExperiment_1.41.1 Biobase_2.71.0
## [7] GenomicRanges_1.63.1 Seqinfo_1.1.0
## [9] IRanges_2.45.0 S4Vectors_0.49.0
## [11] MatrixGenerics_1.23.0 matrixStats_1.5.0
## [13] ExperimentHub_3.1.0 AnnotationHub_4.1.0
## [15] BiocFileCache_3.1.0 dbplyr_2.5.2
## [17] BiocGenerics_0.57.0 generics_0.1.4
## [19] purrr_1.2.1 muscat_1.25.2
## [21] limma_3.67.0 ggplot2_4.0.2
## [23] dplyr_1.2.0 cowplot_1.2.0
## [25] patchwork_1.3.2 BiocStyle_2.39.0
##
## loaded via a namespace (and not attached):
## [1] RcppAnnoy_0.0.23 splines_4.5.2 filelock_1.0.3
## [4] bitops_1.0-9 tibble_3.3.1 lpsymphony_1.39.0
## [7] httr2_1.2.2 lifecycle_1.0.5 Rdpack_2.6.6
## [10] edgeR_4.9.2 doParallel_1.0.17 globals_0.19.0
## [13] lattice_0.22-9 MASS_7.3-65 backports_1.5.0
## [16] magrittr_2.0.4 sass_0.4.10 rmarkdown_2.30
## [19] jquerylib_0.1.4 yaml_2.3.12 otel_0.2.0
## [22] sctransform_0.4.3 DBI_1.2.3 buildtools_1.0.0
## [25] minqa_1.2.8 RColorBrewer_1.1-3 abind_1.4-8
## [28] EnvStats_3.1.0 glmmTMB_1.1.14 rappdirs_0.3.4
## [31] sandwich_3.1-1 circlize_0.4.17 ggrepel_0.9.6
## [34] pbkrtest_0.5.5 irlba_2.3.7 listenv_0.10.0
## [37] maketools_1.3.2 RSpectra_0.16-2 parallelly_1.46.1
## [40] codetools_0.2-20 DelayedArray_0.37.0 tidyselect_1.2.1
## [43] shape_1.4.6.1 farver_2.1.2 lme4_1.1-38
## [46] ScaledMatrix_1.19.0 viridis_0.6.5 jsonlite_2.0.0
## [49] GetoptLong_1.1.0 BiocNeighbors_2.5.4 iterators_1.0.14
## [52] foreach_1.5.2 tools_4.5.2 progress_1.2.3
## [55] Rcpp_1.1.1 blme_1.0-7 glue_1.8.0
## [58] gridExtra_2.3 SparseArray_1.11.10 xfun_0.56
## [61] mgcv_1.9-4 DESeq2_1.51.6 withr_3.0.2
## [64] numDeriv_2016.8-1.1 BiocManager_1.30.27 fastmap_1.2.0
## [67] boot_1.3-32 caTools_1.18.3 digest_0.6.39
## [70] rsvd_1.0.5 R6_2.6.1 colorspace_2.1-2
## [73] Cairo_1.7-0 gtools_3.9.5 RSQLite_2.4.6
## [76] RhpcBLASctl_0.23-42 tidyr_1.3.2 variancePartition_1.41.3
## [79] data.table_1.18.2.1 corpcor_1.6.10 httr_1.4.8
## [82] prettyunits_1.2.0 S4Arrays_1.11.1 uwot_0.2.4
## [85] pkgconfig_2.0.3 gtable_0.3.6 blob_1.3.0
## [88] ComplexHeatmap_2.27.1 S7_0.2.1 XVector_0.51.0
## [91] sys_3.4.3 remaCor_0.0.20 htmltools_0.5.9
## [94] TMB_1.9.19 clue_0.3-66 scales_1.4.0
## [97] png_0.1-8 fANCOVA_0.6-1 reformulas_0.4.4
## [100] knitr_1.51 reshape2_1.4.5 rjson_0.2.23
## [103] curl_7.0.0 nlme_3.1-168 nloptr_2.2.1
## [106] cachem_1.1.0 zoo_1.8-15 GlobalOptions_0.1.3
## [109] stringr_1.6.0 BiocVersion_3.23.1 KernSmooth_2.23-26
## [112] parallel_4.5.2 vipor_0.4.7 AnnotationDbi_1.73.0
## [115] pillar_1.11.1 grid_4.5.2 vctrs_0.7.1
## [118] gplots_3.3.0 slam_0.1-55 BiocSingular_1.27.1
## [121] IHW_1.39.0 beachmat_2.27.2 cluster_2.1.8.2
## [124] beeswarm_0.4.0 evaluate_1.0.5 mvtnorm_1.3-3
## [127] cli_3.6.5 locfit_1.5-9.12 compiler_4.5.2
## [130] rlang_1.1.7 crayon_1.5.3 future.apply_1.20.1
## [133] labeling_0.4.3 fdrtool_1.2.18 plyr_1.8.9
## [136] ggbeeswarm_0.7.3 stringi_1.8.7 viridisLite_0.4.3
## [139] BiocParallel_1.45.0 Biostrings_2.79.4 lmerTest_3.2-0
## [142] aod_1.3.3 Matrix_1.7-4 hms_1.1.4
## [145] bit64_4.6.0-1 future_1.69.0 KEGGREST_1.51.1
## [148] statmod_1.5.1 rbibutils_2.4.1 memoise_2.0.1
## [151] broom_1.0.12 bslib_0.10.0 bit_4.6.0
A hypothesis, here, is understood as the differential expression (or lack thereof) of one gene in one cell type.↩︎
We define the ‘contribution’ of the cell type to the bulk expression of the gene as the proportion of the total pseudobulk reads for that gene that is contributed by the cell type (across all samples).↩︎