CyTOFpower 1.10.0
Mass Spectrometry (or CyTOF) is a single cell technology. It measures up to 50 protein markers on a single cell. The markers are antibodies labeled with stable isotopes and they might be markers of cell types or phenotypes.
CyTOF might be used to determine if there is any differences in cell abundances (type markers) or cell phenotypes (state markers) between two experimental conditions. In this package, we are proposing a tool to predict the power of a differential state test analysis.
Two packages are available on Bioconductor to perform differential state test analyses: CytoGLMM (Seiler 2020) and diffcyt (Weber 2019). These models are available in this package and their results, the adjusted p-value per marker, are used to compute the power an experiment.
In-silico CyTOF data are simulated using the following data generation process. We assumed two conditions for which one condition is the baseline (i.e. a control condition where no marker is different from the other) and the other condition contains some signals (i.e. at least one marker is differentially expressed). The parameters are defined as follow:
The cell value’s mean of each marker is drawn from a Gamma distribution: \[\mu_{0,ij} \sim \Gamma(k, θ)\] where \(i\) the donor, \(j\) the marker, \(k\) the shape and \(\theta\) the scale. These two last parameters are defined using the parameters provided by the user:
The cell value mean of the differentially expressed marker(s) is then multiplied by the fold change defined earlier: \(\mu_{1,ij}= \mu_{0,ij} \cdot \rho_j\).
The cell values are drawn using a Negative Binomial:
Some of these parameters might need to be estimated using previous data or publicly
available datasets. For instance, the dispersion and mean parameters of the
negative binomial distribution might be estimated using fitdistr
from the
MASS
package for each marker.
The power is computed for the differentially expressed marker(s) (i.e. fold change different from 1). It is based on the adjusted p-values reported by the models: counting how many times the null hypothesis is correctly rejected. It is also important to note that the power returned by these computations uses the threshold of \(\alpha = 0.05\) as a significance level.
The shiny app is divided into two tabs: (1) the precomputed dataset tab: the power was pre-computed for multiple combinations of parameters and the user is able to search this grid of parameters; (2) the personalized dataset tab: the data and the power is computed on request based on the parameters chosen by the user.
This panel allows the user to search a grid of parameters that have been pre-computed. The value “NA” displays a power curve for the different values of this parameter.
The available parameters that the user can choose from are the following:
This panel allows the user to compute the power for a chosen set of parameters. The data are generated on request and it takes some time to get the results, especially if the user would like to perform a high number of simulations with a high number of cells.
The available parameters that the user can choose from are the following:
The shiny app is run locally by calling the following function:
library(CyTOFpower)
CyTOFpower()
sessionInfo()
#> R version 4.4.0 beta (2024-04-15 r86425)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.19-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] BiocStyle_2.32.0
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.35 R6_2.5.1 bookdown_0.39
#> [4] fastmap_1.1.1 xfun_0.43 cachem_1.0.8
#> [7] knitr_1.46 htmltools_0.5.8.1 rmarkdown_2.26
#> [10] lifecycle_1.0.4 cli_3.6.2 sass_0.4.9
#> [13] jquerylib_0.1.4 compiler_4.4.0 tools_4.4.0
#> [16] evaluate_0.23 bslib_0.7.0 yaml_2.3.8
#> [19] BiocManager_1.30.22 jsonlite_1.8.8 rlang_1.1.3
Seiler, Ferreira, C. 2020. “CytoGLMM: Conditional Differential Analysis for Flow and Mass Cytometry Experiments.” BMC Bioinformatics.
Weber, Nowicka, L. M. 2019. “Diffcyt: Differential Discovery in High-Dimensional Cytometry via High-Resolution Clustering.” Communication Biology.