apeglm {apeglm} | R Documentation |
apeglm provides Bayesian shrinkage estimators for effect sizes in GLM models, using approximation of the posterior for individual coefficients.
apeglm(Y, x, log.lik, param = NULL, coef = NULL, mle = NULL, no.shrink = FALSE, interval.type = c("laplace", "HPD", "credible"), interval.level = 0.95, threshold = NULL, contrasts, weights = NULL, offset = NULL, flip.sign = TRUE, prior.control, multiplier = 1, ngrid = 50, nse = 5, ngrid.nuis = 5, nse.nuis = 2, log.link = TRUE, param.sd = NULL, optim.method = "BFGS", bounds = c(-Inf, Inf))
Y |
the observations, which can be a matrix or SummarizedExperiment,
with columns for samples and rows for "features" (e.g. genes in a genomic context).
If Y is a SummarizedExperiment, |
x |
design matrix, with intercept in the first column |
log.lik |
the log of likelihood function, specified by the user.
For Negative Binomial distribution, user can use |
param |
the other parameter(s) to be used in the likelihood function, e.g. the dispersion parameter for a negative binomial distribution. this can be a vector or a matrix (with columns as parameters) |
coef |
(optional) the index of the coefficient for which to generate posterior estimates (FSR and intervals) |
mle |
(optional) a 2 column matrix giving the MLE and its standard error
of |
no.shrink |
logical, if TRUE, apeglm won't perform shrinkage (default is FALSE) |
interval.type |
(optional) can be "laplace", "HPD", or "credible", which specifies the type of Bayesian interval that the user wants to output; "laplace" represents the Laplace approximation of the posterior mode |
interval.level |
(optional) default is 0.95 |
threshold |
(optional) a threshold for integrating posterior probabilities, see details under 'Value'. Note that this should be on the natural log scale for a log link GLM. |
contrasts |
(optional) contrast matrix, same number of rows as |
weights |
(optional) weights matrix, same shape as |
offset |
(optional) offsets matrix, same shape as |
flip.sign |
whether to flip the sign of threshold value when MAP is negative, default is TRUE (threshold must then be positive) |
prior.control |
see Details |
multiplier |
a positive number, when the prior is adapted to the |
ngrid |
the number of grid points for grid integration of intervals |
nse |
the number of Laplace standard errors to set the left and right edges of the grid around the MAP |
ngrid.nuis |
the number of grid points for nuisance parameters |
nse.nuis |
the number of Laplace standard errors to set the left and right edges of the grid around the MAP of the nuisance parameters |
log.link |
whether the GLM has a log link (default = TRUE) |
param.sd |
(optional) potential uncertainty measure on the parameter |
optim.method |
the method passed to |
bounds |
the bounds for the numeric optimization |
prior.control
is a list of parameters that will be passed to determine
the prior distribution. Users are allowed to have a Normal prior on the
intercept, and a t prior on the non-intercept coefficients (similar
to bayesglm
in the arm
package. The following are defaults:
no.shrink = 1
: index of the coefficient(s) not to shrink
prior.mean = 0
: mean of t prior
prior.scale = 1
: scale of t prior
prior.df = 1
: df of t prior
prior.no.shrink.mean = 0
: mean of Normal
prior.no.shrink.scale = 15
: scale of Normal
So without specifying prior.control
, the following is set inside apeglm
:
prior.control <- list(no.shrink=1,prior.mean=0,prior.scale=1,
prior.df=1,prior.no.shrink.mean=0,prior.no.shrink.scale=15)
Note that the prior should be defined on the natural log scale for a log link GLM.
a list of matrices containing the following components:
map
: a matrix of MAP estimates, columns for coefficients and rows for features
se
: a matrix of SE estimates, same shape as map
(note: in future versions of apeglm, this slot is renamed sd
).
prior.control
: a list with details on the prior
fsr
: a vector of the false sign rate for coef
interval
: a matrix of either HPD or credible interval for coef
thresh
: a vector of the posterior probability that the estimated parameter
is smaller than the threshold value specified in threshold
when MAP is positive (or greater than
-1 * threshold value when MAP is negative and flip.sign is TRUE)
diag
: a matrix of diagnostics
contrast.map
: a vector of MAP corresponding to the contrast
when contrast
is given
contrast.se
: a vector of SE corresponding to the contrast
when contrast
is given
ranges
: a GRanges or GRangesList with the estimated coefficients,
if Y
was a SummarizedExperiment.
Note that all parameters associated with coefficients,
e.g. map
, se
, etc., are returned on the natural log scale for a log link GLM.
# Simulate RNA-Seq read counts data # 5 samples for each of the two groups # a total of 100 genes n.per.group <- 5 n <- n.per.group * 2 m <- 100 # The design matrix includes one column of intercept # and one column indicating samples that belong to the second group condition <- factor(rep(letters[1:2], each = n.per.group)) x <- model.matrix(~condition) # Specify the standard deviation of beta (LFC between groups) beta.sd <- 2 beta.cond <- rnorm(m, 0, beta.sd) beta.intercept <- runif(m, 2, 6) beta.mat <- cbind(beta.intercept, beta.cond) # Generate the read counts mu <- exp(t(x %*% t(beta.mat))) Y <- matrix(rnbinom(m*n, mu=mu, size=1/.1), ncol = n) # Here we will use the negative binomial log likelihood # which is an exported function. See 'logLikNB' for details. # For the NB: # 'param' is the dispersion estimate (1/size) # 'offset' can be used to adjust for size factors (log of size factors) param <- matrix(0.1, nrow = m, ncol = 1) offset <- matrix(0, nrow = m, ncol = n) # Shrinkage estimator of betas: # (for adaptive shrinkage, 'apeglm' requires 'mle' coefficients # estimated with another software, or by first running 'apeglm' # setting 'no.shrink=TRUE'.) res <- apeglm(Y = Y, x = x, log.lik = logLikNB, param = param, offset = offset, coef = 2) head(res$map) plot(beta.mat[,2], res$map[,2]) abline(0,1)