fmrs.tunsel {fmrs} | R Documentation |
Provides component-wise tuning parameters using BIC for Finite Mixture of Accelerated Failure Time Regression Models and Finite Mixture of Regression Models.
fmrs.tunsel(y, delta, x, nComp, ...) ## S4 method for signature 'ANY' fmrs.tunsel( y, delta, x, nComp, disFamily = "lnorm", initCoeff, initDispersion, initmixProp, penFamily = "lasso", lambRidge = 0, nIterEM = 2000, nIterNR = 2, conveps = 1e-08, convepsEM = 1e-08, convepsNR = 1e-08, porNR = 2, gamMixPor = 1, activeset, lambMCP, lambSICA )
y |
Responses (observations) |
delta |
Censoring indicator vector |
x |
Design matrix (covariates) |
nComp |
Order (Number of components) of mixture model |
... |
Other possible options |
disFamily |
A sub-distribution family. The options
are |
initCoeff |
Vector of initial values for regression coefficients including intercepts |
initDispersion |
Vector of initial values for standard deviations |
initmixProp |
Vector of initial values for proportion of components |
penFamily |
Penalty name that is used in variable selection method.
The available options are |
lambRidge |
A positive value for tuniing parameter in Ridge Regression or Elastic Net |
nIterEM |
Maximum number of iterations for EM algorithm |
nIterNR |
Maximum number of iterations for Newton-Raphson algorithm |
conveps |
A positive value for avoiding NaN in computing divisions |
convepsEM |
A positive value for threshold of convergence in EM algorithm |
convepsNR |
A positive value for threshold of convergence in NR algorithm |
porNR |
A positive interger for maximum number of searches in NR algorithm |
gamMixPor |
Proportion of mixing parameters in the penalty. The
value must be in the interval [0,1]. If |
activeset |
A matrix of zero-one that shows which intercepts and covariates are active in the fitted fmrs model |
lambMCP |
A positive numbers for |
lambSICA |
A positive numbers for |
The maximizer of penalized Log-Likelihood depends on selecting a set of good tuning parameters which is a rather thorny issue. We choose a value in an equally spaced set of values in (0, λ_{max}) for a pre-specified λ_{max} that maximize the component-wise BIC,
\hatλ_{k} ={argmax}_{λ_{k}}BIC_k(λ_{k})= {argmax}_{λ_{k}}≤ft\{\ell^{c}_{k, n} (\hat{\boldsymbolΨ}_{λ_{k}, k}) - |d_{λ_{k},k}| \log (n)\right\},
where d_{λ_{k},k}=\{j:\hat{β}_{λ_{k},kj}\neq 0,
j=1,…,d\} is the active set excluding the intercept
and |d_{λ_{k},k}|
is its size. This approach is much faster than using an nComp
by nComp
grid to select the set \boldsymbolλ to
maximize the penallized Log-Likelihood.
An fmrstunpar-class
that includes
component-wise tuning parameter estimates that can be used in
variable selection procedure.
Farhad Shokoohi <shokoohi@icloud.com>
Shokoohi, F., Khalili, A., Asgharian, M. and Lin, S. (2016 submitted) Variable Selection in Mixture of Survival Models for Biomedical Genomic Studies
Other lnorm, norm, weibull:
fmrs.gendata()
,
fmrs.mle()
,
fmrs.varsel()
set.seed(1980) nComp = 2 nCov = 10 nObs = 500 dispersion = c(1, 1) mixProp = c(0.4, 0.6) rho = 0.5 coeff1 = c( 2, 2, -1, -2, 1, 2, 0, 0, 0, 0, 0) coeff2 = c(-1, -1, 1, 2, 0, 0, 0, 0, -1, 2, -2) umax = 40 dat <- fmrs.gendata(nObs = nObs, nComp = nComp, nCov = nCov, coeff = c(coeff1, coeff2), dispersion = dispersion, mixProp = mixProp, rho = rho, umax = umax, disFamily = 'lnorm') res.mle <- fmrs.mle(y = dat$y, x = dat$x, delta = dat$delta, nComp = nComp, disFamily = 'lnorm', initCoeff = rnorm(nComp*nCov+nComp), initDispersion = rep(1, nComp), initmixProp = rep(1/nComp, nComp)) res.lam <- fmrs.tunsel(y = dat$y, x = dat$x, delta = dat$delta, nComp = nComp, disFamily = 'lnorm', initCoeff = c(coefficients(res.mle)), initDispersion = dispersion(res.mle), initmixProp = mixProp(res.mle), penFamily = 'adplasso') show(res.lam)