lcphR: Latent Class Proportional Hazards Regression
# install.packages("devtools")
devtools::install_github("tengfei-emory/lcphR")
library(lcphR)
Currently lcphR
supports R version >= 4.1.1.
By default, the function simulation(n)
generates a dataset with n observations under the scenario 1 described by Fei, Hanfelt and Peng (2023).
# generate a dataset with 500 individuals
dat <- simulation(500)
Specifically, it returns a data frame of 2 latent classes with 2 baseline covariates (Xcov1
and Xcov2
), time-to-event (tildet
), censoring indicator (delta
) and true latent class labels (latent
).
The analysis for the dataset dat
can be conducted by running LSCA
function:
lcfit <- lcphR(dat,num_class=2,covx=c('Xcov1','Xcov2'),tolEM=1e-3,varest=T,traceplot=T,initial='kmeans')
The output list lcfit
contains the following information:
alpha
: estimated unknown parameters under the class membership probability submodel.
zeta
: estimated unknown parameters under the class-specific proportional hazards submodel.
tevent
: uncensored event times.
chaz
: estimated baseline hazard function for class 1.
ASE
: standard error estimates for alpha and zeta, by the close-form estimator derived from observed-data log likelihood.
nASESr
: standard error estimates for alpha and zeta, by the numerical estimator derived from observed-data profile log likelihood.
chaztgt
and chaztgtASE
: point estimates and corresponding standard error estimates for the baseline hazard function at specified times (under development).
tau
: posterior membership probabilities
p
: baseline membership probabilities
loglik
, obsloglik
: complete-data log likelihood and observed-data log likelihood after the last iteration of the EM algorithm.
AIC
, BIC
, CEBIC
: Akaike information criterion, Bayesian information criterion, and Classification entropy incorporated BIC.
timediff
: Total time used in point estimation and variance estimation.
diffEM
: L-2 norm of the difference of posterior membership probability vector between the last and one prior to the last iteration.
diffPAR
: L-2 norm of the difference of point estimates (alpha, beta, hazard increments) between the last and one prior to the last iteration.
difflA
: convergence criterion (Aitken acceleration).
numiter
: total number of iterations.
entropy
: standardized entropy R-square ranging from 0 to 1. A value close one indicates better separation of classes.
censor
: censoring rate of the observations.
Fei, T, Hanfelt, J, Peng, L. Latent Class Proportional Hazards Regression with Heterogeneous Survival Data. Statistics and its Interface, accepted.