ABSTRACT

We discuss the use of multiple imputation in survival data analysis. Survival data are often subject to censoring. We describe the analogy between missing data and censored data. For the latter, notions of censoring mechanisms can be similar to missingness mechanisms. One possible application of multiple imputation is to impute censored event times. We present two imputation approaches: one is based on parametric survival analysis models such as accelerated failure time (AFT) models and Weibull proportional hazards models; the other is based on semiparametric Cox proportional hazards models using the idea of dual modeling strategy and predictive mean matching. Imputing censored event times can correct biases in marginal survival function estimates if the censoring mechanism is dependent on fully observed covariates. Another application of multiple imputation is to impute missing covariates in survival regression analyses. Such imputation is usually difficult because of the nonlinear structure of survival models. Both the joint modeling and fully conditional specification approaches can be used. For the former, the imputation can be obtained rather conveniently using WinBUGS. For the latter, the key is to appropriately include the survival outcome information in the imputation. Under the AFT model, a reasonable strategy is to set up imputation models separately by the censoring indicator and include either the event time or censoring time as a predictor. Under the Cox model, we show the performance of a method that uses the cumulative baseline hazard function estimate as an imputation predictor. We illustrate the ideas using cancer data from SEER registry and a follow-up cancer study from University of Michigan.