ABSTRACT

Missing data often occur in longitudinal studies. We present some ideas using the multiple imputation strategy. When longitudinal data are processed in the long format, the commonly-used modeling framework is the mixed-effects (mixed) model. Mixed models can effectively model the temporal trend of longitudinal data as well as the correlation of repeated measurements within the same set of subjects. Multiple imputation ignoring such correlations can lead to suboptimal inferences. The data augmentation algorithm based on multivariate linear mixed models is sketched. It is also implemented in R pan. For categorical variables, generalized linear mixed models can be used for imputation. The imputation algorithm can be complicated especially if both the longitudinal variable and baseline covariates are missing. We present some examples using data from the Panel Study of Income Dynamics. We use WinBUGS to implement the imputation. We also present an example of using smoothing methods to model nonlinear temporal trends over time and create imputations. Another simpler imputation approach than using mixed models is to arrange the data in the wide format and then conduct the imputation treating the longitudinal data as multivariate, cross-sectional data. An example is given in comparing both approaches. Longitudinal data fall into a special category of more general multilevel data. Multiple imputation strategies for multilevel data are rapidly evolving, and we touch base some of the main issues.