ABSTRACT

Missing data problems are common for data analysis across different types of research and studies. This book focuses on the multiple imputation analysis approach to missing data problems. Missing data problems can take a variety of forms and patterns, depending on the scientific interests and analytic goals. We use schematic tables to describe common missing data patterns such as univariate missingness and monotone/general missingness for multivariate data. Missingness mechanisms are assumptions describing the relation between the nonresponse probability and variables in the data. We provide definitions of three types of missing data mechanisms: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR) (also known as not missing at random (NMAR)). These mechanisms are illustrated by simulated data. We use the family income nonresponse problem in National Health Interview Survey (NHIS) 2016 as a motivating example to start the chapter. We analyze these data to show that survey participants with lower income are more likely to be missing their answers to the income question. However, by including and adjusting additional variables, the missingness mechanism in the income variable is moving closer to be MAR. The NHIS income nonresponse problem is handled successfully by multiple imputation. We provide a summary of the structure of the book.