ABSTRACT

We provide some material on using the multiple imputation approach to measurement error problems. Under this approach, true values for variables subject to measurement error (or misclassification) are viewed as missing, and they are multiply imputed and analyzed. Most of the ideas and examples covered in this chapter assume that there exist some validation data that have both true values and error-prone observations. Validation data can be used to build up the imputation model. Measurement error problems can take different forms depending on the study design and assumptions. For each type of problems discussed, we use schematic tables to demonstrate how it can be conceived from the missing-data framework. Both the joint modeling and fully conditional specification strategies can be used for the imputation, depending on the problem setup and analytic interest. We use a wide variety of examples for illustration. These examples include bias associated with self-reporting in surveys (e.g., NHIS and NHANES) or linked files with administrative databases, underreporting of health process measures in cancer registry, mismeasured covariates in Cox models, and regression predictors under a detection limit. Positing appropriate measurement error models are key to the imputation. In addition, if validation data are used, it is important that the transportability assumption is satisfied. In certain cases, the use of validation data or bridge study is connected with the general scheme of combining information from multiple data sources. We also discuss the problem of imputing an incomplete composite variable and conclude that the correct strategy should be combining after imputation based on the inclusive imputation strategy for source variables. An example from CDC HIV surveillance data is given.