ABSTRACT

Here are some of the ways that missing data can arise in a regression setting: Missing cases Sometimes we fail to observe a complete case (xi,yi). Indeed, when

we draw a sample from a population, we do not observe the unsampled cases. When missing data arise in this manner, there is no difficulty since this is the standard situation for much of statistics. But sometimes there are cases we intended to sample but failed to observe. If the reason for this failure is unrelated to what would have been observed, then we simply have a smaller sample and can proceed as normal. But when data are not observed for reasons that have some connection to what we would have seen, then we have a biased sample. Sometimes, given enough information about the mechanism for missingness, we can make corrections and achieve valid inferences.