ABSTRACT

Incomplete data sets, or data matrices that contain missing data values, may be considered the rule rather than exception in the social, behavioral, and educational sciences. Missing data pervade these disciplines as well as many other scientific fields, and most empirical studies conducted in them-no matter how well designed-usually yield incomplete data sets. For these reasons, issues that concern how to deal with missing data have been for decades of special interest in these sciences. In fact, over the past 30 years or so, this interest has led to missing data analysis becoming a major field of research, both in statistics and in the areas of its applications. This chapter aims at introducing the reader to the problems of missing

data, indicating how some earlier methods of ‘‘handling’’ them can be inadequate, and discussing a principled, modern approach to dealing with missing data-one embedded within the framework of maximum likelihood (ML). We emphasize, however, that the analysis of incomplete data is a broad research field, and thus a single chapter like this cannot possibly cover all topics of relevance in it. For more comprehensive discussions, therefore, we refer the readers to a number of excellent and more detailed treatments on missing data, such as Allison (2001), Little and Rubin (2002), and Schafer (1997) (see also Schafer & Graham, 2002), all of which have notably influenced this chapter.