ABSTRACT

Most multivariate methods require complete data, but most multivariate data contain missing values. This problem is usually dealt with by fixing the data in some way so that the data can be analyzed by methods that were designed for complete data. The most commonly used techniques for treating missing data are ad hoc procedures that attempt to make the best of a bad situation in ways that are seemingly plausible but have no theoretical rationale. A theory-based approach to the treatment of missing data under the assumption of multivariate normality, based on the direct maximization of the likelihood of the observed data, has long been known. The theoretical advantages of this method are widely recognized, and its applicability in principle to structural modeling has been noted. Unfortunately, theory has not had much influence on practice in the treatment of missing data. In part, the underutilization of maximum likelihood (ML) estimation in the presence of missing data may be due to the unavailability of the method as a standard option in packaged data analysis programs. There may also exist a (mistaken) belief that the benefits of using ML rather than conventional missing data techniques will, in practice, be small.