ABSTRACT

Large sample methods in Statistics constitute the general methodology un­ derlying fruitful simpler statistical analyses of data sets involving a large number of observations. Drawing statistical conclusions from a given d ata set involves the choice of suitable statistical models relating to the ob­ servations which incorporate some random (stochastic) or chance factors whereby convenient probability laws can be adopted in an appropriate m an­ ner. It is with respect to such postulated probability laws th a t the behavior of some sample statistics (typically, an estim ator in an estimation problem or a test statistic in a hypothesis testing problem) needs to be studied carefully so that the conclusions can be drawn with an adequate degree of precision. If the number of observations is small and/or the underlying probability model is well specified, such stochastic behavior can be eval­ uated in an exact manner. However, with the exception of some simple underlying probability laws (such as the normal or Poisson distributions), the exact sampling distribution of a statistic may become very complicated as the number of observations in the sample becomes large. Moreover, if the data set actually involves a large number of observations, there may be a lesser need to restrict oneself to a particular probability law, and general statistical conclusions as well can be derived by allowing such a law to be a member of a broader class. In other words, one may achieve more robust­ ness with respect to the underlying probability models when the number of observations is large. On the other hand, there are some natural (and mini­ mal) requirements for a statistical method to qualify as a valid large sample method. For example, in the case of an estim ator of a param eter, it is quite natural to expect th a t as the sample size increases, the estim ator should be closer to the parameter in some meaningful sense; in the literature, this property is known as the co n s is ten cy of estimators. Similarly, in testing a null hypothesis, a test should be able to detect the falsehood of the null hypothesis (when it is not true ) with more and more confidence when the

sample size becomes large; this relates to the consistency of statistical tests. In either case, there is a need to study general regularity conditions un­ der which such stochastic convergence of sample statistics holds. A second natural requirement for a large sample procedure is to ensure tha t the cor­ responding exact sampling distribution can be adequately approximated by a simpler one (such as the normal, Poisson, or chi-squared distribu­ tions), for which extensive tables and charts are available to facilitate the actual applications. In the literature, this is known as convergence in d istr ib u tio n or central lim it th eory . This alone is a very im portant topic of study and is saturated with applications in diverse problems of statistical inference. Third, in a given setup, there are usually more than one procedure satisfying the requirements of consistency and convergence in distribution. In choosing an appropriate one within such a class, a nat­ ural criterion is optimality in some well-defined sense. In the estimation problem, this optimality criterion may relate to m inim um variance or m in im um risk (with respect to a suitable loss function ), and there are vital issues in choosing such an optim ality criterion and assessing its adapt­ ability to the large sample case. In the testing problem, a test should be m ost pow erful, but, often, such an optimal test may not exist (especially in the m ultiparam eter testing problem), and hence, alternative optimal­ ity (or desirability) criteria are to be examined. This branch of statistical inference dealing with asym p totica lly optim al procedures has been a very active area of productive research during the past fifty years, and yet there is room for further related developments! Far more im portant is the enormous scope of applications of these asymptotically optimal procedures in various problems, the study of which constitutes a major objective of the current book. In a data set, observations generally refer to some measur­ able characteristics which conform to either a continuous/discrete scale or even to a categorical setup where the categories may or may not be ordered in some well-defined manner. S tatistical analysis may naturally depend on the basic nature of such observations. In particular, the analysis of cat­ egorica l d a ta m odels and their ramifications may require some special attention, and in the literature, analysis of qualitative data (or discrete m u ltivaria te analysis) and generalized linear m odels have been suc­ cessfully linked to significant applications in a variety of situations. Our general objectives include the study of large sample methods pertinent to such models as well.