ABSTRACT

Regression techniques are among some of the most widely used methods in applied statistics. Given a response variable Y, and a set of covariates X = (X1 , X 2 , · · ·, Xp), one is often interested in estimating an assumed functional relationship between Y and X, and in predicting further responses for new values of the covariates. One way of modeling such a relationship is to present the expected value of Y as

E(YIX) = p(X), where, in general, p(·) is an unknown function of the covariates. In practice, however, p(-) is usually approximated by a simple parametric function ¢(-; /3), where f3 = ({31 , · · ·, f3r) denotes a vector of unknown parameters. The function ¢( ·; /3) is then treated as if it were the true underlying function p(-), so the problem is reduced to that of estimating {3. Furthermore, in most applications the probability distribution of the response Y is assumed to belong to an exponential familty. This gives rise to the important class of generalized linear models (GLM) (Neider and Wedderburn, 1972; McCullagh and Neider, 1989), which we shall find convenient to describe as follows.