ABSTRACT

Normal linear models (NLMs) are the workhorse of statistical analysis, consisting of the analysis of variance and regression techniques that are routinely applied to thousands of problems every year. For more details on the NLM, albeit with a focus on the classical approach to ˜tting, refer to a standard statistical text such as Clarke and Cooke (2004), Draper and Smith (1998), or Faraway (2004). Box and Tiao (1973) provides a thorough coverage of the Bayesian analysis of NLMs focusing exclusively on vague priors. A more modern Bayesian approach is given in Chapter 14 of Gelman et al. (2004). To illustrate the essential features of an NLM, we consider a simple analysis of covariance (ANCOVA) model that consists of one grouping factor and one covariate. A common situation where this model might be appropriate is an experiment where some characteristic of interest is measured before (baseline) and after (response) some treatment is applied to the experimental units. Suppose there are three different treatments being compared and the measured characteristic is continuous, then the ANCOVA model could be represented algebraically:

y x e i 1 2 3 j 1 2 nij i ij ij i= + + + = = …µ α β , , , , ,

where yij is the response on the jth experimental unit receiving the ith treatment μ is a constant whose meaning depends upon how the treatment group-

ing factor is parameterized, as discussed later αi represents the effects of the treatments (the grouping factor) β is the regression coef˜cient associated with the simple linear effect of

the baseline measurement (the covariate) xij is the baseline measurement on the jth experimental unit receiving the

ith treatment eij is the residual random error associated with the jth experimental unit

receiving the ith treatment ni is the number of experimental units receiving the ith treatment

This is a Normal Model if the random errors are assumed to be Normally distributed:

e N( )ij 2~ ,0 σ

It is a Linear Model since the formula given above implies that the mean response is a weighted linear combination of the unknown location parameters μ, αi, and β. Note that the statistical de˜nition of a linear model is not the same as the more intuitive interpretation, in which the linearity refers to the relationship between the response and a covariate. In statistics, models that include curvilinear relationships between the response and a covariate, such as polynomials, are still regarded as linear, providing the mean can be expressed as a linear function of the unknown parameters.