ABSTRACT

A large number of papers on the use of Bayesian methods to select models appeared in the 1990s. There were a few such papers prior to 1990, but the only one to attract much attention was probably that of Mitchell and Beauchamp (1988), though there was also discussion by Lindley of the paper by Miller (1984). In classical or frequentist statistics, we specify a probability model for a

random variable Y as p(y|θ), where θ is a parameter or vector of parameters, and p is a probability, if Y is a discrete variable, or a probability density if Y is a continuous variable. p(y|θ) is then the likelihood of the value y for variable Y as a function of the parameter(s) θ. If we have a sample of independent values of Y then we multiply these probabilities or probability densities together to obtain the likelihood for the sample. One way of estimating θ is to maximize the likelihood, or usually its logarithm, as a function of θ. Confidence limits can then be placed on the value of θ by finding those values for which the vector of values of Y , y, is reasonably plausible. If we are prepared to specify a prior probability, p(θ) for θ, the joint prob-

ability of y and θ is p(y,θ) = p(y|θ).p(θ).