The Data Augmentation Algorithm: Theory and Methodology

doi:10.1201/b10905-11

ABSTRACT

Assume that the function fX : Rp → [0,∞) is a probability density function (pdf). Suppose that g : Rp → R is a function of interest and that we want to know the value of EfX g =

∫ Rp g(x)fX(x) dx, but that this integral cannot be computed analytically. There

are many ways of approximating such intractable integrals, including numerical integration, analytical approximations, and Monte Carlo methods. In this chapter, we will describe a Markov chain Monte Carlo (MCMC) method called the data augmentation (DA) algorithm. Here is thebasic idea. In situationswhere classicalMonteCarlomethodsarenot applicable

because it is impossible to simulate from fX directly, it is often possible to ﬁnd a joint pdf f : Rp × Rq → [0,∞) that satisﬁes two properties: (i) the x-marginal is fX , that is,

f (x, y) dy = fX(x);

and (ii) simulating from the associated conditional pdfs, fX|Y(x | y) and fY|X(y | x), is straightforward. The DA algorithm is based on this joint pdf. The ﬁrst property allows for the construction of a Markov chain that has fX as an invariant pdf, and the second property provides a means of simulating this Markov chain. As long as the resulting chain is reasonably well behaved, simulations of it can be used to consistently estimate EfX g. We now begin to ﬁll in the details, starting with the construction of the Markov chain. As usual, let fY(y) =

∫ Rp f (x, y) dx. Also, deﬁne X = {x ∈ Rp : fX(x) > 0} and Y = {y ∈ Rq :

fY(y) > 0} and assume that f (x, y) = 0 whenever (x, y) /∈ X× Y. Now deﬁne a function k : X× X → [0,∞) as follows:

k(x′ | x) = ∫ Y fX|Y(x′ | y)fY|X(y | x) dy. (10.1)

(We will not need to perform the integration in Equation 10.1-remember that we are still in the construction phase.) Since the integrand in Equation 10.1 is a product of conditional

∫ X k(x′ | x) dx′ =

[ ∫ Y fX|Y(x′ | y)fY|X(y | x) dy

] dx′

= ∫ Y fY|X(y | x)

[ ∫ X fX|Y(x′ | y) dx′

] dy

= ∫ Y fY|X(y | x) dy

= 1.