A Short History of MCMC: Subjective Recollections from Incomplete Data

doi:10.1201/b10905-3

ABSTRACT

Markov chainMonteCarlo (MCMC)methodshavebeenaround for almost as longasMonte Carlo techniques, even though their impact on statistics was not truly felt until the very early 1990s, except in the specialized ﬁelds of spatial statistics and image analysis, where those methods appeared earlier. The emergence of Markov based techniques in physics is a story that remains untold within this survey (see Landau and Binder, 2005). Also, we will not enter into a description of MCMC techniques, unless they have some historical link, as the remainder of this volume covers the technical aspects.Acomprehensive treatment with further references can also be found in Robert and Casella (2004). We will distinguish between the introduction of Metropolis-Hastings based algorithms

and those related to Gibbs sampling, since they each stem from radically different origins, even though their mathematical justiﬁcation via Markov chain theory is the same. Tracing the development of Monte Carlo methods, we will also brieﬂy mention what we might call the “second-generation MCMC revolution.” Starting in the mid to late 1990s, this includes the development of particle ﬁlters, reversible jump and perfect sampling, and concludes with more current work on population or sequential Monte Carlo and regeneration and the computing of “honest” standard errors. As mentioned above, the realization that Markov chains could be used in a wide variety

of situations only came (tomainstream statisticians)withGelfand and Smith (1990), despite earlier publications in the statistical literature such as Hastings (1970), Geman and Geman (1984), and Tanner and Wong (1987). Several reasons can be advanced: lack of computing machinery (think of the computers of 1970!), or background onMarkov chains, or hesitation to trust in the practicality of themethod. It thus required visionary researchers like Gelfand and Smith to convince the community, supported by papers that demonstrated, through a series of applications, that the method was easy to understand, easy to implement and practical (Gelfand et al., 1990, 1992; Smith and Gelfand, 1992; Wakeﬁeld et al., 1994). The rapid emergence of the dedicated BUGS (Bayesian inference using Gibbs sampling) software as early as 1991, when a paper on BUGS was presented at the Valencia meeting, was another compelling argument for adopting, at large, MCMC algorithms.∗

Monte Carlo methods were born in Los Alamos, New Mexico, during World War II, eventually resulting in theMetropolis algorithm in the early 1950s. WhileMonte Carlo methods were in use by that time, MCMC was brought closer to statistical practicality by the work of Hastings in the 1970s. What can be reasonably seen as the ﬁrst MCMC algorithm is what we now call the

Metropolis algorithm, published by Metropolis et al. (1953). It emanates from the same group of scientists who produced the Monte Carlo method, namely the research scientists of Los Alamos, mostly physicists working on mathematical physics and the atomic bomb. MCMC algorithms therefore date back to the same time as the development of regular

(MC only) Monte Carlo methods, which are usually traced to Ulam and von Neumann in the late 1940s. StanislawUlam associates the original ideawith an intractable combinatorial computation he attempted in 1946 (calculating the probability of winning at the solitaire card game). This ideawas enthusiastically adopted by John vonNeumann for implementationwith direct applications to neutron diffusion, the name “Monte Carlo” being suggested by Nicholas Metropolis. Eckhardt (1987) describes these early Monte Carlo developments, and Hitchcock (2003) gives a brief history of the Metropolis algorithm. These occurrences very closely coincide with the appearance of the very ﬁrst general-

purpose digital computer, the ENIAC,which came to life in February 1946, after three years of construction. The Monte Carlo method was set up by von Neumann, who was using it on thermonuclear and ﬁssion problems as early as 1947. That same year, Ulam and vonNeumann invented inversion andaccept-reject techniques (also recounted inEckhardt, 1987) to simulate from nonuniform distributions. Without computers, a rudimentary version invented by Fermi in the 1930s went unrecognized (Metropolis, 1987). Note also that, as early as 1949, a symposiumonMonteCarlowas supported byRand, theNational Bureau of Standards, and theOakRidge laboratory and thatMetropolis andUlam (1949) published the very ﬁrst paper about the Monte Carlo method.