Markov chain Monte Carlo

In statistics, Markov chain Monte Carlo methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.

Application domains

MCMC methods are primarily used for calculating numerical approximations of multi-dimensional integrals, for example in Bayesian statistics, computational physics, computational biology and computational linguistics.
In Bayesian statistics, the recent development of MCMC methods has made it possible to compute large hierarchical models that require integrations over hundreds to thousands of unknown parameters.
In rare event sampling, they are also used for generating samples that gradually populate the rare failure region.

General explanation

Markov chain Monte Carlo methods create samples from a continuous random variable, with probability density proportional to a known function. These samples can be used to evaluate an integral over that variable, as its expected value or variance.
Practically, an ensemble of chains is generally developed, starting from a set of points arbitrarily chosen and sufficiently distant from each other. These chains are stochastic processes of "walkers" which move around randomly according to an algorithm that looks for places with a reasonably high contribution to the integral to move into next, assigning them higher probabilities.
Random walk Monte Carlo methods are a kind of random simulation or Monte Carlo method. However, whereas the random samples of the integrand used in a conventional Monte Carlo integration are statistically independent, those used in MCMC are autocorrelated. Correlations of samples introduces the need to use the Markov chain central limit theorem when estimating the error of mean values.
These algorithms create Markov chains such that they have an equilibrium distribution which is proportional to the function given.

Reducing correlation

While MCMC methods were created to address multi-dimensional problems better than generic Monte Carlo algorithms, when the number of dimensions rises they too tend to suffer the curse of dimensionality: regions of higher probability tend to stretch and get lost in an increasing volume of space that contributes little to the integral. One way to address this problem could be shortening the steps of the walker, so that it doesn't continuously try to exit the highest probability region, though this way the process would be highly autocorrelated and expensive. More sophisticated methods such as Hamiltonian Monte Carlo and the Wang and Landau algorithm use various ways of reducing this autocorrelation, while managing to keep the process in the regions that give a higher contribution to the integral. These algorithms usually rely on a more complicated theory and are harder to implement, but they usually converge faster.

Examples

Random walk

Metropolis–Hastings algorithm: This method generates a Markov chain using a proposal density for new steps and a method for rejecting some of the proposed moves. It is actually a general framework which includes as special cases the very first and simpler MCMC and many more recent alternatives listed below.
*Gibbs sampling: This method requires all the conditional distributions of the target distribution to be sampled exactly. When drawing from the full-conditional distributions is not straightforward other samplers-within-Gibbs are used. Gibbs sampling is popular partly because it does not require any 'tuning'.
* Metropolis-adjusted Langevin algorithm and other methods that rely on the gradient of the log target density to propose steps that are more likely to be in the direction of higher probability density.
*Pseudo-marginal Metropolis–Hastings: This method replaces the evaluation of the density of the target distribution with an unbiased estimate and is useful when the target density is not available analytically, e.g. latent variable models.
Slice sampling: This method depends on the principle that one can sample from a distribution by sampling uniformly from the region under the plot of its density function. It alternates uniform sampling in the vertical direction with uniform sampling from the horizontal 'slice' defined by the current vertical position.
Multiple-try Metropolis: This method is a variation of the Metropolis–Hastings algorithm that allows multiple trials at each point. By making it possible to take larger steps at each iteration, it helps address the curse of dimensionality.
Reversible-jump: This method is a variant of the Metropolis–Hastings algorithm that allows proposals that change the dimensionality of the space. Markov chain Monte Carlo methods that change dimensionality have long been used in statistical physics applications, where for some problems a distribution that is a grand canonical ensemble is used. But the reversible-jump variant is useful when doing Markov chain Monte Carlo or Gibbs sampling over nonparametric Bayesian models such as those involving the Dirichlet process or Chinese restaurant process, where the number of mixing components/clusters/etc. is automatically inferred from the data.
Hamiltonian Monte Carlo : Tries to avoid random walk behaviour by introducing an auxiliary momentum vector and implementing Hamiltonian dynamics, so the potential energy function is the target density. The momentum samples are discarded after sampling. The end result of Hybrid Monte Carlo is that proposals move across the sample space in larger steps; they are therefore less correlated and converge to the target distribution more rapidly.
Interacting particle methods

Interacting MCMC methodologies are a class of mean field particle methods for obtaining random samples from a sequence of probability distributions with an increasing level of sampling complexity. These probabilistic models include path space state models with increasing time horizon, posterior distributions w.r.t. sequence of partial observations, increasing constraint level sets for conditional distributions, decreasing temperature schedules associated with some Boltzmann-Gibbs distributions, and many others. In principle, any Markov chain Monte Carlo sampler can be turned into an interacting Markov chain Monte Carlo sampler. These interacting Markov chain Monte Carlo samplers can be interpreted as a way to run in parallel a sequence of Markov chain Monte Carlo samplers. For instance, interacting simulated annealing algorithms are based on independent Metropolis-Hastings moves interacting sequentially with a selection-resampling type mechanism. In contrast to traditional Markov chain Monte Carlo methods, the precision parameter of this class of interacting Markov chain Monte Carlo samplers is only related to the number of interacting Markov chain Monte Carlo samplers. These advanced particle methodologies belong to the class of Feynman-Kac particle models, also called Sequential Monte Carlo or particle filter methods in Bayesian inference and signal processing communities. Interacting Markov chain Monte Carlo methods can also be interpreted as a mutation-selection genetic particle algorithm with Markov chain Monte Carlo mutations.

Markov Chain quasi-Monte Carlo (MCQMC).{{cite thesis |last=Tribble |first=Seth D. |title=Markov chain Monte Carlo algorithms using completely uniformly distributed driving sequences |type=Diss. |publisher=Stanford University |year=2007 |id= }}

The advantage of low-discrepancy sequences in lieu of random numbers for simple independent Monte Carlo sampling is well known. This procedure, known as Quasi-Monte Carlo method, yields an integration error that decays at a superior rate to that obtained by IID sampling, by the Koksma-Hlawka inequality. Empirically it allows the reduction of both estimation error and convergence time by an order of magnitude. The Array-RQMC method combines randomized quasi-Monte Carlo and Markov chain simulation by simulating chains simultaneously in a way that the empirical distribution of the states at any given step is a better approximation of the true distribution of the chain than with ordinary MCMC. In empirical experiments, the variance of the average of a function of the state sometimes converges at rate or even faster, instead of the Monte Carlo rate.

Convergence

Usually it is not hard to construct a Markov chain with the desired properties. The more difficult problem is to determine how many steps are needed to converge to the stationary distribution within an acceptable error. A good chain will have rapid mixing: the stationary distribution is reached quickly starting from an arbitrary position. A standard empirical method to assess convergence is to run several independent simulated Markov chains and check that the ratio of inter-chain to intra-chain variances for all the parameters sampled is close to 1.
Typically, Markov chain Monte Carlo sampling can only approximate the target distribution, as there is always some residual effect of the starting position. More sophisticated Markov chain Monte Carlo-based algorithms such as coupling from the past can produce exact samples, at the cost of additional computation and an unbounded running time.
Many random walk Monte Carlo methods move around the equilibrium distribution in relatively small steps, with no tendency for the steps to proceed in the same direction. These methods are easy to implement and analyze, but unfortunately it can take a long time for the walker to explore all of the space. The walker will often double back and cover ground already covered.
Further consideration of convergence is at Markov chain central limit theorem. See for a discussion of the theory related to convergence and stationarity of the Metropolis-Hastings algorithm.

Software

Several software programs provide MCMC sampling capabilities, for example:

, a high-performance serial / parallel software for Monte Carlo simulations, including the Delayed-Rejection Adaptive Metropolis-Hastings MCMC, available in
*,
*,
*C / C++ / Fortran on , , and .
Packages that use dialects of the BUGS model language:
*WinBUGS / OpenBUGS/
*JAGS
*
, a Bayesian statistical modeling language / R package which uses TensorFlow behind the scenes, similar to PyMC3's use of Theano as the computational back-end
MCSim
PyMC3

R with the packages adaptMCMC, atmcmc, BRugs, mcmc, MCMCpack, ramcmc, rjags, rstan, etc.
Stan
and
a general-purpose probabilistic programming library built in Java
is a pure-Python implementation of the Ensemble Slice Sampling method
, a package for general-purpose probabilistic programming in Julia
, a platform for MCMC method in Julia
Citations

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...