Stein's method

Stein's method is a general method in probability theory to obtain bounds on the distance between two probability distributions with respect to a probability metric. It was introduced by Charles Stein, who first published it in 1972, to obtain a bound between the distribution of a sum of -dependent sequence of random variables and a standard normal distribution in the Kolmogorov metric and hence to prove not only a central limit theorem, but also bounds on the rates of convergence for the given metric.

History

At the end of the 1960s, unsatisfied with the by-then known proofs of a specific central limit theorem, Charles Stein developed a new way of proving the theorem for his statistics lecture. His seminal paper was presented in 1970 at the sixth Berkeley Symposium and published in the corresponding proceedings.
Later, his Ph.D. student Louis Chen Hsiao Yun modified the method so as to obtain approximation results for the Poisson distribution; therefore the Stein method applied to the problem of Poisson approximation is often referred to as the Stein-Chen method.
Probably the most important contributions are the monograph by Stein, where he presents his view of the method and the concept of auxiliary randomisation, in particular using exchangeable pairs, and the articles by Barbour and Götze, who introduced the so-called generator interpretation, which made it possible to easily adapt the method to many other probability distributions. An important contribution was also an article by Bolthausen on the so-called combinatorial central limit theorem.
In the 1990s the method was adapted to a variety of distributions, such as Gaussian processes by Barbour, the binomial distribution by Ehm, Poisson processes by Barbour and Brown, the Gamma distribution by Luk, and many others.

The basic approach

Probability metrics

Stein's method is a way to bound the distance between two probability distributions using a specific probability metric.
Let the metric be given in the form
Here, and are probability measures on a measurable space, and are random variables with distribution and respectively, is the usual expectation operator and is a set of functions from to the set of real numbers. Set has to be large enough, so that the above definition indeed yields a metric.
Important examples are the total variation metric, where we let consist of all the indicator functions of measurable sets, the Kolmogorov metric for probability measures on the real numbers, where we consider all the half-line indicator functions, and the Lipschitz metric, where the underlying space is itself a metric space and we take the set to be all Lipschitz-continuous functions with Lipschitz-constant 1. However, note that not every metric can be represented in the form.
In what follows is a complicated distribution, which we want to approximate by a much simpler and tractable distribution .

The Stein operator

We assume now that the distribution is a fixed distribution; in what follows we shall in particular consider the case where is the standard normal distribution, which serves as a classical example.
First of all, we need an operator, which acts on functions from to the set of real numbers and 'characterizes' distribution in the sense that the following equivalence holds:
We call such an operator the Stein operator.
For the standard normal distribution, Stein's lemma yields such an operator:
Thus, we can take
There are in general infinitely many such operators and it still remains an open question, which one to choose. However, it seems that for many distributions there is a particular good one, like for the normal distribution.
There are different ways to find Stein operators.

The Stein equation

is close to with respect to if the difference of expectations in is close to 0. We hope now that the operator exhibits the same behavior: if then, and hopefully if we have.
It is usually possible to define a function such that
We call the Stein equation. Replacing by and taking expectation with respect to, we get
Now all the effort is worthwhile only if the left-hand side of is easier to bound than the right hand side. This is, surprisingly, often the case.
If is the standard normal distribution and we use, then the corresponding Stein equation is
If probability distribution Q has an absolutely continuous density q, then

Solving the Stein equation

Analytic methods. Equation can be easily solved explicitly:
Generator method. If is the generator of a Markov process , Götze ), then the solution to is
where denotes expectation with respect to the process being started in. However, one still has to prove that the solution exists for all desired functions.

Properties of the solution to the Stein equation

Usually, one tries to give bounds on and its derivatives in terms of and its derivatives, that is, inequalities of the form
for some specific , where often is the supremum norm. Here, denotes the differential operator, but in discrete settings it usually refers to a difference operator. The constants may contain the parameters of the distribution. If there are any, they are often referred to as Stein factors.
In the case of one can prove for the supremum norm that
where the last bound is of course only applicable if is differentiable. As the standard normal distribution has no extra parameters, in this specific case the constants are free of additional parameters.
If we have bounds in the general form, we usually are able to treat many probability metrics together. One can often start with the next step below, if bounds of the form are already available.

An abstract approximation theorem

We are now in a position to bound the left hand side of. As this step heavily depends on the form of the Stein operator, we directly regard the case of the standard normal distribution.
At this point we could directly plug in random variable, which we want to approximate, and try to find upper bounds. However, it is often fruitful to formulate a more general theorem. Consider here the case of local dependence.
Assume that is a sum of random variables such that the and variance. Assume that, for every, there is a set, such that is independent of all the random variables with. We call this set the 'neighborhood' of. Likewise let be a set such that all with are independent of all,. We can think of as the neighbors in the neighborhood of, a second-order neighborhood, so to speak. For a set define now the sum.
Using Taylor expansion, it is possible to prove that
Note that, if we follow this line of argument, we can bound only for functions where is bounded because of the third inequality of . To obtain a bound similar to which contains only the expressions and, the argument is much more involved and the result is not as simple as ; however, it can be done.
Theorem A. If is as described above, we have for the Lipschitz metric that
Proof. Recall that the Lipschitz metric is of the form where the functions are Lipschitz-continuous with Lipschitz-constant 1, thus. Combining this with and the last bound in proves the theorem.
Thus, roughly speaking, we have proved that, to calculate the Lipschitz-distance between a with local dependence structure and a standard normal distribution, we only need to know the third moments of and the size of the neighborhoods and.

Application of the theorem

We can treat the case of sums of independent and identically distributed random variables with Theorem A.
Assume that, and. We can take. From Theorem A we obtain that
For sums of random variables another approach related to Steins Method is known as the zero bias transform.

Connections to other methods

Lindeberg's device. Lindeberg introduced a device, where the difference

is represented as a sum of step by step differences.

Tikhomirov's method. Clearly the approach via and does not involve characteristic functions. However, Tikhomirov presented a proof of a central limit theorem based on characteristic functions and a differential operator similar to. The basic observation is that the characteristic function of the standard normal distribution satisfies the differential equation for all. Thus, if the characteristic function of is such that we expect that and hence that is close to the normal distribution. Tikhomirov states in his paper that he was inspired by Stein's seminal paper.
Literature

The following text is advanced, and gives a comprehensive overview of the normal case

Another advanced book, but having some introductory character, is

A standard reference is the book by Stein,

which contains a lot of interesting material, but may be a little hard to understand at first reading.
Despite its age, there are few standard introductory books about Stein's method available. The following recent textbook has a chapter devoted to introducing Stein's method:

Although the book

is by large parts about Poisson approximation, it contains nevertheless a lot of information about the generator approach, in particular in the context of Poisson process approximation.
The following textbook has a chapter devoted to introducing Stein's method of Poisson approximation:

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...