Likelihood principle

In statistics, the likelihood principle is the proposition that, given a statistical model, all the evidence in a sample relevant to model parameters is contained in the likelihood function.
A likelihood function arises from a probability density function considered as a function of its distributional parameterization argument. For example, consider a model which gives the probability density function ƒ_X of observable random variable X as a function of a parameter θ. Then for a specific value x of X, the function = ƒ_X is a likelihood function of θ: it gives a measure of how "likely" any particular value of θ is, if we know that X has the value x. The density function may be a density with respect to counting measure, i.e. a probability mass function.
Two likelihood functions are equivalent if one is a scalar multiple of the other. The likelihood principle is this: all information from the data that is relevant to inferences about the value of the model parameters is in the equivalence class to which the likelihood function belongs. The strong likelihood principle applies this same criterion to cases such as sequential experiments where the sample of data that is available results from applying a stopping rule to the observations earlier in the experiment.

Example

Suppose

X is the number of successes in twelve independent Bernoulli trials with probability θ of success on each trial, and
Y is the number of independent Bernoulli trials needed to get three successes, again with probability θ of success on each trial.

Then the observation that X = 3 induces the likelihood function
while the observation that Y = 12 induces the likelihood function
The likelihood principle says that, as the data are the same in both cases, the inferences drawn about the value of θ should also be the same. In addition, all the inferential content in the data about the value of θ is contained in the two likelihoods, and is the same if they are proportional to one another. This is the case in the above example, reflecting the fact that the difference between observing X = 3 and observing Y = 12 lies not in the actual data, but merely in the design of the experiment. Specifically, in one case, one has decided in advance to try twelve times; in the other, to keep trying until three successes are observed. The inference about θ should be the same, and this is reflected in the fact that the two likelihoods are proportional to each other.
This is not always the case, however. The use of frequentist methods involving p-values leads to different inferences for the two cases above, showing that the outcome of frequentist methods depends on the experimental procedure, and thus violates the likelihood principle.

The law of likelihood

A related concept is the law of likelihood, the notion that the extent to which the evidence supports one parameter value or hypothesis against another is indicated by the ratio of their likelihoods, their likelihood ratio. That is,
is the degree to which the observation x supports parameter value or hypothesis a against b. If this ratio is 1, the evidence is indifferent; if greater than 1, the evidence supports the value a against b; or if less, then vice versa.
In Bayesian statistics, this ratio is known as the Bayes factor, and Bayes' rule can be seen as the application of the law of likelihood to inference.
In frequentist inference, the likelihood ratio is used in the likelihood-ratio test, but other non-likelihood tests are used as well. The Neyman–Pearson lemma states the likelihood-ratio test is the most powerful test for comparing two simple hypotheses at a given significance level, which gives a frequentist justification for the law of likelihood.
Combining the likelihood principle with the law of likelihood yields the consequence that the parameter value which maximizes the likelihood function is the value which is most strongly supported by the evidence. This is the basis for the widely used method of maximum likelihood.

History

The likelihood principle was first identified by that name in print in 1962, but arguments for the same principle, unnamed, and the use of the principle in applications goes back to the works of R.A. Fisher in the 1920s. The law of likelihood was identified by that name by I. Hacking. More recently the likelihood principle as a general principle of inference has been championed by A. W. F. Edwards. The likelihood principle has been applied to the philosophy of science by R. Royall.
Birnbaum proved that the likelihood principle follows from two more primitive and seemingly reasonable principles, the conditionality principle and the sufficiency principle:

The conditionality principle says that if an experiment is chosen by a random process independent of the states of nature, then only the experiment actually performed is relevant to inferences about.
The sufficiency principle says that if is a sufficient statistic for, and if in two experiments with data and we have, then the evidence about given by the two experiments is the same.
Arguments for and against

Some widely used methods of conventional statistics, for example many significance tests, are not consistent with the likelihood principle.
Let us briefly consider some of the arguments for and against the likelihood principle.

The original Birnbaum argument

Birnbaum's proof of the likelihood principle has been disputed by philosophers of science, including Deborah Mayo and statisticians including Michael Evans. On the other hand, a new proof of the likelihood principle has been provided by Greg Gandenberger that addresses some of the counterarguments to the original proof.

Experimental design arguments on the likelihood principle

Unrealized events play a role in some common statistical methods. For example, the result of a significance test depends on the -value, the probability of a result as extreme or more extreme than the observation, and that probability may depend on the design of the experiment. To the extent that the likelihood principle is accepted, such methods are therefore denied.
Some classical significance tests are not based on the likelihood. The following are a simple and more complicated example of those, using a commonly cited example called the optional stopping problem.
;Example 1 – simple version:
Suppose I tell you that I tossed a coin 12 times and in the process observed 3 heads. You might make some inference about the probability of heads and whether the coin was fair.
Suppose now I tell that I tossed the coin until I observed 3 heads, and I tossed it 12 times. Will you now make some different inference?
The likelihood function is the same in both cases: It is proportional to
So according to the likelihood principle, in either case the inference should be the same.
;Example 2 – a more elaborated version of the same statistics:
Suppose a number of scientists are assessing the probability of a certain outcome in experimental trials. Conventional wisdom suggests that if there is no bias towards success or failure then the success probability would be one half. Adam, a scientist, conducted 12 trials and obtains 3 successes and 9 failures. One of those successes was the 12th and last observation. Then Adam left the lab.
Bill, a colleague in the same lab, continued Adam's work and published Adam's results, along with a significance test. He tested the null hypothesis that, the success probability, is equal to a half, versus . The probability of the observed result that out of 12 trials 3 or something fewer were successes, if is true, is
which is . Thus the null hypothesis is not rejected at the 5% significance level.
Charlotte, another scientist, reads Bill's paper and writes a letter, saying that it is possible that Adam kept trying until he obtained 3 successes, in which case the probability of needing to conduct 12 or more experiments is given by
which is . Now the result is statistically significant at the level. Note that there is no contradiction between these two analyses; both computations are correct.
To these scientists, whether a result is significant or not depends on the design of the experiment, not on the likelihood of the parameter value being .
;Summary of the illustrated issues:
Results of this kind are considered by some as arguments against the likelihood principle. For others it exemplifies the value of the likelihood principle and is an argument against significance tests.
Similar themes appear when comparing Fisher's exact test with Pearson's chi-squared test.

The voltmeter story

An argument in favor of the likelihood principle is given by Edwards in his book Likelihood. He cites the following story from J.W. Pratt, slightly condensed here. Note that the likelihood function depends only on what actually happened, and not on what could have happened.
;Throwback to Example 2 in the prior section:
This story can be translated to Adam's stopping rule above, as follows: Adam stopped immediately after 3 successes, because his boss Bill had instructed him to do so. After the publication of the statistical analysis by Bill, Adam realizes that he has missed a later instruction from Bill to instead conduct 12 trials, and that Bill's paper is based on this second instruction. Adam is very glad that he got his 3 successes after exactly 12 trials, and explains to his friend Charlotte that by coincidence he executed the second instruction. Later, Adam is astonished to hear about Charlotte's letter, explaining that now the result is significant.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...