Checking whether a coin is fair

In statistics, the question of checking whether a coin is fair is one whose importance lies, firstly, in providing a simple problem on which to illustrate basic ideas of statistical inference and, secondly, in providing a simple problem that can be used to compare various competing methods of statistical inference, including decision theory. The practical problem of checking whether a coin is fair might be considered as easily solved by performing a sufficiently large number of trials, but statistics and probability theory can provide guidance on two types of question; specifically those of how many trials to undertake and of the accuracy an estimate of the probability of turning up heads, derived from a given sample of trials.
A fair coin is an idealized randomizing device with two states which are equally likely to occur. It is based on the coin flip used widely in sports and other situations where it is required to give two parties the same chance of winning. Either a specially designed chip or more usually a simple currency coin is used, although the latter might be slightly "unfair" due to an asymmetrical weight distribution, which might cause one state to occur more frequently than the other, giving one party an unfair advantage. So it might be necessary to test experimentally whether the coin is in fact "fair" - that is, whether the probability of the coin falling on either side when it is tossed is exactly 50%. It is of course impossible to rule out arbitrarily small deviations from fairness such as might be expected to affect only one flip in a lifetime of flipping; also it is always possible for an unfair coin to happen to turn up exactly 10 heads in 20 flips. Therefore, any fairness test must only establish a certain degree of confidence in a certain degree of fairness. In more rigorous terminology, the problem is of determining the parameters of a Bernoulli process, given only a limited sample of Bernoulli trials.

Preamble

This article describes experimental procedures for determining whether a coin is fair or unfair. There are many statistical methods for analyzing such an experimental procedure. This article illustrates two of them.
Both methods prescribe an experiment in which the coin is tossed many times and the result of each toss is recorded. The results can then be analysed statistically to decide whether the coin is "fair" or "probably not fair".

Posterior probability density function, or PDF. Initially, the true probability of obtaining a particular side when a coin is tossed is unknown, but the uncertainty is represented by the "prior distribution". The theory of Bayesian inference is used to derive the posterior distribution by combining the prior distribution and the likelihood function which represents the information obtained from the experiment. The probability that this particular coin is a "fair coin" can then be obtained by integrating the PDF of the posterior distribution over the relevant interval that represents all the probabilities that can be counted as "fair" in a practical sense.
Estimator of true probability. This method assumes that the experimenter can decide to toss the coin any number of times. The experimenter first decides on the level of confidence required and the tolerable margin of error. These parameters determine the minimum number of tosses that must be performed to complete the experiment.

An important difference between these two approaches is that the first approach gives some weight to one's prior experience of tossing coins, while the second does not. The question of how much weight to give to prior experience, depending on the quality of that experience, is discussed under credibility theory.

Posterior probability density function

One method is to calculate the posterior probability density function of Bayesian probability theory.
A test is performed by tossing the coin N times and noting the observed numbers of heads, h, and tails, t. The symbols H and T represent more generalised variables expressing the numbers of heads and tails respectively that might have been observed in the experiment. Thus N = H+T = h+t.
Next, let r be the actual probability of obtaining heads in a single toss of the coin. This is the property of the coin which is being investigated. Using Bayes' theorem, the posterior probability density of r conditional on h and t is expressed as follows:
where g represents the prior probability density distribution of r, which lies in the range 0 to 1.
The prior probability density distribution summarizes what is known about the distribution of r in the absence of any observation. We will assume that the prior distribution of r is uniform over the interval . That is, g = 1.
The probability of obtaining h heads in N tosses of a coin with a probability of heads equal to r is given by the binomial distribution:
Substituting this into the previous formula:
This is in fact a beta distribution, whose denominator can be expressed in terms of the beta function:
As a uniform prior distribution has been assumed, and because h and t are integers, this can also be written in terms of factorials:

Example

For example, let N = 10, h = 7, i.e. the coin is tossed 10 times and 7 heads are obtained:
The graph on the right shows the probability density function of r given that 7 heads were obtained in 10 tosses.
The probability for an unbiased coin
is small when compared with the alternative hypothesis. However, it is not small enough to cause us to believe that the coin has a significant bias. This probability is slightly higher than our presupposition of the probability that the coin was fair corresponding to the uniform prior distribution, which was 10%.
Using a prior distribution that reflects our prior knowledge of what a coin is and how it acts, the posterior distribution would not favor the hypothesis of bias. However the number of trials in this example
With the uniform prior, the posterior probability distribution f achieves its peak at r = h / = 0.7; this value is called the maximum a posteriori estimate of r. Also with the uniform prior, the expected value of r under the posterior distribution is

Estimator of true probability

Using this approach, to decide the number of times the coin should be tossed, two parameters are required:

The confidence level which is denoted by confidence interval
The maximum error

The confidence level is denoted by Z and is given by the Z-value of a standard normal distribution. This value can be read off a standard score statistics table for the normal distribution. Some examples are:

Z value	Confidence level	Comment
0.6745	gives 50.000% level of confidence	Half
1.0000	gives 68.269% level of confidence	One std dev
1.6449	gives 90.000% level of confidence	"One nine"
1.9599	gives 95.000% level of confidence	95 percent
2.0000	gives 95.450% level of confidence	Two std dev
2.5759	gives 99.000% level of confidence	"Two nines"
3.0000	gives 99.730% level of confidence	Three std dev
3.2905	gives 99.900% level of confidence	"Three nines"
3.8906	gives 99.990% level of confidence	"Four nines"
4.0000	gives 99.993% level of confidence	Four std dev
4.4172	gives 99.999% level of confidence	"Five nines"

The maximum error is defined by where is the estimated probability of obtaining heads. Note: is the same actual probability as of the previous section in this article.
In statistics, the estimate of a proportion of a sample has a standard error given by:

where n is the number of trials.
This standard error function of p has a maximum at. Further, in the case of a coin being tossed, it is likely that p will be not far from 0.5, so it is reasonable to take p=0.5 in the following:
And hence the value of maximum error is given by
Solving for the required number of coin tosses, n,

Examples

1. If a maximum error of 0.01 is desired, how many times should the coin be tossed?
2. If the coin is tossed 10000 times, what is the maximum error of the estimator on the value of ?
3. The coin is tossed 12000 times with a result of 5961 heads. What interval does the value of lie within if a confidence level of 99.999% is desired?
Now find the value of Z corresponding to 99.999% level of confidence.
Now calculate E
The interval which contains r is thus:
Hence, 99.999% of the time, the interval above would contain which is the true value of obtaining heads in a single toss.

Other approaches

Other approaches to the question of checking whether a coin is fair are available using decision theory, whose application would require the formulation of a loss function or utility function which describes the consequences of making a given decision. An approach that avoids requiring either a loss function or a prior probability is that of "acceptance sampling".

Other applications

The above mathematical analysis for determining if a coin is fair can also be applied to other uses. For example:

Determining the proportion of defective items for a product subjected to a particular condition. Sometimes a product can be very difficult or expensive to produce. Furthermore, if testing such products will result in their destruction, a minimum number of items should be tested. Using a similar analysis, the probability density function of the product defect rate can be found.
Two party polling. If a small random sample poll is taken where there are only two mutually exclusive choices, then this is similar to tossing a single coin multiple times using a possibly biased coin. A similar analysis can therefore be applied to determine the confidence to be ascribed to the actual ratio of votes cast.
Determining the sex ratio in a large group of an animal species. Provided that a small random sample is taken when performing the random sampling of the population, the analysis is similar to determining the probability of obtaining heads in a coin toss.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...