Pearson distribution

The Pearson distribution is a family of continuous probability distributions. It was first published by Karl Pearson in 1895 and subsequently extended by him in 1901 and 1916 in a series of articles on biostatistics.

History

The Pearson system was originally devised in an effort to model visibly skewed observations. It was well known at the time how to adjust a theoretical model to fit the first two cumulants or moments of observed data: Any probability distribution can be extended straightforwardly to form a location-scale family. Except in pathological cases, a location-scale family can be made to fit the observed mean and variance arbitrarily well. However, it was not known how to construct probability distributions in which the skewness and kurtosis could be adjusted equally freely. This need became apparent when trying to fit known theoretical models to observed data that exhibited skewness. Pearson's examples include survival data, which are usually asymmetric.
In his original paper, Pearson identified four types of distributions in addition to the normal distribution. The classification depended on whether the distributions were supported on a bounded interval, on a half-line, or on the whole real line; and whether they were potentially skewed or necessarily symmetric. A second paper fixed two omissions: it redefined the type V distribution and introduced the type VI distribution. Together the first two papers cover the five main types of the Pearson system. In a third paper, Pearson introduced further special cases and subtypes.
Rhind devised a simple way of visualizing the parameter space of the Pearson system, which was subsequently adopted by Pearson. The Pearson types are characterized by two quantities, commonly referred to as β₁ and β₂. The first is the square of the skewness: where γ₁ is the skewness, or third standardized moment. The second is the traditional kurtosis, or fourth standardized moment: β₂ = γ₂ + 3. The diagram on the right shows which Pearson type a given concrete distribution belongs to.
Many of the skewed and/or non-mesokurtic distributions familiar to us today were still unknown in the early 1890s. What is now known as the beta distribution had been used by Thomas Bayes as a posterior distribution of the parameter of a Bernoulli distribution in his 1763 work on inverse probability. The Beta distribution gained prominence due to its membership in Pearson's system and was known until the 1940s as the Pearson type I distribution. The gamma distribution originated from Pearson's work and was known as the Pearson type III distribution, before acquiring its modern name in the 1930s and 1940s. Pearson's 1895 paper introduced the type IV distribution, which contains Student's t-distribution as a special case, predating William Sealy Gosset's subsequent use by several years. His 1901 paper introduced the inverse-gamma distribution and the beta prime distribution.

Definition

A Pearson density p is defined to be any valid solution to the differential equation
with:
According to Ord, Pearson devised the underlying form of Equation on the basis of, firstly, the formula for the derivative of the logarithm of the density function of the normal distribution and, secondly, from a recurrence relation for values in the probability mass function of the hypergeometric distribution.
In Equation, the parameter a determines a stationary point, and hence under some conditions a mode of the distribution, since
follows directly from the differential equation.
Since we are confronted with a first-order linear differential equation with variable coefficients, its solution is straightforward:
The integral in this solution simplifies considerably when certain special cases of the integrand are considered. Pearson distinguished two main cases, determined by the sign of the discriminant of the quadratic function

Particular types of distribution

Case 1, negative discriminant

The Pearson type IV distribution

If the discriminant of the quadratic function is negative, it has no real roots. Then define
Observe that is a well-defined real number and, because by assumption and therefore. Applying these substitutions, the quadratic function is transformed into
The absence of real roots is obvious from this formulation, because α² is necessarily positive.
We now express the solution to the differential equation as a function of y:
Pearson called this the "trigonometrical case", because the integral
involves the inverse trigonometric arctan function. Then
Finally, let
Applying these substitutions, we obtain the parametric function:
This unnormalized density has support on the entire real line. It depends on a scale parameter α > 0 and shape parameters m > 1/2 and ν. One parameter was lost when we chose to find the solution to the differential equation as a function of y rather than x. We therefore reintroduce a fourth parameter, namely the location parameter λ. We have thus derived the density of the Pearson type IV distribution:
The normalizing constant involves the complex Gamma function and the Beta function .
Notice that the location parameter λ here is not the same as the original location parameter introduced in the general formulation, but is related via

The Pearson type VII distribution

The shape parameter ν of the Pearson type IV distribution controls its skewness. If we fix its value at zero, we obtain a symmetric three-parameter family. This special case is known as the Pearson type VII distribution. Its density is
where B is the Beta function.
An alternative parameterization of the type VII distribution is obtained by letting
which requires m > 3/2. This entails a minor loss of generality but ensures that the variance of the distribution exists and is equal to σ². Now the parameter m only controls the kurtosis of the distribution. If m approaches infinity as λ and σ are held constant, the normal distribution arises as a special case:
This is the density of a normal distribution with mean λ and standard deviation σ.
It is convenient to require that m > 5/2 and to let
This is another specialization, and it guarantees that the first four moments of the distribution exist. More specifically, the Pearson type VII distribution parameterized in terms of has a mean of λ, standard deviation of σ, skewness of zero, and excess kurtosis of γ₂.

Student's ''t''-distribution

The Pearson type VII distribution is equivalent to the non-standardized Student's t-distribution with parameters ν > 0, μ, σ² by applying the following substitutions to its original parameterization:
Observe that the constraint is satisfied.
The resulting density is
which is easily recognized as the density of a Student's t-distribution.
This implies that the Pearson type VII distribution subsumes the standard Student's t-distribution and also the standard Cauchy distribution. In particular, the standard Student's t-distribution arises as a subcase, when μ = 0 and σ² = 1, equivalent to the following substitutions:
The density of this restricted one-parameter family is a standard Student's t:

Case 2, non-negative discriminant

If the quadratic function has a non-negative discriminant, it has real roots a₁ and a₂ :
In the presence of real roots the quadratic function can be written as
and the solution to the differential equation is therefore
Pearson called this the "logarithmic case", because the integral
involves only the logarithm function and not the arctan function as in the previous case.
Using the substitution
we obtain the following solution to the differential equation :
Since this density is only known up to a hidden constant of proportionality, that constant can be changed and the density written as follows:

The Pearson type I distribution

The Pearson type I distribution arises when the roots of the quadratic equation are of opposite sign, that is,. Then the solution p is supported on the interval. Apply the substitution
where, which yields a solution in terms of y that is supported on the interval :
One may define:
Regrouping constants and parameters, this simplifies to:
Thus follows a with. It turns out that m₁, m₂ > −1 is necessary and sufficient for p to be a proper probability density function.

The Pearson type II distribution

The Pearson type II distribution is a special case of the Pearson type I family restricted to symmetric distributions.
For the Pearson Type II Curve,
where
The ordinate, y, is the frequency of. The Pearson Type II Curve is used in computing the table of significant correlation coefficients for Spearman's rank correlation coefficient when the number of items in a series is less than 100. After that, the distribution mimics a standard Student's t-distribution. For the table of values, certain values are used as the constants in the previous equation:
The moments of x used are

The Pearson type III distribution

Defining
is. The Pearson type III distribution is a generalized gamma distribution or chi-squared distribution.

The Pearson type V distribution

Defining new parameters:
follows an. The Pearson type V distribution is an inverse-gamma distribution.

The Pearson type VI distribution

Defining
follows a. The Pearson type VI distribution is a beta prime distribution or F-distribution.

Relation to other distributions

The Pearson family subsumes the following distributions, among others:

These models are used in financial markets, given their ability to be parametrised in a way that has intuitive meaning for market traders. A number of models are in current use that capture the stochastic nature of the volatility of rates, stocks, etc., and this family of distributions may prove to be one of the more important.
In the United States, the Log-Pearson III is the default distribution for flood frequency analysis..
Recently there have been many advancements in generalizing Pearson Distributions to make it more flexible called Metalog Distributions

Primary sources

Secondary sources

Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. National Bureau of Standards.
Eric W. Weisstein et al. . From MathWorld.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...