Generalized chi-squared distribution


In probability theory and statistics, the generalized chi-squared distribution is the distribution of a linear sum of independent non-central chi-square variables and a normal variable, or equivalently, of a quadratic form of a multivariate normal distribution. It is a generalization of the noncentral chi-squared distribution. There are several other such generalizations for which the same term is sometimes used. Some of them are special cases of the family discussed here, for example the gamma distribution.

Definition

The generalized chi-squared variable may be described in multiple ways. One is to write it as a linear sum of independent noncentral chi-square variables and a normal variable :
Here the parameters are the weights and, and the degrees of freedom and non-centralities of the constituent chi-squares. Some important special cases of this have all coefficients the same sign, omit the normal term or have central chi-squared components.
Another is to formulate it as a quadratic form of a normal vector :
Here is a matrix, is a vector, and is a scalar. These, together with the mean and covariance matrix of the normal vector, parameterize the distribution. If in this formulation is positive-definite, all the in the other formulation will have the same sign.
For the most general case, a reduction towards a common standard form can be made by using a representation of the following form:
where D is a diagonal matrix and where x represents a vector of uncorrelated standard normal random variables.

Probability density and cumulative distribution functions

The probability density and cumulative distribution functions of a generalized chi-squared variable do not have simple closed-form expressions. However, numerical algorithms and computer code for evaluating them have been published.

Applications

The generalized chi-squared is the distribution of statistical estimates in cases where the usual statistical theory does not hold. For example, if a predictive model is fitted by least squares, but the model errors have either autocorrelation or heteroscedasticity, then alternative models can be compared by relating changes in the sum of squares to an asymptotically valid generalized chi-squared distribution.

Classifying normal samples using Gaussian discriminant analysis

If is a normal variable, its log likelihood is a quadratic form of, and is hence distributed as a generalized chi-squared. The log likelihood ratio that arises from one normal distribution versus another is also a quadratic form, so distributed as a generalized chi-squared.
In Gaussian discriminant analysis, samples from normal distributions are optimally separated by using a quadratic classifier, a boundary that is a quadratic function. The classification error rates of different types are integrals of the normal distributions within the quadratic regions defined by this classifier. Since this is mathematically equivalent to integrating a quadratic form of a normal variable, the result is an integral of a generalized-chi-squared variable.

In signal processing

The following application arises in the context of Fourier analysis in signal processing, renewal theory in probability theory, and multi-antenna systems in wireless communication. The common factor of these areas is that the sum of exponentially distributed variables is of importance.
If are k independent, circular symmetric complex Gaussian random variables with mean 0 and variance, then the random variable
has a generalized chi-squared distribution of a particular form. The difference from the standard chi-squared distribution is that are complex and can have different variances, and the difference from the more general generalized chi-squared distribution is that the relevant scaling matrix A is diagonal. If for all i, then, scaled down by , has a chi-squared distribution,, also known as an Erlang distribution. If have distinct values for all i, then has the pdf
If there are sets of repeated variances among, assume that they are divided into M sets, each representing a certain variance value. Denote to be the number of repetitions in each group. That is, the mth set contains variables that have variance It represents an arbitrary linear combination of independent -distributed random variables with different degrees of freedom:
The pdf of is
where
with from the set of
all partitions of defined as