Condition number


In the field of numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the input, and how much error in the output results from an error in the input. Very frequently, one is solving the inverse problem: given one is solving for x, and thus the condition number of the inverse must be used. In linear regression the condition number of the moment matrix can be used as a diagnostic for multicollinearity.
The condition number is an application of the derivative, and is formally defined as the value of the asymptotic worst-case relative change in output for a relative change in input. The "function" is the solution of a problem and the "arguments" are the data in the problem. The condition number is frequently applied to questions in linear algebra, in which case the derivative is straightforward but the error could be in many different directions, and is thus computed from the geometry of the matrix. More generally, condition numbers can be defined for non-linear functions in several variables.
A problem with a low condition number is said to be well-conditioned, while a problem with a high condition number is said to be ill-conditioned. In non-mathematical terms, an ill-conditioned problem is one where, for a small change in the inputs there is a large change in the answer or dependent variable. This means that the correct solution/answer to the equation becomes hard to find. The condition number is a property of the problem. Paired with the problem are any number of algorithms that can be used to solve the problem, that is, to calculate the solution. Some algorithms have a property called backward stability. In general, a backward stable algorithm can be expected to accurately solve well-conditioned problems. Numerical analysis textbooks give formulas for the condition numbers of problems and identify known backward stable algorithms.
As a rule of thumb, if the condition number, then you may lose up to digits of accuracy on top of what would be lost to the numerical method due to loss of precision from arithmetic methods. However, the condition number does not give the exact value of the maximum inaccuracy that may occur in the algorithm. It generally just bounds it with an estimate.

General definition in the context of error analysis

Given a problem and an algorithm with an input x, the absolute error is and the relative error is.
In this context, the absolute condition number of a problem f is
and the relative condition number is

Matrices

For example, the condition number associated with the linear equation
Ax = b gives a bound on how inaccurate the solution x will be after approximation. Note that this is before the effects of round-off error are taken into account; conditioning is a property of the matrix, not the algorithm or floating-point accuracy of the computer used to solve the corresponding system. In particular, one should think of the condition number as being the rate at which the solution x will change with respect to a change in b. Thus, if the condition number is large, even a small error in b may cause a large error in x. On the other hand, if the condition number is small, then the error in x will not be much bigger than the error in b.
The condition number is defined more precisely to be the maximum ratio of the relative error in x to the relative error in b.
Let e be the error in b. Assuming that A is a nonsingular matrix, the error in the solution A−1b is A−1e. The ratio of the relative error in the solution to the relative error in b is
The maximum value is then seen to be the product of the two operator norms as follows:
The same definition is used for any consistent norm, i.e. one that satisfies
When the condition number is exactly one, then a solution algorithm can find an approximation of the solution whose precision is no worse than that of the data.
However, it does not mean that the algorithm will converge rapidly to this solution, just that it will not diverge arbitrarily because of inaccuracy on the source data, provided that the forward error introduced by the algorithm does not diverge as well because of accumulating intermediate rounding errors.
The condition number may also be infinite, but this implies that the problem is ill-posed, and no algorithm can be expected to reliably find a solution.
The definition of the condition number depends on the choice of norm, as can be illustrated by two examples.
If is the norm defined in the square-summable sequence space2, then
where and are maximal and minimal singular values of respectively. Hence:
The condition number with respect to L2 arises so often in numerical linear algebra that it is given a name, the condition number of a matrix.
If is the norm defined in the sequence space ℓ of all bounded sequences, and is lower triangular non-singular, then
The condition number computed with this norm is generally larger than the condition number computed with square-summable sequences, but it can be evaluated more easily.
If the condition number is not too much larger than one, the matrix is well-conditioned, which means that its inverse can be computed with good accuracy. If the condition number is very large, then the matrix is said to be ill-conditioned. Practically, such a matrix is almost singular, and the computation of its inverse, or solution of a linear system of equations is prone to large numerical errors. A matrix that is not invertible has condition number equal to infinity.

Nonlinear

Condition numbers can also be defined for nonlinear functions, and can be computed using calculus. The condition number varies with the point; in some cases one can use the maximum condition number over the domain of the function or domain of the question as an overall condition number, while in other cases the condition number at a particular point is of more interest.

One variable

The condition number of a differentiable function in one variable as a function is. Evaluated at a point, this is
Most elegantly, this can be understood as the ratio of the logarithmic derivative of, which is, and the logarithmic derivative of, which is, yielding a ratio of. This is because the logarithmic derivative is the infinitesimal rate of relative change in a function: it is the derivative scaled by the value of. Note that if a function has a zero at a point, its condition number at the point is infinite, as infinitesimal changes in the input can change the output from zero to positive or negative, yielding a ratio with zero in the denominator, hence infinite relative change.
More directly, given a small change in, the relative change in is, while the relative change in is. Taking the ratio yields
The last term is the difference quotient, and taking the limit yields the derivative.
Condition numbers of common elementary functions are particularly important in computing significant figures and can be computed immediately from the derivative; see significance arithmetic of transcendental functions. A few important ones are given below:
NameSymbolCondition number
Addition / subtraction
Scalar multiplication
Division
Polynomial
Exponential function
Natural logarithm function
Sine function
Cosine function
Tangent function
Inverse sine function
Inverse cosine function
Inverse tangent function

Several variables

Condition numbers can be defined for any function mapping its data from some domain into some codomain, where both the domain and codomain are Banach spaces. They express how sensitive that function is to small changes in its arguments. This is crucial in assessing the sensitivity and potential accuracy difficulties of numerous computational problems, for example, polynomial root finding or computing eigenvalues.
The condition number of at a point is then defined to be the maximum ratio of the fractional change in to any fractional change in, in the limit where the change in becomes infinitesimally small:
where is a norm on the domain/codomain of.
If is differentiable, this is equivalent to:
where denotes the Jacobian matrix of partial derivatives of at, and is the induced norm on the matrix.