Moore–Penrose inverse

In mathematics, and in particular linear algebra, the Moore–Penrose inverse of a matrix is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. When referring to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose inverse. The term generalized inverse is sometimes used as a synonym for pseudoinverse.
A common use of the pseudoinverse is to compute a "best fit" solution to a system of linear equations that lacks a unique solution.
Another use is to find the minimum norm solution to a system of linear equations with multiple solutions. The pseudoinverse facilitates the statement and proof of results in linear algebra.
The pseudoinverse is defined and unique for all matrices whose entries are real or complex numbers. It can be computed using the singular value decomposition.

Notation

In the following discussion, the following conventions are adopted.

will denote one of the fields of real or complex numbers, denoted,, respectively. The vector space of matrices over is denoted by.
For, and denote the transpose and Hermitian transpose respectively. If, then.
For, denotes the column space of and denotes the kernel of.
Finally, for any positive integer, denotes the identity matrix.
Definition

For, a pseudoinverse of is defined as a matrix satisfying all of the following four criteria, known as the Moore–Penrose conditions:
exists for any matrix, but, when the latter has full rank, then can be expressed as a simple algebraic formula.
In particular, when has linearly independent columns, can be computed as
This particular pseudoinverse constitutes a left inverse, since, in this case,.
When has linearly independent rows, can be computed as
This is a right inverse, as.

Properties

Existence and uniqueness

The pseudoinverse exists and is unique: for any matrix, there is precisely one matrix, that satisfies the four properties of the definition.
A matrix satisfying the first condition of the definition is known as a generalized inverse. If the matrix also satisfies the second definition, it is called a generalized reflexive inverse. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions.

Basic properties

If has real entries, then so does.
If is invertible, its pseudoinverse is its inverse. That is,.
The pseudoinverse of a zero matrix is its transpose.
The pseudoinverse of the pseudoinverse is the original matrix:.
Pseudoinversion commutes with transposition, conjugation, and taking the conjugate transpose:
:,,.
The pseudoinverse of a scalar multiple of is the reciprocal multiple of :
: for.
Identities

The following identities can be used to cancel certain subexpressions or expand expressions involving pseudoinverses. Proofs for these properties can be found in the proofs subpage.

Reduction to Hermitian case

The computation of the pseudoinverse is reducible to its construction in the Hermitian case. This is possible through the equivalences:
as and are Hermitian.

Products

If, and if

has orthonormal columns, or
has orthonormal rows, or
has all columns linearly independent and has all rows linearly independent, or
,

then
The last property yields the equalities
NB: The equality does not hold in general.
See the counterexample:

Projectors

and are orthogonal projection operators, that is, they are Hermitian and idempotent. The following hold:

and
is the orthogonal projector onto the range of .
is the orthogonal projector onto the range of .
is the orthogonal projector onto the kernel of.
is the orthogonal projector onto the kernel of.

The last two properties imply the following identities:

Another property is the following: if is Hermitian and idempotent, then, for any matrix the following equation holds:
This can be proven by defining matrices,, and checking that is indeed a pseudoinverse for by verifying that the defining properties of the pseudoinverse hold, when is Hermitian and idempotent.
From the last property it follows that, if is Hermitian and idempotent, for any matrix
Finally, if is an orthogonal projection matrix, then its pseudoinverse trivially coincides with the matrix itself, that is,.

Geometric construction

If we view the matrix as a linear map over a field then can be decomposed as follows. We write for the direct sum, for the orthogonal complement, for the kernel of a map, and for the image of a map. Notice that and. The restriction is then an isomorphism. This implies that on is the inverse of this isomorphism, and is zero on
In other words: To find for given in, first project orthogonally onto the range of, finding a point in the range. Then form, that is, find those vectors in that sends to. This will be an affine subspace of parallel to the kernel of. The element of this subspace that has the smallest length is the answer we are looking for. It can be found by taking an arbitrary member of and projecting it orthogonally onto the orthogonal complement of the kernel of.
This description is closely related to the Minimum norm solution to a linear system.

Subspaces

Limit relations

The pseudoinverse are limits:

Continuity

In contrast to ordinary matrix inversion, the process of taking pseudoinverses is not continuous: if the sequence converges to the matrix , then need not converge to. However, if all the matrices have the same rank, will converge to.

Derivative

The derivative of a real valued pseudoinverse matrix which has constant rank at a point may be calculated in terms of the derivative of the original matrix:

Examples

Since for invertible matrices the pseudoinverse equals the usual inverse, only examples of non-invertible matrices are considered below.

For the pseudoinverse is The uniqueness of this pseudoinverse can be seen from the requirement, since multiplication by a zero matrix would always produce a zero matrix.
For the pseudoinverse is

Indeed, and thus
Similarly, and thus

For
For
For
For the pseudoinverse is

Note that for this matrix, the left inverse exists and thus equals, indeed,

Special cases

Scalars

It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar is zero if is zero and the reciprocal of otherwise:

Vectors

The pseudoinverse of the null vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude:

Linearly independent columns

If the columns of are linearly independent
, then is invertible. In this case, an explicit formula is:
It follows that is then a left inverse of : .

Linearly independent rows

If the rows of are linearly independent, then
is invertible. In this case, an explicit formula is:
It follows that is a right inverse of : .

Orthonormal columns or rows

This is a special case of either full column rank or full row rank. If has orthonormal columns or orthonormal rows, then:

Orthogonal projection matrices

If is an orthogonal projection matrix, that is, and, then the pseudoinverse trivially coincides with the matrix itself:

Circulant matrices

For a circulant matrix, the singular value decomposition is given by the Fourier transform, that is, the singular values are the Fourier coefficients. Let be the Discrete Fourier Transform matrix, then

Construction

Rank decomposition

Let denote the rank of. Then can be decomposed as
where and are of rank. Then.

The QR method

For computing the product or and their inverses explicitly is often a source of numerical rounding errors and computational cost in practice. An alternative approach using the QR decomposition of may be used instead.
Consider the case when is of full column rank, so that. Then the Cholesky decomposition, where is an upper triangular matrix, may be used. Multiplication by the inverse is then done easily by solving a system with multiple right-hand sides,
which may be solved by forward substitution followed by back substitution.
The Cholesky decomposition may be computed without forming explicitly, by alternatively using the QR decomposition of, where has orthonormal columns,, and is upper triangular. Then
so is the Cholesky factor of.
The case of full row rank is treated similarly by using the formula and using a similar argument, swapping the roles of and.

Singular value decomposition (SVD)

A computationally simple and accurate way to compute the pseudoinverse is by using the singular value decomposition. If is the singular value decomposition of, then. For a rectangular diagonal matrix such as, we get the pseudoinverse by taking the reciprocal of each non-zero element on the diagonal, leaving the zeros in place, and then transposing the matrix. In numerical computation, only elements larger than some small tolerance are taken to be nonzero, and the others are replaced by zeros. For example, in the MATLAB, GNU Octave, or NumPy function pinv, the tolerance is taken to be, where ε is the machine epsilon.
The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix–matrix multiplication, even if a state-of-the art implementation is used.
The above procedure shows why taking the pseudoinverse is not a continuous operation: if the original matrix has a singular value 0, then modifying slightly may turn this zero into a tiny positive number, thereby affecting the pseudoinverse dramatically as we now have to take the reciprocal of a tiny number.

Block matrices

exist for calculating the pseudoinverse of block structured matrices.

The iterative method of Ben-Israel and Cohen

Another method for computing the pseudoinverse uses the recursion
which is sometimes referred to as hyper-power sequence. This recursion produces a sequence converging quadratically to the pseudoinverse of if it is started with an appropriate satisfying. The choice has been argued not to be competitive to the method using the SVD mentioned above, because even for moderately ill-conditioned matrices it takes a long time before enters the region of quadratic convergence. However, if started with already close to the Moore–Penrose inverse and, for example, convergence is fast.

Updating the pseudoinverse

For the cases where has full row or column rank, and the inverse of the correlation matrix is already known, the pseudoinverse for matrices related to can be computed by applying the Sherman–Morrison–Woodbury formula to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms exist that exploit the relationship.
Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated.

Software libraries

The Python package NumPy provides a pseudoinverse calculation through its functions matrix.I and linalg.pinv; its pinv uses the SVD-based algorithm. SciPy adds a function scipy.linalg.pinv that uses a least-squares solver. High-quality implementations of SVD, QR, and back substitution are available in standard libraries, such as LAPACK. Writing one's own implementation of SVD is a major programming project that requires a significant numerical expertise. In special circumstances, such as parallel computing or embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable.
The MASS package for R provides a calculation of the Moore–Penrose inverse through the ginv function. The ginv function calculates a pseudoinverse using the singular value decomposition provided by the svd function in the base R package. An alternative is to employ the pinv function available in the pracma package.
The Octave programming language provides a pseudoinverse through the standard package function pinv and the pseudo_inverse method.
In Julia, the LinearAlgebra package of the standard library provides an implementation of the Moore-Penrose pseudoinverse pinv implemented via singular-value decomposition.

Applications

Linear least-squares

The pseudoinverse provides a least squares solution to a system of linear equations.
For, given a system of linear equations
in general, a vector that solves the system may not exist, or if one does exist, it may not be unique. The pseudoinverse solves the "least-squares" problem as follows:

, we have where and denotes the Euclidean norm. This weak inequality holds with equality if and only if for any vector ; this provides an infinitude of minimizing solutions unless has full column rank, in which case is a zero matrix. The solution with minimum Euclidean norm is

This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by the Frobenius norm. Let.

, we have where and denotes the Frobenius norm.
Obtaining all solutions of a linear system

If the linear system
has any solutions, they are all given by
for arbitrary vector. Solution exist if and only if. If the latter holds, then the solution is unique if and only if has full column rank, in which case is a zero matrix. If solutions exist but does not have full column rank, then we have an indeterminate system, all of whose infinitude of solutions are given by this last equation.

Minimum norm solution to a linear system

For linear systems with non-unique solutions, the pseudoinverse may be used to construct the solution of minimum Euclidean norm
among all solutions.

If is satisfiable, the vector is a solution, and satisfies for all solutions.

This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by the Frobenius norm. Let.

If is satisfiable, the matrix is a solution, and satisfies for all solutions.
Condition number

Using the pseudoinverse and a matrix norm, one can define a condition number for any matrix:
A large condition number implies that the problem of finding least-squares solutions to the corresponding system of linear equations is ill-conditioned in the sense that small errors in the entries of can lead to huge errors in the entries of the solution.

Generalizations

Besides for matrices over real and complex numbers, the conditions hold for matrices over biquaternions, also called "complex quaternions".
In order to solve more general least-squares problems, one can define Moore–Penrose inverses for all continuous linear operators between two Hilbert spaces and, using the same four conditions as in our definition above. It turns out that not every continuous linear operator has a continuous linear pseudoinverse in this sense. Those that do are precisely the ones whose range is closed in.
In abstract algebra, a Moore–Penrose inverse may be defined on a *-regular semigroup. This abstract definition coincides with the one in linear algebra.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...

Moore–Penrose inverse

Notation

Definition

Properties

Existence and uniqueness

Basic properties

Identities

Reduction to Hermitian case

Products

Projectors

Geometric construction

Subspaces

Limit relations

Continuity

Derivative

Examples

Special cases

Scalars

Vectors

Linearly independent columns

Linearly independent rows

Orthonormal columns or rows

Orthogonal projection matrices

Circulant matrices

Construction

Rank decomposition

The QR method

Singular value decomposition (SVD)

Block matrices

The iterative method of Ben-Israel and Cohen

Updating the pseudoinverse

Software libraries

Applications

Linear least-squares

Obtaining all solutions of a linear system

Minimum norm solution to a linear system

Condition number

Generalizations