Numerical methods for linear least squares

Numerical methods for linear least squares entails the numerical analysis of linear least squares problems.

Introduction

A general approach to the least squares problem can be described as follows. Suppose that we can find an n by m matrix S
such that XS is an
orthogonal projection onto the image of X. Then a solution to our minimization problem is given by
simply because
is exactly a sought for orthogonal projection of onto an image of X
.
A few popular ways to find such a matrix S are described below.

Inverting the matrix of the normal equations

The algebraic solution of the normal equations with a full-rank matrix X^TX can be written as
where X⁺ is the Moore–Penrose pseudoinverse of X. Although this equation is correct and can work in many applications, it is not computationally efficient to invert the normal-equations matrix. An exception occurs in numerical smoothing and differentiation where an analytical expression is required.
If the matrix X^TX is well-conditioned and positive definite, implying that it has full rank, the normal equations can be solved directly by using the Cholesky decomposition R^TR, where R is an upper triangular matrix, giving:
The solution is obtained in two stages, a forward substitution step, solving for z:
followed by a backward substitution, solving for :
Both substitutions are facilitated by the triangular nature of R.

Orthogonal decomposition methods

Orthogonal decomposition methods of solving the least squares problem are slower than the normal equations method but are more numerically stable because they avoid forming the product X^TX.
The residuals are written in matrix notation as
The matrix X is subjected to an orthogonal decomposition, e.g., the QR decomposition as follows.
where Q is an m×m orthogonal matrix and R is an n×n upper triangular matrix with.
The residual vector is left-multiplied by Q^T.
Because Q is orthogonal, the sum of squares of the residuals, s, may be written as:
Since v doesn't depend on β, the minimum value of s is attained when the upper block, u, is zero. Therefore, the parameters are found by solving:
These equations are easily solved as R is upper triangular.
An alternative decomposition of X is the singular value decomposition
where U is m by m orthogonal matrix, V is n by n orthogonal matrix and is an m by n matrix with all its elements outside of the main diagonal equal to 0. The pseudoinverse of is easily obtained by inverting its non-zero diagonal elements and transposing. Hence,
where P is obtained from by replacing its non-zero diagonal elements with ones. Since , the matrix is an orthogonal projection onto the image of X. In accordance with a general approach described in the introduction above,
and thus,
is a solution of a least squares problem. This method is the most computationally intensive, but is particularly useful if the normal equations matrix, X^TX, is very ill-conditioned. In that case, including the smallest singular values in the inversion merely adds numerical noise to the solution. This can be cured with the truncated SVD approach, giving a more stable and exact answer, by explicitly setting to zero all singular values below a certain threshold and so ignoring them, a process closely related to factor analysis.

Discussion

The numerical methods for linear least squares are important because linear regression models are among the most important types of model, both as formal statistical models and for exploration of data-sets. The majority of statistical computer packages contain facilities for regression analysis that make use of linear least squares computations. Hence it is appropriate that considerable effort has been devoted to the task of ensuring that these computations are undertaken efficiently and with due regard to round-off error.
Individual statistical analyses are seldom undertaken in isolation, but rather are part of a sequence of investigatory steps. Some of the topics involved in considering numerical methods for linear least squares relate to this point. Thus important topics can be

Computations where a number of similar, and often nested, models are considered for the same data-set. That is, where models with the same dependent variable but different sets of independent variables are to be considered, for essentially the same set of data-points.
Computations for analyses that occur in a sequence, as the number of data-points increases.
Special considerations for very extensive data-sets.

Fitting of linear models by least squares often, but not always, arise in the context of statistical analysis. It can therefore be important that considerations of computation efficiency for such problems extend to all of the auxiliary quantities required for such analyses, and are not restricted to the formal solution of the linear least squares problem.
Matrix calculations, like any other, are affected by rounding errors. An early summary of these effects, regarding the choice of computation methods for matrix inversion, was provided by Wilkinson.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...