QR Factorization for Least Squares
Solving the normal equations directly can be numerically unstable: squaring can double the condition number. The preferred approach factors into , where has orthonormal columns and is upper triangular.
QR factorization via Gram-Schmidt orthogonalization: start with the columns of . Successively subtract projections to make them orthogonal, then normalize. The orthonormal set forms the columns of , and the coefficients form .
Once , the least squares problem simplifies beautifully: . Substituting : . The first term is minimized by solving the upper triangular system via back-substitution.
Compared to the normal equations, QR is numerically superior: is orthogonal so squaring is avoided, and back-substitution on is stable. For this reason, most software (NumPy, MATLAB) uses QR under the hood for lstsq.
Formal View
The Gram-Schmidt process constructs Q and R column by column. Modified Gram-Schmidt is numerically more stable than classical Gram-Schmidt.
Interactive Visualization
Least Squares Fitting
Why This Matters
QR factorization is the standard algorithm for solving least squares problems in scientific computing.
- Numerical linear algebra: `numpy.linalg.lstsq` uses QR internally
- Signal processing: QR-based adaptive filtering (RLS algorithm)
- Machine learning: QR decomposition is used in Gram-Schmidt orthogonalization of attention heads
- Eigenvalue algorithms: the QR algorithm iterates QR factorizations to find eigenvalues
Quiz
In the QR decomposition , which property does have?
Using QR, the least squares solution satisfies:
QR factorization is preferred over normal equations for numerical stability.
Common Mistakes
- Confusing the QR algorithm for eigenvalues with QR factorization for least squares — they both use QR but for different purposes.
- Assuming requires to be square — the reduced QR only gives (not ).
- Skipping QR and always using normal equations — fine for small problems but numerically dangerous for nearly rank-deficient .