Linear Algebra
7.1012 min read

QR Factorization for Least Squares

Solving the normal equations ATAx^=ATbA^T A \hat{x} = A^T b directly can be numerically unstable: squaring AA can double the condition number. The preferred approach factors AA into A=QRA = QR, where QQ has orthonormal columns and RR is upper triangular.

QR factorization via Gram-Schmidt orthogonalization: start with the columns a1,,ana_1, \ldots, a_n of AA. Successively subtract projections to make them orthogonal, then normalize. The orthonormal set forms the columns of QQ, and the coefficients form RR.

Once A=QRA = QR, the least squares problem simplifies beautifully: bAx2=bQRx2\|b - Ax\|^2 = \|b - QRx\|^2. Substituting QTQ=IQ^T Q = I: QTbRx2+bQQTb2\|Q^T b - Rx\|^2 + \|b - QQ^T b\|^2. The first term is minimized by solving the upper triangular system Rx^=QTbR\hat{x} = Q^T b via back-substitution.

Compared to the normal equations, QR is numerically superior: QQ is orthogonal so squaring is avoided, and back-substitution on RR is stable. For this reason, most software (NumPy, MATLAB) uses QR under the hood for lstsq.

Formal View

Theorem 7.10 — QR Least Squares
If A=QRA = QR where QRm×nQ \in \mathbb{R}^{m \times n} has orthonormal columns and RRn×nR \in \mathbb{R}^{n \times n} is upper triangular with positive diagonal, then the unique least squares solution (when AA has full column rank) is
x^=R1QTb,\hat{x} = R^{-1} Q^T b,
equivalently obtained by solving Rx^=QTbR\hat{x} = Q^T b via back-substitution.
Definition 7.10 — QR Factorization
Every matrix ARm×nA \in \mathbb{R}^{m \times n} with mnm \geq n and full column rank can be written A=QRA = QR where QRm×nQ \in \mathbb{R}^{m \times n} satisfies QTQ=InQ^T Q = I_n (orthonormal columns) and RRn×nR \in \mathbb{R}^{n \times n} is upper triangular with positive diagonal entries. This is the reduced (thin) QR factorization.

The Gram-Schmidt process constructs Q and R column by column. Modified Gram-Schmidt is numerically more stable than classical Gram-Schmidt.

Interactive Visualization

Least Squares Fitting

Why This Matters

QR factorization is the standard algorithm for solving least squares problems in scientific computing.

  • Numerical linear algebra: `numpy.linalg.lstsq` uses QR internally
  • Signal processing: QR-based adaptive filtering (RLS algorithm)
  • Machine learning: QR decomposition is used in Gram-Schmidt orthogonalization of attention heads
  • Eigenvalue algorithms: the QR algorithm iterates QR factorizations to find eigenvalues

Quiz

Question 1

In the QR decomposition A=QRA = QR, which property does QQ have?

Question 2

Using QR, the least squares solution satisfies:

Question 3

QR factorization is preferred over normal equations for numerical stability.

Common Mistakes

  • Confusing the QR algorithm for eigenvalues with QR factorization for least squares — they both use QR but for different purposes.
  • Assuming QT=Q1Q^T = Q^{-1} requires QQ to be square — the reduced QR only gives QTQ=InQ^T Q = I_n (not QQT=ImQ Q^T = I_m).
  • Skipping QR and always using normal equations — fine for small problems but numerically dangerous for nearly rank-deficient AA.