The Least Squares Problem
Most real-world systems are overdetermined: you have more equations than unknowns. A GPS receiver solves ~30 satellite equations for 4 unknowns; a data scientist fits a line through thousands of points. There is usually no exact solution , so we ask: what makes as close as possible to ?
We measure closeness with the Euclidean norm: the residual is , and its length is the error. The least squares problem asks us to minimize over all .
Geometrically, can only reach the column space of . The closest point in the column space to is the orthogonal projection . The optimal satisfies .
Since must be perpendicular to every column of , we get the condition , which rearranges to the normal equations . We will solve these in section 7.9.
Formal View
Why This Matters
Least squares is the backbone of data fitting and statistical estimation, appearing everywhere from GPS to machine learning.
- Linear regression: fitting the best line or plane through data
- Navigation systems: estimating position from overdetermined satellite equations
- Signal processing: filtering and denoising noisy measurements
- Computer vision: fitting geometric models to point clouds
Quiz
The least squares solution minimizes which quantity?
A least squares solution exists for every matrix and every vector .
If with has full column rank, the unique least squares solution is:
Common Mistakes
- Trying to solve directly when — this system is generally inconsistent; use the normal equations instead.
- Forgetting that the normal equations require to have full column rank; if columns are dependent, is singular.
- Confusing minimizing with minimizing — they give the same answer, but squaring makes the algebra smoother.