9.107 min read
Linearization by Centering
After finding , center the data: . The centered data satisfies . Now the problem is purely linear: find minimizing .
The term is the orthogonal projection of onto the subspace spanned by . The residual is perpendicular to the subspace. We minimize the total squared residual length.
Formal View
Definition 9.2 — Centered Data Matrix
Given data with mean , the centered data matrix is where . It satisfies .
Why This Matters
Centering is the first step of virtually all dimensionality reduction algorithms.
- PCA always centers first — without centering, the first component is dominated by the mean direction.
- Standardization (centering + scaling) is standard preprocessing in machine learning.
- Finance: centering returns removes market-wide trend before cross-sectional analysis.
Quiz
Question 1
After centering, .
Question 2
The term represents:
Common Mistakes
- Thinking centering is optional — for affine subspace fitting, the optimal center is the mean.
- Confusing (back in ) with (coordinates, in ).