Linear Algebra
9.107 min read

Linearization by Centering

After finding m\mathbf{m}^*, center the data: ai=qim\mathbf{a}_i = \mathbf{q}_i - \mathbf{m}^*. The centered data satisfies i=1Nai=0\sum_{i=1}^N \mathbf{a}_i = \mathbf{0}. Now the problem is purely linear: find VkV_k minimizing iaiVkVkai2\sum_i \|\mathbf{a}_i - V_k V_k^\top \mathbf{a}_i\|^2.

The term VkVkaiV_k V_k^\top \mathbf{a}_i is the orthogonal projection of ai\mathbf{a}_i onto the subspace spanned by VkV_k. The residual aiVkVkai\mathbf{a}_i - V_k V_k^\top \mathbf{a}_i is perpendicular to the subspace. We minimize the total squared residual length.

Formal View

Definition 9.2 — Centered Data Matrix
Given data with mean m=1Niqi\mathbf{m} = \frac{1}{N}\sum_i \mathbf{q}_i, the centered data matrix is A=[a1aN]A = [\mathbf{a}_1 \cdots \mathbf{a}_N] where ai=qim\mathbf{a}_i = \mathbf{q}_i - \mathbf{m}. It satisfies A1=0A\mathbf{1} = \mathbf{0}.

Why This Matters

Centering is the first step of virtually all dimensionality reduction algorithms.

  • PCA always centers first — without centering, the first component is dominated by the mean direction.
  • Standardization (centering + scaling) is standard preprocessing in machine learning.
  • Finance: centering returns removes market-wide trend before cross-sectional analysis.

Quiz

Question 1

After centering, i=1Nai=0\sum_{i=1}^N \mathbf{a}_i = \mathbf{0}.

Question 2

The term VkVkaiV_k V_k^\top \mathbf{a}_i represents:

Common Mistakes

  • Thinking centering is optional — for affine subspace fitting, the optimal center is the mean.
  • Confusing VkVkaiV_k V_k^\top \mathbf{a}_i (back in Rm\mathbb{R}^m) with VkaiV_k^\top \mathbf{a}_i (coordinates, in Rk\mathbb{R}^k).