9.107 min read

Linearization by Centering

After finding $\mathbf{m}^*$ , center the data: $\mathbf{a}_i = \mathbf{q}_i - \mathbf{m}^*$ . The centered data satisfies $\sum_{i=1}^N \mathbf{a}_i = \mathbf{0}$ . Now the problem is purely linear: find $V_k$ minimizing $\sum_i \|\mathbf{a}_i - V_k V_k^\top \mathbf{a}_i\|^2$ .

The term $V_k V_k^\top \mathbf{a}_i$ is the orthogonal projection of $\mathbf{a}_i$ onto the subspace spanned by $V_k$ . The residual $\mathbf{a}_i - V_k V_k^\top \mathbf{a}_i$ is perpendicular to the subspace. We minimize the total squared residual length.

Formal View

Definition 9.2 — Centered Data Matrix

Given data with mean

\mathbf{m} = \frac{1}{N}\sum_i \mathbf{q}_i

, the centered data matrix is

A = [\mathbf{a}_1 \cdots \mathbf{a}_N]

where

\mathbf{a}_i = \mathbf{q}_i - \mathbf{m}

. It satisfies

A\mathbf{1} = \mathbf{0}

Why This Matters

Centering is the first step of virtually all dimensionality reduction algorithms.

PCA always centers first — without centering, the first component is dominated by the mean direction.
Standardization (centering + scaling) is standard preprocessing in machine learning.
Finance: centering returns removes market-wide trend before cross-sectional analysis.

Learning Resources

PCA: centering and standardization

StatQuest

Why and how to center data before PCA.

22 min

Linear reduction and centering

MIT OpenCourseWare

Strang connects centering to the linear reduction problem.

45 min

Quiz

Question 1

After centering, $\sum_{i=1}^N \mathbf{a}_i = \mathbf{0}$ .

Question 2

The term $V_k V_k^\top \mathbf{a}_i$ represents:

Common Mistakes

Thinking centering is optional — for affine subspace fitting, the optimal center is the mean.
Confusing $V_k V_k^\top \mathbf{a}_i$ (back in $\mathbb{R}^m$ ) with $V_k^\top \mathbf{a}_i$ (coordinates, in $\mathbb{R}^k$ ).