Linear Algebra
9.118 min read

The Linear Reduction Problem

After centering, the linear reduction problem is: minimize AVkVkAF2\|A - V_k V_k^\top A\|_F^2 over all m×km \times k matrices VkV_k with orthonormal columns.

Equivalently, this maximizes the variance captured: VkAF2=AF2AVkVkAF2\|V_k^\top A\|_F^2 = \|A\|_F^2 - \|A - V_k V_k^\top A\|_F^2. Since AF2\|A\|_F^2 is fixed, minimizing the residual is the same as maximizing the captured variance — the key duality behind PCA.

Formal View

Theorem 9.4 — Equivalence: Minimize Residual = Maximize Captured Variance
For centered data matrix AA and orthonormal VkV_k:
AVkVkAF2=AF2VkAF2.\|A - V_k V_k^\top A\|_F^2 = \|A\|_F^2 - \|V_k^\top A\|_F^2.
Minimizing reconstruction error is equivalent to maximizing VkAF2\|V_k^\top A\|_F^2.

Why This Matters

The equivalence between minimizing residuals and maximizing variance is what makes PCA "optimal" in the least-squares sense.

  • PCA finds directions of maximum variance — the same as directions of minimum error.
  • Sensor placement: find kk locations capturing the most variance in a field.
  • Visualizing to 2D: maximize variance preserves the most structure.

Quiz

Question 1

Minimizing the total reconstruction error is equivalent to maximizing the total variance captured.

Question 2

The reconstruction error AVkVkAF2\|A - V_k V_k^\top A\|_F^2 equals:

Common Mistakes

  • Thinking we maximize variance of the original data — we maximize variance of the projected data VkAV_k^\top A.
  • Confusing VkAV_k^\top A (coordinates, k×Nk \times N) with VkVkAV_k V_k^\top A (projection, m×Nm \times N).