9.910 min read
The Affine Reduction Problem
The dimensionality reduction problem: given data points , find a -dimensional affine subspace closest to all points. Formally, minimize over the center and orthonormal frame .
The optimal center is always (the data mean). After finding this, subtracting the mean turns the affine problem into a linear one: find the best linear subspace for the centered data.
Formal View
Definition 9.1 — Affine Dimensionality Reduction Problem
Given points and target dimension , find and with orthonormal columns minimizing .
The optimal center is always (the mean).
Why This Matters
Dimensionality reduction is fundamental to data science — compressing data while preserving structure.
- Face recognition: represent faces in a low-dimensional "face space".
- Genome-wide association studies: reduce thousands of genetic markers to principal components.
- Anomaly detection: points far from the low-dimensional subspace are outliers.
Quiz
Question 1
What is the optimal center for the affine reduction problem?
Question 2
After subtracting the mean, the affine reduction problem becomes a linear reduction problem.
Common Mistakes
- Skipping centering and fitting a linear subspace through the origin — gives suboptimal results unless data already has zero mean.
- Confusing dimension reduction (finding a low-dim subspace) with feature selection (choosing original features).