9.208 min read

Principal Coordinates from Gram Matrix

The Gram matrix $G = A^\top A$ has spectral decomposition $G = V\Lambda_G V^\top$ with eigenvalues $\sigma_i^2$ . Principal coordinates can be recovered directly: columns of $V_k \Lambda_{G,k}^{1/2}$ give the scores. This requires only pairwise inner products $G_{ij} = \mathbf{a}_i^\top \mathbf{a}_j$ , not raw data vectors.

This is the key insight behind kernel PCA: replace $G_{ij} = \mathbf{a}_i^\top \mathbf{a}_j$ with $K_{ij} = k(\mathbf{a}_i, \mathbf{a}_j)$ for a kernel function to get non-linear principal coordinates.

Formal View

Theorem 9.7 — Principal Coordinates from Gram

Let

G = A^\top A = V\Lambda_G V^\top

. Principal coordinates: columns of

C_k^\top = V_k \Lambda_{G,k}^{1/2}

Kernel PCA uses $K_{ij} = k(\mathbf{a}_i, \mathbf{a}_j)$ instead of $G_{ij} = \mathbf{a}_i^\top \mathbf{a}_j$ .

Interactive Visualization

Matrix Product — Column Perspective

Why This Matters

Gram-matrix PCA enables kernel PCA and non-linear dimensionality reduction.

Kernel PCA: non-linear principal coordinates via kernel functions.
When $m \gg N$ : only the $N \times N$ Gram matrix is needed.
Metric embeddings: recover coordinates from pairwise inner products.

Learning Resources

Kernel PCA

StatQuest

Kernel PCA extending standard PCA via Gram matrices.

20 min

PCA via Gram matrix

MIT OpenCourseWare

Dual formulation of PCA and Gram matrices.

45 min

Quiz

Question 1

Principal coordinates can be computed from $G = A^\top A$ without knowing the raw data vectors $\mathbf{a}_i$ .

Question 2

Kernel PCA replaces $G_{ij} = \mathbf{a}_i^\top \mathbf{a}_j$ with:

Common Mistakes

Forgetting to center the Gram matrix before kernel PCA.
Confusing $A^\top A$ (Gram) with $AA^\top$ (covariance).