Linear Algebra
9.188 min read

Dual PCA

The Dual PCA gives the same reconstruction as Classical PCA but computed via the N×NN \times N Gram matrix G=AAG = A^\top A instead of the m×mm \times m covariance K=AAK = AA^\top. It uses the right singular vectors VV and the formula q~=m+AVkVkA+(qm)\tilde{\mathbf{q}} = \mathbf{m} + A V_k V_k^\top A^+ (\mathbf{q} - \mathbf{m}).

Dual PCA is efficient when N<mN < m (fewer data points than dimensions), since the Gram matrix is smaller. Both formulations produce identical reconstructions.

Formal View

Definition 9.7 — Dual PCA
For centered data A=UΣVA = U\Sigma V^\top, Dual PCA computes via the N×NN \times N Gram matrix G=AA=VΛGVG = A^\top A = V\Lambda_G V^\top. Principal coordinates: Ck=VkΛG,k1/2C_k^\top = V_k \Lambda_{G,k}^{1/2}. Same reconstructions as classical PCA.

Classical PCA uses m×mm \times m covariance; Dual PCA uses N×NN \times N Gram. Choose the smaller one.

Why This Matters

Dual PCA handles the common case where N<mN < m — more common in genomics, imaging, and text mining.

  • Genomics: N=1000N=1000 patients, m=100000m=100000 genes — Gram is 1000×10001000 \times 1000, much smaller.
  • Kernel PCA replaces the Gram with a kernel matrix for non-linear PCA.
  • When N<mN < m, dual PCA is strictly faster.

Quiz

Question 1

When is Dual PCA more efficient?

Question 2

Classical and Dual PCA always give the same reconstructions.

Common Mistakes

  • Thinking dual PCA gives different results — reconstructions are identical.
  • Forgetting to center data before either variant.