Linear Algebra
8.1810 min read

Normal Equations Always Give a PSD Matrix

For any m×nm \times n matrix MM (any mm, any nn, any entries), the matrix A=MMA = M^\top M is always positive semidefinite. The proof is elegant: for any vector xRn\mathbf{x} \in \mathbb{R}^n, x(MM)x=(Mx)(Mx)=Mx20.\mathbf{x}^\top (M^\top M) \mathbf{x} = (M\mathbf{x})^\top (M\mathbf{x}) = \|M\mathbf{x}\|^2 \geq 0. A squared norm is always non-negative. So xAx0\mathbf{x}^\top A \mathbf{x} \geq 0 for all x\mathbf{x}, making AA PSD.

When is MMM^\top M actually positive definite (strictly)? Exactly when MM has full column rank — i.e., the null space of MM contains only the zero vector. If Mx=0M\mathbf{x} = \mathbf{0} only for x=0\mathbf{x} = \mathbf{0}, then Mx2=0    x=0\|M\mathbf{x}\|^2 = 0 \implies \mathbf{x} = 0, so the PSD condition strengthens to PD.

This is why the normal equations MMx=MbM^\top M \mathbf{x} = M^\top \mathbf{b} have a unique solution when MM has full column rank — the coefficient matrix MMM^\top M is PD (hence invertible).

Formal View

Theorem 8.7 — Gram Matrix is Always PSD
For any m×nm \times n matrix MM, the Gram matrix A=MMA = M^\top M is symmetric and positive semidefinite (PSD). Furthermore, MMM^\top M is positive definite (PD) if and only if MM has full column rank (i.e., ker(M)={0}\ker(M) = \{\mathbf{0}\}).

Proof of PSD: xMMx=Mx20\mathbf{x}^\top M^\top M \mathbf{x} = \|M\mathbf{x}\|^2 \geq 0. MATLAB: `A = M' * M` always gives PSD matrix.

Interactive Visualization

Matrix-Vector Multiplication

Why This Matters

The fact that MMM^\top M is always PSD is why least squares always works — the normal equations always have a solution, and have a unique solution when MM has full column rank.

  • Linear regression: AAA^\top A is PSD, PD when columns are linearly independent — guaranteeing unique least squares solution.
  • Kernel methods: the Gram matrix of any kernel function is PSD — this is the mathematical foundation for support vector machines.
  • Covariance estimation: sample covariance 1nXX\frac{1}{n}X^\top X is always PSD.

Quiz

Question 1

For any matrix MM, the matrix MMM^\top M is positive semidefinite.

Question 2

When is MMM^\top M positive definite (rather than just PSD)?

Common Mistakes

  • Thinking MMM^\top M is PD in general — it is only PD when MM has full column rank. Tall matrices with dependent columns give PSD (not PD) MMM^\top M.
  • Forgetting that MMM^\top M being PSD does not mean it is invertible — you need full column rank for invertibility.