Hessians and Convexity
For differentiable functions, there is a beautiful Hessian-based test for convexity — a global version of the second derivative test.
The theorem: a function over a convex domain is convex if and only if its Hessian is PSD at every point. If the Hessian is everywhere PD, then is strictly convex (but not vice versa — the converse fails).
Compare this to the second derivative test: there, having a PSD Hessian at one specific critical point tells us very little. Here, knowing the Hessian is PSD everywhere is much stronger — it gives global convexity.
Combined with the convexity-implies-global-minimum theorem: if you find a critical point of a function whose Hessian is everywhere PSD, you have found the global minimum. Gradient descent wins the game.
Formal View
Why This Matters
This criterion is how you verify that a problem is convex — and once verified, you know gradient descent will find the global optimum.
- Verifying ML loss functions are convex: check Hessian is globally PSD
- Least squares: Hessian is (PSD everywhere) → unique global min
- Neural networks: non-convex loss landscape — Hessian is indefinite at many points
- Convex relaxations: approximate a non-convex problem with one having a PSD Hessian
Quiz
A function on a convex domain is convex if and only if:
If is PD everywhere, then is strictly convex.
For , is convex on ?
You find a critical point of and verify is everywhere PSD. The critical point is:
The least squares objective is convex.
Common Mistakes
- Confusing "PSD at a critical point (inconclusive 2nd derivative test)" with "PSD everywhere (convex function)" — these are very different claims.
- Thinking PD everywhere is required for convexity — PSD everywhere suffices.
- Assuming a convex function must have a global minimum — it might not (e.g., on has no minimum).