The Hessian Matrix
Just as the gradient vector organizes all first partial derivatives, the Hessian matrix organizes all second partial derivatives into an matrix.
The entry of the Hessian is . Because mixed partials are equal (Clairaut's theorem), the Hessian is always symmetric for .
The Hessian captures the "curvature" of at a point. Just as the gradient tells you which direction increases fastest (first-order information), the Hessian tells you how fast that rate is changing (second-order information).
For , the Hessian function is — a matrix-valued function of .
Formal View
Why This Matters
The Hessian is the multivariable analog of the second derivative in single-variable calculus — the key object for understanding curvature and classifying critical points.
- Newton's method updates:
- Quasi-Newton methods (L-BFGS) approximate the Hessian to speed up optimization
- Second-order sensitivity analysis in engineering and economics
- Hessian-vector products used in efficient curvature computation in deep learning
Quiz
The Hessian is always symmetric when:
For on , what is the size of the Hessian matrix?
For , what is ?
The Hessian at a point is a single number (scalar).
Common Mistakes
- Confusing the Hessian with the Jacobian — the Jacobian is the matrix of first partials, the Hessian is the matrix of second partials.
- Forgetting to divide by 2 when reconstructing from its Hessian (see the quadratic section).
- Thinking the Hessian is always constant — it depends on position unless is quadratic.