Linear Algebra
16.37 min read

The Hessian Matrix

Just as the gradient vector organizes all first partial derivatives, the Hessian matrix organizes all second partial derivatives into an n×nn \times n matrix.

The (i,j)(i,j) entry of the Hessian is 2fxixj\frac{\partial^2 f}{\partial x_i \partial x_j}. Because mixed partials are equal (Clairaut's theorem), the Hessian is always symmetric for fD2f \in D^2.

The Hessian captures the "curvature" of ff at a point. Just as the gradient tells you which direction ff increases fastest (first-order information), the Hessian tells you how fast that rate is changing (second-order information).

For f(x,y)=xy3+x2yf(x,y) = xy^3 + x^2 - y, the Hessian function is Hf(x,y)=[23y23y26xy]Hf(x,y) = \begin{bmatrix} 2 & 3y^2 \\ 3y^2 & 6xy \end{bmatrix} — a matrix-valued function of (x,y)(x,y).

Formal View

Definition 16.3 — Hessian Matrix
Given f(x)f(\mathbf{x}) on Rn\mathbb{R}^n with all second partials at x0\mathbf{x}_0, the Hessian Hf(x0)Hf(\mathbf{x}_0) is the n×nn \times n matrix with (i,j)(i,j) entry 2fxixj(x0)\frac{\partial^2 f}{\partial x_i \partial x_j}(\mathbf{x}_0). When fD2f \in D^2, HfHf is symmetric.
Example 16.1 — Computing the Hessian
For f(x,y)=xy3+x2yf(x,y) = xy^3 + x^2 - y: first partials are fx=y3+2xf_x = y^3 + 2x and fy=3xy21f_y = 3xy^2 - 1. Second partials: fxx=2f_{xx} = 2, fxy=fyx=3y2f_{xy} = f_{yx} = 3y^2, fyy=6xyf_{yy} = 6xy. Hessian: Hf(x,y)=[23y23y26xy]Hf(x,y) = \begin{bmatrix} 2 & 3y^2 \\ 3y^2 & 6xy \end{bmatrix}.

Why This Matters

The Hessian is the multivariable analog of the second derivative f(x)f''(x) in single-variable calculus — the key object for understanding curvature and classifying critical points.

  • Newton's method updates: xk+1=xk[Hf(xk)]1f(xk)\mathbf{x}_{k+1} = \mathbf{x}_k - [Hf(\mathbf{x}_k)]^{-1} \nabla f(\mathbf{x}_k)
  • Quasi-Newton methods (L-BFGS) approximate the Hessian to speed up optimization
  • Second-order sensitivity analysis in engineering and economics
  • Hessian-vector products used in efficient curvature computation in deep learning

Quiz

Question 1

The Hessian Hf(x0)Hf(\mathbf{x}_0) is always symmetric when:

Question 2

For ff on R3\mathbb{R}^3, what is the size of the Hessian matrix?

Question 3

For f(x,y)=3x2+2xy+5y2f(x,y) = 3x^2 + 2xy + 5y^2, what is HfHf?

Question 4

The Hessian HfHf at a point x0\mathbf{x}_0 is a single number (scalar).

Common Mistakes

  • Confusing the Hessian with the Jacobian — the Jacobian is the matrix of first partials, the Hessian is the matrix of second partials.
  • Forgetting to divide by 2 when reconstructing ff from its Hessian (see the quadratic section).
  • Thinking the Hessian is always constant — it depends on position unless ff is quadratic.