12.38 min read

U-Derivative and the LLA

The local linear approximation gives us a powerful formula for directional derivatives. If $f$ is differentiable at $\mathbf{a}$ , then near $\mathbf{a}$ we have $f(\mathbf{a}+t\mathbf{u}) \approx f(\mathbf{a}) + Df(\mathbf{a})(t\mathbf{u}) = f(\mathbf{a}) + t \cdot Df(\mathbf{a})\mathbf{u}$ .

Substituting into the definition: $D_\mathbf{u} f(\mathbf{a}) = \lim_{t\to 0} \frac{f(\mathbf{a}+t\mathbf{u})-f(\mathbf{a})}{t} = \lim_{t\to 0} \frac{t \cdot Df(\mathbf{a})\mathbf{u} + o(t)}{t} = Df(\mathbf{a})\mathbf{u}$ .

In terms of the gradient: $D_\mathbf{u} f(\mathbf{a}) = Df(\mathbf{a})\mathbf{u} = \nabla f(\mathbf{a}) \cdot \mathbf{u}$ . This is the fundamental formula connecting the Jacobian, the gradient, and directional derivatives.

Formal View

Theorem 12.2 — Directional Derivative from Jacobian

f: D \subseteq \mathbb{R}^n \to \mathbb{R}

is differentiable at

\mathbf{a}

, then for any unit vector

\mathbf{u} \in \mathbb{R}^n

D_\mathbf{u} f(\mathbf{a}) = Df(\mathbf{a})\mathbf{u} = \nabla f(\mathbf{a}) \cdot \mathbf{u}

This shows that once you know the gradient, you know all directional derivatives. The gradient encodes all first-order information about $f$ .

Why This Matters

This formula reduces computing directional derivatives to computing the gradient once and taking dot products.

Efficient computation: compute $\nabla f$ once, then get any directional derivative for free
Understanding why gradient descent works: the gradient direction maximizes $D_\mathbf{u} f$
Basis for Lagrange multiplier methods and constrained optimization

Learning Resources

Gradient Formula for Directional Derivatives

Khan Academy

Deriving the gradient formula for directional derivatives.

11 min

Directional Derivatives Using the Gradient

MIT OpenCourseWare

Complete MIT lecture on directional derivatives and the gradient.

40 min

Quiz

Question 1

If $\nabla f(\mathbf{a}) = (4, -1, 2)$ and $\mathbf{u} = (0, 1, 0)$ , what is $D_\mathbf{u} f(\mathbf{a})$ ?

Question 2

If $f$ has all partial derivatives at $\mathbf{a}$ , then the directional derivative formula $D_\mathbf{u} f(\mathbf{a}) = \nabla f(\mathbf{a}) \cdot \mathbf{u}$ holds.

Common Mistakes

Applying the formula $D_\mathbf{u} f = \nabla f \cdot \mathbf{u}$ without verifying differentiability.
Confusing $D_\mathbf{u} f$ (scalar) with $Df$ (matrix/row vector).
Forgetting to normalize $\mathbf{u}$ before applying the formula.