14.128 min read

Detailed Justification

Let's trace through the full justification of the chain rule in the one-underlying-variable case more carefully. We want to show $h'(t) = \nabla g(\mathbf{f}(t)) \cdot \mathbf{f}'(t)$ .

By definition: $h'(t) = \lim_{s\to 0}\frac{h(t+s)-h(t)}{s} = \lim_{s\to 0}\frac{g(\mathbf{f}(t+s))-g(\mathbf{f}(t))}{s}$ .

Write $\mathbf{f}(t+s) = \mathbf{f}(t) + s\mathbf{f}'(t) + \mathbf{e}(s)$ where $\mathbf{e}(s) = o(s)$ . Then $g(\mathbf{f}(t+s)) = g(\mathbf{f}(t) + s\mathbf{f}'(t) + \mathbf{e}(s)) \approx g(\mathbf{f}(t)) + \nabla g(\mathbf{f}(t)) \cdot (s\mathbf{f}'(t) + \mathbf{e}(s))$ . Dividing by $s$ and taking the limit: $h'(t) = \nabla g(\mathbf{f}(t)) \cdot \mathbf{f}'(t)$ .

Formal View

Theorem 14.7 — Chain Rule — Detailed Proof (Scalar Case)

The approximation

h(t+s) - h(t) = \nabla g(\mathbf{f}(t)) \cdot \mathbf{f}'(t) \cdot s + o(s)

follows from: 1.

\mathbf{f}(t+s) = \mathbf{f}(t) + s\mathbf{f}'(t) + o(s)

(differentiability of

\mathbf{f}

) 2.

g(\mathbf{f}(t)+\boldsymbol{\delta}) = g(\mathbf{f}(t)) + \nabla g(\mathbf{f}(t))\cdot\boldsymbol{\delta} + o(\|\boldsymbol{\delta}\|)

(differentiability of

g

) 3.

\|\boldsymbol{\delta}\| = O(|s|)

, so

o(\|\boldsymbol{\delta}\|) = o(|s|)

Why This Matters

Working through the detailed proof builds confidence and reveals what conditions are truly necessary.

Understanding exactly where differentiability is used clarifies what conditions the chain rule requires
Building foundation for understanding when generalizations (e.g., to non-smooth functions) can be made

Learning Resources

Chain Rule Proof Details

MIT OpenCourseWare

Full detailed proof of the chain rule with all error bounds.

48 min

Total Derivative and Chain Rule

Khan Academy

Step-by-step justification of the multivariate chain rule.

9 min

Quiz

Question 1

In the detailed justification, which step uses differentiability of $\mathbf{f}$ at $t$ ?

Common Mistakes

Skipping the step where $\|\boldsymbol{\delta}\| = O(|s|)$ is used to upgrade $o(\|\boldsymbol{\delta}\|)$ to $o(|s|)$ .
Treating the proof sketch as a complete proof without bounding the error terms.