14.98 min read
Matrix Form Justification
The matrix multiplication form of the chain rule can be understood through the LLA perspective: composing two affine maps gives another affine map.
The LLA of near is — an affine map in . The LLA of near is .
Substituting : . The composed LLA has the linear part , which is the Jacobian of the composition.
Formal View
Remark 14.3 — Why It's Matrix Multiplication
Composing affine maps is matrix multiplication: if and , then . The linear part is — matrix multiplication. The chain rule is simply this fact applied to the LLAs.
Interactive Visualization
Matrix Product — Column Perspective
Why This Matters
Seeing the chain rule as matrix multiplication unifies calculus and linear algebra.
- Deep learning: each layer applies a matrix multiplication (the Jacobian), and backprop reverses the chain by multiplying Jacobians back to front
- Automatic differentiation: forward and reverse mode are two orderings of matrix multiplication in the chain rule
- Numerical linear algebra: Krylov methods apply the chain rule as matrix-vector products
Quiz
Question 1
Composing two linear maps and gives the linear map . This is analogous to the chain rule with because:
Common Mistakes
- Thinking the chain rule involves adding Jacobians — it is always multiplication, not addition.
- Confusing the order of multiplication when composing three or more functions.