14.148 min read
Multiple Outputs
The general case with vector-valued (multiple outputs) follows the same Jacobian multiplication formula. Each output component satisfies the chain rule, and stacking all outputs gives the Jacobian form.
Concretely, for and : the entry of is summed over — matrix multiplication entry by entry.
This is the most general and powerful form. It subsumes all special cases: scalar input (, tangent vector), scalar output (, gradient), and everything in between.
Formal View
Corollary 14.1 — Chain Rule with Multiple Inputs and Outputs
For : .
Dimensions: . All special cases are obtained by substituting , , or both.
Why This Matters
The general chain rule applies to every layer of a neural network and every composed function in applied mathematics.
- Multi-output regression: chain rule for vector-valued loss functions
- Sensor fusion: combining multiple measurements through a composed model
- Robotics forward kinematics: composing transformations for each joint
Quiz
Question 1
For with and , what is the size of ?
Common Mistakes
- Getting confused by all the dimensions — always remember: Jacobian is (output dim) × (input dim).
- Forgetting that the chain rule formula applies row-by-row for each output component.