14.148 min read

Multiple Outputs

The general case $\mathbf{h} = \mathbf{g}\circ\mathbf{f}$ with vector-valued $\mathbf{g}$ (multiple outputs) follows the same Jacobian multiplication formula. Each output component $h_i = g_i(\mathbf{f}(\mathbf{x}))$ satisfies the chain rule, and stacking all outputs gives the Jacobian form.

Concretely, for $\mathbf{f}: \mathbb{R}^n \to \mathbb{R}^k$ and $\mathbf{g}: \mathbb{R}^k \to \mathbb{R}^m$ : the $(i,j)$ entry of $J\mathbf{h} \in \mathbb{R}^{m\times n}$ is $[J\mathbf{h}]_{ij} = \frac{\partial h_i}{\partial x_j} = \sum_{k=1}^K \frac{\partial g_i}{\partial u_k}\frac{\partial f_k}{\partial x_j} = [J\mathbf{g}]_{ik}[J\mathbf{f}]_{kj}$ summed over $k$ — matrix multiplication entry by entry.

This is the most general and powerful form. It subsumes all special cases: scalar input ( $n=1$ , tangent vector), scalar output ( $m=1$ , gradient), and everything in between.

Formal View

Corollary 14.1 — Chain Rule with Multiple Inputs and Outputs

For

\mathbf{h} = \mathbf{g}\circ\mathbf{f}

J\mathbf{h}(\mathbf{a}) = J\mathbf{g}(\mathbf{f}(\mathbf{a})) \cdot J\mathbf{f}(\mathbf{a})

. Dimensions:

(m\times n) = (m\times k) \cdot (k\times n)

. All special cases are obtained by substituting

m=1

n=1

, or both.

Why This Matters

The general chain rule applies to every layer of a neural network and every composed function in applied mathematics.

Multi-output regression: chain rule for vector-valued loss functions
Sensor fusion: combining multiple measurements through a composed model
Robotics forward kinematics: composing transformations for each joint

Learning Resources

General Chain Rule — Vector Case

Steve Brunton

Chain rule for vector-valued composed functions in full generality.

20 min

Jacobian Chain Rule

MIT OpenCourseWare

Full generality of the Jacobian chain rule.

48 min

Quiz

Question 1

For $\mathbf{h} = \mathbf{g}\circ\mathbf{f}$ with $\mathbf{f}: \mathbb{R}^5 \to \mathbb{R}^3$ and $\mathbf{g}: \mathbb{R}^3 \to \mathbb{R}^4$ , what is the size of $J\mathbf{h}$ ?

Common Mistakes

Getting confused by all the dimensions — always remember: Jacobian is (output dim) × (input dim).
Forgetting that the chain rule formula applies row-by-row for each output component.