Linear Algebra
14.108 min read

One Underlying Variable

An important special case: u=f(t)Rk\mathbf{u} = \mathbf{f}(t) \in \mathbb{R}^k depends on a single scalar variable tt, and h=g(u)=g(f(t))Rh = g(\mathbf{u}) = g(\mathbf{f}(t)) \in \mathbb{R} is a scalar output. This is the case n=1n=1, m=1m=1.

The Jacobian of f:RRk\mathbf{f}: \mathbb{R} \to \mathbb{R}^k is the column vector f(t)=(f1(t),,fk(t))TRk×1\mathbf{f}'(t) = (f_1'(t), \ldots, f_k'(t))^T \in \mathbb{R}^{k\times 1}. The Jacobian of g:RkRg: \mathbb{R}^k \to \mathbb{R} is the row vector Tg=(g/u1,,g/uk)R1×k\nabla^T g = (\partial g/\partial u_1, \ldots, \partial g/\partial u_k) \in \mathbb{R}^{1\times k}.

The chain rule gives h(t)=Tg(f(t))f(t)=g(f(t))f(t)h'(t) = \nabla^T g(\mathbf{f}(t)) \cdot \mathbf{f}'(t) = \nabla g(\mathbf{f}(t)) \cdot \mathbf{f}'(t) — a dot product. This is a row vector times a column vector = a scalar.

Formal View

Theorem 14.6 — Chain Rule: One Underlying Variable
If f:RRk\mathbf{f}: \mathbb{R} \to \mathbb{R}^k is differentiable at tt and g:RkRg: \mathbb{R}^k \to \mathbb{R} is differentiable at f(t)\mathbf{f}(t), then h(t)=g(f(t))h(t) = g(\mathbf{f}(t)) satisfies
h(t)=g(f(t))f(t)=k=1Kguk(f(t))fk(t)h'(t) = \nabla g(\mathbf{f}(t)) \cdot \mathbf{f}'(t) = \sum_{k=1}^K \frac{\partial g}{\partial u_k}(\mathbf{f}(t)) f_k'(t)

In Leibniz notation: dhdt=kgukdukdt\frac{dh}{dt} = \sum_k \frac{\partial g}{\partial u_k}\frac{du_k}{dt}.

Why This Matters

The one-underlying-variable case is ubiquitous in physics where a function depends on space, and space depends on time.

  • Rate of change of a scalar quantity along a trajectory: ddt[f(γ(t))]=fγ\frac{d}{dt}[f(\boldsymbol{\gamma}(t))] = \nabla f \cdot \boldsymbol{\gamma}'
  • Hamiltonian mechanics: H˙=qHq˙+pHp˙\dot{H} = \nabla_\mathbf{q} H \cdot \dot{\mathbf{q}} + \nabla_\mathbf{p} H \cdot \dot{\mathbf{p}}
  • Neural ODE: continuous-depth neural networks parameterized by a single "depth" variable

Quiz

Question 1

If h(t)=g(f1(t),f2(t))h(t) = g(f_1(t), f_2(t)), then h(t)h'(t) equals:

Common Mistakes

  • Forgetting to evaluate g\nabla g at (f1(t),f2(t))(f_1(t), f_2(t)), not at (t,t)(t, t).
  • Treating h(t)h'(t) as a vector when the output is scalar — it is a scalar.
  • Omitting the dot product: writing gf\nabla g \cdot \mathbf{f}' as gf\nabla g \mathbf{f}' without the dot, causing dimension confusion.