Linear Algebra
14.138 min read

Two Underlying Variables

Another key special case: u=f(x,y)Rk\mathbf{u} = \mathbf{f}(x,y) \in \mathbb{R}^k depends on two scalar variables, and h=g(u)Rh = g(\mathbf{u}) \in \mathbb{R} is scalar output. Here n=2n=2, m=1m=1.

The Jacobian of f:R2Rk\mathbf{f}: \mathbb{R}^2 \to \mathbb{R}^k is a k×2k\times 2 matrix. The chain rule gives the 1×21\times 2 Jacobian of h=gfh = g\circ\mathbf{f}: Dh=TgJfR1×2Dh = \nabla^T g \cdot J\mathbf{f} \in \mathbb{R}^{1\times 2}

In coordinates: hx=kgukfkx\frac{\partial h}{\partial x} = \sum_k \frac{\partial g}{\partial u_k}\frac{\partial f_k}{\partial x} and hy=kgukfky\frac{\partial h}{\partial y} = \sum_k \frac{\partial g}{\partial u_k}\frac{\partial f_k}{\partial y}. These are the two components of the gradient of hh with respect to (x,y)(x,y).

Formal View

Theorem 14.8 — Chain Rule: Two Underlying Variables
For h(x,y)=g(f(x,y))h(x,y) = g(\mathbf{f}(x,y)) with f:R2Rk\mathbf{f}: \mathbb{R}^2 \to \mathbb{R}^k and g:RkRg: \mathbb{R}^k \to \mathbb{R}:
hx=k=1Kgukfkx,hy=k=1Kgukfky\frac{\partial h}{\partial x} = \sum_{k=1}^K \frac{\partial g}{\partial u_k}\frac{\partial f_k}{\partial x}, \quad \frac{\partial h}{\partial y} = \sum_{k=1}^K \frac{\partial g}{\partial u_k}\frac{\partial f_k}{\partial y}
where partial derivatives of gg are evaluated at u=f(x,y)\mathbf{u} = \mathbf{f}(x,y).

Why This Matters

The two-underlying-variable case is the core of partial differentiation for composed functions.

  • Temperature in a medium: T(x,y)T(x,y) depends on position, which depends on parameters
  • Reparametrization: rewriting a function in different coordinates
  • Automatic differentiation with two parameters (e.g., bivariate polynomial fitting)

Quiz

Question 1

For h(x,y)=g(x2,xy,y2)h(x,y) = g(x^2, xy, y^2), the partial derivative h/x\partial h/\partial x is:

Question 2

For h(x,y)=g(f1(x,y),f2(x,y))h(x,y) = g(f_1(x,y), f_2(x,y)), the gradient h\nabla h is the 2×12\times 1 column vector of partial derivatives h/x\partial h/\partial x and h/y\partial h/\partial y.

Common Mistakes

  • Forgetting to compute both h/x\partial h/\partial x and h/y\partial h/\partial y — the chain rule gives a gradient, not just one partial.
  • Misidentifying the partial derivatives of inner functions when they depend on both xx and yy.