Linear Algebra
15.47 min read

Cutting a Wire

To make the backward pass precise, we need a formal way to say "how does the output depend on this specific internal wire?" The tool is the wire-cut circuit.

Pick any wire named xx. The xx-wire-cut circuit removes that wire and treats its value as a free external input to the rest of the circuit. The output now depends on xx (what you feed in) as well as the original inputs t\mathbf{t}. We write its functional behavior as f(x,t)f(x, \mathbf{t}).

Now f(x,t)x(x0,t0)\frac{\partial f(x, \mathbf{t})}{\partial x}(x_0, \mathbf{t}_0) is a well-defined partial: how does the circuit output change when we wiggle just the xx wire, holding everything else fixed?

To seed the backward pass, note that for the output wire ff itself: f(f,t)=ff(f, \mathbf{t}) = f (trivially), so f(f,t)f=1\frac{\partial f(f, \mathbf{t})}{\partial f} = 1. This is the number we inject at the start.

Formal View

Definition 15.4 — Wire-Cut Circuit
For a wire xx in the circuit, the $x$-wire-cut circuit replaces that internal wire with a free input. Its functional behavior is f(x,t)f(x, \mathbf{t}). The partial f(x,t)x(x0,t0)\frac{\partial f(x, \mathbf{t})}{\partial x}(x_0, \mathbf{t}_0) measures the circuit output's sensitivity to the xx-wire at the forward-pass values.
Remark 15.3 — Seeding the Backward Pass
For the output wire ff: f(f,t)=ff(f, \mathbf{t}) = f (identity), so f(f,t)f(f0,t0)=1\frac{\partial f(f, \mathbf{t})}{\partial f}(f_0, \mathbf{t}_0) = 1. This seed value of 1 is placed on the output wire to start the backward pass.

Why This Matters

Wire cutting gives the backward pass a rigorous foundation — it turns gradient propagation into precise partial derivative computation.

  • Foundation for formal proofs of backpropagation correctness
  • Basis for higher-order autodiff (computing Hessians via backprop through backprop)
  • Forward-mode vs. reverse-mode autodiff differ in which wires they "cut" and when
  • Sensitivity analysis in scientific computing and engineering design

Quiz

Question 1

What is the value of f(f,t)f(f0,t0)\frac{\partial f(f, \mathbf{t})}{\partial f}(f_0, \mathbf{t}_0)?

Question 2

In f(x,t)f(x, \mathbf{t}), what is xx?

Question 3

f(x,t)f(x, \mathbf{t}) and f(t)f(\mathbf{t}) describe the same function.

Question 4

The backward pass goal is to compute, for each tit_i:

Common Mistakes

  • Thinking wire-cutting physically breaks the circuit — it is a conceptual formalism for defining partial derivatives, not an actual modification.
  • Forgetting that the seed value 1 comes from ddf[f]=1\frac{d}{df}[f] = 1.