Linear Algebra
10.1410 min read

Differentiation Rules

Rather than computing every derivative from the limit definition, we use differentiation rules that package common patterns. The essential rules: constant (c)=0(c)' = 0; power (xk)=kxk1(x^k)' = kx^{k-1}; sum (f+g)=f+g(f + g)' = f' + g'; product (fg)=fg+fg(fg)' = f'g + fg'; chain rule (fg)=(fg)g(f \circ g)' = (f' \circ g) \cdot g'.

The chain rule is the most important: if y=f(u)y = f(u) and u=g(x)u = g(x), then dydx=dydududx\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}. In Leibniz notation, derivatives multiply like fractions. This rule will be central in Chapter 14 when we extend it to multivariate functions.

Special derivatives: ddxex=ex\frac{d}{dx} e^x = e^x, ddxsinx=cosx\frac{d}{dx} \sin x = \cos x, ddxlnx=1x\frac{d}{dx} \ln x = \frac{1}{x} (for x>0x > 0).

Formal View

Theorem 10.1 — Differentiation Rules
For differentiable f,gf, g:\n- Power: (xk)=kxk1(x^k)' = kx^{k-1}\n- Sum: (f+g)=f+g(f + g)' = f' + g'\n- Product: (fg)=fg+fg(fg)' = f'g + fg'\n- Quotient: (f/g)=(fgfg)/g2(f/g)' = (f'g - fg')/g^2\n- Chain: (f(g(x)))=f(g(x))g(x)(f(g(x)))' = f'(g(x)) \cdot g'(x)

The chain rule in Leibniz form: ddxf(g(x))=dfdgdgdx\frac{d}{dx}f(g(x)) = \frac{df}{dg} \cdot \frac{dg}{dx}.

Interactive Visualization

Interactive Line Explorer

Why This Matters

Differentiation rules make computing derivatives fast — essential for implementing gradient descent and backpropagation.

  • Backpropagation: the chain rule applied recursively to a composition of functions (the neural network).
  • Automatic differentiation: modern ML frameworks (PyTorch, JAX) implement these rules algorithmically.
  • Sensitivity analysis: product and chain rules let you trace how changes propagate through complex models.

Quiz

Question 1

What is the derivative of f(x)=x5f(x) = x^5?

Question 2

The chain rule says: if y=f(g(x))y = f(g(x)), then dydx=\frac{dy}{dx} =

Common Mistakes

  • Applying the power rule as (xk)=xk1(x^k)' = x^{k-1} (forgetting the factor of kk) — it should be kxk1kx^{k-1}.
  • Chain rule error: computing f(x)g(x)f'(x) \cdot g'(x) instead of f(g(x))g(x)f'(g(x)) \cdot g'(x) — must evaluate the outer derivative at the inner function.