Why We Need Derivatives
To minimize , we need to know which direction is "downhill" at any point. This is exactly what the derivative tells us: the rate at which changes as we perturb the input. Without derivatives, we would need to evaluate at every possible input — which is impossible for continuous functions.
For a one-variable function , the derivative gives the slope of the tangent line at . If , moving right increases ; if , moving right decreases . At a minimum, — flat tangent, no immediate improvement from either direction.
In multiple variables, the gradient generalizes the derivative — it points in the direction of steepest ascent. The negative gradient points downhill. This is the direction we follow in gradient descent.
Formal View
Interactive Visualization
Interactive Line Explorer
Why This Matters
Derivatives make optimization tractable — they tell you where to step without evaluating everywhere.
- Backpropagation in neural networks: compute derivatives of loss w.r.t. every parameter via the chain rule.
- Newton's method: use both first and second derivatives to find roots or minima faster than gradient descent.
- Sensitivity analysis: tells how much the optimum changes if a constraint changes slightly.
Quiz
If , then must be a minimum.
The negative gradient points in the direction of:
Common Mistakes
- Thinking is sufficient for a minimum — it is only necessary. Check second-order conditions or function values to confirm.
- Confusing the derivative (a number, the slope) with the gradient (a vector, pointing uphill).