Optimization Problems
An optimization problem asks: find a vector that minimizes (or maximizes) a scalar-valued function . In machine learning, might be the training loss; in engineering, the potential energy; in finance, the portfolio risk. The goal is always to find the input that makes the output as small as possible.
Why is this hard? For a general function , the minimizer could be anywhere — we have no idea where to look. Calculus provides the tools to narrow the search: derivatives tell us which direction is "downhill," critical points are where the gradient is zero, and second-order conditions distinguish minima from maxima and saddle points.
The iterative strategy called gradient descent starts from an initial guess and repeatedly steps in the direction of steepest descent. Understanding this algorithm deeply requires understanding derivatives — which is the subject of this and the next several chapters.
Formal View
Why This Matters
Optimization is the engine of machine learning, operations research, and engineering design.
- Neural network training: minimize the loss function over billions of parameters .
- Supply chain: minimize cost subject to delivery constraints.
- Antenna design: maximize signal strength subject to power constraints.
Quiz
A local minimum of is always a global minimum.
Gradient descent finds a minimum by:
Common Mistakes
- Confusing local and global minima — gradient descent typically finds local minima, which may not be global.
- Assuming every function has a global minimum — unbounded or oscillating functions may not achieve their infimum.