Strategies for Finding Global Minima
When a function may not be convex, finding the global minimum requires more sophisticated strategies. The main analytical approaches are: (1) find all critical points and compare their values, selecting the smallest; (2) check the boundary if the domain is bounded; (3) exploit problem structure (e.g., separability, periodicity) to reduce the search.
For practical problems, commonly used strategies include: random restarts (run gradient descent from many random starting points), simulated annealing (probabilistic jumps to escape local minima), evolutionary/genetic algorithms (population-based global search), and convex relaxation (replace the non-convex problem with a tractable convex approximation).
In machine learning, the loss landscape typically has many local minima of similar quality, so gradient descent often finds a good (if not globally optimal) solution. Stochastic gradient descent (using random subsets of data) adds noise that helps escape shallow local minima.
Formal View
Why This Matters
Choosing the right global optimization strategy is crucial for practical problem solving.
- Hyperparameter tuning in machine learning uses random search and Bayesian optimization
- Molecular dynamics simulations use simulated annealing to find ground states
- Operations research uses branch-and-bound for combinatorial optimization
Quiz
For a non-convex function with multiple local minima, what is the most rigorous analytical approach to finding the global minimum?
Stochastic gradient descent is guaranteed to find the global minimum of a non-convex function.
Common Mistakes
- Trusting the first local minimum found by gradient descent as the global minimum for non-convex problems.
- Forgetting to evaluate the function at the boundary when the domain is compact.
- Over-applying random restarts without considering whether the problem has structure that can be exploited.