Recognition: unknown
Learning to Optimize
read the original abstract
Algorithm design is a laborious process and often requires many iterations of ideation and validation. In this paper, we explore automating algorithm design and present a method to learn an optimization algorithm, which we believe to be the first method that can automatically discover a better algorithm. We approach this problem from a reinforcement learning perspective and represent any particular optimization algorithm as a policy. We learn an optimization algorithm using guided policy search and demonstrate that the resulting algorithm outperforms existing hand-engineered algorithms in terms of convergence speed and/or the final objective value.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Regret Equals Covariance: A Closed-Form Characterization for Stochastic Optimization
Expected regret equals covariance between costs and optimal decisions for linear and quadratic stochastic programs, with explicit bounds on the residual.
-
Learning to Cut: Reinforcement Learning for Benders Decomposition
RLBD trains a neural policy with REINFORCE to select cuts adaptively in Benders decomposition, yielding faster convergence and better generalization than standard BD or SVM-based LearnBD on an EV charging problem.
-
Learning to Test: Physics-Informed Representation for Dynamical Instability Detection
A physics-informed neural representation is learned from safe data to support distributional hypothesis testing for dynamical instability in stochastic DAE systems without repeated simulations.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.