Accelerating Quadratic Optimization with Reinforcement Learning

Bartolomeo Stellato; Francesco Borrelli; Goran Banjac; Ion Stoica; Jeffrey Ichnowski; Joseph E. Gonzalez; Ken Goldberg; Michael Luo; Paras Jain

arxiv: 2107.10847 · v1 · pith:H4QI4VLGnew · submitted 2021-07-22 · 💻 cs.LG · math.OC

Accelerating Quadratic Optimization with Reinforcement Learning

Jeffrey Ichnowski , Paras Jain , Bartolomeo Stellato , Goran Banjac , Michael Luo , Francesco Borrelli , Joseph E. Gonzalez , Ion Stoica

show 1 more author

Ken Goldberg

This is my paper

classification 💻 cs.LG math.OC

keywords rlqplearningproblemsconvergencemethodsoptimizationpolicyquadratic

0 comments

read the original abstract

First-order methods for quadratic optimization such as OSQP are widely used for large-scale machine learning and embedded optimal control, where many related problems must be rapidly solved. These methods face two persistent challenges: manual hyperparameter tuning and convergence time to high-accuracy solutions. To address these, we explore how Reinforcement Learning (RL) can learn a policy to tune parameters to accelerate convergence. In experiments with well-known QP benchmarks we find that our RL policy, RLQP, significantly outperforms state-of-the-art QP solvers by up to 3x. RLQP generalizes surprisingly well to previously unseen problems with varying dimension and structure from different applications, including the QPLIB, Netlib LP and Maros-Meszaros problems. Code for RLQP is available at https://github.com/berkeleyautomation/rlqp.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

An AI-ready, Polarized Electron-Positron Collision Dataset
hep-ex 2026-05 unverdicted novelty 5.0

Release of an AI-ready dataset containing approximately 660,000 reconstructed polarized e+e- collision events at 91.2 GeV from the SLD experiment, translated from legacy formats with accompanying digitized documentation.