AlphaOPT: Formulating optimization programs with self-improving LLM experience library

Minwei Kong, Ao Qu, Xiaotong Guo, Wenbin Ouyang, Chonghe Jiang, Han Zheng, Yining Ma, Dingyi Zhuang, Yuhan Tang, Junyi Li, Shenhao Wang, Haris Koutsopoulos, Hai Wang, Cathy Wu, Jinhua Zhao · 2025 · cs.AI · arXiv 2510.18428

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open full Pith review browse 4 citing papers arXiv PDF

abstract

Optimization modeling underlies critical decision-making across industries, yet remains difficult to automate: natural-language problem descriptions must be translated into precise mathematical formulations and executable solver code. Existing LLM-based approaches typically rely on brittle prompting or costly retraining, both of which offer limited generalization. Recent work suggests that large models can improve via experience reuse, but how to systematically acquire, refine, and reuse such experience in structurally constrained settings remains unclear. We present \textbf{AlphaOPT}, a self-improving experience library that enables LLMs to learn optimization modeling knowledge from limited supervision, including answer-only feedback without gold-standard programs, annotated reasoning traces, or parameter updates. AlphaOPT operates in a continual two-phase cycle: a \emph{Library Learning} phase that extracts solver-verified, structured insights from failed attempts, and a \emph{Library Evolution} phase that refines the applicability of stored insights based on aggregate evidence across tasks. This design allows the model to accumulate reusable modeling principles, improve transfer across problem instances, and maintain bounded library growth over time. Evaluated on multiple optimization benchmarks, AlphaOPT steadily improves as more training data become available (65\% $\rightarrow$ 72\% from 100 to 300 training items) and outperforms the strongest baseline by 9.1\% and 8.2\% on two out-of-distribution datasets. These results demonstrate that structured experience learning, grounded in solver feedback, provides a practical alternative to retraining for complex reasoning tasks requiring precise formulation and execution. All code and data are available at: https://github.com/Minw913/AlphaOPT.

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

FrontierOR: Benchmarking LLMs' Capacity for Efficient Algorithm Design in Large-Scale Optimization

cs.AI · 2026-05-24 · unverdicted · novelty 7.0

FrontierOR benchmark shows frontier LLMs outperform Gurobi on solution quality and efficiency in only 31% of one-shot cases and 50% with test-time evolution on hard large-scale optimization tasks.

Automated Reformulation of Robust Optimization via Memory-Augmented Large Language Models

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

AutoREM augments LLMs with a structured memory of failed reformulation trajectories to improve accuracy and efficiency on robust optimization tasks without parameter updates or expert knowledge.

From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling

math.OC · 2026-04-28 · unverdicted · novelty 6.0

Agora-Opt uses decentralized debate among LLM agent teams plus a read-write memory bank to produce more accurate optimization models from text than prior LLM methods.

Vehicle Routing Problem Meets Large Language Models: An Overview and Perspectives

math.OC · 2026-07-01 · unverdicted · novelty 4.0

Survey organizing LLM uses for VRP into modeler, designer, and coordinator roles, covering variants, solvers, benchmarks, and two experiments.

citing papers explorer

Showing 4 of 4 citing papers.

FrontierOR: Benchmarking LLMs' Capacity for Efficient Algorithm Design in Large-Scale Optimization cs.AI · 2026-05-24 · unverdicted · none · ref 11 · internal anchor
FrontierOR benchmark shows frontier LLMs outperform Gurobi on solution quality and efficiency in only 31% of one-shot cases and 50% with test-time evolution on hard large-scale optimization tasks.
Automated Reformulation of Robust Optimization via Memory-Augmented Large Language Models cs.AI · 2026-05-12 · unverdicted · none · ref 26 · internal anchor
AutoREM augments LLMs with a structured memory of failed reformulation trajectories to improve accuracy and efficiency on robust optimization tasks without parameter updates or expert knowledge.
From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling math.OC · 2026-04-28 · unverdicted · none · ref 8 · internal anchor
Agora-Opt uses decentralized debate among LLM agent teams plus a read-write memory bank to produce more accurate optimization models from text than prior LLM methods.
Vehicle Routing Problem Meets Large Language Models: An Overview and Perspectives math.OC · 2026-07-01 · unverdicted · none · ref 39 · internal anchor
Survey organizing LLM uses for VRP into modeler, designer, and coordinator roles, covering variants, solvers, benchmarks, and two experiments.

AlphaOPT: Formulating optimization programs with self-improving LLM experience library

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer