FrontierOR benchmark shows frontier LLMs outperform Gurobi on solution quality and efficiency in only 31% of one-shot cases and 50% with test-time evolution on hard large-scale optimization tasks.
AlphaOPT: Formulating optimization programs with self-improving LLM experience library
4 Pith papers cite this work. Polarity classification is still indexing.
abstract
Optimization modeling underlies critical decision-making across industries, yet remains difficult to automate: natural-language problem descriptions must be translated into precise mathematical formulations and executable solver code. Existing LLM-based approaches typically rely on brittle prompting or costly retraining, both of which offer limited generalization. Recent work suggests that large models can improve via experience reuse, but how to systematically acquire, refine, and reuse such experience in structurally constrained settings remains unclear. We present \textbf{AlphaOPT}, a self-improving experience library that enables LLMs to learn optimization modeling knowledge from limited supervision, including answer-only feedback without gold-standard programs, annotated reasoning traces, or parameter updates. AlphaOPT operates in a continual two-phase cycle: a \emph{Library Learning} phase that extracts solver-verified, structured insights from failed attempts, and a \emph{Library Evolution} phase that refines the applicability of stored insights based on aggregate evidence across tasks. This design allows the model to accumulate reusable modeling principles, improve transfer across problem instances, and maintain bounded library growth over time. Evaluated on multiple optimization benchmarks, AlphaOPT steadily improves as more training data become available (65\% $\rightarrow$ 72\% from 100 to 300 training items) and outperforms the strongest baseline by 9.1\% and 8.2\% on two out-of-distribution datasets. These results demonstrate that structured experience learning, grounded in solver feedback, provides a practical alternative to retraining for complex reasoning tasks requiring precise formulation and execution. All code and data are available at: https://github.com/Minw913/AlphaOPT.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4representative citing papers
AutoREM augments LLMs with a structured memory of failed reformulation trajectories to improve accuracy and efficiency on robust optimization tasks without parameter updates or expert knowledge.
Agora-Opt uses decentralized debate among LLM agent teams plus a read-write memory bank to produce more accurate optimization models from text than prior LLM methods.
Survey organizing LLM uses for VRP into modeler, designer, and coordinator roles, covering variants, solvers, benchmarks, and two experiments.
citing papers explorer
-
FrontierOR: Benchmarking LLMs' Capacity for Efficient Algorithm Design in Large-Scale Optimization
FrontierOR benchmark shows frontier LLMs outperform Gurobi on solution quality and efficiency in only 31% of one-shot cases and 50% with test-time evolution on hard large-scale optimization tasks.
-
Automated Reformulation of Robust Optimization via Memory-Augmented Large Language Models
AutoREM augments LLMs with a structured memory of failed reformulation trajectories to improve accuracy and efficiency on robust optimization tasks without parameter updates or expert knowledge.
-
From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling
Agora-Opt uses decentralized debate among LLM agent teams plus a read-write memory bank to produce more accurate optimization models from text than prior LLM methods.
-
Vehicle Routing Problem Meets Large Language Models: An Overview and Perspectives
Survey organizing LLM uses for VRP into modeler, designer, and coordinator roles, covering variants, solvers, benchmarks, and two experiments.