AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

Ao Qu; Cathy Wu; Chonghe Jiang; Dingyi Zhuang; Hai Wang; Han Zheng; Haris Koutsopoulos; Jinhua Zhao; Junyi Li; Minwei Kong

arxiv: 2510.18428 · v4 · pith:52E6P2VYnew · submitted 2025-10-21 · 💻 cs.AI

AlphaOPT: Formulating Optimization Programs with Self-Improving LLM Experience Library

Minwei Kong , Ao Qu , Xiaotong Guo , Wenbin Ouyang , Chonghe Jiang , Han Zheng , Yining Ma , Dingyi Zhuang

show 7 more authors

Yuhan Tang Junyi Li Shenhao Wang Haris Koutsopoulos Hai Wang Cathy Wu Jinhua Zhao

This is my paper

classification 💻 cs.AI

keywords alphaoptexperiencelibraryoptimizationacrossmodelingavailablecode

0 comments

read the original abstract

Optimization modeling underlies critical decision-making across industries, yet remains difficult to automate: natural-language problem descriptions must be translated into precise mathematical formulations and executable solver code. Existing LLM-based approaches typically rely on brittle prompting or costly retraining, both of which offer limited generalization. Recent work suggests that large models can improve via experience reuse, but how to systematically acquire, refine, and reuse such experience in structurally constrained settings remains unclear. We present \textbf{AlphaOPT}, a self-improving experience library that enables LLMs to learn optimization modeling knowledge from limited supervision, including answer-only feedback without gold-standard programs, annotated reasoning traces, or parameter updates. AlphaOPT operates in a continual two-phase cycle: a \emph{Library Learning} phase that extracts solver-verified, structured insights from failed attempts, and a \emph{Library Evolution} phase that refines the applicability of stored insights based on aggregate evidence across tasks. This design allows the model to accumulate reusable modeling principles, improve transfer across problem instances, and maintain bounded library growth over time. Evaluated on multiple optimization benchmarks, AlphaOPT steadily improves as more training data become available (65\% $\rightarrow$ 72\% from 100 to 300 training items) and outperforms the strongest baseline by 9.1\% and 8.2\% on two out-of-distribution datasets. These results demonstrate that structured experience learning, grounded in solver feedback, provides a practical alternative to retraining for complex reasoning tasks requiring precise formulation and execution. All code and data are available at: https://github.com/Minw913/AlphaOPT.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Automated Reformulation of Robust Optimization via Memory-Augmented Large Language Models
cs.AI 2026-05 unverdicted novelty 6.0

AutoREM augments LLMs with a structured memory of failed reformulation trajectories to improve accuracy and efficiency on robust optimization tasks without parameter updates or expert knowledge.
From Soliloquy to Agora: Memory-Enhanced LLM Agents with Decentralized Debate for Optimization Modeling
math.OC 2026-04 unverdicted novelty 6.0

Agora-Opt uses decentralized debate among LLM agent teams plus a read-write memory bank to produce more accurate optimization models from text than prior LLM methods.