Mamo: A mathematical modeling benchmark with solvers

Xuhan Huang, Qingning Shen, Yan Hu, Anningzhe Gao, Benyou Wang · 2024 · arXiv 2405.13144

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

background 1 use dataset 1

representative citing papers

ReLoop: Structured Modeling and Behavioral Verification for Reliable LLM-Based Optimization

cs.SE · 2026-02-17 · unverdicted · novelty 7.0

ReLoop closes the feasibility-correctness gap in LLM optimization code via structured generation and behavioral verification with parameter perturbations, reaching 100% executability and accuracy gains on benchmarks while releasing RetailOpt-190.

SciML Agents: Write the Solver, Not the Solution

cs.LG · 2025-09-12 · unverdicted · novelty 7.0

LLMs prompted with domain knowledge can generate runnable, numerically valid code for stiff and non-stiff ODEs on new diagnostic and 1000-task benchmarks.

PARM: Pipeline-Adapted Reward Model

cs.AI · 2026-04-20 · unverdicted · novelty 6.0

PARM adapts reward models to multi-stage LLM pipelines via pipeline data and direct preference optimization, improving execution rate and solving accuracy on optimization benchmarks and showing transfer to GSM8K.

AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems

cs.LG · 2026-04-18 · unverdicted · novelty 6.0

AutoOR uses synthetic data generation and RL post-training with solver feedback to enable 8B LLMs to autoformalize linear, mixed-integer, and non-linear OR problems, matching larger models on benchmarks.

Co-evolving Agent Architectures and Interpretable Reasoning for Automated Optimization

cs.AI · 2026-04-20

citing papers explorer

Showing 4 of 4 citing papers after filters.

ReLoop: Structured Modeling and Behavioral Verification for Reliable LLM-Based Optimization cs.SE · 2026-02-17 · unverdicted · none · ref 19
ReLoop closes the feasibility-correctness gap in LLM optimization code via structured generation and behavioral verification with parameter perturbations, reaching 100% executability and accuracy gains on benchmarks while releasing RetailOpt-190.
SciML Agents: Write the Solver, Not the Solution cs.LG · 2025-09-12 · unverdicted · none · ref 29
LLMs prompted with domain knowledge can generate runnable, numerically valid code for stiff and non-stiff ODEs on new diagnostic and 1000-task benchmarks.
PARM: Pipeline-Adapted Reward Model cs.AI · 2026-04-20 · unverdicted · none · ref 40
PARM adapts reward models to multi-stage LLM pipelines via pipeline data and direct preference optimization, improving execution rate and solving accuracy on optimization benchmarks and showing transfer to GSM8K.
AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems cs.LG · 2026-04-18 · unverdicted · none · ref 21
AutoOR uses synthetic data generation and RL post-training with solver feedback to enable 8B LLMs to autoformalize linear, mixed-integer, and non-linear OR problems, matching larger models on benchmarks.

Mamo: A mathematical modeling benchmark with solvers

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer