Automated optimization modeling via a localizable error-driven perspective

Weiting Liu, Han Wu, Yufei Kuang, Xiongwei Han, Tao Zhong, Jianfeng Feng, Wenlian Lu · 2026 · arXiv 2602.11164

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space

cs.LG · 2026-04-15 · unverdicted · novelty 6.0

PreRL applies reward-driven updates to P(y) in pre-train space, uses Negative Sample Reinforcement to prune bad reasoning paths and boost reflection, and combines with standard RL in Dual Space RL to outperform baselines on reasoning tasks.

citing papers explorer

Showing 1 of 1 citing paper.

From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space cs.LG · 2026-04-15 · unverdicted · none · ref 34
PreRL applies reward-driven updates to P(y) in pre-train space, uses Negative Sample Reinforcement to prune bad reasoning paths and boost reflection, and combines with standard RL in Dual Space RL to outperform baselines on reasoning tasks.

Automated optimization modeling via a localizable error-driven perspective

fields

years

verdicts

representative citing papers

citing papers explorer