Improve mathematical reasoning in language models with automated process supervision, 2025 b

Liangchen Luo, Yinxiao Liu, Rosanne Liu, Samrat Phatale, Meiqi Guo, Harsh Lara, Yunxuan Li, Lei Shu, Lei Meng, Jiao Sun, Abhinav Rastogi · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Process Supervision of Confidence Margin for Calibrated LLM Reasoning

cs.LG · 2026-04-25 · unverdicted · novelty 6.0

RLCM trains LLMs with a margin-enhanced process reward that widens the gap between correct and incorrect reasoning steps, improving calibration on math, code, logic, and science tasks without hurting accuracy.

citing papers explorer

Showing 1 of 1 citing paper.

Process Supervision of Confidence Margin for Calibrated LLM Reasoning cs.LG · 2026-04-25 · unverdicted · none · ref 50
RLCM trains LLMs with a margin-enhanced process reward that widens the gap between correct and incorrect reasoning steps, improving calibration on math, code, logic, and science tasks without hurting accuracy.

Improve mathematical reasoning in language models with automated process supervision, 2025 b

fields

years

verdicts

representative citing papers

citing papers explorer