Wong, Lidia S

Shudong Liu, Zhaocong Li, Xuebo Liu, Runzhe Zhan, Derek F · 2024 · DOI 10.18653/v1/2024.emnlp-main.1205

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

Process Supervision of Confidence Margin for Calibrated LLM Reasoning

cs.LG · 2026-04-25 · unverdicted · novelty 6.0

RLCM trains LLMs with a margin-enhanced process reward that widens the gap between correct and incorrect reasoning steps, improving calibration on math, code, logic, and science tasks without hurting accuracy.

citing papers explorer

Showing 1 of 1 citing paper.

Process Supervision of Confidence Margin for Calibrated LLM Reasoning cs.LG · 2026-04-25 · unverdicted · none · ref 47
RLCM trains LLMs with a margin-enhanced process reward that widens the gap between correct and incorrect reasoning steps, improving calibration on math, code, logic, and science tasks without hurting accuracy.

Wong, Lidia S

fields

years

verdicts

representative citing papers

citing papers explorer