Addressing the Binning Problem in Calibration Assessment through Scalar Annotations

Jiang, Zhengping, Liu, Anqi, Durme, Benjamnin Van · 2024 · DOI 10.1162/tacl_a_00636

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

Process Supervision of Confidence Margin for Calibrated LLM Reasoning

cs.LG · 2026-04-25 · unverdicted · novelty 6.0

RLCM trains LLMs with a margin-enhanced process reward that widens the gap between correct and incorrect reasoning steps, improving calibration on math, code, logic, and science tasks without hurting accuracy.

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate

cs.CL · 2023-05-30 · conditional · novelty 5.0

Multi-agent debate with tit-for-tat arguments and a judge LLM improves reasoning by preventing LLMs from locking into incorrect initial solutions.

citing papers explorer

Showing 2 of 2 citing papers.

Process Supervision of Confidence Margin for Calibrated LLM Reasoning cs.LG · 2026-04-25 · unverdicted · none · ref 29
RLCM trains LLMs with a margin-enhanced process reward that widens the gap between correct and incorrect reasoning steps, improving calibration on math, code, logic, and science tasks without hurting accuracy.
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate cs.CL · 2023-05-30 · conditional · none · ref 189
Multi-agent debate with tit-for-tat arguments and a judge LLM improves reasoning by preventing LLMs from locking into incorrect initial solutions.

Addressing the Binning Problem in Calibration Assessment through Scalar Annotations

fields

years

verdicts

representative citing papers

citing papers explorer