C o T - UQ : Improving Response-wise Uncertainty Quantification in LLM s with Chain-of-Thought

Zhang, Boxuan, Zhang, Ruqi · 2025 · DOI 10.18653/v1/2025.findings-acl.1339

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

representative citing papers

When Calibration Rankings Reverse: Accuracy-Controlled Evaluation for Fair Comparison of LLMs

cs.CL · 2026-06-29 · unverdicted · novelty 6.0

Global calibration metrics like ECE are confounded by accuracy; the proposed ACE framework with three accuracy-controlled views shows many prior calibration advantages weaken or reverse.

From `May' to `Is': Certainty Distortion in Language Model Rewriting

cs.CL · 2026-06-06 · unverdicted · novelty 6.0

LMs systematically inflate expressed certainty during rewriting, affecting up to 75% of outputs with a 1.5-2x bias toward increasing rather than decreasing certainty, and the effect compounds over iterations.

citing papers explorer

Showing 2 of 2 citing papers after filters.

When Calibration Rankings Reverse: Accuracy-Controlled Evaluation for Fair Comparison of LLMs cs.CL · 2026-06-29 · unverdicted · none · ref 92
Global calibration metrics like ECE are confounded by accuracy; the proposed ACE framework with three accuracy-controlled views shows many prior calibration advantages weaken or reverse.
From `May' to `Is': Certainty Distortion in Language Model Rewriting cs.CL · 2026-06-06 · unverdicted · none · ref 28
LMs systematically inflate expressed certainty during rewriting, affecting up to 75% of outputs with a 1.5-2x bias toward increasing rather than decreasing certainty, and the effect compounds over iterations.

C o T - UQ : Improving Response-wise Uncertainty Quantification in LLM s with Chain-of-Thought

fields

years

verdicts

representative citing papers

citing papers explorer