Global calibration metrics like ECE are confounded by accuracy; the proposed ACE framework with three accuracy-controlled views shows many prior calibration advantages weaken or reverse.
Sayself: Teaching llms to express con- fidence with self-reflective rationales.arXiv preprint arXiv:2405.20974
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
years
2026 3representative citing papers
C3RL is a new RL algorithm combining correctness, calibration, and reference accuracy rewards to improve LLM confidence calibration, enabling CAS to outperform majority voting with up to 12.33x lower inference cost.
citing papers explorer
-
When Calibration Rankings Reverse: Accuracy-Controlled Evaluation for Fair Comparison of LLMs
Global calibration metrics like ECE are confounded by accuracy; the proposed ACE framework with three accuracy-controlled views shows many prior calibration advantages weaken or reverse.