Quantifying uncertainty in answers from any language model and enhancing their trustworthiness

Association for Computational Linguistics · 2024 · DOI 10.18653/v1/2024.acl-long.283

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Can LLMs Use Linguistic Uncertainty Markers to Reliably Reflect Intrinsic Confidence?

cs.CL · 2026-05-27 · unverdicted · novelty 7.0

LLMs struggle to associate epistemic markers with stable internal confidence levels across distributions, even under model-centric interpretations, while maintaining somewhat consistent marker rankings.

Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs

cs.CL · 2026-06-30 · unverdicted · novelty 6.0

RLMF uses quality of model self-judgments to refine RL rankings and select training data, achieving SOTA faithful calibration while preserving accuracy and outperforming standard RL by up to 63%.

Quantifying Faithful Confidence Expression in Large Reasoning Models

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

A new framework quantifies faithful confidence expression in large reasoning models by comparing linguistic decisiveness to token probabilities, hidden states, and response consistency, revealing it as a persistent challenge.

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

cs.CL · 2026-06-02 · unverdicted · novelty 6.0

Clustered Self-Assessment groups sampled LLM responses into semantic clusters, presents clusters as multiple-choice options, and uses the LLM's assigned probabilities to those options as direct uncertainty estimates, outperforming entropy baselines with as few as two extra samples.

Aligning LLM Uncertainty with Human Disagreement in Subjectivity Analysis

cs.CL · 2026-05-11 · unverdicted · novelty 6.0 · 2 refs

DPUA is a two-phase framework that aligns LLM uncertainty expressions with human disagreement distributions in subjectivity analysis while preserving task performance.

Separate First, Fuse Later: Mitigating Cross-Modal Interference in Audio-Visual LLMs Reasoning with Modality-Specific Chain-of-Thought

cs.AI · 2026-05-11 · unverdicted · novelty 6.0

Separate modality-specific reasoning before fusion reduces hallucinations and improves accuracy in audio-visual LLMs by enforcing isolated traces then integrating evidence.

citing papers explorer

Showing 6 of 6 citing papers after filters.

Can LLMs Use Linguistic Uncertainty Markers to Reliably Reflect Intrinsic Confidence? cs.CL · 2026-05-27 · unverdicted · none · ref 11
LLMs struggle to associate epistemic markers with stable internal confidence levels across distributions, even under model-centric interpretations, while maintaining somewhat consistent marker rankings.
Reinforcement Learning with Metacognitive Feedback Elicits Faithful Uncertainty Expression in LLMs cs.CL · 2026-06-30 · unverdicted · none · ref 14
RLMF uses quality of model self-judgments to refine RL rankings and select training data, achieving SOTA faithful calibration while preserving accuracy and outperforming standard RL by up to 63%.
Quantifying Faithful Confidence Expression in Large Reasoning Models cs.CL · 2026-06-02 · unverdicted · none · ref 5
A new framework quantifies faithful confidence expression in large reasoning models by comparing linguistic decisiveness to token probabilities, hidden states, and response consistency, revealing it as a persistent challenge.
Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models cs.CL · 2026-06-02 · unverdicted · none · ref 2
Clustered Self-Assessment groups sampled LLM responses into semantic clusters, presents clusters as multiple-choice options, and uses the LLM's assigned probabilities to those options as direct uncertainty estimates, outperforming entropy baselines with as few as two extra samples.
Aligning LLM Uncertainty with Human Disagreement in Subjectivity Analysis cs.CL · 2026-05-11 · unverdicted · none · ref 35 · 2 links
DPUA is a two-phase framework that aligns LLM uncertainty expressions with human disagreement distributions in subjectivity analysis while preserving task performance.
Separate First, Fuse Later: Mitigating Cross-Modal Interference in Audio-Visual LLMs Reasoning with Modality-Specific Chain-of-Thought cs.AI · 2026-05-11 · unverdicted · none · ref 4
Separate modality-specific reasoning before fusion reduces hallucinations and improves accuracy in audio-visual LLMs by enforcing isolated traces then integrating evidence.

Quantifying uncertainty in answers from any language model and enhancing their trustworthiness

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer