Explaining Sources of Uncertainty in Automated Fact-Checking
Pith reviewed 2026-05-19 13:19 UTC · model grok-4.3
The pith
CLUE explains language model uncertainty in fact-checking by linking it to specific conflicts and agreements between text spans.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CLUE is the first framework to generate natural language explanations of model uncertainty by identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance.
What carries the argument
CLUE framework, which identifies relationships between spans of text in an unsupervised manner to expose conflicts and agreements, then generates explanations via prompting and attention steering.
If this is right
- Explanations explicitly link uncertainty to evidence conflicts, supporting better human-AI collaboration in fact-checking.
- CLUE requires no fine-tuning or architectural changes and works as plug-and-play for any white-box language model.
- Human evaluators judge the explanations more helpful, informative, less redundant, and more logically consistent with the input.
- The approach generalizes readily to other tasks that require reasoning over complex information.
Where Pith is reading between the lines
- If span relationships reliably capture uncertainty sources, users could resolve disagreements with AI fact-checkers by examining the highlighted conflicts.
- The unsupervised span detection step could be adapted to other reasoning domains where conflicting information drives model doubt.
- Testing CLUE in live fact-checking workflows would show whether the explanations actually help people correct or trust model outputs.
Load-bearing premise
Identifying relationships between spans of text in an unsupervised manner accurately exposes the claim-evidence or inter-evidence conflicts and agreements that are the primary drivers of the model's predictive uncertainty.
What would settle it
An experiment in which CLUE explanations score no higher than baseline prompting on faithfulness to uncertainty or consistency with fact-checking decisions would falsify the central claim.
Figures
read the original abstract
Understanding sources of a model's uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior work proposes using numerical uncertainty or hedges ("I'm not sure, but ..."), which do not explain uncertainty that arises from conflicting evidence, leaving users unable to resolve disagreements or rely on the output. We introduce CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations), the first framework to generate natural language explanations of model uncertainty by (i) identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and (ii) generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, we show that CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance. Human evaluators judge our explanations to be more helpful, more informative, less redundant, and more logically consistent with the input than this baseline. CLUE requires no fine-tuning or architectural changes, making it plug-and-play for any white-box language model. By explicitly linking uncertainty to evidence conflicts, it offers practical support for fact-checking and generalises readily to other tasks that require reasoning over complex information.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces CLUE, a framework for generating natural language explanations of model uncertainty in automated fact-checking. It first identifies relationships between spans of text in an unsupervised manner to expose claim-evidence or inter-evidence conflicts and agreements that drive predictive uncertainty, then generates explanations via prompting and attention steering to verbalize these interactions. Across three language models and two fact-checking datasets, CLUE is shown to yield explanations more faithful to the model's uncertainty and more consistent with fact-checking decisions than a direct-prompting baseline without span-interaction guidance. Human evaluators rate the explanations higher on helpfulness, informativeness, reduced redundancy, and logical consistency. The approach requires no fine-tuning or architectural changes and is presented as plug-and-play for white-box models.
Significance. If the core assumption holds, this work would meaningfully advance explainable AI for fact-checking by moving beyond numerical uncertainty scores or hedges to explanations that explicitly tie uncertainty to evidence conflicts and agreements. Such explanations could improve human-AI collaboration in misinformation detection and other reasoning tasks over complex inputs. The no-fine-tuning design increases practical applicability across models.
major comments (1)
- [Abstract and unsupervised identification step] The central claim that CLUE produces explanations more faithful to the model's uncertainty rests on the unsupervised identification of span relationships accurately exposing the primary drivers of predictive uncertainty (claim-evidence or inter-evidence conflicts/agreements). The abstract describes this step as unsupervised with no mention of validation against ground-truth conflicts, perturbation tests, or correlation with changes in uncertainty. If the identified spans are salient but not causally decisive for the uncertainty, the subsequent prompting and attention-steering explanations may be consistent with decisions yet fail to be faithful to uncertainty sources, weakening the comparison to the baseline.
minor comments (2)
- [Human Evaluation] The human evaluation section would benefit from reporting inter-annotator agreement statistics and more detailed rubrics for the criteria of helpfulness, informativeness, redundancy, and logical consistency to strengthen the qualitative claims.
- [Experiments] Additional details on the precise implementation of the direct-prompting baseline, including prompt templates and any hyperparameter choices, would aid reproducibility and allow clearer assessment of the reported improvements.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and for highlighting the importance of validating the unsupervised identification step. We address the major comment below and will revise the manuscript to strengthen this aspect of the presentation.
read point-by-point responses
-
Referee: [Abstract and unsupervised identification step] The central claim that CLUE produces explanations more faithful to the model's uncertainty rests on the unsupervised identification of span relationships accurately exposing the primary drivers of predictive uncertainty (claim-evidence or inter-evidence conflicts/agreements). The abstract describes this step as unsupervised with no mention of validation against ground-truth conflicts, perturbation tests, or correlation with changes in uncertainty. If the identified spans are salient but not causally decisive for the uncertainty, the subsequent prompting and attention-steering explanations may be consistent with decisions yet fail to be faithful to uncertainty sources, weakening the comparison to the baseline.
Authors: We acknowledge that the abstract does not explicitly discuss validation of the unsupervised span-relationship identification. The full manuscript reports that CLUE yields higher faithfulness to model uncertainty (via consistency with predictive changes) and better human ratings than the direct-prompting baseline across three models and two datasets; these downstream gains provide indirect evidence that the identified conflicts and agreements are relevant drivers. Because the method is intentionally unsupervised and the existing fact-checking datasets lack span-level ground-truth conflict labels, we did not perform direct validation against such annotations in the original submission. We will revise the abstract to note this design choice, add a limitations paragraph, and include new perturbation experiments (masking identified spans and measuring resulting uncertainty shifts) plus correlation analysis between identified relationships and uncertainty magnitude. These additions will be present in the revised version. revision: yes
Circularity Check
No significant circularity detected in derivation or claims
full rationale
The paper presents CLUE as an unsupervised span-relationship identification step followed by prompting and attention-steering to produce natural-language uncertainty explanations, then reports empirical comparisons against a direct-prompting baseline on three models and two datasets. No equations, fitted parameters, or self-citations are shown that reduce the faithfulness or consistency results to definitional equivalence with the inputs. The central claims rest on experimental outcomes rather than any self-referential construction, making the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Relationships between text spans can be identified unsupervisedly to expose conflicts and agreements that drive predictive uncertainty in fact-checking.
Forward citations
Cited by 1 Pith paper
-
Learning from AVA: Early Lessons from a Curated and Trustworthy Generative AI for Policy and Development Research
AVA is a specialized GenAI platform for development policy research that provides verifiable syntheses from World Bank reports and is associated with 2.4-3.9 hours of weekly time savings in a large-scale user evaluation.
Reference graph
Works this paper leans on
-
[1]
Longformer: The Long-Document Transformer
Longformer: The Long-Document Trans- former.ArXiv preprint, abs/2004.05150. Steven Bird, Ewan Klein, and Edward Loper. 2009.Nat- ural Language Processing with Python. O’Reilly Media. Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast Un- folding of Communities in Large Networks.Jour- nal of statistical mechanics: t...
work page internal anchor Pith review Pith/arXiv arXiv 2004
-
[2]
Training Verifiers to Solve Math Word Problems
Language models are few-shot learners. InAd- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Process- ing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Hou Pong Chan, Qi Zeng, and Heng Ji. 2023. Inter- pretable Automatic Fine-grained Inconsistency De- tection in Text Summarization. InFindings of the ...
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[3]
DeBERTaV3: Improving DeBERTa us- ing ELECTRA-Style Pre-Training with Gradient- Disentangled Embedding Sharing. InThe Eleventh International Conference on Learning Representa- tions, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Stein- hardt. 2021. Measur...
-
[4]
Language Models (Mostly) Know What They Know
Generating fluent fact checking explanations with unsupervised post-editing.Information, 13(10). Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, et al. 2022. Language Models (Mostly) Know What They Know.ArXiv preprint, abs/2207.05221. Maurice G Kend...
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[5]
"I’m Not Sure, But...": Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’24, page 822–835, New York, NY , USA. Association for Computing Machin- ery. Neema Kotonya and Francesca Toni. 2020. Explainable Automated F...
work page 2024
-
[6]
https://openreview.net/forum? id=8s8K2UZGTZ
Teaching Models to Express Their Uncer- tainty in Words.Transactions on Machine Learn- ing Research. https://openreview.net/forum? id=8s8K2UZGTZ. Dawn Liu, Marie Juanchich, Miroslav Sirota, and Sheina Orbell. 2020. The Intuitive Use of Contex- tual Information in Decisions Made with Verbal and Numerical Quantifiers.Quarterly Journal of Experi- mental Psyc...
-
[7]
Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework. InPro- ceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Compu- tational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 10559–10577, Al- buquerque, New Mexico. Association for Computa- tional Linguis...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[8]
Show Me the Work: Fact-Checkers’ Require- ments for Explainable Automated Fact-Checking. In Proceedings of the CHI Conference on Human Fac- tors in Computing Systems, CHI ’25, New York, NY , USA. Association for Computing Machinery. Greta Warren, Jingyi Sun, Irina Shklovski, and Isabelle Augenstein. 2026. Show Me the Evidence: Evalu- ating the Role of Evi...
-
[9]
Measuring Association Between Labels and Free-Text Rationales. InProceedings of the 2021 Conference on Empirical Methods in Natural Lan- guage Processing, pages 10266–10284, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. Paul D Windschitl and Gary L Wells. 1996. Measuring Psychological Uncertainty: Verbal versus Nume...
work page 2021
-
[10]
Read the claim and its two evidence passages ( E1 , E2 )
-
[11]
For each supplied span interaction , decide whether the two spans AGREE , DISAGREE , or are UNRELATED , taking the full context into account
-
[12]
relation : agree | disagree | unrelated
Output the span pairs exactly as given , followed by " relation : agree | disagree | unrelated ". Return format :
- [13]
-
[14]
### SHOT 1 ( annotated example ) Claim : [...] Evidence 1: [...] Evidence 2: [...] Span interactions ( to be labelled ) :
- [15]
- [17]
- [18]
- [19]
- [20]
-
[21]
"{ SPAN3 - A }" - "{ SPAN3 - B }" Figure 3: Prompt template for span interaction relation labelling. C.2 Open-weight Alternative for Span-Interaction Relation Labeling This section studies how the choice of relation la- beler affects downstream NLE quality in CLUE. While we use GPT-4o as the default relation labeler to reduce labeling noise, we show that ...
work page 2024
-
[22]
supplies the three pre-extracted span interactions (§2.3). The model is explicitly instructed to base its explanation on these spans, ensuring that the ra- tionale remains grounded in the provided evidence. Model Methodr pb t p HealthVer Qwen2.5-14B-InstructPrompt Baseline −0.028−2.38 1.7×10 −2 CLUE-Span+0.006 +0.51 6.1×10 −1 CLUE-Span+Steering+0.033 +2.8...
work page 2024
-
[24]
Explain your prediction ’ s uncertainty by identifying the three most influential span interactions from Claim - Evidence 1 , Claim - Evidence 2 , and Evidence 1 - Evidence 2 , and describing how each interaction ’ s relation ( agree , disagree , or unrelated ) affects your overall confidence . Return format : [ Prediction ] [ Explanation ] ### SHOT 1 Inp...
-
[25]
Determine the relationship between the claim and the two evidence passages
-
[26]
Explain your prediction ’ s uncertainty by referring to the three span interactions provided below ( Claim - Evidence 1 , Claim - Evidence 2 , Evidence 1 - Evidence 2) and describing how each interaction ’ s relation ( agree , disagree , or unrelated ) affects your overall confidence . Return format : [ Prediction ] [ Explanation ] ### SHOT 1 Input : Clai...
-
[27]
’ ’[...] ’ ’ - ’ ’[...] ’ ’ (C - E1 ) relation : [...]
-
[28]
’ ’[...] ’ ’ - ’ ’[...] ’ ’ (C - E2 ) relation : [...]
-
[29]
’ ’[...] ’ ’ - ’ ’[...] ’ ’ ( E1 - E2 ) relation : [...] Output : [ Prediction : ...] [ Explanation : ...] ### SHOT 2 % omitted for brevity ### SHOT 3 % omitted for brevity ### NEW INSTANCE Claim : { CLAIM } Evidence 1: { E1 } Evidence 2: { E2 } Span interactions ( pre - filled ) :
-
[30]
’ ’{ SPAN1 - A } ’ ’ - ’ ’{ SPAN1 - B } ’ ’ (C - E1 ) relation : { REL1 }
-
[31]
’ ’{ SPAN2 - A } ’ ’ - ’ ’{ SPAN2 - B } ’ ’ (C - E2 ) relation : { REL2 }
-
[32]
’ ’{ SPAN3 - A } ’ ’ - ’ ’{ SPAN3 - B } ’ ’ ( E1 - E2 ) relation : { REL3 } Your answer : Figure 5: Three-shot prompt for CLUE-Span and CLUE-Span+Steering (Shots 2-3 omitted) on the HealthVer and DRUID datasets. Overall, both variants of our CLUE model, CLUE-Span and CLUE-Span+Steering, show higher Faithfulness and Label-Explanation Entail- ment than the ...
-
[33]
This survey is part of a project to help us understand [anonymised]
What is the project about? Our goal is to make sure that AI fact-checking systems can explain the decisions they produce in ways that are understandable and useful to people. This survey is part of a project to help us understand [anonymised]
-
[34]
What does participation entail? You are invited to help us explore what kinds of explanations work better in fact-checking. In this task you will see claims, an AI system’s prediction about whether this claim is true or false and cor- responding evidence used to make the prediction. You will also see an explanation for why the AI system is certain or unce...
-
[35]
Source of funding This project has received funding from [anonymised]
-
[36]
Partic- ipation in the study is completely voluntary
Consenting to participate in the project and withdrawing from the research You can consent to participating in this study by ticking the box on the next page of the study. Partic- ipation in the study is completely voluntary. Your decision not to consent will have no adverse con- sequences. Should you wish to withdraw during the experiment you can simply ...
-
[37]
Possible benefits and risks to participants By participating in this study you will be contribut- ing to research related to understanding what kinds of explanations are useful to people who use or who are impacted by automated fact checking systems. This is a long-term research project, so the benefits of the research may not be seen for several years. I...
-
[38]
What personal data does the project process? The project does not process any personal data
-
[39]
Your rights are specified in [anonymised]
Participants’ rights under the General Data Protection Regulation (GDPR) As a participant in a research project, you have a number of rights under the GDPR. Your rights are specified in [anonymised]
-
[40]
I.4 Human Evaluation Consent Form We hereby request your consent for processing your data
Person responsible for storing and processing of data [anonymised] Please click ’Next’ to read more about consenting to participate in the study. I.4 Human Evaluation Consent Form We hereby request your consent for processing your data. We do so in compliance with the General Data Protection Regulation (GDPR). See the informa- tion sheet on the previous s...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.