pith. sign in

arxiv: 2505.17855 · v2 · submitted 2025-05-23 · 💻 cs.CL

Explaining Sources of Uncertainty in Automated Fact-Checking

Pith reviewed 2026-05-19 13:19 UTC · model grok-4.3

classification 💻 cs.CL
keywords uncertainty explanationsfact-checkinglanguage modelsconflict and agreementspan relationshipsnatural language explanationshuman-AI collaboration
0
0 comments X

The pith

CLUE explains language model uncertainty in fact-checking by linking it to specific conflicts and agreements between text spans.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CLUE as a way to generate natural language explanations for why models are uncertain about fact-checking predictions. It works by first detecting relationships among spans of text that reveal claim-evidence or inter-evidence conflicts and agreements. These relationships then guide prompting and attention steering to produce verbal explanations. Experiments across three models and two datasets show the resulting explanations align better with actual uncertainty and decisions than explanations from standard prompting. Human judges rate them as more helpful, informative, and logically consistent.

Core claim

CLUE is the first framework to generate natural language explanations of model uncertainty by identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance.

What carries the argument

CLUE framework, which identifies relationships between spans of text in an unsupervised manner to expose conflicts and agreements, then generates explanations via prompting and attention steering.

If this is right

  • Explanations explicitly link uncertainty to evidence conflicts, supporting better human-AI collaboration in fact-checking.
  • CLUE requires no fine-tuning or architectural changes and works as plug-and-play for any white-box language model.
  • Human evaluators judge the explanations more helpful, informative, less redundant, and more logically consistent with the input.
  • The approach generalizes readily to other tasks that require reasoning over complex information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If span relationships reliably capture uncertainty sources, users could resolve disagreements with AI fact-checkers by examining the highlighted conflicts.
  • The unsupervised span detection step could be adapted to other reasoning domains where conflicting information drives model doubt.
  • Testing CLUE in live fact-checking workflows would show whether the explanations actually help people correct or trust model outputs.

Load-bearing premise

Identifying relationships between spans of text in an unsupervised manner accurately exposes the claim-evidence or inter-evidence conflicts and agreements that are the primary drivers of the model's predictive uncertainty.

What would settle it

An experiment in which CLUE explanations score no higher than baseline prompting on faithfulness to uncertainty or consistency with fact-checking decisions would falsify the central claim.

Figures

Figures reproduced from arXiv: 2505.17855 by Greta Warren, Irina Shklovski, Isabelle Augenstein, Jingyi Sun.

Figure 1
Figure 1. Figure 1: An example of CLUE output for a claim with [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustrative comparison of verdict-oriented fact-checking explanations (e-FEVER ( [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Prompt template for span interaction relation [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Three-shot prompt for PromptBaseline (Shots 2-3 omitted) on the HealthVer and DRUID datasets. 0.08 for PromptBaseline, from 0.006 to 0.10 for CLUE-Span, and from 0.033 to 0.13 for CLUE￾Span+Steering. This may be because, when NLEs are grounded in richer evidence, they capture more nuances in both claim-evidence and inter-evidence interactions, thereby more faithfully reflecting the model’s fact-checking de… view at source ↗
Figure 5
Figure 5. Figure 5: Three-shot prompt for CLUE-Span and CLUE-Span+Steering (Shots 2-3 omitted) on the HealthVer and DRUID datasets. Overall, both variants of our CLUE model, CLUE-Span and CLUE-Span+Steering, show higher Faithfulness and Label-Explanation Entail￾ment than the baseline method PromptBaseline, confirming the effectiveness of our framework. It is also notable that, compared with the eval￾uation results on other mo… view at source ↗
Figure 6
Figure 6. Figure 6: Example of human evaluation set-up. Expla [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
read the original abstract

Understanding sources of a model's uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior work proposes using numerical uncertainty or hedges ("I'm not sure, but ..."), which do not explain uncertainty that arises from conflicting evidence, leaving users unable to resolve disagreements or rely on the output. We introduce CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations), the first framework to generate natural language explanations of model uncertainty by (i) identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and (ii) generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, we show that CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance. Human evaluators judge our explanations to be more helpful, more informative, less redundant, and more logically consistent with the input than this baseline. CLUE requires no fine-tuning or architectural changes, making it plug-and-play for any white-box language model. By explicitly linking uncertainty to evidence conflicts, it offers practical support for fact-checking and generalises readily to other tasks that require reasoning over complex information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces CLUE, a framework for generating natural language explanations of model uncertainty in automated fact-checking. It first identifies relationships between spans of text in an unsupervised manner to expose claim-evidence or inter-evidence conflicts and agreements that drive predictive uncertainty, then generates explanations via prompting and attention steering to verbalize these interactions. Across three language models and two fact-checking datasets, CLUE is shown to yield explanations more faithful to the model's uncertainty and more consistent with fact-checking decisions than a direct-prompting baseline without span-interaction guidance. Human evaluators rate the explanations higher on helpfulness, informativeness, reduced redundancy, and logical consistency. The approach requires no fine-tuning or architectural changes and is presented as plug-and-play for white-box models.

Significance. If the core assumption holds, this work would meaningfully advance explainable AI for fact-checking by moving beyond numerical uncertainty scores or hedges to explanations that explicitly tie uncertainty to evidence conflicts and agreements. Such explanations could improve human-AI collaboration in misinformation detection and other reasoning tasks over complex inputs. The no-fine-tuning design increases practical applicability across models.

major comments (1)
  1. [Abstract and unsupervised identification step] The central claim that CLUE produces explanations more faithful to the model's uncertainty rests on the unsupervised identification of span relationships accurately exposing the primary drivers of predictive uncertainty (claim-evidence or inter-evidence conflicts/agreements). The abstract describes this step as unsupervised with no mention of validation against ground-truth conflicts, perturbation tests, or correlation with changes in uncertainty. If the identified spans are salient but not causally decisive for the uncertainty, the subsequent prompting and attention-steering explanations may be consistent with decisions yet fail to be faithful to uncertainty sources, weakening the comparison to the baseline.
minor comments (2)
  1. [Human Evaluation] The human evaluation section would benefit from reporting inter-annotator agreement statistics and more detailed rubrics for the criteria of helpfulness, informativeness, redundancy, and logical consistency to strengthen the qualitative claims.
  2. [Experiments] Additional details on the precise implementation of the direct-prompting baseline, including prompt templates and any hyperparameter choices, would aid reproducibility and allow clearer assessment of the reported improvements.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback and for highlighting the importance of validating the unsupervised identification step. We address the major comment below and will revise the manuscript to strengthen this aspect of the presentation.

read point-by-point responses
  1. Referee: [Abstract and unsupervised identification step] The central claim that CLUE produces explanations more faithful to the model's uncertainty rests on the unsupervised identification of span relationships accurately exposing the primary drivers of predictive uncertainty (claim-evidence or inter-evidence conflicts/agreements). The abstract describes this step as unsupervised with no mention of validation against ground-truth conflicts, perturbation tests, or correlation with changes in uncertainty. If the identified spans are salient but not causally decisive for the uncertainty, the subsequent prompting and attention-steering explanations may be consistent with decisions yet fail to be faithful to uncertainty sources, weakening the comparison to the baseline.

    Authors: We acknowledge that the abstract does not explicitly discuss validation of the unsupervised span-relationship identification. The full manuscript reports that CLUE yields higher faithfulness to model uncertainty (via consistency with predictive changes) and better human ratings than the direct-prompting baseline across three models and two datasets; these downstream gains provide indirect evidence that the identified conflicts and agreements are relevant drivers. Because the method is intentionally unsupervised and the existing fact-checking datasets lack span-level ground-truth conflict labels, we did not perform direct validation against such annotations in the original submission. We will revise the abstract to note this design choice, add a limitations paragraph, and include new perturbation experiments (masking identified spans and measuring resulting uncertainty shifts) plus correlation analysis between identified relationships and uncertainty magnitude. These additions will be present in the revised version. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation or claims

full rationale

The paper presents CLUE as an unsupervised span-relationship identification step followed by prompting and attention-steering to produce natural-language uncertainty explanations, then reports empirical comparisons against a direct-prompting baseline on three models and two datasets. No equations, fitted parameters, or self-citations are shown that reduce the faithfulness or consistency results to definitional equivalence with the inputs. The central claims rest on experimental outcomes rather than any self-referential construction, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework builds on standard LLM prompting and attention mechanisms without introducing new fitted parameters or postulated entities; the core premise is a domain assumption about what drives model uncertainty.

axioms (1)
  • domain assumption Relationships between text spans can be identified unsupervisedly to expose conflicts and agreements that drive predictive uncertainty in fact-checking.
    This premise underpins the first step of the CLUE framework as described in the abstract.

pith-pipeline@v0.9.0 · 5759 in / 1170 out tokens · 62710 ms · 2026-05-19T13:19:55.265795+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning from AVA: Early Lessons from a Curated and Trustworthy Generative AI for Policy and Development Research

    cs.HC 2026-04 unverdicted novelty 5.0

    AVA is a specialized GenAI platform for development policy research that provides verifiable syntheses from World Bank reports and is associated with 2.4-3.9 hours of weekly time savings in a large-scale user evaluation.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · cited by 1 Pith paper · 4 internal anchors

  1. [1]

    Longformer: The Long-Document Transformer

    Longformer: The Long-Document Trans- former.ArXiv preprint, abs/2004.05150. Steven Bird, Ewan Klein, and Edward Loper. 2009.Nat- ural Language Processing with Python. O’Reilly Media. Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast Un- folding of Communities in Large Networks.Jour- nal of statistical mechanics: t...

  2. [2]

    Training Verifiers to Solve Math Word Problems

    Language models are few-shot learners. InAd- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Process- ing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual. Hou Pong Chan, Qi Zeng, and Heng Ji. 2023. Inter- pretable Automatic Fine-grained Inconsistency De- tection in Text Summarization. InFindings of the ...

  3. [3]

    InThe Eleventh International Conference on Learning Representa- tions, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

    DeBERTaV3: Improving DeBERTa us- ing ELECTRA-Style Pre-Training with Gradient- Disentangled Embedding Sharing. InThe Eleventh International Conference on Learning Representa- tions, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net. Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Stein- hardt. 2021. Measur...

  4. [4]

    Language Models (Mostly) Know What They Know

    Generating fluent fact checking explanations with unsupervised post-editing.Information, 13(10). Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, et al. 2022. Language Models (Mostly) Know What They Know.ArXiv preprint, abs/2207.05221. Maurice G Kend...

  5. [5]

    I’m Not Sure, But

    "I’m Not Sure, But...": Examining the Impact of Large Language Models’ Uncertainty Expression on User Reliance and Trust. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’24, page 822–835, New York, NY , USA. Association for Computing Machin- ery. Neema Kotonya and Francesca Toni. 2020. Explainable Automated F...

  6. [6]

    https://openreview.net/forum? id=8s8K2UZGTZ

    Teaching Models to Express Their Uncer- tainty in Words.Transactions on Machine Learn- ing Research. https://openreview.net/forum? id=8s8K2UZGTZ. Dawn Liu, Marie Juanchich, Miroslav Sirota, and Sheina Orbell. 2020. The Intuitive Use of Contex- tual Information in Decisions Made with Verbal and Numerical Quantifiers.Quarterly Journal of Experi- mental Psyc...

  7. [7]

    Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework. InPro- ceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Compu- tational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 10559–10577, Al- buquerque, New Mexico. Association for Computa- tional Linguis...

  8. [8]

    In Proceedings of the CHI Conference on Human Fac- tors in Computing Systems, CHI ’25, New York, NY , USA

    Show Me the Work: Fact-Checkers’ Require- ments for Explainable Automated Fact-Checking. In Proceedings of the CHI Conference on Human Fac- tors in Computing Systems, CHI ’25, New York, NY , USA. Association for Computing Machinery. Greta Warren, Jingyi Sun, Irina Shklovski, and Isabelle Augenstein. 2026. Show Me the Evidence: Evalu- ating the Role of Evi...

  9. [9]

    InProceedings of the 2021 Conference on Empirical Methods in Natural Lan- guage Processing, pages 10266–10284, Online and Punta Cana, Dominican Republic

    Measuring Association Between Labels and Free-Text Rationales. InProceedings of the 2021 Conference on Empirical Methods in Natural Lan- guage Processing, pages 10266–10284, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. Paul D Windschitl and Gary L Wells. 1996. Measuring Psychological Uncertainty: Verbal versus Nume...

  10. [10]

    Read the claim and its two evidence passages ( E1 , E2 )

  11. [11]

    For each supplied span interaction , decide whether the two spans AGREE , DISAGREE , or are UNRELATED , taking the full context into account

  12. [12]

    relation : agree | disagree | unrelated

    Output the span pairs exactly as given , followed by " relation : agree | disagree | unrelated ". Return format :

  13. [13]

    SPAN A

    " SPAN A " - " SPAN B " relation : < agree | disagree | unrelated >

  14. [14]

    ### SHOT 1 ( annotated example ) Claim : [...] Evidence 1: [...] Evidence 2: [...] Span interactions ( to be labelled ) :

  15. [15]

    [...]" -

    "[...]" - "[...]" Expected output :

  16. [17]

    [...]" -

    "[...]" - "[...]" relation :

  17. [18]

    [...]" -

    "[...]" - "[...]" relation : ... ### SHOT 2 % omitted for brevity ### SHOT 3 % omitted for brevity ### NEW INSTANCE ( pre - filled for each new example ) Claim : { CLAIM } Evidence 1: { E1 } Evidence 2: { E2 } Span interactions :

  18. [19]

    { SPAN1 - A }

    "{ SPAN1 - A }" - "{ SPAN1 - B }"

  19. [20]

    { SPAN2 - A }

    "{ SPAN2 - A }" - "{ SPAN2 - B }"

  20. [21]

    { SPAN3 - A }

    "{ SPAN3 - A }" - "{ SPAN3 - B }" Figure 3: Prompt template for span interaction relation labelling. C.2 Open-weight Alternative for Span-Interaction Relation Labeling This section studies how the choice of relation la- beler affects downstream NLE quality in CLUE. While we use GPT-4o as the default relation labeler to reduce labeling noise, we show that ...

  21. [22]

    The model is explicitly instructed to base its explanation on these spans, ensuring that the ra- tionale remains grounded in the provided evidence

    supplies the three pre-extracted span interactions (§2.3). The model is explicitly instructed to base its explanation on these spans, ensuring that the ra- tionale remains grounded in the provided evidence. Model Methodr pb t p HealthVer Qwen2.5-14B-InstructPrompt Baseline −0.028−2.38 1.7×10 −2 CLUE-Span+0.006 +0.51 6.1×10 −1 CLUE-Span+Steering+0.033 +2.8...

  22. [24]

    Explain your prediction ’ s uncertainty by identifying the three most influential span interactions from Claim - Evidence 1 , Claim - Evidence 2 , and Evidence 1 - Evidence 2 , and describing how each interaction ’ s relation ( agree , disagree , or unrelated ) affects your overall confidence . Return format : [ Prediction ] [ Explanation ] ### SHOT 1 Inp...

  23. [25]

    Determine the relationship between the claim and the two evidence passages

  24. [26]

    Return format : [ Prediction ] [ Explanation ] ### SHOT 1 Input : Claim : [...] Evidence 1: [...] Evidence 2: [...] Span interactions :

    Explain your prediction ’ s uncertainty by referring to the three span interactions provided below ( Claim - Evidence 1 , Claim - Evidence 2 , Evidence 1 - Evidence 2) and describing how each interaction ’ s relation ( agree , disagree , or unrelated ) affects your overall confidence . Return format : [ Prediction ] [ Explanation ] ### SHOT 1 Input : Clai...

  25. [27]

    ’ ’[...] ’ ’ - ’ ’[...] ’ ’ (C - E1 ) relation : [...]

  26. [28]

    ’ ’[...] ’ ’ - ’ ’[...] ’ ’ (C - E2 ) relation : [...]

  27. [29]

    ’ ’[...] ’ ’ - ’ ’[...] ’ ’ ( E1 - E2 ) relation : [...] Output : [ Prediction : ...] [ Explanation : ...] ### SHOT 2 % omitted for brevity ### SHOT 3 % omitted for brevity ### NEW INSTANCE Claim : { CLAIM } Evidence 1: { E1 } Evidence 2: { E2 } Span interactions ( pre - filled ) :

  28. [30]

    ’ ’{ SPAN1 - A } ’ ’ - ’ ’{ SPAN1 - B } ’ ’ (C - E1 ) relation : { REL1 }

  29. [31]

    ’ ’{ SPAN2 - A } ’ ’ - ’ ’{ SPAN2 - B } ’ ’ (C - E2 ) relation : { REL2 }

  30. [32]

    ’ ’{ SPAN3 - A } ’ ’ - ’ ’{ SPAN3 - B } ’ ’ ( E1 - E2 ) relation : { REL3 } Your answer : Figure 5: Three-shot prompt for CLUE-Span and CLUE-Span+Steering (Shots 2-3 omitted) on the HealthVer and DRUID datasets. Overall, both variants of our CLUE model, CLUE-Span and CLUE-Span+Steering, show higher Faithfulness and Label-Explanation Entail- ment than the ...

  31. [33]

    This survey is part of a project to help us understand [anonymised]

    What is the project about? Our goal is to make sure that AI fact-checking systems can explain the decisions they produce in ways that are understandable and useful to people. This survey is part of a project to help us understand [anonymised]

  32. [34]

    In this task you will see claims, an AI system’s prediction about whether this claim is true or false and cor- responding evidence used to make the prediction

    What does participation entail? You are invited to help us explore what kinds of explanations work better in fact-checking. In this task you will see claims, an AI system’s prediction about whether this claim is true or false and cor- responding evidence used to make the prediction. You will also see an explanation for why the AI system is certain or unce...

  33. [35]

    Source of funding This project has received funding from [anonymised]

  34. [36]

    Partic- ipation in the study is completely voluntary

    Consenting to participate in the project and withdrawing from the research You can consent to participating in this study by ticking the box on the next page of the study. Partic- ipation in the study is completely voluntary. Your decision not to consent will have no adverse con- sequences. Should you wish to withdraw during the experiment you can simply ...

  35. [37]

    This is a long-term research project, so the benefits of the research may not be seen for several years

    Possible benefits and risks to participants By participating in this study you will be contribut- ing to research related to understanding what kinds of explanations are useful to people who use or who are impacted by automated fact checking systems. This is a long-term research project, so the benefits of the research may not be seen for several years. I...

  36. [38]

    What personal data does the project process? The project does not process any personal data

  37. [39]

    Your rights are specified in [anonymised]

    Participants’ rights under the General Data Protection Regulation (GDPR) As a participant in a research project, you have a number of rights under the GDPR. Your rights are specified in [anonymised]

  38. [40]

    I.4 Human Evaluation Consent Form We hereby request your consent for processing your data

    Person responsible for storing and processing of data [anonymised] Please click ’Next’ to read more about consenting to participate in the study. I.4 Human Evaluation Consent Form We hereby request your consent for processing your data. We do so in compliance with the General Data Protection Regulation (GDPR). See the informa- tion sheet on the previous s...