Taxonomy of Risks on Automated Fact-Checking Systems Considering its Propagation
Pith reviewed 2026-06-25 20:34 UTC · model grok-4.3
The pith
Automated fact-checking systems contain 32 specific risks that a three-stage propagation model can surface.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By tracing risk factors through hazardous situations to harms in a three-stage propagation model, the authors derive 32 specific risks in automated fact-checking systems; these risks serve as guide words that enable risk assessment of systems such as DEFAME and uncover issues not identified by STRIDE.
What carries the argument
The three-stage risk propagation model that converts risk factors into hazardous situations and then into harms, yielding 32 guide words for systematic assessment.
If this is right
- The 32 risks provide a checklist that can be applied during the design or evaluation of any automated fact-checking system.
- Using the guide words allows assessors to detect risks that standard IT security methods overlook.
- Mitigating the identified risks would reduce the chance that an automated system itself spreads incorrect information or causes defamation.
- The taxonomy supplies concrete analytical cues for reviewing implementations such as DEFAME.
Where Pith is reading between the lines
- The same three-stage approach could be tested on other AI tools that make public claims, such as content moderation or recommendation engines.
- Early integration of the 32 guide words into development workflows might prevent costly redesigns after deployment.
- If the model is applied across multiple fact-checking platforms, patterns in shared risks could point to common architectural fixes.
Load-bearing premise
The three-stage risk propagation model is assumed to be complete enough to capture all relevant risks without leaving out important ones or drawing arbitrary boundaries.
What would settle it
Finding a documented risk in an automated fact-checking system that cannot be placed into any of the 32 categories derived from the model, or showing that STRIDE identifies every risk the guide words find.
Figures
read the original abstract
In recent years, the posting of fake news including disinformation and misinformation on social networking services (SNS) has become a social problem. To combat this fake news, fact-checking that is the process of assessing the veracity of posts on SNS has become increasingly important. While fact-checking is currently performed by fact-checking organizations, it is difficult to fact-check all posts on SNS. Therefore, the use of automated fact-checking systems is effective. Recent automated fact-checking systems utilize artificial intelligence and large language models, so there are risks of incorrect judgments and posting incorrect results on social media which can lead to the spread of misinformation or to engage in defamation. In this paper, as a first step toward enabling the safe use of automated fact-checking systems, we categorize the specific risks on automated fact-checking systems. In this categorizing, we consider a three-stage risk propagation: risk factors, hazardous situations, and harm. Our analysis revealed that 32 specific risks exist in automated fact-checking systems. In this paper, we utilize the categorized risks as analytical cues (guide words) to present the risk assessment of the automated fact-checking system DEFAME. This assessment result indicates that risks that cannot be derived using STRIDE, a conventional IT security risk assessment method can be derived using our guide words.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a taxonomy of risks for automated fact-checking systems by modeling risk propagation through three stages (risk factors, hazardous situations, and harm). Analysis via this model yields 32 specific risks, which are then used as guide words to assess the DEFAME system; the assessment claims to surface risks not derivable via the STRIDE method.
Significance. If the three-stage model proves both appropriate and exhaustive for this domain, the resulting taxonomy and guide-word approach could provide a practical extension of conventional security analysis methods to AI-based fact-checking, addressing domain-specific issues such as incorrect veracity judgments and downstream misinformation spread.
major comments (3)
- [Abstract and three-stage model section] Abstract and § on three-stage model: the assertion that the three-stage propagation (risk factors → hazardous situations → harm) systematically surfaces all relevant risks is asserted without argument, literature comparison, or sensitivity analysis showing that alternative framings (STPA, HAZOP variants, or AI risk ontologies) would not yield a materially different set or that boundaries do not exclude categories such as training-data poisoning or long-term societal feedback loops.
- [Taxonomy section] Section presenting the 32 risks: the count of exactly 32 risks and their categorization is stated as the outcome of the analysis, yet no derivation steps, inter-rater process, or validation method are supplied, leaving the reproducibility and completeness of the taxonomy unassessable.
- [DEFAME assessment section] DEFAME assessment section: the claim that the guide words derive risks that cannot be obtained with STRIDE is made without concrete examples of the missed risks or a side-by-side comparison, so the asserted superiority rests on an unshown differential rather than demonstrated evidence.
minor comments (2)
- [Abstract and introduction] The abstract and introduction would benefit from explicit definitions of the three stages before their use in the taxonomy.
- [Taxonomy presentation] Figure or table summarizing the 32 risks should include a column or note indicating which stage each risk belongs to for traceability.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper accordingly to strengthen the justification, reproducibility, and evidence presented.
read point-by-point responses
-
Referee: [Abstract and three-stage model section] the assertion that the three-stage propagation (risk factors → hazardous situations → harm) systematically surfaces all relevant risks is asserted without argument, literature comparison, or sensitivity analysis showing that alternative framings (STPA, HAZOP variants, or AI risk ontologies) would not yield a materially different set or that boundaries do not exclude categories such as training-data poisoning or long-term societal feedback loops.
Authors: We acknowledge that the three-stage model is introduced without explicit comparison to alternatives or sensitivity analysis. In revision we will add a dedicated subsection in the model description that (1) motivates the propagation framing from safety-engineering literature on hazard analysis, (2) briefly contrasts it with STPA and selected AI risk ontologies, and (3) states the scope limitation that training-data poisoning and long-term societal loops fall outside the current boundary and are flagged for future work. A full comparative sensitivity study remains outside the paper’s scope. revision: yes
-
Referee: [Taxonomy section] the count of exactly 32 risks and their categorization is stated as the outcome of the analysis, yet no derivation steps, inter-rater process, or validation method are supplied, leaving the reproducibility and completeness of the taxonomy unassessable.
Authors: The 32 risks were obtained by the authors applying the three-stage model component-wise to automated fact-checking pipelines. To improve assessability we will insert an appendix that lists the derivation steps, the component-to-risk mapping, and the explicit categorization criteria used. Because the analysis was performed by the author team rather than multiple independent raters, no inter-rater reliability statistic is available; we will therefore describe the process transparently instead of claiming formal validation. revision: partial
-
Referee: [DEFAME assessment section] the claim that the guide words derive risks that cannot be obtained with STRIDE is made without concrete examples of the missed risks or a side-by-side comparison, so the asserted superiority rests on an unshown differential rather than demonstrated evidence.
Authors: We agree that the differential must be shown explicitly. The revised DEFAME section will contain a short table that pairs each guide-word-derived risk with the closest STRIDE category (or notes its absence) and supplies two concrete examples—incorrect veracity judgment leading to defamation and propagation of a false “fact-check” result—that STRIDE does not surface. This will replace the current qualitative claim with documented evidence. revision: yes
- Full sensitivity analysis demonstrating that no alternative framing (STPA, HAZOP, AI ontologies) would produce a materially different risk set
Circularity Check
No circularity: taxonomy derived by applying an introduced framework to the domain
full rationale
The paper introduces a three-stage risk propagation model (risk factors, hazardous situations, harm) as its analytical lens and applies it to automated fact-checking systems to enumerate 32 risks, then uses the resulting guide words to assess DEFAME and contrast with STRIDE. No equations, fitted parameters, or self-citations appear in the provided text; the count of 32 risks and the STRIDE comparison are outputs of the application rather than inputs that presuppose the result. The derivation chain is therefore self-contained domain analysis rather than a reduction to prior results or definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Risks in automated fact-checking systems can be systematically categorized using a three-stage propagation model consisting of risk factors, hazardous situations, and harm.
Reference graph
Works this paper leans on
-
[1]
The saga of ’Pizzagate’: The fake story that shows how conspiracy theo- ries spread,
BBC, “The saga of ’Pizzagate’: The fake story that shows how conspiracy theo- ries spread,”https://www.bbc.com/news/ blogs-trending-38156985
-
[2]
Caution urged over Japan quake fake posts,
Japan Broadcasting Corporation (NHK), “Caution urged over Japan quake fake posts,” https://www3.nhk.or.jp/nhkworld/en/ news/backstories/4758/
-
[3]
PolitiFact,https://www.politifact.com/
-
[4]
Full Fact,https://fullfact.org/
-
[5]
factcheckcenter.jp/
Japan Fact-check Center,https://www. factcheckcenter.jp/
-
[6]
Fu- jitsu to combat fake news in collabo- ration with leading Japanese organiza- tions,
Fujitsu limited press release, “ Fu- jitsu to combat fake news in collabo- ration with leading Japanese organiza- tions,”https://info.archives.global. fujitsu/global/about/resources/news/ press-releases/2024/1016-01.html
2024
-
[7]
The Psychology of Fake News,
Gordon Pennycook, David G. Rand, “The Psychology of Fake News,” Trends in Cogni- tive Sciences
-
[8]
Microsoft Threat Modeling Tool,
Microsoft Corporation, “Microsoft Threat Modeling Tool, ”https://learn. microsoft.com/ja-jp/azure/security/ develop/threat-modeling-tool-threats
-
[9]
Automated Fact-Checking for Assist- ing Human Fact-Checkers,
Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barr´ on-Cede˜ no, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino, “Automated Fact-Checking for Assist- ing Human Fact-Checkers,”arXiv,2021, https://arxiv.org/abs/2103.07769
arXiv 2021
-
[10]
Zhijiang Guo, Michael Schlichtkrull, Andreas Vlachos, ”A Survey on Automated Fact- Checking,” Transactions of the Association for Computational Linguistics, Volume 10
-
[11]
Known At- tacks and Their Impacts on AI Systems,
Japan AI Safety Institute, “Known At- tacks and Their Impacts on AI Systems,” https://aisi.go.jp/assets/pdf/Known_ Attacks_and_Their_Impacts_on_AI_ Systems_V2_EN.pdf
-
[12]
Opinion — Structural prob- lems with tech platforms prevent fact-checkers from focusing on harm and virality,
Baybars ¨Orsek, “Opinion — Structural prob- lems with tech platforms prevent fact-checkers from focusing on harm and virality,”https: //www.poynter.org/commentary/2024/ structural-problems-with-tech-platfor ms-prevent-fact-checkers-from-focusin g-on-harm-and-virality/
2024
-
[13]
Taxonomy of Gen- erative AI Applications for Risk Assessment,
H. Tanaka, M. Ide, J. Yajima, S. Onodera, K. Munakata, N. Yoshioka, “Taxonomy of Gen- erative AI Applications for Risk Assessment,” 2024 IEEE/ACM 3rd International Confer- ence on AI Engineering - Software Engineering for AI (CAIN 2024), 2024
2024
-
[14]
Guide to Eval- uation Perspectives on AI Safety (ver- sion 1.10),
Japan AI Safety Institute, “Guide to Eval- uation Perspectives on AI Safety (ver- sion 1.10),”https://aisi.go.jp/assets/ pdf/ai_safety_eval_v1.10_en.pdf
-
[15]
Man arrested for posting false tweet claiming lion on the loose after Kumamoto quake,
Japan Today, “Man arrested for posting false tweet claiming lion on the loose after Kumamoto quake,” https://japantoday.com/category/crime/ man-arrested-for-posting-false-tweet- claiming-lion-on-the-loose-after-kuma moto-quake
-
[16]
DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts,
T. Graun, M. Rothermel, M. Rohrbach, A. Rohrbach, “ DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts,” arXiv,https: //arxiv.org/abs/2412.10510
-
[17]
I-C. Chern, S. Chern, S. Chen, W. Yuan, K. Feng, C. Zhou, J. He, G. Neubig, P. Liu, “Fac- Tool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios,” arXiV, 2023, https://arxiv.org/abs/2307.13528
arXiv 2023
-
[18]
ISO/IEC Guide 51: 2014, Safety aspects – Guidelines for their inclusion in stan- dards,
International Organization for Standardiza- tion (ISO), “ISO/IEC Guide 51: 2014, Safety aspects – Guidelines for their inclusion in stan- dards, ”https://www.iso.org/standard/ 53940.html
2014
-
[19]
Fault trees for security system design and analysis,
Phillip J. Brooke, Richard F. Paige, “Fault trees for security system design and analysis,” Computer & Security, Volume 22, Issue 3
-
[20]
IEC Guide 31010:2019, Risk 14 management – Risk assessment techniques,
International Organization for Standardiza- tion (ISO), “IEC Guide 31010:2019, Risk 14 management – Risk assessment techniques, ”https://www.iso.org/standard/72140. html 15
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.