Taxonomy of Risks on Automated Fact-Checking Systems Considering its Propagation

Jun Yajima; Takao Okubo; Tatsuya Oka

arxiv: 2606.25645 · v1 · pith:ZBVPHPMMnew · submitted 2026-06-24 · 💻 cs.CR · cs.AI

Taxonomy of Risks on Automated Fact-Checking Systems Considering its Propagation

Jun Yajima , Tatsuya Oka , Takao Okubo This is my paper

Pith reviewed 2026-06-25 20:34 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords automated fact-checkingrisk taxonomyrisk propagationdisinformationmisinformationAI risksSTRIDEdefamation

0 comments

The pith

Automated fact-checking systems contain 32 specific risks that a three-stage propagation model can surface.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to build a taxonomy of risks in automated fact-checking systems by modeling how initial risk factors create hazardous situations that then produce concrete harms such as misinformation spread or defamation. A sympathetic reader would care because these AI-based systems are proposed as a solution to fake news on social media, yet their own failures could worsen the problem they target. The authors apply the model to identify exactly 32 risks and demonstrate that the resulting categories, used as guide words, reveal issues in a sample system that a conventional security method misses. This matters for anyone evaluating whether to deploy or rely on automated verification tools.

Core claim

By tracing risk factors through hazardous situations to harms in a three-stage propagation model, the authors derive 32 specific risks in automated fact-checking systems; these risks serve as guide words that enable risk assessment of systems such as DEFAME and uncover issues not identified by STRIDE.

What carries the argument

The three-stage risk propagation model that converts risk factors into hazardous situations and then into harms, yielding 32 guide words for systematic assessment.

If this is right

The 32 risks provide a checklist that can be applied during the design or evaluation of any automated fact-checking system.
Using the guide words allows assessors to detect risks that standard IT security methods overlook.
Mitigating the identified risks would reduce the chance that an automated system itself spreads incorrect information or causes defamation.
The taxonomy supplies concrete analytical cues for reviewing implementations such as DEFAME.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same three-stage approach could be tested on other AI tools that make public claims, such as content moderation or recommendation engines.
Early integration of the 32 guide words into development workflows might prevent costly redesigns after deployment.
If the model is applied across multiple fact-checking platforms, patterns in shared risks could point to common architectural fixes.

Load-bearing premise

The three-stage risk propagation model is assumed to be complete enough to capture all relevant risks without leaving out important ones or drawing arbitrary boundaries.

What would settle it

Finding a documented risk in an automated fact-checking system that cannot be placed into any of the 32 categories derived from the model, or showing that STRIDE identifies every risk the guide words find.

Figures

Figures reproduced from arXiv: 2606.25645 by Jun Yajima, Takao Okubo, Tatsuya Oka.

**Figure 3.** Figure 3: In this figure, risks that occur in nor [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 2.** Figure 2: Data Flow Diagram (DFD) on DEFAME 2. Identifying risks at entity and data boundaries on the DFD For each entity and data boundary in the DFD created in the previous step, identify which of the risks listed in [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Result of risk assessment on DEFAME 6 Discussion 6.1 Risk Assessment using STRIDE vs Ours As shown in Section 5.4, the identified risks using STRIDE and ours are differ. The main reason is that the target risks of each guide word are different. The differences between the STRIDE guide words and our guide words are shown in [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

read the original abstract

In recent years, the posting of fake news including disinformation and misinformation on social networking services (SNS) has become a social problem. To combat this fake news, fact-checking that is the process of assessing the veracity of posts on SNS has become increasingly important. While fact-checking is currently performed by fact-checking organizations, it is difficult to fact-check all posts on SNS. Therefore, the use of automated fact-checking systems is effective. Recent automated fact-checking systems utilize artificial intelligence and large language models, so there are risks of incorrect judgments and posting incorrect results on social media which can lead to the spread of misinformation or to engage in defamation. In this paper, as a first step toward enabling the safe use of automated fact-checking systems, we categorize the specific risks on automated fact-checking systems. In this categorizing, we consider a three-stage risk propagation: risk factors, hazardous situations, and harm. Our analysis revealed that 32 specific risks exist in automated fact-checking systems. In this paper, we utilize the categorized risks as analytical cues (guide words) to present the risk assessment of the automated fact-checking system DEFAME. This assessment result indicates that risks that cannot be derived using STRIDE, a conventional IT security risk assessment method can be derived using our guide words.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a 32-risk taxonomy for automated fact-checking built on a three-stage propagation model and claims it finds risks STRIDE misses, but the derivation and completeness of that count are not shown.

read the letter

The main thing to know is that this paper builds a taxonomy of exactly 32 risks in automated fact-checking systems using three stages of propagation (risk factors, hazardous situations, harm) and applies the resulting guide words to evaluate their DEFAME system, arguing it surfaces issues that STRIDE does not.

What is new is the direct tailoring of this propagation framing to AI-driven fact-checking, with an explicit list and a side-by-side comparison to a standard method. The paper does a reasonable job naming domain-specific concerns, such as erroneous LLM judgments leading to defamation or misinformation spread on social media, that generic IT security approaches can overlook.

The soft spots are straightforward. The abstract states the count of 32 and the superiority claim but gives no steps for how the risks were identified, no validation process, and no argument that the three-stage model is exhaustive or that other frameworks would not produce a different set. The boundaries between stages could easily exclude categories like training data issues or longer-term feedback loops, and nothing in the summary addresses that.

This work is for researchers and engineers working on risk assessment for automated content tools or AI safety in misinformation contexts. Someone looking for structured guide words in this narrow area could get some practical categories from it.

The central argument is a reasonable starting map but rests on an asserted model rather than demonstrated completeness. It deserves peer review so the categorization process and the DEFAME assessment can be examined in full.

Referee Report

3 major / 2 minor

Summary. The paper proposes a taxonomy of risks for automated fact-checking systems by modeling risk propagation through three stages (risk factors, hazardous situations, and harm). Analysis via this model yields 32 specific risks, which are then used as guide words to assess the DEFAME system; the assessment claims to surface risks not derivable via the STRIDE method.

Significance. If the three-stage model proves both appropriate and exhaustive for this domain, the resulting taxonomy and guide-word approach could provide a practical extension of conventional security analysis methods to AI-based fact-checking, addressing domain-specific issues such as incorrect veracity judgments and downstream misinformation spread.

major comments (3)

[Abstract and three-stage model section] Abstract and § on three-stage model: the assertion that the three-stage propagation (risk factors → hazardous situations → harm) systematically surfaces all relevant risks is asserted without argument, literature comparison, or sensitivity analysis showing that alternative framings (STPA, HAZOP variants, or AI risk ontologies) would not yield a materially different set or that boundaries do not exclude categories such as training-data poisoning or long-term societal feedback loops.
[Taxonomy section] Section presenting the 32 risks: the count of exactly 32 risks and their categorization is stated as the outcome of the analysis, yet no derivation steps, inter-rater process, or validation method are supplied, leaving the reproducibility and completeness of the taxonomy unassessable.
[DEFAME assessment section] DEFAME assessment section: the claim that the guide words derive risks that cannot be obtained with STRIDE is made without concrete examples of the missed risks or a side-by-side comparison, so the asserted superiority rests on an unshown differential rather than demonstrated evidence.

minor comments (2)

[Abstract and introduction] The abstract and introduction would benefit from explicit definitions of the three stages before their use in the taxonomy.
[Taxonomy presentation] Figure or table summarizing the 32 risks should include a column or note indicating which stage each risk belongs to for traceability.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the paper accordingly to strengthen the justification, reproducibility, and evidence presented.

read point-by-point responses

Referee: [Abstract and three-stage model section] the assertion that the three-stage propagation (risk factors → hazardous situations → harm) systematically surfaces all relevant risks is asserted without argument, literature comparison, or sensitivity analysis showing that alternative framings (STPA, HAZOP variants, or AI risk ontologies) would not yield a materially different set or that boundaries do not exclude categories such as training-data poisoning or long-term societal feedback loops.

Authors: We acknowledge that the three-stage model is introduced without explicit comparison to alternatives or sensitivity analysis. In revision we will add a dedicated subsection in the model description that (1) motivates the propagation framing from safety-engineering literature on hazard analysis, (2) briefly contrasts it with STPA and selected AI risk ontologies, and (3) states the scope limitation that training-data poisoning and long-term societal loops fall outside the current boundary and are flagged for future work. A full comparative sensitivity study remains outside the paper’s scope. revision: yes
Referee: [Taxonomy section] the count of exactly 32 risks and their categorization is stated as the outcome of the analysis, yet no derivation steps, inter-rater process, or validation method are supplied, leaving the reproducibility and completeness of the taxonomy unassessable.

Authors: The 32 risks were obtained by the authors applying the three-stage model component-wise to automated fact-checking pipelines. To improve assessability we will insert an appendix that lists the derivation steps, the component-to-risk mapping, and the explicit categorization criteria used. Because the analysis was performed by the author team rather than multiple independent raters, no inter-rater reliability statistic is available; we will therefore describe the process transparently instead of claiming formal validation. revision: partial
Referee: [DEFAME assessment section] the claim that the guide words derive risks that cannot be obtained with STRIDE is made without concrete examples of the missed risks or a side-by-side comparison, so the asserted superiority rests on an unshown differential rather than demonstrated evidence.

Authors: We agree that the differential must be shown explicitly. The revised DEFAME section will contain a short table that pairs each guide-word-derived risk with the closest STRIDE category (or notes its absence) and supplies two concrete examples—incorrect veracity judgment leading to defamation and propagation of a false “fact-check” result—that STRIDE does not surface. This will replace the current qualitative claim with documented evidence. revision: yes

standing simulated objections not resolved

Full sensitivity analysis demonstrating that no alternative framing (STPA, HAZOP, AI ontologies) would produce a materially different risk set

Circularity Check

0 steps flagged

No circularity: taxonomy derived by applying an introduced framework to the domain

full rationale

The paper introduces a three-stage risk propagation model (risk factors, hazardous situations, harm) as its analytical lens and applies it to automated fact-checking systems to enumerate 32 risks, then uses the resulting guide words to assess DEFAME and contrast with STRIDE. No equations, fitted parameters, or self-citations appear in the provided text; the count of 32 risks and the STRIDE comparison are outputs of the application rather than inputs that presuppose the result. The derivation chain is therefore self-contained domain analysis rather than a reduction to prior results or definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the domain assumption that risks propagate in three identifiable stages and that this model yields a useful and exhaustive checklist; no free parameters or invented entities are described.

axioms (1)

domain assumption Risks in automated fact-checking systems can be systematically categorized using a three-stage propagation model consisting of risk factors, hazardous situations, and harm.
This model is invoked to generate the 32 specific risks and the guide words for assessment.

pith-pipeline@v0.9.1-grok · 5757 in / 1250 out tokens · 34648 ms · 2026-06-25T20:34:04.541308+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

20 extracted references

[1]

The saga of ’Pizzagate’: The fake story that shows how conspiracy theo- ries spread,

BBC, “The saga of ’Pizzagate’: The fake story that shows how conspiracy theo- ries spread,”https://www.bbc.com/news/ blogs-trending-38156985
[2]

Caution urged over Japan quake fake posts,

Japan Broadcasting Corporation (NHK), “Caution urged over Japan quake fake posts,” https://www3.nhk.or.jp/nhkworld/en/ news/backstories/4758/
[3]

PolitiFact,https://www.politifact.com/
[4]

Full Fact,https://fullfact.org/
[5]

factcheckcenter.jp/

Japan Fact-check Center,https://www. factcheckcenter.jp/
[6]

Fu- jitsu to combat fake news in collabo- ration with leading Japanese organiza- tions,

Fujitsu limited press release, “ Fu- jitsu to combat fake news in collabo- ration with leading Japanese organiza- tions,”https://info.archives.global. fujitsu/global/about/resources/news/ press-releases/2024/1016-01.html

2024
[7]

The Psychology of Fake News,

Gordon Pennycook, David G. Rand, “The Psychology of Fake News,” Trends in Cogni- tive Sciences
[8]

Microsoft Threat Modeling Tool,

Microsoft Corporation, “Microsoft Threat Modeling Tool, ”https://learn. microsoft.com/ja-jp/azure/security/ develop/threat-modeling-tool-threats
[9]

Automated Fact-Checking for Assist- ing Human Fact-Checkers,

Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barr´ on-Cede˜ no, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino, “Automated Fact-Checking for Assist- ing Human Fact-Checkers,”arXiv,2021, https://arxiv.org/abs/2103.07769

arXiv 2021
[10]

Zhijiang Guo, Michael Schlichtkrull, Andreas Vlachos, ”A Survey on Automated Fact- Checking,” Transactions of the Association for Computational Linguistics, Volume 10
[11]

Known At- tacks and Their Impacts on AI Systems,

Japan AI Safety Institute, “Known At- tacks and Their Impacts on AI Systems,” https://aisi.go.jp/assets/pdf/Known_ Attacks_and_Their_Impacts_on_AI_ Systems_V2_EN.pdf
[12]

Opinion — Structural prob- lems with tech platforms prevent fact-checkers from focusing on harm and virality,

Baybars ¨Orsek, “Opinion — Structural prob- lems with tech platforms prevent fact-checkers from focusing on harm and virality,”https: //www.poynter.org/commentary/2024/ structural-problems-with-tech-platfor ms-prevent-fact-checkers-from-focusin g-on-harm-and-virality/

2024
[13]

Taxonomy of Gen- erative AI Applications for Risk Assessment,

H. Tanaka, M. Ide, J. Yajima, S. Onodera, K. Munakata, N. Yoshioka, “Taxonomy of Gen- erative AI Applications for Risk Assessment,” 2024 IEEE/ACM 3rd International Confer- ence on AI Engineering - Software Engineering for AI (CAIN 2024), 2024

2024
[14]

Guide to Eval- uation Perspectives on AI Safety (ver- sion 1.10),

Japan AI Safety Institute, “Guide to Eval- uation Perspectives on AI Safety (ver- sion 1.10),”https://aisi.go.jp/assets/ pdf/ai_safety_eval_v1.10_en.pdf
[15]

Man arrested for posting false tweet claiming lion on the loose after Kumamoto quake,

Japan Today, “Man arrested for posting false tweet claiming lion on the loose after Kumamoto quake,” https://japantoday.com/category/crime/ man-arrested-for-posting-false-tweet- claiming-lion-on-the-loose-after-kuma moto-quake
[16]

DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts,

T. Graun, M. Rothermel, M. Rohrbach, A. Rohrbach, “ DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts,” arXiv,https: //arxiv.org/abs/2412.10510

arXiv
[17]

Fac- Tool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios,

I-C. Chern, S. Chern, S. Chen, W. Yuan, K. Feng, C. Zhou, J. He, G. Neubig, P. Liu, “Fac- Tool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios,” arXiV, 2023, https://arxiv.org/abs/2307.13528

arXiv 2023
[18]

ISO/IEC Guide 51: 2014, Safety aspects – Guidelines for their inclusion in stan- dards,

International Organization for Standardiza- tion (ISO), “ISO/IEC Guide 51: 2014, Safety aspects – Guidelines for their inclusion in stan- dards, ”https://www.iso.org/standard/ 53940.html

2014
[19]

Fault trees for security system design and analysis,

Phillip J. Brooke, Richard F. Paige, “Fault trees for security system design and analysis,” Computer & Security, Volume 22, Issue 3
[20]

IEC Guide 31010:2019, Risk 14 management – Risk assessment techniques,

International Organization for Standardiza- tion (ISO), “IEC Guide 31010:2019, Risk 14 management – Risk assessment techniques, ”https://www.iso.org/standard/72140. html 15

2019

[1] [1]

The saga of ’Pizzagate’: The fake story that shows how conspiracy theo- ries spread,

BBC, “The saga of ’Pizzagate’: The fake story that shows how conspiracy theo- ries spread,”https://www.bbc.com/news/ blogs-trending-38156985

[2] [2]

Caution urged over Japan quake fake posts,

Japan Broadcasting Corporation (NHK), “Caution urged over Japan quake fake posts,” https://www3.nhk.or.jp/nhkworld/en/ news/backstories/4758/

[3] [3]

PolitiFact,https://www.politifact.com/

[4] [4]

Full Fact,https://fullfact.org/

[5] [5]

factcheckcenter.jp/

Japan Fact-check Center,https://www. factcheckcenter.jp/

[6] [6]

Fu- jitsu to combat fake news in collabo- ration with leading Japanese organiza- tions,

Fujitsu limited press release, “ Fu- jitsu to combat fake news in collabo- ration with leading Japanese organiza- tions,”https://info.archives.global. fujitsu/global/about/resources/news/ press-releases/2024/1016-01.html

2024

[7] [7]

The Psychology of Fake News,

Gordon Pennycook, David G. Rand, “The Psychology of Fake News,” Trends in Cogni- tive Sciences

[8] [8]

Microsoft Threat Modeling Tool,

Microsoft Corporation, “Microsoft Threat Modeling Tool, ”https://learn. microsoft.com/ja-jp/azure/security/ develop/threat-modeling-tool-threats

[9] [9]

Automated Fact-Checking for Assist- ing Human Fact-Checkers,

Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barr´ on-Cede˜ no, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino, “Automated Fact-Checking for Assist- ing Human Fact-Checkers,”arXiv,2021, https://arxiv.org/abs/2103.07769

arXiv 2021

[10] [10]

Zhijiang Guo, Michael Schlichtkrull, Andreas Vlachos, ”A Survey on Automated Fact- Checking,” Transactions of the Association for Computational Linguistics, Volume 10

[11] [11]

Known At- tacks and Their Impacts on AI Systems,

Japan AI Safety Institute, “Known At- tacks and Their Impacts on AI Systems,” https://aisi.go.jp/assets/pdf/Known_ Attacks_and_Their_Impacts_on_AI_ Systems_V2_EN.pdf

[12] [12]

Opinion — Structural prob- lems with tech platforms prevent fact-checkers from focusing on harm and virality,

Baybars ¨Orsek, “Opinion — Structural prob- lems with tech platforms prevent fact-checkers from focusing on harm and virality,”https: //www.poynter.org/commentary/2024/ structural-problems-with-tech-platfor ms-prevent-fact-checkers-from-focusin g-on-harm-and-virality/

2024

[13] [13]

Taxonomy of Gen- erative AI Applications for Risk Assessment,

H. Tanaka, M. Ide, J. Yajima, S. Onodera, K. Munakata, N. Yoshioka, “Taxonomy of Gen- erative AI Applications for Risk Assessment,” 2024 IEEE/ACM 3rd International Confer- ence on AI Engineering - Software Engineering for AI (CAIN 2024), 2024

2024

[14] [14]

Guide to Eval- uation Perspectives on AI Safety (ver- sion 1.10),

Japan AI Safety Institute, “Guide to Eval- uation Perspectives on AI Safety (ver- sion 1.10),”https://aisi.go.jp/assets/ pdf/ai_safety_eval_v1.10_en.pdf

[15] [15]

Man arrested for posting false tweet claiming lion on the loose after Kumamoto quake,

Japan Today, “Man arrested for posting false tweet claiming lion on the loose after Kumamoto quake,” https://japantoday.com/category/crime/ man-arrested-for-posting-false-tweet- claiming-lion-on-the-loose-after-kuma moto-quake

[16] [16]

DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts,

T. Graun, M. Rothermel, M. Rohrbach, A. Rohrbach, “ DEFAME: Dynamic Evidence-based FAct-checking with Multimodal Experts,” arXiv,https: //arxiv.org/abs/2412.10510

arXiv

[17] [17]

Fac- Tool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios,

I-C. Chern, S. Chern, S. Chen, W. Yuan, K. Feng, C. Zhou, J. He, G. Neubig, P. Liu, “Fac- Tool: Factuality Detection in Generative AI – A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios,” arXiV, 2023, https://arxiv.org/abs/2307.13528

arXiv 2023

[18] [18]

ISO/IEC Guide 51: 2014, Safety aspects – Guidelines for their inclusion in stan- dards,

International Organization for Standardiza- tion (ISO), “ISO/IEC Guide 51: 2014, Safety aspects – Guidelines for their inclusion in stan- dards, ”https://www.iso.org/standard/ 53940.html

2014

[19] [19]

Fault trees for security system design and analysis,

Phillip J. Brooke, Richard F. Paige, “Fault trees for security system design and analysis,” Computer & Security, Volume 22, Issue 3

[20] [20]

IEC Guide 31010:2019, Risk 14 management – Risk assessment techniques,

International Organization for Standardiza- tion (ISO), “IEC Guide 31010:2019, Risk 14 management – Risk assessment techniques, ”https://www.iso.org/standard/72140. html 15

2019