ZoFia: Zero-Shot Fake News Detection with Entity-Guided Retrieval and Multi-LLM Interaction
Pith reviewed 2026-05-18 01:44 UTC · model grok-4.3
The pith
ZoFia detects fake news in zero-shot settings by retrieving evidence with core entities and verifying through multi-LLM adversarial debate.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose ZoFia, a two-stage zero-shot fake news detection framework. The first stage uses a novel Hierarchical Salience and Salience-Calibrated Minimum Marginal Relevance (SC-MMR) algorithm to extract core entities that drive dual-source retrieval to overcome knowledge and evidence gaps. The second stage employs a multi-agent system for multi-perspective reasoning and verification in parallel, achieving an explainable and robust result via adversarial debate. Comprehensive experiments on two public datasets show that ZoFia outperforms existing zero-shot baselines and even most few-shot methods.
What carries the argument
Entity-guided dual-source retrieval using the SC-MMR algorithm for core entity extraction, paired with a multi-agent adversarial debate system.
If this is right
- Detection of time-sensitive fake news becomes possible without task-specific training or labeled data.
- The adversarial debate provides built-in explanations for the detection decision.
- Knowledge cutoffs and hallucinations are mitigated through external evidence retrieval.
- Confirmation bias is reduced by requiring multiple models to reconcile differing perspectives.
Where Pith is reading between the lines
- The entity extraction technique might apply to improving retrieval in other LLM applications like question answering.
- Multi-agent debate could be tested for enhancing decision making in fields with high uncertainty such as climate science reporting.
- Integrating real-time web search into the retrieval stage would likely strengthen performance on breaking stories.
Load-bearing premise
The approach relies on the premise that automatically identified core entities will lead to useful dual-source evidence and that multi-LLM adversarial debate will correct for individual model biases and knowledge limitations.
What would settle it
If evaluations on additional recent news datasets reveal that ZoFia performs no better than a simple single-LLM classification prompt, this would indicate that the retrieval and debate components do not deliver the expected improvements.
Figures
read the original abstract
The rapid spread of fake news threatens social stability and public trust, highlighting the urgent need for its effective detection. Although large language models (LLMs) show potential in fake news detection, they are limited by knowledge cutoff and easily generate factual hallucinations when handling time-sensitive news. Furthermore, the thinking of a single LLM easily falls into early stance locking and confirmation bias, making it hard to handle both content reasoning and fact checking simultaneously. To address these challenges, we propose ZoFia, a two-stage zero-shot fake news detection framework. In the first retrieval stage, we propose novel Hierarchical Salience and Salience-Calibrated Minimum Marginal Relevance (SC-MMR) algorithm to extract core entities accurately, which drive dual-source retrieval to overcome knowledge and evidence gaps. In the subsequent stage, a multi-agent system conducts multi-perspective reasoning and verification in parallel and achieves an explainable and robust result via adversarial debate. Comprehensive experiments on two public datasets show that ZoFia outperforms existing zero-shot baselines and even most few-shot methods. Our code has been open-sourced to facilitate the research community at https://github.com/SakiRinn/ZoFia.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents ZoFia, a two-stage zero-shot fake news detection framework. Stage one extracts core entities via a Hierarchical Salience method and the Salience-Calibrated Minimum Marginal Relevance (SC-MMR) algorithm to drive dual-source retrieval, addressing knowledge cutoffs and evidence gaps. Stage two deploys a multi-agent LLM system for parallel multi-perspective reasoning and adversarial debate to yield explainable, robust verdicts. Comprehensive experiments on two public datasets are reported to show that ZoFia outperforms existing zero-shot baselines and most few-shot methods. The code is open-sourced.
Significance. If the results hold under rigorous evaluation, the work supplies a practical zero-shot pipeline that combines entity-guided retrieval with multi-LLM adversarial interaction to mitigate hallucinations and confirmation bias. The open-sourced implementation is a clear strength that aids reproducibility and community follow-up.
major comments (1)
- [Experimental Evaluation] Experimental Evaluation section: The central claim is that entity-guided dual-source retrieval plus multi-LLM debate overcomes knowledge cutoffs and hallucinations specifically for time-sensitive news. Standard public fake-news corpora (e.g., LIAR, FakeNewsNet) predominantly contain items published well before the training cutoffs of the LLMs used (2021–2023). In this regime the base models already encode the relevant facts, so measured gains cannot be attributed to the retrieval stage’s ability to supply post-cutoff evidence. A controlled evaluation on recent, post-cutoff news items is required to substantiate the motivating failure mode.
minor comments (2)
- [Abstract] Abstract: The claim of outperformance is stated without any quantitative metrics, baseline names, or significance tests. A concise summary of key F1 or accuracy deltas would make the abstract self-contained.
- [Methods] Methods: The precise formulation of the SC-MMR scoring function and the protocol for the adversarial debate (number of agents, turn structure, aggregation rule) would benefit from pseudocode or a worked example to improve clarity and reproducibility.
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive review of our manuscript. We address the major comment regarding the experimental evaluation below, providing a point-by-point response and indicating planned revisions to strengthen the work.
read point-by-point responses
-
Referee: [Experimental Evaluation] Experimental Evaluation section: The central claim is that entity-guided dual-source retrieval plus multi-LLM debate overcomes knowledge cutoffs and hallucinations specifically for time-sensitive news. Standard public fake-news corpora (e.g., LIAR, FakeNewsNet) predominantly contain items published well before the training cutoffs of the LLMs used (2021–2023). In this regime the base models already encode the relevant facts, so measured gains cannot be attributed to the retrieval stage’s ability to supply post-cutoff evidence. A controlled evaluation on recent, post-cutoff news items is required to substantiate the motivating failure mode.
Authors: We appreciate the referee's careful analysis of the temporal characteristics of our evaluation datasets. We acknowledge that the LIAR and FakeNewsNet corpora primarily consist of news items predating the knowledge cutoffs of the LLMs employed in our experiments. While the current results demonstrate the benefits of hierarchical entity salience retrieval and multi-LLM adversarial debate for robust zero-shot detection, we agree that these benchmarks do not fully isolate the framework's ability to address post-cutoff knowledge gaps in time-sensitive scenarios. To directly substantiate this aspect of our motivating claims, we will revise the Experimental Evaluation section to include a controlled evaluation on a set of recent news items published after 2023. This addition will feature a new dataset or curated collection of contemporary articles, with ablation studies isolating the contribution of the dual-source retrieval stage in supplying up-to-date evidence. We believe this revision will better align the empirical evaluation with the paper's focus on overcoming knowledge cutoffs. revision: yes
Circularity Check
No circularity: empirical pipeline with no derivations or self-referential reductions
full rationale
The paper proposes ZoFia as a two-stage zero-shot framework: entity extraction via Hierarchical Salience and SC-MMR for dual-source retrieval, followed by multi-agent adversarial debate. No equations, fitted parameters, predictions derived from inputs, or load-bearing self-citations appear in the abstract or described method. The central claims rest on experimental outperformance on public datasets rather than any mathematical derivation that reduces to its own definitions or prior author results. This is a standard engineering contribution validated empirically and is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Vian Bakir and Andrew McStay. 2018. Fake news and the economy of emotions: Problems, causes, solutions. Digital journalism, 6(2):154--175
work page 2018
-
[2]
Benjamin Bullough, Harrison Lundberg, Chen Hu, and Weihang Xiao. 2024. Predicting entity salience in extremely short documents. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track, pages 50--64
work page 2024
-
[3]
Jaime Carbonell and Jade Goldstein. 1998. The use of mmr, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pages 335--336
work page 1998
-
[4]
Claudio Carpineto and Giovanni Romano. 2012. A survey of automatic query expansion in information retrieval. Acm Computing Surveys (CSUR), 44(1):1--50
work page 2012
-
[5]
Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, and Zhiyuan Liu. 2023. Chateval: Towards better llm-based evaluators through multi-agent debate. arXiv preprint arXiv:2308.07201
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [6]
- [7]
-
[8]
DeepSeek‑AI. 2024. https://arxiv.org/abs/2412.19437 Deepseek‑v3 technical report . Preprint, arXiv:2412.19437
work page internal anchor Pith review Pith/arXiv arXiv 2024
- [9]
-
[10]
Jesse Dunietz and Dan Gillick. 2014. A new entity salience task with millions of training examples. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers, pages 205--209
work page 2014
- [11]
-
[12]
Marc Fisher, John Woodrow Cox, and Peter Hermann. 2016. Pizzagate: From rumor, to hashtag, to gunfire in dc. Washington Post, 6:8410--8415
work page 2016
-
[13]
Hao Guo, Zihan Ma, Zhi Zeng, Minnan Luo, Weixin Zeng, Jiuyang Tang, and Xiang Zhao. 2025. Each fake news is fake in its own way: An attribution multi-granularity benchmark for multimodal fake news detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 228--236
work page 2025
- [14]
-
[15]
Nathaniel Hoy and Theodora Koulouri. 2022. Exploring the generalisability of fake news detection models. In 2022 IEEE International Conference on Big Data (Big Data), pages 5731--5740. IEEE
work page 2022
-
[16]
Beizhe Hu, Qiang Sheng, Juan Cao, Yuhui Shi, Yang Li, Danding Wang, and Peng Qi. 2024 a . Bad actor, good advisor: Exploring the role of large language models in fake news detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 22105--22113
work page 2024
-
[17]
Beizhe Hu, Qiang Sheng, Juan Cao, Yuhui Shi, Yang Li, Danding Wang, and Peng Qi. 2024 b . Bad actor, good advisor: Exploring the role of large language models in fake news detection. In Proceedings of the AAAI conference on artificial intelligence, volume 38, pages 22105--22113
work page 2024
-
[18]
Weiqi Hu, Ye Wang, Yan Jia, Qing Liao, and Bin Zhou. 2024 c . A multi-modal prompt learning framework for early detection of fake news. In Proceedings of the International AAAI Conference on Web and Social Media, volume 18, pages 651--662
work page 2024
-
[19]
Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, Dan Su, Yan Xu, Etsuko Ishii, Ye Jin Bang, Andrea Madotto, and Pascale Fung. 2023. Survey of hallucination in natural language generation. ACM computing surveys, 55(12):1--38
work page 2023
-
[20]
Gongyao Jiang, Shuang Liu, Yu Zhao, Yueheng Sun, and Meishan Zhang. 2022. Fake news detection via knowledgeable prompt learning. Information Processing & Management, 59(5):103029
work page 2022
- [21]
- [22]
-
[23]
Rohit Kumar Kaliyar, Anurag Goswami, and Pratik Narang. 2021. Fakebert: Fake news detection in social media with a bert-based deep learning approach. Multimedia tools and applications, 80(8):11765--11788
work page 2021
- [24]
-
[25]
Xiaochong Lan, Chen Gao, Depeng Jin, and Yong Li. 2024. Stance detection with collaborative role-infused llm-based agents. In Proceedings of the international AAAI conference on web and social media, volume 18, pages 891--903
work page 2024
-
[26]
u ttler, Mike Lewis, Wen-tau Yih, Tim Rockt \
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K \"u ttler, Mike Lewis, Wen-tau Yih, Tim Rockt \"a schel, and 1 others. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in neural information processing systems, 33:9459--9474
work page 2020
- [27]
-
[28]
Jia Li, Lijie Hu, Jingfeng Zhang, Tianhang Zheng, Hua Zhang, and Di Wang. 2025. Fair text-to-image diffusion via fair mapping. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 26256--26264
work page 2025
-
[29]
Tian Liang, Zhiwei He, Wenxiang Jiao, Xing Wang, Yan Wang, Rui Wang, Yujiu Yang, Shuming Shi, and Zhaopeng Tu. 2023. Encouraging divergent thinking in large language models through multi-agent debate. arXiv preprint arXiv:2305.19118
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [30]
-
[31]
Yuhan Liu, Yuxuan Liu, Xiaoqing Zhang, Xiuying Chen, and Rui Yan. 2025. The truth becomes clearer through debate! multi-agent systems with large language models unmask fake news. In Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 504--514
work page 2025
-
[32]
Federico Monti, Fabrizio Frasca, Davide Eynard, Damon Mannion, and Michael M Bronstein. 2019. Fake news detection on social media using geometric deep learning. arXiv preprint arXiv:1902.06673
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[33]
Qiong Nan, Juan Cao, Yongchun Zhu, Yanyan Wang, and Jintao Li. 2021. Mdfend: Multi-domain fake news detection. In Proceedings of the 30th ACM international conference on information & knowledge management, pages 3343--3347
work page 2021
-
[34]
Qiong Nan, Qiang Sheng, Juan Cao, Beizhe Hu, Danding Wang, and Jintao Li. 2024. Let silence speak: Enhancing fake news detection with generated comments from large language models. In Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, pages 1732--1742
work page 2024
- [35]
- [36]
- [37]
- [38]
-
[39]
Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[40]
Wajiha Shahid, Bahman Jamshidi, Saqib Hakak, Haruna Isah, Wazir Zada Khan, Muhammad Khurram Khan, and Kim-Kwang Raymond Choo. 2022. Detecting and mitigating the dissemination of fake news: Challenges and future research opportunities. IEEE Transactions on Computational Social Systems, 11(4):4649--4662
work page 2022
-
[41]
Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big data, 8(3):171--188
work page 2020
-
[42]
Karen Sparck Jones. 1972. A statistical interpretation of term specificity and its application in retrieval. Journal of documentation, 28(1):11--21
work page 1972
- [43]
-
[44]
Tjong Kim Sang and Fien De Meulder
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. https://www.aclweb.org/anthology/W03-0419 Introduction to the C o NLL -2003 shared task: Language-independent named entity recognition . In Proceedings of the Seventh Conference on Natural Language Learning at HLT - NAACL 2003 , pages 142--147
work page 2003
-
[45]
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. https://arxiv.org/abs/2210.03629 React: Synergizing reasoning and acting in language models . Preprint, arXiv:2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [46]
-
[47]
Zhuosheng Zhang, Aston Zhang, Mu Li, and Alex Smola. 2022. https://arxiv.org/abs/2210.03493 Automatic chain of thought prompting in large language models . Preprint, arXiv:2210.03493
work page internal anchor Pith review Pith/arXiv arXiv 2022
- [48]
-
[49]
Xinyi Zhou and Reza Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR), 53(5):1--40
work page 2020
-
[50]
ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...
-
[51]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.