Multi-Agentic System Leveraging Open-Source LLMs to Mitigate Disinformation Threats

Martin Tamajka; Sebastian Kula

arxiv: 2606.30259 · v1 · pith:CRL4ZRB2new · submitted 2026-06-29 · 💻 cs.CL

Multi-Agentic System Leveraging Open-Source LLMs to Mitigate Disinformation Threats

Sebastian Kula , Martin Tamajka This is my paper

Pith reviewed 2026-06-30 06:24 UTC · model grok-4.3

classification 💻 cs.CL

keywords multi-agent systemsdisinformation detectionlarge language modelsopen-source LLMsconsensus mechanismsfact-checkingmultilingual NLPhierarchical AI

0 comments

The pith

A multi-agent system of open-source LLMs outperforms single models like GPT-4 at disinformation detection by using consensus, diverse knowledge, and hierarchy to mimic human annotators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a multi-agent system of open-source large language models to detect disinformation more reliably than single models. By incorporating mechanisms for consensus among agents, diversity in their knowledge and thinking styles, and a hierarchical structure, the system aims to replicate effective human annotation teams. Experiments show better results than GPT-4 and GPT-3.5 on tasks involving English, Polish, Slovak, and Bulgarian texts. This matters because manual verification cannot keep up with the volume of online content, so automated methods need to be as accurate as possible.

Core claim

The authors claim that their multi-agent system, which uses a consensus mechanism, diversity in cognition and knowledge, and hierarchical structure inspired by human annotators, achieves superior results in disinformation detection compared to individual LLMs including GPT-4 and GPT-3.5. The system is built with open-source models like LLaMA and others for transparency and is evaluated on datasets in four languages with varying resource levels on three different tasks related to disinformation.

What carries the argument

Multi-agent system emulating human annotators via consensus mechanism, diversity in cognition and knowledge, and hierarchical structure.

If this is right

The system improves performance on low-resource languages such as Slovak and Bulgarian.
It enables transparent, open-source alternatives to proprietary models for fact-checking tasks.
The method applies across direct detection, identifying texts worthy of verification, and detecting verifiable claims.
It provides a scalable automated approach to handle high volumes of disinformation where manual review falls short.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could extend to other content moderation problems that benefit from ensemble decision-making.
Diversity across open models may help reduce certain biases that appear in single proprietary LLMs.
Real-time deployment on social media streams would test whether the consensus overhead remains practical at scale.

Load-bearing premise

Emulating human annotator behavior with consensus, diversity in cognition and knowledge, and hierarchy in a multi-agent LLM setup will reliably produce better disinformation detection than single models.

What would settle it

A demonstration that GPT-4 or another single LLM matches or exceeds the multi-agent system's accuracy on the same multilingual datasets and tasks would falsify the superiority claim.

Figures

Figures reproduced from arXiv: 2606.30259 by Martin Tamajka, Sebastian Kula.

**Figure 1.** Figure 1: The architecture of the proposed MAS with leader agent in the center. The architecture represents the star (hub-and-spoke) topology with agents based on various LLMs. The proposed system comprises four agents: a designated leader agent and three expert annotator agents operating as equals. The leader assumes responsibility for moderating discussions within a hierarchical structure, functioning as the cent… view at source ↗

**Figure 2.** Figure 2: Decision and data ow with possible scenarios in the proposed system. the CLEF22 Twitter collection [23], labelled for veriable claim presence, oered a social media context for assessing factual claim detection capabilities. Four distinct experimental tasks were designed to leverage these datasets, enabling a comprehensive assessment of the proposed system’s capabilities. TASK 1 focused on disinformatio… view at source ↗

read the original abstract

In contemporary societies, the threat of disinformation has reached alarming levels, exacerbated by the proliferation of electronic communication, social media, and advancements in artificial intelligence. As a result, there is an urgent need to develop effective countermeasures to mitigate this menace. However, the sheer scale of the problem renders manual fact-checking and human-based verification inadequate, underscoring the necessity for automated methods to detect and debunk disinformation. This article proposes a novel approach based on a multi-agent system that emulates the decision-making processes of human annotators engaged in disinformation detection tasks. By incorporating a consensus mechanism, diversity in cognition and diversity in knowledge, and also hierarchical structure, inspired by human annotators' behavior, the proposed method achieves superior results compared to individual Large Language Models (LLMs), including GPT 4 and GPT 3.5. The system leverages open models (e.g., LLaMA, Kimi, Qwen, Deepseek and LLaMA-Nemotron) to ensure greater transparency. The evaluation of the proposed method encompasses datasets in languages with varying resource availability, including English (high-resource), Polish (medium-resource), Slovak (low-resource) and Bulgarian (low-resource). Experiments were conducted on tasks such as direct disinformation detection, identification of texts worthy of verification, and detection of texts containing verifiable factual claims.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper describes a multi-agent open-LLM setup for disinformation detection across languages but gives no metrics, baselines, or ablations to support its superiority claims over GPT-4.

read the letter

The main point is a system that runs several open models (LLaMA, Qwen, Deepseek and similar) together with consensus, diversity in knowledge and cognition, and a hierarchy to mimic human fact-checkers. It runs on English, Polish, Slovak, and Bulgarian data for direct detection, verification-worthiness, and factual-claim spotting.

The work does apply known multi-agent patterns to this domain and sticks to open models, which matters for transparency and for lower-resource languages. That choice is reasonable and practical.

The problems are straightforward and central. The abstract states superior results without any numbers, tables, or statistical tests. The stress-test note is correct: there is no sign of ablations that turn off consensus or hierarchy while keeping the model set fixed, so any reported edge could come from generic ensembling, prompt length, or other uncontrolled factors. Without those controls the causal claim about the human-emulation structure does not hold up.

The paper is aimed at people who build practical disinformation tools and want open-model options. A reader looking for reproducible empirical work or a clear advance in agentic methods will not get enough to use or extend.

I would not bring this to a reading group or cite it. It does not deserve peer review in its current state because the evaluation evidence is missing. Recommendation: desk reject unless a revision supplies the actual results and the required ablations.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a multi-agent system using open-source LLMs (LLaMA, Kimi, Qwen, Deepseek, LLaMA-Nemotron) to detect disinformation by emulating human annotators via consensus mechanisms, cognitive and knowledge diversity, and hierarchical structure. It evaluates the approach on three tasks—direct disinformation detection, identification of texts worthy of verification, and detection of verifiable factual claims—across English (high-resource), Polish (medium-resource), Slovak and Bulgarian (low-resource) datasets, claiming superior performance to individual models including GPT-4 and GPT-3.5 while emphasizing transparency.

Significance. If substantiated with quantitative evidence and controls, the work could advance transparent, open-source tools for disinformation mitigation in multilingual settings, including low-resource languages. The multi-agent emulation of human processes offers a potentially interesting direction for robustness, though its advantages require verification against simpler baselines.

major comments (2)

[Abstract] Abstract: The assertion that the method 'achieves superior results compared to individual Large Language Models (LLMs), including GPT 4 and GPT 3.5' is made without any reported metrics, baselines, statistical tests, dataset statistics, or effect sizes. This directly prevents assessment of the central empirical claim.
[Evaluation] Evaluation: No ablation studies are described that isolate the effects of the consensus mechanism, cognitive/knowledge diversity, or hierarchical structure (e.g., by removing one while holding model set and prompting fixed). Without them, gains cannot be attributed to the claimed human-emulation design rather than generic ensembling or prompt variations, weakening the causal link in the main contribution.

minor comments (2)

The distinction between 'diversity in cognition' and 'diversity in knowledge' is mentioned but not operationalized with concrete implementation details or examples; this should be clarified in the methods for reproducibility.
Specify exact model versions, sizes, and prompting templates used for the open-source LLMs to support replication.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights opportunities to strengthen the empirical presentation and causal attribution in our work. We address each major comment below and commit to revisions that improve clarity and rigor without altering the core claims.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that the method 'achieves superior results compared to individual Large Language Models (LLMs), including GPT 4 and GPT 3.5' is made without any reported metrics, baselines, statistical tests, dataset statistics, or effect sizes. This directly prevents assessment of the central empirical claim.

Authors: We agree that the abstract should be self-contained and include quantitative support for the central claim. The body of the manuscript reports full metrics, baselines (including GPT-4 and GPT-3.5), statistical tests, dataset sizes, and effect sizes across the four languages and three tasks. In revision we will condense the key results (e.g., accuracy/F1 gains, p-values, and dataset statistics) into the abstract so readers can immediately evaluate the claim. revision: yes
Referee: [Evaluation] Evaluation: No ablation studies are described that isolate the effects of the consensus mechanism, cognitive/knowledge diversity, or hierarchical structure (e.g., by removing one while holding model set and prompting fixed). Without them, gains cannot be attributed to the claimed human-emulation design rather than generic ensembling or prompt variations, weakening the causal link in the main contribution.

Authors: The referee correctly notes the absence of component-wise ablations. The current experiments compare the full multi-agent system against single models and simpler ensembles, but do not systematically disable consensus, diversity, or hierarchy while holding the model set and prompting constant. We will add these ablation experiments in the revised manuscript to quantify the incremental contribution of each design element and thereby strengthen the causal link to the human-emulation approach. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical system description without derivations or self-referential fits

full rationale

The paper describes an empirical multi-agent LLM architecture for disinformation detection tasks across languages, claiming performance gains from consensus, cognitive/knowledge diversity, and hierarchy. No equations, fitted parameters, predictions, or first-principles derivations appear in the provided text. Claims rest on experimental comparisons to single models rather than any self-definitional loop, fitted-input renaming, or load-bearing self-citation chain. The central mechanisms are presented as design choices inspired by human annotators, not as outputs forced by the evaluation results themselves. This is a standard empirical systems paper with independent experimental content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unverified assumption that the described multi-agent mechanisms emulate human annotators effectively; no free parameters, invented entities, or additional axioms are extractable from the abstract alone.

axioms (1)

domain assumption Consensus, diversity in cognition/knowledge, and hierarchical structure in multi-agent LLMs emulate human annotator decision-making for disinformation detection
Invoked as the basis for the system's design and claimed superiority.

pith-pipeline@v0.9.1-grok · 5762 in / 1186 out tokens · 25376 ms · 2026-06-30T06:24:11.503686+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 24 canonical work pages · 6 internal anchors

[1]

In: Working Notes of CLEF 2023—Conference and Labs of the Evaluation Forum

Alam, F., Barrón-Cedeño, A., Cheema, G.S., Hakimov, S., et al.: Overview of the CLEF-2023 CheckThat! lab task 1 on check-worthiness in multimodal and multigenre content. In: Working Notes of CLEF 2023—Conference and Labs of the Evaluation Forum. CLEF ’2023, Thessaloniki, Greece (2023)

2023
[2]

Lipscomb, C

Alam, F., Shaar, S., Dalvi, F., Sajjad, H., et al.: Fighting the COVID-19 infodemic: Modeling the perspective of journalists, fact-checkers, social media platforms, pol- icy makers, and the society. In: Moens, M.F., Huang, X., Specia, L., Yih, S.W.t. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2021. pp. 611–649. Association for...

work page doi:10.18653/v1/2021 2021
[3]

A vram, A.A., Groza, A., Lecu, A.: MCP-Orchestrated multi-agent system for au- tomated disinformation detection (2025), https://arxiv.org/abs/2508.10143v1, ac- cessed: 2026-02-26

work page arXiv 2025
[4]

Bercovich, A., Levy, I., Golan, I., Dabbah, M., et al.: Llama-nemotron: Ecient reasoning models (2025), https://arxiv.org/abs/2505.00949

work page arXiv 2025
[5]

Bethany, M., Vishwamitra, N., Chiang, C.Y.J., Najarad, P.: Camouage: Ex- ploiting misinformation detection systems through llm-driven adversarial claim transformation (2025), https://arxiv.org/abs/2505.01900

work page arXiv 2025
[6]

In: Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners

Bisong, E.: Google colaboratory. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners. pp. 59–
[7]

https://doi.org/10.1007/978-1-4842-4470-8_7

Apress, Berkeley, CA (2019). https://doi.org/10.1007/978-1-4842-4470-8_7

work page doi:10.1007/978-1-4842-4470-8_7 2019
[8]

In: Proceedings of the 2025 International Conference on Generative Articial Intelligence for Business

Cui, Z., Huang, T., Chiang, C.E., Du, C.: Toward veriable misinformation detec- tion: A multi-tool LLM agent framework. In: Proceedings of the 2025 International Conference on Generative Articial Intelligence for Business. pp. 179–185. ACM, New York, NY, USA (2025). https://doi.org/10.1145/3766918.3766948

work page doi:10.1145/3766918.3766948 2025
[9]

Procedia Computer Science258, 2278– 2289 (2025)

Das, S., Kumari, R., Singh, R.K.: Detection of fake news by convolutional neural networks and recurrent neural networks. Procedia Computer Science258, 2278– 2289 (2025). https://doi.org/10.1016/j.procs.2025.04.481

work page doi:10.1016/j.procs.2025.04.481 2025
[10]

DeepSeek-AI, Liu, A., Feng, B., Xue, B., et al.: Deepseek-v3 technical report (2025), https://arxiv.org/abs/2412.19437

work page internal anchor Pith review Pith/arXiv arXiv 2025
[11]

Demagog.sk: Demagog.sk, https://demagog.sk, Accessed= March 07, 2026

2026
[12]

Fakty, A.: AFP Fakty, https://fakty.afp.com, Accessed= March 07, 2026

2026
[13]

Fenza, G., Furno, D., Loia, V., Trotta, P.: Multi-LLM agents architecture for claim verication (2024)

2024
[14]

Grattaori, A., Dubey, A., Jauhri, A., Pandey, A., et al.: The llama 3 herd of models (2024), https://arxiv.org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Hoa, T.T., Duy, T.Q., Tran, K.Q., Nguyen, K.V.: Vifactcheck: A new benchmark dataset and methods for multi-domain news fact-checking in vietnamese (2024), https://arxiv.org/abs/2412.15308

work page arXiv 2024
[16]

In: Proceedings of the 2022 Four- teenth International Conference on Contemporary Computing (aug 2022)

Imbwaga, J.L., Chittaragi, N., Koolagudi, S.G.: Fake news detection us- ing machine learning algorithms. In: Proceedings of the 2022 Four- teenth International Conference on Contemporary Computing (aug 2022). https://doi.org/10.1145/3549206.3549256

work page doi:10.1145/3549206.3549256 2022
[17]

Kula, S., Gregor, M.: Multilingual models for check-worthy social media posts detection (2024), https://arxiv.org/abs/2408.06737 16 Sebastian Kula and Martin Tamajka

work page arXiv 2024
[18]

In: Computational Science – ICCS 2021

Kula, S., Kozik, R., Choraś, M., Woźniak, M.: Transformer based models in fake news detection. In: Computational Science – ICCS 2021. Lecture Notes in Com- puter Science, vol. 12747, pp. 28–38. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-77970-2_3

work page doi:10.1007/978-3-030-77970-2_3 2021
[19]

Lakara, K., Channing, G., Rupprecht, C., Sock, J., et al.: MAD- Sherlock: Multi-agent debate for visual misinformation detection (2024), https://arxiv.org/abs/2410.20140

work page arXiv 2024
[20]

Li, X., Zhang, Y., Malthouse, E.C.: Large language model agent for fake news detection (2024), https://arxiv.org/abs/2405.01593

work page arXiv 2024
[21]

Liu, Y., Liu, Y., Zhang, X., Chen, X., Yan, R.: The truth becomes clearer through debate! Multi-Agent systems with large language models unmask fake news (2025), https://arxiv.org/abs/2505.08532v1, accessed: 2026-02-26

work page arXiv 2025
[22]

META: The Llama 4 herd: The beginning of a new era of natively multimodal AI in- novation, https://ai.meta.com/blog/llama-4-multimodal-intelligence/, Accessed= March 07, 2026

2026
[23]

In: Al-Onaizan, Y., Bansal, M., Chen, Y.N

Modzelewski, A., Da San Martino, G., Savov, P., Wilczyńska, M.A., Wierzbicki, A.: MIPD: Exploring manipulation and intention in a novel corpus of Pol- ish disinformation. In: Al-Onaizan, Y., Bansal, M., Chen, Y.N. (eds.) Proceed- ings of the 2024 Conference on Empirical Methods in Natural Language Pro- cessing. pp. 19769–19785. Association for Computation...

work page doi:10.18653/v1/2024.emnlp-main.1103 2024
[24]

In: Faggioli, G., Ferro, N., Hanbury, A., Potthast, M

Nakov, P., Barrón-Cedeño, A., Martino, G.D.S., Alam, F., et al.: Overview of the CLEF-2022 checkthat! lab task 1 on identifying relevant claims in tweets. In: Faggioli, G., Ferro, N., Hanbury, A., Potthast, M. (eds.) Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to - 8th, 2022....

2022
[25]

NVIDIA Corporation: NIM Platform (2023), https://developer.nvidia.com/nim

2023
[26]

OpenAI, Achiam, J., Adler, S., Agarwal, S., et al.: Gpt-4 technical report (2024), https://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

State-of-the-Art Large Language Models: Performance in Detecting Nuanced Fake News

Repede, S.E., Brad, R.: LLaMA 3 vs. State-of-the-Art Large Language Models: Performance in Detecting Nuanced Fake News. Computers13(11), 292 (2024). https://doi.org/10.3390/computers13110292

work page doi:10.3390/computers13110292 2024
[28]

bert - a comparative study of transformers language models for the detection of check-worthy claims

Sawiński, M., Węcel, K., Księżniak, E., Stróżyna, M., et al.: Openfact at checkthat! 2023: Head-to-head gpt vs. bert - a comparative study of transformers language models for the detection of check-worthy claims. CLEF 2023: Conference and Labs of the Evaluation Forum (2023),

2023
[29]

In: Lec- ture Notes in Networks and Systems

Srivastava, A.K., Reddy, L.A.: Detection of fake news using logistic regres- sion, decision tree, random forest, and gradient boosting algorithms. In: Lec- ture Notes in Networks and Systems. pp. 317–325. Springer (jan 2025). https://doi.org/10.1007/978-981-96-5238-9_28

work page doi:10.1007/978-981-96-5238-9_28 2025
[30]

Measurement: Sensors32, 101028 (2024)

Sudhakar, M., Kaliyamurthie, K.P.: Detection of fake news from social media using support vector machine learning algorithms. Measurement: Sensors32, 101028 (2024). https://doi.org/10.1016/j.measen.2024.101028

work page doi:10.1016/j.measen.2024.101028 2024
[31]

Team, K., Bai, Y., Bao, Y., Chen, G., et al.: Kimi k2: Open agentic intelligence (2025), https://arxiv.org/abs/2507.20534

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Wu, Q., Bansal, G., Zhang, J., Wu, Y., et al.: Autogen: Enabling next-gen llm ap- plications via multi-agent conversation (2023), https://arxiv.org/abs/2308.08155

work page internal anchor Pith review Pith/arXiv arXiv 2023
[33]

Yang, A., Li, A., Yang, B., Zhang, B., et al.: Qwen3 technical report (2025), https://arxiv.org/abs/2505.09388

work page internal anchor Pith review Pith/arXiv arXiv 2025

[1] [1]

In: Working Notes of CLEF 2023—Conference and Labs of the Evaluation Forum

Alam, F., Barrón-Cedeño, A., Cheema, G.S., Hakimov, S., et al.: Overview of the CLEF-2023 CheckThat! lab task 1 on check-worthiness in multimodal and multigenre content. In: Working Notes of CLEF 2023—Conference and Labs of the Evaluation Forum. CLEF ’2023, Thessaloniki, Greece (2023)

2023

[2] [2]

Lipscomb, C

Alam, F., Shaar, S., Dalvi, F., Sajjad, H., et al.: Fighting the COVID-19 infodemic: Modeling the perspective of journalists, fact-checkers, social media platforms, pol- icy makers, and the society. In: Moens, M.F., Huang, X., Specia, L., Yih, S.W.t. (eds.) Findings of the Association for Computational Linguistics: EMNLP 2021. pp. 611–649. Association for...

work page doi:10.18653/v1/2021 2021

[3] [3]

A vram, A.A., Groza, A., Lecu, A.: MCP-Orchestrated multi-agent system for au- tomated disinformation detection (2025), https://arxiv.org/abs/2508.10143v1, ac- cessed: 2026-02-26

work page arXiv 2025

[4] [4]

Bercovich, A., Levy, I., Golan, I., Dabbah, M., et al.: Llama-nemotron: Ecient reasoning models (2025), https://arxiv.org/abs/2505.00949

work page arXiv 2025

[5] [5]

Bethany, M., Vishwamitra, N., Chiang, C.Y.J., Najarad, P.: Camouage: Ex- ploiting misinformation detection systems through llm-driven adversarial claim transformation (2025), https://arxiv.org/abs/2505.01900

work page arXiv 2025

[6] [6]

In: Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners

Bisong, E.: Google colaboratory. In: Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners. pp. 59–

[7] [7]

https://doi.org/10.1007/978-1-4842-4470-8_7

Apress, Berkeley, CA (2019). https://doi.org/10.1007/978-1-4842-4470-8_7

work page doi:10.1007/978-1-4842-4470-8_7 2019

[8] [8]

In: Proceedings of the 2025 International Conference on Generative Articial Intelligence for Business

Cui, Z., Huang, T., Chiang, C.E., Du, C.: Toward veriable misinformation detec- tion: A multi-tool LLM agent framework. In: Proceedings of the 2025 International Conference on Generative Articial Intelligence for Business. pp. 179–185. ACM, New York, NY, USA (2025). https://doi.org/10.1145/3766918.3766948

work page doi:10.1145/3766918.3766948 2025

[9] [9]

Procedia Computer Science258, 2278– 2289 (2025)

Das, S., Kumari, R., Singh, R.K.: Detection of fake news by convolutional neural networks and recurrent neural networks. Procedia Computer Science258, 2278– 2289 (2025). https://doi.org/10.1016/j.procs.2025.04.481

work page doi:10.1016/j.procs.2025.04.481 2025

[10] [10]

DeepSeek-AI, Liu, A., Feng, B., Xue, B., et al.: Deepseek-v3 technical report (2025), https://arxiv.org/abs/2412.19437

work page internal anchor Pith review Pith/arXiv arXiv 2025

[11] [11]

Demagog.sk: Demagog.sk, https://demagog.sk, Accessed= March 07, 2026

2026

[12] [12]

Fakty, A.: AFP Fakty, https://fakty.afp.com, Accessed= March 07, 2026

2026

[13] [13]

Fenza, G., Furno, D., Loia, V., Trotta, P.: Multi-LLM agents architecture for claim verication (2024)

2024

[14] [14]

Grattaori, A., Dubey, A., Jauhri, A., Pandey, A., et al.: The llama 3 herd of models (2024), https://arxiv.org/abs/2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Hoa, T.T., Duy, T.Q., Tran, K.Q., Nguyen, K.V.: Vifactcheck: A new benchmark dataset and methods for multi-domain news fact-checking in vietnamese (2024), https://arxiv.org/abs/2412.15308

work page arXiv 2024

[16] [16]

In: Proceedings of the 2022 Four- teenth International Conference on Contemporary Computing (aug 2022)

Imbwaga, J.L., Chittaragi, N., Koolagudi, S.G.: Fake news detection us- ing machine learning algorithms. In: Proceedings of the 2022 Four- teenth International Conference on Contemporary Computing (aug 2022). https://doi.org/10.1145/3549206.3549256

work page doi:10.1145/3549206.3549256 2022

[17] [17]

Kula, S., Gregor, M.: Multilingual models for check-worthy social media posts detection (2024), https://arxiv.org/abs/2408.06737 16 Sebastian Kula and Martin Tamajka

work page arXiv 2024

[18] [18]

In: Computational Science – ICCS 2021

Kula, S., Kozik, R., Choraś, M., Woźniak, M.: Transformer based models in fake news detection. In: Computational Science – ICCS 2021. Lecture Notes in Com- puter Science, vol. 12747, pp. 28–38. Springer International Publishing, Cham (2021). https://doi.org/10.1007/978-3-030-77970-2_3

work page doi:10.1007/978-3-030-77970-2_3 2021

[19] [19]

Lakara, K., Channing, G., Rupprecht, C., Sock, J., et al.: MAD- Sherlock: Multi-agent debate for visual misinformation detection (2024), https://arxiv.org/abs/2410.20140

work page arXiv 2024

[20] [20]

Li, X., Zhang, Y., Malthouse, E.C.: Large language model agent for fake news detection (2024), https://arxiv.org/abs/2405.01593

work page arXiv 2024

[21] [21]

Liu, Y., Liu, Y., Zhang, X., Chen, X., Yan, R.: The truth becomes clearer through debate! Multi-Agent systems with large language models unmask fake news (2025), https://arxiv.org/abs/2505.08532v1, accessed: 2026-02-26

work page arXiv 2025

[22] [22]

META: The Llama 4 herd: The beginning of a new era of natively multimodal AI in- novation, https://ai.meta.com/blog/llama-4-multimodal-intelligence/, Accessed= March 07, 2026

2026

[23] [23]

In: Al-Onaizan, Y., Bansal, M., Chen, Y.N

Modzelewski, A., Da San Martino, G., Savov, P., Wilczyńska, M.A., Wierzbicki, A.: MIPD: Exploring manipulation and intention in a novel corpus of Pol- ish disinformation. In: Al-Onaizan, Y., Bansal, M., Chen, Y.N. (eds.) Proceed- ings of the 2024 Conference on Empirical Methods in Natural Language Pro- cessing. pp. 19769–19785. Association for Computation...

work page doi:10.18653/v1/2024.emnlp-main.1103 2024

[24] [24]

In: Faggioli, G., Ferro, N., Hanbury, A., Potthast, M

Nakov, P., Barrón-Cedeño, A., Martino, G.D.S., Alam, F., et al.: Overview of the CLEF-2022 checkthat! lab task 1 on identifying relevant claims in tweets. In: Faggioli, G., Ferro, N., Hanbury, A., Potthast, M. (eds.) Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to - 8th, 2022....

2022

[25] [25]

NVIDIA Corporation: NIM Platform (2023), https://developer.nvidia.com/nim

2023

[26] [26]

OpenAI, Achiam, J., Adler, S., Agarwal, S., et al.: Gpt-4 technical report (2024), https://arxiv.org/abs/2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2024

[27] [27]

State-of-the-Art Large Language Models: Performance in Detecting Nuanced Fake News

Repede, S.E., Brad, R.: LLaMA 3 vs. State-of-the-Art Large Language Models: Performance in Detecting Nuanced Fake News. Computers13(11), 292 (2024). https://doi.org/10.3390/computers13110292

work page doi:10.3390/computers13110292 2024

[28] [28]

bert - a comparative study of transformers language models for the detection of check-worthy claims

Sawiński, M., Węcel, K., Księżniak, E., Stróżyna, M., et al.: Openfact at checkthat! 2023: Head-to-head gpt vs. bert - a comparative study of transformers language models for the detection of check-worthy claims. CLEF 2023: Conference and Labs of the Evaluation Forum (2023),

2023

[29] [29]

In: Lec- ture Notes in Networks and Systems

Srivastava, A.K., Reddy, L.A.: Detection of fake news using logistic regres- sion, decision tree, random forest, and gradient boosting algorithms. In: Lec- ture Notes in Networks and Systems. pp. 317–325. Springer (jan 2025). https://doi.org/10.1007/978-981-96-5238-9_28

work page doi:10.1007/978-981-96-5238-9_28 2025

[30] [30]

Measurement: Sensors32, 101028 (2024)

Sudhakar, M., Kaliyamurthie, K.P.: Detection of fake news from social media using support vector machine learning algorithms. Measurement: Sensors32, 101028 (2024). https://doi.org/10.1016/j.measen.2024.101028

work page doi:10.1016/j.measen.2024.101028 2024

[31] [31]

Team, K., Bai, Y., Bao, Y., Chen, G., et al.: Kimi k2: Open agentic intelligence (2025), https://arxiv.org/abs/2507.20534

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Wu, Q., Bansal, G., Zhang, J., Wu, Y., et al.: Autogen: Enabling next-gen llm ap- plications via multi-agent conversation (2023), https://arxiv.org/abs/2308.08155

work page internal anchor Pith review Pith/arXiv arXiv 2023

[33] [33]

Yang, A., Li, A., Yang, B., Zhang, B., et al.: Qwen3 technical report (2025), https://arxiv.org/abs/2505.09388

work page internal anchor Pith review Pith/arXiv arXiv 2025