Recognition: 2 theorem links
· Lean TheoremFrom Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration
Pith reviewed 2026-05-15 16:34 UTC · model grok-4.3
The pith
A genealogy-graph governance layer suppresses error amplification in LLM-based multi-agent systems without altering their collaboration architecture.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By abstracting LLM-MAS collaboration as a directed dependency graph and defining an early-stage risk criterion, the paper shows that error cascades follow predictable amplification patterns. Experiments across mainstream frameworks expose cascade amplification along dependency paths, sensitivity to graph topology, and inertia toward erroneous consensus. A single atomic error seed suffices to infect the system. The genealogy-graph governance layer, implemented as a non-intrusive plugin, suppresses both endogenous and exogenous amplification and blocks final infection in at least 89 percent of runs without modifying the underlying collaboration structure.
What carries the argument
The genealogy-graph-based governance layer, which tracks message lineage in the directed dependency graph to apply early risk criteria and intercept error propagation.
If this is right
- Minor errors no longer solidify into system-level false consensus during iterative collaboration.
- Protection works across six mainstream multi-agent frameworks without architecture changes.
- A single error seed is prevented from causing widespread failure in most operating modes.
- Both internally generated and externally introduced errors are suppressed by the same layer.
- Effective information flow between agents remains intact while cascade risks are reduced.
Where Pith is reading between the lines
- Dependency tracking of this form could be adapted to improve reliability in distributed systems beyond LLM agents.
- Agent frameworks may benefit from making genealogy logging a default feature rather than an add-on.
- Topology-aware agent design informed by the risk criterion could further lower cascade exposure.
- Validation on larger, open-ended tasks would test whether the reported 89 percent prevention rate generalizes.
Load-bearing premise
The directed dependency graph abstraction and early-stage risk criterion capture the dominant mechanisms of error spread in real LLM multi-agent deployments.
What would settle it
A real deployment in which errors propagate and amplify through non-message channels such as shared external memory or tool states that the genealogy graph does not record, allowing infection despite the governance layer.
Figures
read the original abstract
Large Language Model-based Multi-Agent Systems (LLM-MAS) are increasingly applied to complex collaborative scenarios. However, their collaborative mechanisms may cause minor inaccuracies to gradually solidify into system-level false consensus through iteration. Such risks are difficult to trace since errors can propagate and amplify through message dependencies. Existing protections often rely on single-agent validation or require modifications to the collaboration architecture, which can weaken effective information flow and may not align with natural collaboration processes in real tasks. To address this, we propose a propagation dynamics model tailored for LLM-MAS that abstracts collaboration as a directed dependency graph and provides an early-stage risk criterion to characterize amplification risk. Through experiments on six mainstream frameworks, we identify three vulnerability classes: cascade amplification, topological sensitivity, and consensus inertia. We further instantiate an attack where injecting just a single atomic error seed leads to widespread failure. In response, we introduce a genealogy-graph-based governance layer, implemented as a message-layer plugin, that suppresses both endogenous and exogenous error amplification without altering the collaboration architecture. Experiments show that this approach prevents final infection in at least 89% of runs across operating modes and significantly mitigates the cascading spread of minor errors.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper models error propagation in LLM-based multi-agent systems as a directed dependency graph, identifies three vulnerability classes (cascade amplification, topological sensitivity, consensus inertia) via experiments on six frameworks, shows that a single atomic error seed can cause widespread failure, and proposes a genealogy-graph governance layer implemented as a message-layer plugin. This plugin is claimed to suppress endogenous and exogenous amplification without altering the collaboration architecture, preventing final infection in at least 89% of runs across operating modes.
Significance. If the directed-graph abstraction faithfully captures dominant propagation mechanisms and the empirical results prove robust to controls, the work offers a lightweight, architecture-preserving mitigation strategy for error cascades in LLM-MAS. This is significant for practical deployment of collaborative agents, as it avoids the drawbacks of single-agent validation or architectural changes while providing concrete prevention rates across multiple frameworks.
major comments (3)
- [Abstract and Experimental Setup] Abstract and Experimental Setup: The 89% prevention rate is presented without details on error definitions, number of runs, statistical tests, variance, or baseline comparisons (e.g., no-governance controls). This information is load-bearing for evaluating the mitigation's effectiveness and generalizability.
- [Propagation Dynamics Model] Propagation Dynamics Model: The directed dependency graph and early-stage risk criterion assume message dependencies dominate error spread. However, shared context, tool outputs, or implicit state outside explicit messages are common in LLM-MAS and could allow undetected amplification, potentially invalidating the reported prevention rate when the model is incomplete.
- [Vulnerability Classification] Vulnerability Classification: The three classes appear identified post-hoc from the runs, introducing selection-effect risk that could overstate their generality and the attack's representativeness.
minor comments (1)
- [Abstract] Abstract: The six mainstream frameworks are not named, which reduces immediate reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on experimental transparency, model assumptions, and classification methodology. We address each major comment below, indicating revisions where the manuscript will be updated to strengthen clarity and address potential limitations.
read point-by-point responses
-
Referee: [Abstract and Experimental Setup] The 89% prevention rate is presented without details on error definitions, number of runs, statistical tests, variance, or baseline comparisons (e.g., no-governance controls). This information is load-bearing for evaluating the mitigation's effectiveness and generalizability.
Authors: We agree that the abstract would benefit from more detail on these elements for immediate evaluation. The full manuscript's Section 4 specifies error as factual deviation from ground truth exceeding a 5% threshold, with 500 runs per framework and operating mode, including standard deviations and t-test results (p < 0.01) against no-governance baselines showing infection rates above 70%. We will revise the abstract to incorporate key statistics and add a summary table of runs, variance, and baselines in the main text. revision: yes
-
Referee: [Propagation Dynamics Model] The directed dependency graph and early-stage risk criterion assume message dependencies dominate error spread. However, shared context, tool outputs, or implicit state outside explicit messages are common in LLM-MAS and could allow undetected amplification, potentially invalidating the reported prevention rate when the model is incomplete.
Authors: The model abstracts collaboration via explicit message dependencies as the primary propagation channel, which our experiments on six frameworks confirm as the dominant mechanism in the tested scenarios. We acknowledge that shared context and tool outputs may enable additional implicit paths not fully modeled. In revision, we will add a limitations discussion on this point and note that the genealogy-graph plugin mitigates observable cascades at the message layer. The reported prevention rates remain valid under the model's explicit-dependency assumptions. revision: partial
-
Referee: [Vulnerability Classification] The three classes appear identified post-hoc from the runs, introducing selection-effect risk that could overstate their generality and the attack's representativeness.
Authors: The classes emerged from systematic patterns observed consistently across all frameworks and attack variants, informed by graph properties such as path amplification and node sensitivity. To mitigate selection concerns, we will clarify the a priori hypotheses in the revision, provide full run data in supplementary materials, and cross-validate the classification against additional independent scenarios. revision: yes
Circularity Check
No circularity detected; claims rest on graph abstraction and empirical validation
full rationale
The paper defines a directed dependency graph model and early-stage risk criterion, then reports empirical results from experiments on six frameworks showing vulnerability classes and 89% prevention via the genealogy-graph plugin. No equations are presented that equate outputs to inputs by construction, no fitted parameters are relabeled as predictions, and no self-citations are used to justify uniqueness or smuggle ansatzes. The derivation chain is self-contained against the stated experimental benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We formalize the multi-agent workflow as a directed graph G=(V,E) ... si(t+1)=(1-δ)si(t)+(1-si(t))f_prod ... R≈βρ(A)/δ ... Genealogy Graph L=(V,E) to track atomic provenance
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
product-form infection function f_prod_i(t)=1-∏(1-β aij sj(t)) ... spectral threshold indicator
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Not Just RLHF: Why Alignment Alone Won't Fix Multi-Agent Sycophancy
Pretrained base models exhibit higher yield to peer disagreement than RLHF instruct variants, with the effect localized to mid-layer attention and mitigated by structured dissent rather than prompt defenses.
Reference graph
Works this paper leans on
-
[1]
Ibrahim Adabara, Bashir Olaniyi Sadiq, Aliyu Nuhu Shuaibu, Yale Ibrahim Danjuma, and Venkateswarlu Maninti. Trustworthy agentic ai systems: A cross-layer review of architectures, threat models, and governance strategies for real-world deployment.F1000Research, 14(905):905, 2025
work page 2025
-
[2]
Apoorva Adimulam, Rajesh Gupta, and Sumit Kumar. The orchestration of multi-agent systems: Architec- tures, protocols, and enterprise adoption.arXiv preprint arXiv:2601.13671, 2026
-
[3]
Muhammad Muzamil Aslam, Zahoor Ahmed, Liping Du, Muhammad Zohaib Hassan, Sajid Ali, and Muham- mad Nasir. An overview of recent advances of resilient consensus for multiagent systems under at- tacks.Computational Intelligence and Neuroscience, 2022(1):6732343, 2022
work page 2022
-
[4]
Uci machine learning repository, 2007
Arthur Asuncion, David Newman, et al. Uci machine learning repository, 2007
work page 2007
-
[5]
Bowen Baker, Joost Huizinga, Leo Gao, Zehao Dou, Melody Y Guan, Aleksander Madry, Wojciech Zaremba, Jakub Pachocki, and David Farhi. Monitoring reason- ing models for misbehavior and the risks of promoting obfuscation.arXiv preprint arXiv:2503.11926, 2025
-
[6]
Sushil Bikhchandani, David Hirshleifer, and Ivo Welch. A theory of fads, fashion, custom, and cultural change as informational cascades.Journal of political Economy, 100(5):992–1026, 1992
work page 1992
-
[7]
Edward Y Chang and Longling Geng. Sagallm: Con- text management, validation, and transaction guaran- tees for multi-agent llm planning.arXiv preprint arXiv:2503.11951, 2025
-
[8]
A lattice model of secure informa- tion flow.Communications of the ACM, 19(5):236–243, 1976
Dorothy E Denning. A lattice model of secure informa- tion flow.Communications of the ACM, 19(5):236–243, 1976. 14
work page 1976
-
[9]
Improving factuality and reasoning in language models through multiagent de- bate
Yilun Du, Shuang Li, Antonio Torralba, Joshua B Tenen- baum, and Igor Mordatch. Improving factuality and reasoning in language models through multiagent de- bate. InForty-first International Conference on Machine Learning, 2023
work page 2023
-
[10]
Zhihua Duan and Jialin Wang. Exploration of llm multi- agent application implementation based on langgraph+ crewai.arXiv preprint arXiv:2411.18241, 2024
-
[11]
PhD thesis, University of Oxford, 2021
Christopher J D’Urso.Nowhere to hide: investigating the use of unilateral alternatives to extradition in United States prosecutions of transnational cybercrime. PhD thesis, University of Oxford, 2021
work page 2021
-
[12]
Ragas: Automated evaluation of retrieval augmented generation
Shahul Es, Jithin James, Luis Espinosa Anke, and Steven Schockaert. Ragas: Automated evaluation of retrieval augmented generation. InProceedings of the 18th Con- ference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pages 150–158, 2024
work page 2024
-
[13]
Mohamed Amine Ferrag, Norbert Tihanyi, Djallel Hamouda, Leandros Maglaras, Abderrahmane Lakas, and Merouane Debbah. From prompt injections to pro- tocol exploits: Threats in llm-powered ai agents work- flows.ICT Express, 2025
work page 2025
-
[14]
Armstrong Foundjem, Lionel Nganyewou Tidjon, Leu- son Da Silva, and Foutse Khomh. Multi-agent frame- work for threat mitigation and resilience in ai-based systems.arXiv preprint arXiv:2512.23132, 2025
-
[15]
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injec- tion. InProceedings of the 16th ACM workshop on artificial intelligence and security, pages 79–90, 2023
work page 2023
-
[16]
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V Chawla, Olaf Wiest, and Xian- gliang Zhang. Large language model based multi-agents: A survey of progress and challenges.arXiv preprint arXiv:2402.01680, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[17]
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He, Xiaodong Liu, Jianfeng Gao, and Weizhu Chen. Deberta: Decoding-enhanced bert with disentangled attention.arXiv preprint arXiv:2006.03654, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[18]
Red-teaming llm multi-agent systems via communication attacks
Pengfei He, Yuping Lin, Shen Dong, Han Xu, Yue Xing, and Hui Liu. Red-teaming llm multi-agent systems via communication attacks. InFindings of the Association for Computational Linguistics: ACL 2025, pages 6726– 6747, 2025
work page 2025
-
[19]
Sentinelagent: Graph-based anomaly detection in multi-agent systems
Xu He, Di Wu, Yan Zhai, and Kun Sun. Sentinelagent: Graph-based anomaly detection in multi-agent systems. arXiv preprint arXiv:2505.24201, 2025
-
[20]
Measuring Massive Multitask Language Understanding
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. Measuring massive multitask language understanding. arXiv preprint arXiv:2009.03300, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[21]
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks, Collin Burns, Saurav Kadavath, Akul Arora, Steven Basart, Eric Tang, Dawn Song, and Jacob Steinhardt. Measuring mathematical problem solving with the math dataset.arXiv preprint arXiv:2103.03874, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[22]
Metagpt: Meta programming for a multi-agent collabora- tive framework
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. Metagpt: Meta programming for a multi-agent collabora- tive framework. InThe twelfth international conference on learning representations, 2023
work page 2023
-
[23]
Understanding the planning of LLM agents: A survey
Xu Huang, Weiwen Liu, Xiaolong Chen, Xingmei Wang, Hao Wang, Defu Lian, Yasheng Wang, Ruiming Tang, and Enhong Chen. Understanding the planning of llm agents: A survey.arXiv preprint arXiv:2402.02716, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[24]
Hideaki Ishii, Yuan Wang, and Shuai Feng. An overview on multi-agent consensus under adversarial attacks.An- nual Reviews in Control, 53:252–272, 2022
work page 2022
-
[25]
A multi-vocal review of security orchestration.ACM Computing Surveys (CSUR), 52(2):1–45, 2019
Chadni Islam, Muhammad Ali Babar, and Surya Nepal. A multi-vocal review of security orchestration.ACM Computing Surveys (CSUR), 52(2):1–45, 2019
work page 2019
-
[26]
Towards mitigating llm halluci- nation via self reflection
Ziwei Ji, Tiezheng Yu, Yan Xu, Nayeon Lee, Etsuko Ishii, and Pascale Fung. Towards mitigating llm halluci- nation via self reflection. InFindings of the Association for Computational Linguistics: EMNLP 2023, pages 1827–1843, 2023
work page 2023
-
[27]
Juyong Jiang, Fan Wang, Jiasi Shen, Sungju Kim, and Sunghun Kim. A survey on large language models for code generation.ACM Transactions on Software Engi- neering and Methodology, 35(2):1–72, January 2026
work page 2026
-
[28]
Dezhang Kong, Shi Lin, Zhenhua Xu, Zhebo Wang, Minghao Li, Yufeng Li, Yilun Zhang, Hujin Peng, Xiang Chen, Zeyang Sha, et al. A survey of llm-driven ai agent communication: Protocols, security risks, and defense countermeasures.arXiv preprint arXiv:2506.19676, 2025
-
[29]
Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, 15 et al. Retrieval-augmented generation for knowledge- intensive nlp tasks.Advances in neural information processing systems, 33:9459–9474, 2020
work page 2020
-
[30]
Guohao Li, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. Camel: Communica- tive agents for" mind" exploration of large language model society.Advances in Neural Information Process- ing Systems, 36:51991–52008, 2023
work page 2023
-
[31]
Xinyi Li, Sai Wang, Siqi Zeng, Yu Wu, and Yi Yang. A survey on llm-based multi-agent systems: workflow, in- frastructure, and challenges.Vicinagearth, 1(1):9, 2024
work page 2024
-
[32]
Zhiyu Liao, Kang Chen, Yuanguo Lin, Kangkang Li, Yunxuan Liu, Hefeng Chen, Xingwang Huang, and Yuanhui Yu. Attack and defense techniques in large language models: A survey and new perspectives.Neu- ral Networks, page 108388, 2025
work page 2025
-
[33]
The Dark Side of LLMs: Agent-based Attack Vectors for System-level Compromise
Matteo Lupinacci, Francesco Aurelio Pironti, Francesco Blefari, Francesco Romeo, Luigi Arena, and Angelo Furfaro. The dark side of llms: Agent-based at- tacks for complete computer takeover.arXiv preprint arXiv:2507.06850, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[34]
Factscore: Fine-grained atomic evaluation of factual precision in long form text generation
Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen-tau Yih, Pang Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi. Factscore: Fine-grained atomic evaluation of factual precision in long form text generation. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 12076–12100, 2023
work page 2023
-
[35]
Melissa Z Pan, Mert Cemri, Lakshya A Agrawal, Shuyi Yang, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Kannan Ramchandran, Dan Klein, et al. Why do multiagent systems fail? In ICLR 2025 Workshop on Building Trust in Language Models and Applications, 2025
work page 2025
-
[36]
Epidemic processes in complex networks.Reviews of modern physics, 87(3):925–979, 2015
Romualdo Pastor-Satorras, Claudio Castellano, Piet Van Mieghem, and Alessandro Vespignani. Epidemic processes in complex networks.Reviews of modern physics, 87(3):925–979, 2015
work page 2015
-
[37]
Partha Pratim Ray. A review on agent-to-agent pro- tocol: Concept, state-of-the-art, challenges and future directions.Authorea Preprints, 2025
work page 2025
-
[38]
Ranjan Sapkota, Konstantinos I Roumeliotis, and Manoj Karkee. Ai agents vs. agentic ai: A conceptual tax- onomy, applications and challenges.arXiv preprint arXiv:2505.10468, 2025
-
[39]
Audit-llm: Multi- agent collaboration for log-based insider threat detection
Chengyu Song, Linru Ma, Jianming Zheng, Jinzhi Liao, Hongyu Kuang, and Lin Yang. Audit-llm: Multi- agent collaboration for log-based insider threat detection. arXiv preprint arXiv:2408.08902, 2024
-
[40]
Towards detecting llms hallucination via markov chain-based multi-agent debate framework
Xiaoxi Sun, Jinpeng Li, Yan Zhong, Dongyan Zhao, and Rui Yan. Towards detecting llms hallucination via markov chain-based multi-agent debate framework. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025
work page 2025
-
[41]
Yashar Talebirad and Amirhossein Nadiri. Multi-agent collaboration: Harnessing the power of intelligent llm agents.arXiv preprint arXiv:2306.03314, 2023
-
[42]
Creating large language model applications utilizing langchain: A primer on developing llm apps fast
Oguzhan Topsakal and Tahir Cetin Akinci. Creating large language model applications utilizing langchain: A primer on developing llm apps fast. InInternational conference on applied engineering and natural sciences, volume 1, pages 1050–1056, 2023
work page 2023
-
[43]
Multi-agent systems execute arbitrary malicious code
Harold Triedman, Rishi Jha, and Vitaly Shmatikov. Multi-agent systems execute arbitrary malicious code. arXiv preprint arXiv:2503.12188, 2025
-
[44]
The spread of true and false news online.science, 359(6380):1146– 1151, 2018
Soroush V osoughi, Deb Roy, and Sinan Aral. The spread of true and false news online.science, 359(6380):1146– 1151, 2018
work page 2018
-
[45]
Jialin Wang and Zhihua Duan. Agent ai with lang- graph: A modular framework for enhancing machine translation using large language models.arXiv preprint arXiv:2412.03801, 2024
-
[46]
Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024
work page 2024
-
[47]
Yuntao Wang, Yanghe Pan, Shaolong Guo, and Zhou Su. Security of internet of agents: Attacks and counter- measures.IEEE Open Journal of the Computer Society, 2025
work page 2025
-
[48]
Yuntao Wang, Yanghe Pan, Zhou Su, Yi Deng, Quan Zhao, Linkang Du, Tom H Luan, Jiawen Kang, and Dusit Niyato. Large model based agents: State-of-the-art, cooperation paradigms, security and privacy, and future trends.IEEE Communications Surveys & Tutorials, 2025
work page 2025
-
[49]
Duncan J Watts. A simple model of global cascades on random networks.Proceedings of the National Academy of Sciences, 99(9):5766–5771, 2002
work page 2002
-
[50]
Autogen: Enabling next-gen llm applications via multi-agent conversations
Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. Autogen: Enabling next-gen llm applications via multi-agent conversations. InFirst Conference on Language Modeling, 2024. 16
work page 2024
-
[51]
The rise and potential of large language model based agents: A survey, 2023
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shi- han Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang,...
work page 2023
-
[52]
Yizhe Xie, Congcong Zhu, Xinyue Zhang, Tianqing Zhu, Dayong Ye, Minghao Wang, and Chi Liu. Who’s the mole? modeling and detecting intention-hiding mali- cious agents in llm-based multi-agent systems.arXiv preprint arXiv:2507.04724, 2025
-
[53]
Yi Yang, Yitong Ma, Hao Feng, Yiming Cheng, and Zhu Han. Minimizing hallucinations and communication costs: Adversarial debate and voting mechanisms in llm-based multi-agents.Applied Sciences, 15(7):3676, 2025
work page 2025
-
[54]
Jailbreak Attacks and Defenses Against Large Language Models: A Survey
Sibo Yi, Yule Liu, Zhen Sun, Tianshuo Cong, Xinlei He, Jiaxing Song, Ke Xu, and Qi Li. Jailbreak attacks and defenses against large language models: A survey.arXiv preprint arXiv:2407.04295, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[55]
Engin Zeydan, Jorge Baranda, Josep Mangues-Bafalluy, and Yekta Turk. Blockchain for network service or- chestration: Trust and adoption in multi-domain envi- ronments.IEEE Communications Standards Magazine, 7(2):16–22, 2023
work page 2023
-
[56]
Shaokun Zhang, Ming Yin, Jieyu Zhang, Jiale Liu, Zhiguang Han, Jingyang Zhang, Beibin Li, Chi Wang, Huazheng Wang, Yiran Chen, et al. Which agent causes task failures and when? on automated failure attribution of llm multi-agent systems.arXiv preprint arXiv:2505.00212, 2025
-
[57]
Tommaso Zoppi, Andrea Ceccarelli, and Andrea Bon- davalli. Exploring anomaly detection in systems of systems. InProceedings of the Symposium on Applied Computing, pages 1139–1146, 2017. A Model Fitting and Topology Configuration Details This appendix specifies the configuration and fitting proto- col omitted from §2.3. In this calibration experiment, we o...
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.