When Latent Agents Lie: KV-Cache Integrity in Multi-Agent LLM Collaboration

Carlos Baquero; Lu\'is Brito

arxiv: 2606.28958 · v1 · pith:S3VT5IDBnew · submitted 2026-06-27 · 💻 cs.MA

When Latent Agents Lie: KV-Cache Integrity in Multi-Agent LLM Collaboration

Lu\'is Brito , Carlos Baquero This is my paper

Pith reviewed 2026-06-30 08:24 UTC · model grok-4.3

classification 💻 cs.MA

keywords multi-agent LLMsKV-cachelatent collaborationintegrity verificationHMAC manifestsecurityquestion answering

0 comments

The pith

Tampering with hidden KV-cache state can degrade multi-agent LLM answers even when visible commitments look plausible.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

In multi-agent question-answering systems, specialist agents share both short visible commitments and their full KV-cache states with a coordinator model. This latent sharing improves exact-match and F1 scores over text-only collaboration on benchmarks such as HiddenBench and HotPotQA. A malicious specialist can alter the hidden KV state to reduce final performance while leaving the visible commitment unchanged and plausible. Text verifiers miss the attack, and simple magnitude checks on the state can be evaded by adaptive tampering. An HMAC-SHA256 manifest that binds specialist, session, model, visible commitment, tensor metadata, and payload digest accepts all honest payloads and rejects all recorded tampered ones.

Core claim

Specialists each see part of the evidence, send a short commitment, and pass full KV-cache state to a coordinator. In clean runs this latent collaboration improves over a matched text-only version, reaching EM/F1 of 0.338/0.486 versus 0.231/0.369 on transformed HiddenBench with Qwen3-4B. When one specialist is malicious, changing the hidden KV state collapses performance even when the visible commitment still looks plausible. A verifier that checks only text misses this failure mode. Simple magnitude checks catch some corruptions but adaptive attacks evade them. The most reliable fix is an HMAC-SHA256 manifest that binds the specialist, session, model, visible commitment, tensor metadata, an

What carries the argument

HMAC-SHA256 manifest that binds specialist identity, session, model, visible commitment, tensor metadata, and payload digest to protect KV-cache during transport.

If this is right

Full-KV latent memory can improve multi-agent collaboration but must be treated as a security-sensitive object.
Visible text commitments alone cannot verify the integrity of shared hidden state.
Adaptive attacks can evade magnitude-based checks on KV tensors while still damaging answers.
Cryptographic binding of KV state to visible commitments preserves performance gains from latent sharing.
KV-cache exchanged between agents should be protected in transport rather than inspected after receipt.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar integrity mechanisms could apply to other forms of hidden state exchange in distributed LLM systems.
The measured performance lift from latent sharing suggests that secure KV protocols may be worth adopting in production agent frameworks.
Standardized manifests for model state could reduce the attack surface when multiple models exchange internal representations.

Load-bearing premise

The 295 recorded tampered payloads and the adaptive attacks tested represent realistic threats that could be mounted against deployed multi-agent LLM systems.

What would settle it

An adaptive attack that modifies KV-cache content, changes the coordinator's answer, and still produces a payload accepted by the HMAC manifest.

Figures

Figures reproduced from arXiv: 2606.28958 by Carlos Baquero, Lu\'is Brito.

**Figure 2.** Figure 2: Exact match on the full 65-record HiddenBench [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

read the original abstract

LLM agents can share more than text. In some systems, an agent can send a short visible message while also passing its full KV-cache state to another model. This hidden state can help the final model combine evidence from several agents, but it is also hard to inspect. A visible message may look harmless even if the hidden state has been changed. We study this problem in a multi-agent question-answering setup. Specialists each see part of the evidence, send a short commitment, and pass full KV-cache state to a coordinator. In clean runs, this latent collaboration improves over a matched text-only version. On transformed HiddenBench with Qwen3-4B, it reaches EM/F1 of 0.338/0.486, compared with 0.231/0.369 for text collaboration. Qwen3-8B and HotPotQA runs show the same direction of improvement. The problem appears when one specialist is malicious. Some false visible commitments can steer answers. More seriously, changing the hidden KV state can collapse performance even when the visible commitment still looks plausible. A verifier that checks only text misses this failure mode. Simple magnitude checks catch some obvious corruptions, but adaptive attacks can evade them while still damaging the final answer. The most reliable fix we find is not to guess whether hidden state looks normal, but to protect it in transport. We implement an HMAC-SHA256 manifest that binds the specialist, session, model, visible commitment, tensor metadata, and payload digest. It accepts all 774 honest replayed payloads and rejects all 295 recorded tampered payloads. The main lesson is that full-KV latent memory can be useful, but it should be treated as a security-sensitive object, not as ordinary internal model state.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows KV-cache tampering can hurt multi-agent answers without touching visible text, and their HMAC manifest catches the recorded attacks.

read the letter

The paper shows that in multi-agent LLM setups, an attacker can change the hidden KV-cache to degrade the coordinator's answer even when the visible commitment looks fine. Their HMAC-SHA256 manifest, which binds specialist, session, model, visible commitment, tensor metadata, and payload digest, accepts all 774 honest replayed payloads and rejects all 295 recorded tampered ones.

What is new is isolating this latent attack surface from ordinary text tampering and showing that full KV state can improve over text-only collaboration in clean runs. On transformed HiddenBench with Qwen3-4B they report EM/F1 of 0.338/0.486 versus 0.231/0.369 for the text baseline, with similar direction on Qwen3-8B and HotPotQA. The mitigation is straightforward transport protection rather than trying to inspect or guess normal hidden state.

The work is grounded in direct empirical tests rather than fitted models, which keeps the circularity burden low. The numbers on honest versus tampered cases are exact and the direction of the clean-run improvement is consistent across the reported settings.

The soft spot is coverage of the threat model. The central claim that the manifest is the most reliable fix rests on the 295 recorded tampered payloads being representative. The abstract notes that adaptive attacks can evade magnitude checks, but without details on how those 295 examples were generated or whether they include attempts to satisfy the manifest itself, it is hard to know how far the protection extends. No error bars or statistical tests are mentioned.

This is for people building or securing multi-agent LLM systems. A reader working on agent integrity or KV-cache sharing would get a clear empirical demonstration and a practical binding approach. It deserves peer review because it flags a plausible attack surface with supporting counts, even if more diverse attack testing would strengthen the case.

Referee Report

1 major / 1 minor

Summary. The paper examines security risks in multi-agent LLM systems where agents exchange full KV-cache states alongside short visible commitments in a question-answering setup. It reports that latent KV sharing improves exact match and F1 scores over text-only collaboration (e.g., 0.338/0.486 vs. 0.231/0.369 on HiddenBench with Qwen3-4B). It then demonstrates that tampering the hidden KV state can degrade coordinator performance even when the visible commitment appears plausible, that magnitude-based checks are evadable by adaptive attacks, and that an HMAC-SHA256 manifest binding specialist, session, model, commitment, tensor metadata, and payload digest perfectly separates 774 honest replayed payloads from 295 recorded tampered ones. The central recommendation is to treat KV-cache payloads as security-sensitive objects requiring cryptographic transport protection rather than relying on post-hoc inspection.

Significance. If the empirical separation holds under a representative threat model, the work is significant for highlighting an under-explored attack vector in latent multi-agent collaboration and for supplying a concrete, implementable cryptographic countermeasure. The exact acceptance/rejection counts and performance deltas provide clear, falsifiable evidence; the absence of free parameters or fitted models in the HMAC construction is a strength. The result bears on the design of any system passing internal model state between agents.

major comments (1)

[Abstract, attack evaluation paragraph] Abstract, attack evaluation paragraph: The claim that the HMAC-SHA256 manifest is the most reliable fix rests on its acceptance of all 774 honest payloads and rejection of all 295 recorded tampered payloads. The manuscript provides no evidence that these 295 examples adequately sample sophisticated adaptive KV-cache modifications that preserve the visible commitment while still damaging answers, leaving the superiority over inspection methods dependent on an unverified representativeness assumption.

minor comments (1)

[Abstract] Abstract: The reported EM/F1 improvements are given without error bars, dataset transformation details, or statistical tests, which would make the utility claim more robust even if not central to the security argument.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and valuable feedback on our manuscript. We address the major comment point by point below.

read point-by-point responses

Referee: [Abstract, attack evaluation paragraph] Abstract, attack evaluation paragraph: The claim that the HMAC-SHA256 manifest is the most reliable fix rests on its acceptance of all 774 honest payloads and rejection of all 295 recorded tampered payloads. The manuscript provides no evidence that these 295 examples adequately sample sophisticated adaptive KV-cache modifications that preserve the visible commitment while still damaging answers, leaving the superiority over inspection methods dependent on an unverified representativeness assumption.

Authors: We appreciate the referee pointing out this limitation in our evaluation. The 295 tampered payloads were produced by the adaptive attacks we implemented, which successfully evaded magnitude-based detection while degrading coordinator performance. We acknowledge that this finite set does not represent all conceivable sophisticated modifications that could preserve the visible commitment. However, the strength of the HMAC-SHA256 manifest lies in its cryptographic properties rather than empirical coverage: it binds the payload digest, so any change to the KV-cache alters the digest and invalidates the HMAC (provided the key remains secret). Therefore, the detection capability does not depend on the representativeness of the 295 examples. We will revise the abstract and the relevant evaluation paragraph to clarify this point, explicitly distinguishing the cryptographic guarantee from the empirical results and acknowledging the scope of the tested attacks. revision: yes

Circularity Check

0 steps flagged

No circularity; results are direct empirical measurements on recorded payloads

full rationale

The paper reports an empirical evaluation: an HMAC-SHA256 manifest is implemented and tested on 774 honest replayed payloads (all accepted) and 295 recorded tampered payloads (all rejected). No derivation chain, equations, fitted parameters, or self-citations are present that reduce the central claim to its own inputs by construction. The representativeness of the 295 tampered examples is a validity question outside the scope of circularity analysis.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard cryptographic assumptions and empirical payload testing; no free parameters, new entities, or ad-hoc axioms are introduced.

axioms (1)

standard math HMAC-SHA256 provides integrity when the key remains secret
Implicit in the claim that the manifest rejects all tampered payloads.

pith-pipeline@v0.9.1-grok · 5858 in / 1253 out tokens · 45186 ms · 2026-06-30T08:24:52.909190+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 27 canonical work pages · 13 internal anchors

[1]

18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24)

Zhong, Yinmin et al..DistServe: Disaggregating Pre- fill and Decoding for Goodput-optimized Large Lan- guage Model Serving. 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). 2024. https://www.usenix.org/conference/ osdi24/presentation/zhong-yinmin

2024
[2]

23rd USENIX Conference on File and Storage Technologies (FAST 25)

Qin, Ruoyu et al..Mooncake: Trading More Stor- age for Less Computation—A KVCache-centric Ar- chitecture for Serving LLM Chatbot. 23rd USENIX Conference on File and Storage Technologies (FAST 25). 2025. https://www.usenix.org/conference/ fast25/presentation/qin

2025
[3]

The Twelfth Interna- tional Conference on Learning Representations

Pham, Chau et al..Let Models Speak Ciphers: Multia- gent Debate through Embeddings. The Twelfth Interna- tional Conference on Learning Representations. 2024. https://openreview.net/forum?id=sehRvaIPQQ

2024
[4]

Latent Collaboration in Multi-Agent Systems

Zou, Jiaru et al..Latent Collaboration in Multi-Agent Systems. Forty-third International Conference on Ma- chine Learning. 2026. arXiv:2511.20639. https:// doi.org/10.48550/arXiv.2511.20639

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.20639 2026
[5]

Du, Zhuoyun et al..Enabling Agents to Communicate Entirely in Latent Space. arXiv. 2026. arXiv:2511.09149. https://doi.org/10.48550/arXiv.2511.09149

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.09149 2026
[6]

et al..Latent Space Communication via K-V Cache Alignment

Dery, Lucio M. et al..Latent Space Communication via K-V Cache Alignment. arXiv. 2026. arXiv:2601.06123. https://doi.org/10.48550/arXiv.2601.06123

work page doi:10.48550/arxiv.2601.06123 2026
[7]

The Fourteenth International Conference on Learning Repre- sentations

Fu, Tianyu et al..Cache-to-Cache: Direct Semantic Communication Between Large Language Models. The Fourteenth International Conference on Learning Repre- sentations. 2026. https://openreview.net/forum? id=LeatkxrBCi

2026
[8]

Wang, Chenxi et al..Out of Sight, Not Out of Mind: Unveiling Latent Attack in Latent-based Multi-Agent Systems. arXiv. 2026. arXiv:2605.28214.https://doi. org/10.48550/arXiv.2605.28214

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.28214 2026
[9]

Asif, Sadia et al..LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems. arXiv. 2026. arXiv:2605.22786.https://doi.org/10. 48550/arXiv.2605.22786

work page internal anchor Pith review Pith/arXiv arXiv 2026
[10]

Computer Security

Lee, Donghyun; Tiwari, Mo; Miranda, Brando.Prompt Infection: LLM-to-LLM Prompt Injection within Multi- agent Systems. Computer Security. ESORICS 2025 In- ternational Workshops. 2026. https://doi.org/10. 1007/978-3-032-16092-8_28

2025
[11]

Kavathekar, Ishan et al..TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems. arXiv. 2025. arXiv:2511.05269.https://doi.org/10. 48550/arXiv.2511.05269

work page arXiv 2025
[12]

Systematic Failures in Collective Reasoning under Distributed Information in Multi-Agent LLMs

Li, Yuxuan; Naito, Aoi; Shirado, Hirokazu. Systematic Failures in Collective Reason- ing under Distributed Information in Multi- Agent LLMs. arXiv. 2026. arXiv:2505.11556. https://doi.org/10.48550/arXiv.2505.11556

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.11556 2026
[13]

Proceedings of the 2025 Conference on Empirical Methods in Natu- ral Language Processing

Tang, Yichen et al..Augmenting Multi-Agent Commu- nication with State Delta Trajectory. Proceedings of the 2025 Conference on Empirical Methods in Natu- ral Language Processing. 2025. https://doi.org/10. 18653/v1/2025.emnlp-main.518

2025
[14]

Zheng, Yujia et al..Thought Communication in Multi- agent Collaboration. arXiv. 2025. arXiv:2510.20733. https://doi.org/10.48550/arXiv.2510.20733

work page doi:10.48550/arxiv.2510.20733 2025
[15]

Liu, Xiaoze et al..The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems. arXiv. 2026. arXiv:2602.15382.https://doi.org/10. 48550/arXiv.2602.15382

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

Mou, Xinyi et al..HyLaT: Efficient Multi-Agent Communication via Hybrid Latent-Text Protocol. arXiv. 2026. arXiv:2605.25421.https://doi.org/10. 48550/arXiv.2605.25421

work page internal anchor Pith review Pith/arXiv arXiv 2026
[17]

Parekh, Swapnil.Thinking Wrong in Si- lence: Backdoor Attacks on Continuous La- tent Reasoning. arXiv. 2026. arXiv:2604.00770. https://doi.org/10.48550/arXiv.2604.00770

work page doi:10.48550/arxiv.2604.00770 2026
[18]

Wan, Zhipeng et al..Information Leakage from Embedding in Large Language Mod- els. arXiv. 2024. arXiv:2405.11916. https: //doi.org/10.48550/arXiv.2405.11916

work page doi:10.48550/arxiv.2405.11916 2024
[19]

Liu, Tiantian et al..Mitigating Privacy Risks in LLM Embeddings from Embedding In- version. arXiv. 2024. arXiv:2411.05034. https://doi.org/10.48550/arXiv.2411.05034

work page doi:10.48550/arxiv.2411.05034 2024
[20]

Nikolaou, Giorgos et al..Language Models are Injective and Hence Invertible. arXiv. 2025. arXiv:2510.15511. https://doi.org/10.48550/arXiv.2510.15511

work page doi:10.48550/arxiv.2510.15511 2025
[21]

IEEE Access

El Yagoubi, Faouzi; Badu-Marfo, Godwin; Al Mallah, Ranwa.AgentLeak: A Benchmark for Internal-Channel Privacy Leakage in Multi-Agent LLM Systems. IEEE Access. 2026. https://doi.org/10.1109/ACCESS. 2026.3704541

work page doi:10.1109/access 2026
[22]

Cui, Yu; Du, Hongyang.MAD-Spear: A Conformity- Driven Prompt Injection Attack on Multi-Agent Debate 15 Systems. arXiv. 2025. arXiv:2507.13038.https://doi. org/10.48550/arXiv.2507.13038

work page doi:10.48550/arxiv.2507.13038 2025
[23]

Advances in Neural Information Processing Sys- tems 38 (NeurIPS 2025) Datasets and Benchmarks Track

Cemri, Mert et al..Why Do Multi-Agent LLM Systems Fail?. Advances in Neural Information Processing Sys- tems 38 (NeurIPS 2025) Datasets and Benchmarks Track. 2025. https://openreview.net/forum?id= fAjbYBmonr

2025
[24]

Zhang, Lingxi; Zheng, Guangtao; Chen, Han- jie.When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems. arXiv. 2026. arXiv:2605.01133. https://doi.org/10.48550/arXiv.2605.01133

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.01133 2026
[26]

Proceedings of the ACM Web Conference 2026

Feng, Yang; Pan, Xudong.SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dy- namic Threat Detection. Proceedings of the ACM Web Conference 2026. 2026. https://doi.org/10.1145/ 3774904.3792462

work page arXiv 2026
[27]

Luo, Yaoyang et al..Defending LLM-based Multi-Agent Systems Against Cooperative Attacks with Sentence- Level Rectification. arXiv. 2026. arXiv:2605.28104. https://doi.org/10.48550/arXiv.2605.28104

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.28104 2026
[28]

Schroeder de Witt, Christian.Open Challenges in Multi- Agent Security: Towards Secure Systems of Interacting AI Agents. arXiv. 2025. arXiv:2505.02077. https:// doi.org/10.48550/arXiv.2505.02077

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.02077 2025
[29]

Advances in Neural Informa- tion Processing Systems 30

Blanchard, Peva et al..Machine Learning with Adversaries: Byzantine Tolerant Gradi- ent Descent. Advances in Neural Informa- tion Processing Systems 30. 2017. https: //proceedings.neurips.cc/paper/2017/hash/ f4b9ec30ad9f68f89b29639786cb62ef-Abstract. html

2017
[30]

Proceedings of the 35th International Conference on Machine Learn- ing

Yin, Dong et al..Byzantine-Robust Distributed Learn- ing: Towards Optimal Statistical Rates. Proceedings of the 35th International Conference on Machine Learn- ing. 2018. https://proceedings.mlr.press/v80/ yin18a.html

2018
[31]

Robust Aggregation for Federated Learning

Pillutla, Krishna; Kakade, Sham M.; Harchaoui, Zaid. Robust Aggregation for Federated Learning. IEEE Transactions on Signal Processing. 2022.https://doi. org/10.1109/TSP.2022.3153135

work page doi:10.1109/tsp.2022.3153135 2022
[32]

Proceedings of the 35th Interna- tional Conference on Machine Learning

El Mhamdi, El Mahdi; Guerraoui, Rachid; Rouault, Se- bastien.The Hidden Vulnerability of Distributed Learn- ing in Byzantium. Proceedings of the 35th Interna- tional Conference on Machine Learning. 2018. https: //proceedings.mlr.press/v80/mhamdi18a.html

2018
[33]

A Little Is Enough: Circumventing Defenses for Distributed Learning

Baruch, Gilad; Baruch, Moran; Goldberg, Yoav. A Little Is Enough: Circumventing Defenses for Distributed Learning. Advances in Neural In- formation Processing Systems 32. 2019. https: //proceedings.neurips.cc/paper/2019/hash/ ec1c59141046cd1866bbbcdfb6ae31d4-Abstract. html

2019
[34]

Findings of the Association for Computational Linguistics: NAACL

Zhou, Wei et al..Efficient Multi-Agent Collabora- tion with Tool Use for Online Planning in Com- plex Table Question Answering. Findings of the Association for Computational Linguistics: NAACL
[35]

Emotion Neurons

2025. https://doi.org/10.18653/v1/2025. findings-naacl.54

work page doi:10.18653/v1/2025 2025
[36]

Besrour, Ines et al..RAGentA: Multi-Agent Retrieval- Augmented Generation for Attributed Question Answer- ing. arXiv. 2025. arXiv:2506.16988. https://doi. org/10.48550/arXiv.2506.16988

work page doi:10.48550/arxiv.2506.16988 2025
[37]

Xiao, Xingchen et al..MASS-RAG: Multi- Agent Synthesis Retrieval-Augmented Gen- eration. arXiv. 2026. arXiv:2604.18509. https://doi.org/10.48550/arXiv.2604.18509

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.18509 2026
[38]

Addison, Parker et al..C-FedRAG: A Confidential Federated Retrieval-Augmented Generation System. arXiv. 2024. arXiv:2412.13163.https://doi.org/10. 48550/arXiv.2412.13163

work page arXiv 2024
[39]

Gao, Tianhao; Yang, Kai; Li, Yiyang.FD-RAG: Fed- erated Dual-System Retrieval-Augmented Generation. arXiv. 2026. arXiv:2605.27432.https://doi.org/10. 48550/arXiv.2605.27432

work page internal anchor Pith review Pith/arXiv arXiv 2026
[40]

Mao, Chenxin et al..An Efficient and Privacy- Preserving Architecture for Cross-Institutional Collab- orative RAG. arXiv. 2026. arXiv:2605.25716. https: //doi.org/10.48550/arXiv.2605.25716. 16

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.25716 2026

[1] [1]

18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24)

Zhong, Yinmin et al..DistServe: Disaggregating Pre- fill and Decoding for Goodput-optimized Large Lan- guage Model Serving. 18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). 2024. https://www.usenix.org/conference/ osdi24/presentation/zhong-yinmin

2024

[2] [2]

23rd USENIX Conference on File and Storage Technologies (FAST 25)

Qin, Ruoyu et al..Mooncake: Trading More Stor- age for Less Computation—A KVCache-centric Ar- chitecture for Serving LLM Chatbot. 23rd USENIX Conference on File and Storage Technologies (FAST 25). 2025. https://www.usenix.org/conference/ fast25/presentation/qin

2025

[3] [3]

The Twelfth Interna- tional Conference on Learning Representations

Pham, Chau et al..Let Models Speak Ciphers: Multia- gent Debate through Embeddings. The Twelfth Interna- tional Conference on Learning Representations. 2024. https://openreview.net/forum?id=sehRvaIPQQ

2024

[4] [4]

Latent Collaboration in Multi-Agent Systems

Zou, Jiaru et al..Latent Collaboration in Multi-Agent Systems. Forty-third International Conference on Ma- chine Learning. 2026. arXiv:2511.20639. https:// doi.org/10.48550/arXiv.2511.20639

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.20639 2026

[5] [5]

Du, Zhuoyun et al..Enabling Agents to Communicate Entirely in Latent Space. arXiv. 2026. arXiv:2511.09149. https://doi.org/10.48550/arXiv.2511.09149

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.09149 2026

[6] [6]

et al..Latent Space Communication via K-V Cache Alignment

Dery, Lucio M. et al..Latent Space Communication via K-V Cache Alignment. arXiv. 2026. arXiv:2601.06123. https://doi.org/10.48550/arXiv.2601.06123

work page doi:10.48550/arxiv.2601.06123 2026

[7] [7]

The Fourteenth International Conference on Learning Repre- sentations

Fu, Tianyu et al..Cache-to-Cache: Direct Semantic Communication Between Large Language Models. The Fourteenth International Conference on Learning Repre- sentations. 2026. https://openreview.net/forum? id=LeatkxrBCi

2026

[8] [8]

Wang, Chenxi et al..Out of Sight, Not Out of Mind: Unveiling Latent Attack in Latent-based Multi-Agent Systems. arXiv. 2026. arXiv:2605.28214.https://doi. org/10.48550/arXiv.2605.28214

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.28214 2026

[9] [9]

Asif, Sadia et al..LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems. arXiv. 2026. arXiv:2605.22786.https://doi.org/10. 48550/arXiv.2605.22786

work page internal anchor Pith review Pith/arXiv arXiv 2026

[10] [10]

Computer Security

Lee, Donghyun; Tiwari, Mo; Miranda, Brando.Prompt Infection: LLM-to-LLM Prompt Injection within Multi- agent Systems. Computer Security. ESORICS 2025 In- ternational Workshops. 2026. https://doi.org/10. 1007/978-3-032-16092-8_28

2025

[11] [11]

Kavathekar, Ishan et al..TAMAS: Benchmarking Adversarial Risks in Multi-Agent LLM Systems. arXiv. 2025. arXiv:2511.05269.https://doi.org/10. 48550/arXiv.2511.05269

work page arXiv 2025

[12] [12]

Systematic Failures in Collective Reasoning under Distributed Information in Multi-Agent LLMs

Li, Yuxuan; Naito, Aoi; Shirado, Hirokazu. Systematic Failures in Collective Reason- ing under Distributed Information in Multi- Agent LLMs. arXiv. 2026. arXiv:2505.11556. https://doi.org/10.48550/arXiv.2505.11556

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.11556 2026

[13] [13]

Proceedings of the 2025 Conference on Empirical Methods in Natu- ral Language Processing

Tang, Yichen et al..Augmenting Multi-Agent Commu- nication with State Delta Trajectory. Proceedings of the 2025 Conference on Empirical Methods in Natu- ral Language Processing. 2025. https://doi.org/10. 18653/v1/2025.emnlp-main.518

2025

[14] [14]

Zheng, Yujia et al..Thought Communication in Multi- agent Collaboration. arXiv. 2025. arXiv:2510.20733. https://doi.org/10.48550/arXiv.2510.20733

work page doi:10.48550/arxiv.2510.20733 2025

[15] [15]

Liu, Xiaoze et al..The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems. arXiv. 2026. arXiv:2602.15382.https://doi.org/10. 48550/arXiv.2602.15382

work page internal anchor Pith review Pith/arXiv arXiv 2026

[16] [16]

Mou, Xinyi et al..HyLaT: Efficient Multi-Agent Communication via Hybrid Latent-Text Protocol. arXiv. 2026. arXiv:2605.25421.https://doi.org/10. 48550/arXiv.2605.25421

work page internal anchor Pith review Pith/arXiv arXiv 2026

[17] [17]

Parekh, Swapnil.Thinking Wrong in Si- lence: Backdoor Attacks on Continuous La- tent Reasoning. arXiv. 2026. arXiv:2604.00770. https://doi.org/10.48550/arXiv.2604.00770

work page doi:10.48550/arxiv.2604.00770 2026

[18] [18]

Wan, Zhipeng et al..Information Leakage from Embedding in Large Language Mod- els. arXiv. 2024. arXiv:2405.11916. https: //doi.org/10.48550/arXiv.2405.11916

work page doi:10.48550/arxiv.2405.11916 2024

[19] [19]

Liu, Tiantian et al..Mitigating Privacy Risks in LLM Embeddings from Embedding In- version. arXiv. 2024. arXiv:2411.05034. https://doi.org/10.48550/arXiv.2411.05034

work page doi:10.48550/arxiv.2411.05034 2024

[20] [20]

Nikolaou, Giorgos et al..Language Models are Injective and Hence Invertible. arXiv. 2025. arXiv:2510.15511. https://doi.org/10.48550/arXiv.2510.15511

work page doi:10.48550/arxiv.2510.15511 2025

[21] [21]

IEEE Access

El Yagoubi, Faouzi; Badu-Marfo, Godwin; Al Mallah, Ranwa.AgentLeak: A Benchmark for Internal-Channel Privacy Leakage in Multi-Agent LLM Systems. IEEE Access. 2026. https://doi.org/10.1109/ACCESS. 2026.3704541

work page doi:10.1109/access 2026

[22] [22]

Cui, Yu; Du, Hongyang.MAD-Spear: A Conformity- Driven Prompt Injection Attack on Multi-Agent Debate 15 Systems. arXiv. 2025. arXiv:2507.13038.https://doi. org/10.48550/arXiv.2507.13038

work page doi:10.48550/arxiv.2507.13038 2025

[23] [23]

Advances in Neural Information Processing Sys- tems 38 (NeurIPS 2025) Datasets and Benchmarks Track

Cemri, Mert et al..Why Do Multi-Agent LLM Systems Fail?. Advances in Neural Information Processing Sys- tems 38 (NeurIPS 2025) Datasets and Benchmarks Track. 2025. https://openreview.net/forum?id= fAjbYBmonr

2025

[24] [24]

Zhang, Lingxi; Zheng, Guangtao; Chen, Han- jie.When Embedding-Based Defenses Fail: Rethinking Safety in LLM-Based Multi-Agent Systems. arXiv. 2026. arXiv:2605.01133. https://doi.org/10.48550/arXiv.2605.01133

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.01133 2026

[25] [26]

Proceedings of the ACM Web Conference 2026

Feng, Yang; Pan, Xudong.SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dy- namic Threat Detection. Proceedings of the ACM Web Conference 2026. 2026. https://doi.org/10.1145/ 3774904.3792462

work page arXiv 2026

[26] [27]

Luo, Yaoyang et al..Defending LLM-based Multi-Agent Systems Against Cooperative Attacks with Sentence- Level Rectification. arXiv. 2026. arXiv:2605.28104. https://doi.org/10.48550/arXiv.2605.28104

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.28104 2026

[27] [28]

Schroeder de Witt, Christian.Open Challenges in Multi- Agent Security: Towards Secure Systems of Interacting AI Agents. arXiv. 2025. arXiv:2505.02077. https:// doi.org/10.48550/arXiv.2505.02077

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2505.02077 2025

[28] [29]

Advances in Neural Informa- tion Processing Systems 30

Blanchard, Peva et al..Machine Learning with Adversaries: Byzantine Tolerant Gradi- ent Descent. Advances in Neural Informa- tion Processing Systems 30. 2017. https: //proceedings.neurips.cc/paper/2017/hash/ f4b9ec30ad9f68f89b29639786cb62ef-Abstract. html

2017

[29] [30]

Proceedings of the 35th International Conference on Machine Learn- ing

Yin, Dong et al..Byzantine-Robust Distributed Learn- ing: Towards Optimal Statistical Rates. Proceedings of the 35th International Conference on Machine Learn- ing. 2018. https://proceedings.mlr.press/v80/ yin18a.html

2018

[30] [31]

Robust Aggregation for Federated Learning

Pillutla, Krishna; Kakade, Sham M.; Harchaoui, Zaid. Robust Aggregation for Federated Learning. IEEE Transactions on Signal Processing. 2022.https://doi. org/10.1109/TSP.2022.3153135

work page doi:10.1109/tsp.2022.3153135 2022

[31] [32]

Proceedings of the 35th Interna- tional Conference on Machine Learning

El Mhamdi, El Mahdi; Guerraoui, Rachid; Rouault, Se- bastien.The Hidden Vulnerability of Distributed Learn- ing in Byzantium. Proceedings of the 35th Interna- tional Conference on Machine Learning. 2018. https: //proceedings.mlr.press/v80/mhamdi18a.html

2018

[32] [33]

A Little Is Enough: Circumventing Defenses for Distributed Learning

Baruch, Gilad; Baruch, Moran; Goldberg, Yoav. A Little Is Enough: Circumventing Defenses for Distributed Learning. Advances in Neural In- formation Processing Systems 32. 2019. https: //proceedings.neurips.cc/paper/2019/hash/ ec1c59141046cd1866bbbcdfb6ae31d4-Abstract. html

2019

[33] [34]

Findings of the Association for Computational Linguistics: NAACL

Zhou, Wei et al..Efficient Multi-Agent Collabora- tion with Tool Use for Online Planning in Com- plex Table Question Answering. Findings of the Association for Computational Linguistics: NAACL

[34] [35]

Emotion Neurons

2025. https://doi.org/10.18653/v1/2025. findings-naacl.54

work page doi:10.18653/v1/2025 2025

[35] [36]

Besrour, Ines et al..RAGentA: Multi-Agent Retrieval- Augmented Generation for Attributed Question Answer- ing. arXiv. 2025. arXiv:2506.16988. https://doi. org/10.48550/arXiv.2506.16988

work page doi:10.48550/arxiv.2506.16988 2025

[36] [37]

Xiao, Xingchen et al..MASS-RAG: Multi- Agent Synthesis Retrieval-Augmented Gen- eration. arXiv. 2026. arXiv:2604.18509. https://doi.org/10.48550/arXiv.2604.18509

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.18509 2026

[37] [38]

Addison, Parker et al..C-FedRAG: A Confidential Federated Retrieval-Augmented Generation System. arXiv. 2024. arXiv:2412.13163.https://doi.org/10. 48550/arXiv.2412.13163

work page arXiv 2024

[38] [39]

Gao, Tianhao; Yang, Kai; Li, Yiyang.FD-RAG: Fed- erated Dual-System Retrieval-Augmented Generation. arXiv. 2026. arXiv:2605.27432.https://doi.org/10. 48550/arXiv.2605.27432

work page internal anchor Pith review Pith/arXiv arXiv 2026

[39] [40]

Mao, Chenxin et al..An Efficient and Privacy- Preserving Architecture for Cross-Institutional Collab- orative RAG. arXiv. 2026. arXiv:2605.25716. https: //doi.org/10.48550/arXiv.2605.25716. 16

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.25716 2026