Latent Collaboration in Multi-Agent Systems

Gaotang Li; Hanghang Tong; James Zou; Jiaru Zou; Jingrui He; Katherine Tieu; Ke Shen; Ling Yang; Mengdi Wang; Pan Lu

arxiv: 2511.20639 · v3 · pith:XCFROIBSnew · submitted 2025-11-25 · 💻 cs.CL · cs.AI· cs.LG

Latent Collaboration in Multi-Agent Systems

Jiaru Zou , Ruizhong Qiu , Gaotang Li , Xiyuan Yang , Katherine Tieu , Pan Lu , Ke Shen , Hanghang Tong

show 5 more authors

Yejin Choi Jingrui He James Zou Mengdi Wang Ling Yang

This is my paper

classification 💻 cs.CL cs.AIcs.LG

keywords latentlatentmasagentsreasoningtext-basedagentcodecollaboration

0 comments

read the original abstract

Multi-agent systems (MAS) extend large language models (LLMs) from independent single-model reasoning to coordinative system-level intelligence. While existing LLM agents depend on text-based mediation for reasoning and communication, we take a step forward by enabling models to collaborate directly within the continuous latent space. We introduce LatentMAS, an end-to-end training-free framework that enables pure latent collaboration among LLM agents. In LatentMAS, each agent first performs auto-regressive latent thoughts generation through last-layer hidden embeddings instead of text. Then, a shared latent working memory preserves and transfers each agent's internal representations and latent thoughts, ensuring lossless information exchange without re-encoding. We provide detailed theoretical analyses showing that LatentMAS achieves higher expressiveness and lossless information preservation with lower overall complexity than standard text-based MAS. In addition, empirical evaluations across 9 comprehensive benchmarks spanning math and science reasoning, commonsense understanding, and code generation show that LatentMAS outperforms advanced single agents and text-based MAS baselines, achieving up to 14.6% higher accuracy, reducing output token usage by 70.8%-83.7%, and providing 4$\times$-4.3$\times$ faster end-to-end inference. Code and data are fully open-sourced at https://github.com/Gen-Verse/LatentMAS.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 17 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
cs.AI 2026-05 unverdicted novelty 7.0

A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.
Good Agentic Friends Do Not Just Give Verbal Advice: They Can Update Your Weights
cs.CL 2026-05 unverdicted novelty 7.0

TFlow enables multi-agent LLMs to collaborate via transient low-rank LoRA perturbations derived from sender activations, yielding up to 8.5 accuracy gains and 83% token reduction versus text-based baselines on Qwen3-4...
LaTER: Efficient Test-Time Reasoning via Latent Exploration and Explicit Verification
cs.CL 2026-05 unverdicted novelty 7.0

LaTER reduces LLM token usage 16-33% on reasoning benchmarks by exploring in latent space then switching to explicit CoT verification, with gains like 70% to 73.3% on AIME 2025 in the training-free version.
LACO: Adaptive Latent Communication for Collaborative Driving
cs.AI 2026-05 unverdicted novelty 6.0

LACO introduces Iterative Latent Deliberation, Cross-Horizon Saliency Attribution, and Structured Semantic Knowledge Distillation to enable low-latency latent communication in collaborative driving while preserving pe...
VerifyMAS: Hypothesis Verification for Failure Attribution in LLM Multi-Agent Systems
cs.CL 2026-05 unverdicted novelty 6.0

VerifyMAS improves failure attribution in LLM multi-agent systems via hypothesis verification on full trajectories, error taxonomy-based data construction, and fine-tuned verifier models, outperforming prior direct-pr...
KV-Fold: One-Step KV-Cache Recurrence for Long-Context Inference
cs.LG 2026-05 conditional novelty 6.0

KV-Fold turns frozen transformers into stable long-context models by folding the KV cache across sequence chunks in repeated forward passes.
MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems
cs.AI 2026-05 unverdicted novelty 6.0

MASPO jointly optimizes prompts in multi-agent LLM systems via downstream-success evaluation and evolutionary beam search, delivering 2.9 average accuracy gains over prior methods across six tasks.
QKVShare: Quantized KV-Cache Handoff for Multi-Agent On-Device LLMs
cs.AI 2026-05 unverdicted novelty 6.0

QKVShare enables efficient quantized KV-cache handoff for on-device multi-agent LLMs, cutting TTFT versus re-prefill across tested contexts while adaptive quantization stays competitive with uniform baselines on GSM8K.
When Less is Enough: Efficient Inference via Collaborative Reasoning
cs.LG 2026-05 conditional novelty 6.0

A large model generates a compact reasoning signal that a small model uses to solve tasks, reducing the large model's output tokens by up to 60% on benchmarks like AIME and GPQA.
MeloTune: On-Device Arousal Learning and Peer-to-Peer Mood Coupling for Proactive Music Curation
cs.SD 2026-04 unverdicted novelty 6.0

MeloTune implements learned per-listener Personal Arousal Functions and mesh memory protocols on mobile devices to predict affective trajectories and enable peer-coupled proactive music selection, reporting 96.6% patt...
Representational Collapse in Multi-Agent LLM Committees: Measurement and Diversity-Aware Consensus
cs.LG 2026-04 conditional novelty 6.0

LLM agent committees exhibit representational collapse with mean cosine similarity of 0.888, and diversity-aware consensus reaches 87% accuracy on GSM8K versus 84% for self-consistency at lower cost.
Position: Zeroth-Order Optimization in Deep Learning Is Underexplored, Not Underpowered
cs.LG 2026-05 unverdicted novelty 5.0

Zeroth-order optimization is underexplored rather than underpowered in deep learning, with limitations stemming from full-space designs that can be addressed via subspace, spectral, and systems-aware approaches.
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
cs.AI 2026-05 conditional novelty 5.0

The survey proposes the LIFE framework to unify fragmented research on collaboration, failure attribution, and self-evolution in LLM multi-agent systems into a progression toward self-organizing intelligence.
Heterogeneous Scientific Foundation Model Collaboration
cs.AI 2026-04 unverdicted novelty 5.0

Eywa enables language-based agentic AI systems to collaborate with specialized scientific foundation models for improved performance on structured data tasks.
Token Economics for LLM Agents: A Dual-View Study from Computing and Economics
cs.AI 2026-05 unverdicted novelty 4.0

The paper delivers a unified survey of token economics for LLM agents, conceptualizing tokens as production factors, exchange mediums, and units of account across micro, meso, macro, and security dimensions using esta...
Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces
cs.CL 2026-05 unverdicted novelty 4.0

This survey organizes RL for LLM multi-agent systems into reward families, credit units, and five orchestration sub-decisions, notes the absence of explicit stopping-decision training in its paper pool, and releases a...
From AI-Generated Content to Agentic Action: Security and Safety Threats in Generative AI
cs.CR 2026-05 unverdicted novelty 3.0

The paper analyzes evolving security and safety threats in generative AI from content generation to agentic actions, noting that attack surfaces expand faster than defenses and that many safeguards require institution...