hub Canonical reference

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

Khanh-Tung Tran, Dung Dao, Minh-Duong Nguyen, Quoc-Viet Pham, Barry O'Sullivan, Hoang D. Nguyen · 2025 · cs.AI · arXiv 2501.06322

Canonical reference. 100% of citing Pith papers cite this work as background.

68 Pith papers citing it

Background 100% of classified citations

open full Pith review browse 68 citing papers arXiv PDF

abstract

With recent advances in Large Language Models (LLMs), Agentic AI has become phenomenal in real-world applications, moving toward multiple LLM-based agents to perceive, learn, reason, and act collaboratively. These LLM-based Multi-Agent Systems (MASs) enable groups of intelligent agents to coordinate and solve complex tasks collectively at scale, transitioning from isolated models to collaboration-centric approaches. This work provides an extensive survey of the collaborative aspect of MASs and introduces an extensible framework to guide future research. Our framework characterizes collaboration mechanisms based on key dimensions: actors (agents involved), types (e.g., cooperation, competition, or coopetition), structures (e.g., peer-to-peer, centralized, or distributed), strategies (e.g., role-based or model-based), and coordination protocols. Through a review of existing methodologies, our findings serve as a foundation for demystifying and advancing LLM-based MASs toward more intelligent and collaborative solutions for complex, real-world use cases. In addition, various applications of MASs across diverse domains, including 5G/6G networks, Industry 5.0, question answering, and social and cultural settings, are also investigated, demonstrating their wider adoption and broader impacts. Finally, we identify key lessons learned, open challenges, and potential research directions of MASs towards artificial collective intelligence.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 18

citation-polarity summary

background 18

representative citing papers

Soap2Soap: Long Cinematic Video Remaking via Multi-Agent Collaboration

cs.CV · 2026-05-17 · unverdicted · novelty 7.0

Soap2Soap uses a multi-agent system with dual-bridge consistency via JSON screenplays and visual anchors plus batch keyframe generation to achieve better long-term consistency in cinematic video remaking than commercial APIs.

Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems

cs.AI · 2026-05-14 · unverdicted · novelty 7.0 · 2 refs

A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.

Hierarchical Attacks for Multi-Modal Multi-Agent Reasoning

cs.AI · 2026-05-13 · unverdicted · novelty 7.0

HAM³ achieves up to 78.3% attack success rate on the GQA benchmark by hierarchically attacking perception, communication, and reasoning layers in multi-modal multi-agent systems.

TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data

cs.AI · 2026-04-30 · unverdicted · novelty 7.0

TADI shows that domain-specialized tools orchestrated by an LLM over dual structured and semantic databases can convert heterogeneous wellsite data into evidence-grounded drilling intelligence, with tool design mattering more than model scale.

From Skills to Talent: Organising Heterogeneous Agents as a Real-World Company

cs.AI · 2026-04-24 · unverdicted · novelty 7.0

OMC framework turns multi-agent AI into self-organizing companies with Talents, Talent Market, and E²R search, achieving 84.67% success on PRDBench (15.48 points above prior art).

Learning to Interrupt in Language-based Multi-agent Communication

cs.CL · 2026-04-07 · unverdicted · novelty 7.0

HANDRAISER learns optimal interruption points in multi-agent LLM communication using estimated future reward and cost, achieving 32.2% lower communication cost with comparable or better task results across games, scheduling, and debate.

GraphBit: A Graph-based Agentic Framework for Non-Linear Agent Orchestration

cs.AI · 2026-03-08 · unverdicted · novelty 7.0

GraphBit is a DAG-based engine-orchestrated framework for agentic LLMs that achieves 67.6% accuracy with zero hallucinations on GAIA benchmarks.

M$^3$KG-RAG: Multi-hop Multimodal Knowledge Graph-enhanced Retrieval-Augmented Generation

cs.CL · 2025-12-23 · unverdicted · novelty 7.0

M³KG-RAG improves multimodal reasoning in large language models by constructing multi-hop knowledge graphs and selectively pruning retrieved context with GRASP.

When Identity Skews Debate: Anonymization for Bias-Reduced Multi-Agent Reasoning

cs.AI · 2025-10-08 · unverdicted · novelty 7.0

Anonymization in multi-agent debate reduces identity bias by equalizing self and peer weights in a Bayesian update model, quantified by the Identity Bias Coefficient.

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

cs.CL · 2025-09-22 · conditional · novelty 7.0

A systematic audit of LLM-based AI societies finds that 89.7% of 39 studies violate at least one of six PIMMUR validity principles, with reproductions showing that many claimed collective behaviors disappear when controls are tightened.

Beyond Alignment: Value Diversity as a Collective Property in Multicultural Agent Systems

cs.CL · 2026-06-04 · unverdicted · novelty 6.0

Multicultural multi-agent LLM systems exhibit substantially lower value diversity than human societies on the World Values Survey, with diversity uncorrelated to per-agent alignment and further reduced by agent interactions.

TRACER: Turn-level Regret Matching with Inner Reinforcement Credit for Cooperative Multi-LLM Reasoning

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

TRACER combines a controller-regret layer using regret matching for speak/skip decisions with a generation-credit layer using GSPO rewards to enable learned collaboration in multi-LLM reasoning.

AutoScientists: Self-Organizing Agent Teams for Long-Running Scientific Experimentation

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

Decentralized AI agent teams self-organize around hypotheses, critique proposals, and share knowledge to outperform single-agent baselines on biomedical ML, language-model optimization, and protein fitness tasks.

MACReD: A Multi-Agent Collaborative Reasoning Framework for Reaction Diagram Parsing

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

MACReD is a multi-agent collaborative reasoning framework for reaction diagram parsing that reports state-of-the-art F1 scores of 75.2% and 84.6% on the RxnScribe benchmark.

How to Steer Your Multi-Agent System: Human-LLM Collaborative Planning

cs.MA · 2026-05-21 · unverdicted · novelty 6.0

Formalizes design space for human-LLM collaborative planning along mode, scope, and level axes; evaluates AMBIPOM prototype via user study and benchmark revealing hybrid workflows and trade-offs.

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

LCGuard applies adversarial training to transform KV cache artifacts in multi-agent LLMs, reducing reconstructable sensitive information while preserving task performance.

AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

cs.AI · 2026-05-19 · unverdicted · novelty 6.0

AgentCo-op retrieves and assembles existing agents and tools into interoperable workflows for open-world scientific tasks, showing effectiveness in genomics case studies and competitive benchmark results with lower costs.

Conflict-Resilient Multi-Agent Reasoning via Signed Graph Modeling

cs.AI · 2026-05-19 · unverdicted · novelty 6.0

SIGMA builds a signed relational graph among LLM agents and uses conflict-aware message passing plus weighted aggregation to produce more consistent predictions than prior cooperative-assumption baselines.

CHAL: Council of Hierarchical Agentic Language

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

CHAL is a multi-agent dialectic system that performs structured belief optimization over defeasible domains using Bayesian-inspired graph representations and configurable meta-cognitive value system hyperparameters.

Rollout Cards: A Reproducibility Standard for Agent Research

cs.AI · 2026-05-12 · conditional · novelty 6.0

Rollout cards preserve complete agent rollout records and declare the reporting rules behind scores, enabling reproducible evaluation where changing only the rule can alter success rates by over 20 percentage points.

Conformity Generates Collective Misalignment in AI Agents Societies

physics.soc-ph · 2026-05-11 · unverdicted · novelty 6.0

Populations of individually aligned AI agents reach stable misaligned states through conformity, with small adversarial agents able to trigger irreversible tipping points.

STAR: Failure-Aware Markovian Routing for Multi-Agent Spatiotemporal Reasoning

cs.AI · 2026-05-11 · unverdicted · novelty 6.0 · 3 refs

STAR presents a failure-aware routing framework using a state-conditioned transition policy and an agent routing matrix combining expert routes with learned recoveries from execution traces to improve multi-agent spatiotemporal reasoning.

A Versatile AI Agent for Rare Disease Diagnosis and Risk Gene Prioritization

cs.AI · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

Hygieia is a new AI agent system that integrates phenotypes, genetics, and records to achieve superior rare disease diagnosis and gene prioritization with confidence scores.

Self-Adaptive Multi-Agent LLM-Based Security Pattern Selection for IoT Systems

cs.CR · 2026-05-01 · unverdicted · novelty 6.0

ASPO combines multi-agent LLM proposals with deterministic enforcement in a MAPE-K loop to select conflict-free, resource-feasible security patterns for IoT, delivering 100% safety invariants and 21-23% tail latency/energy reductions on testbed workloads.

citing papers explorer

Showing 18 of 68 citing papers.

When Cloud Agents Meet Device Agents: Lessons from Hybrid Multi-Agent Systems cs.MA · 2026-05-28 · unverdicted · none · ref 2 · internal anchor
Empirical study of two hybrid MAS architectures finds that optimal cloud-device splits are task-dependent and extra frontier compute does not reliably improve results.
Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces cs.CL · 2026-05-04 · unverdicted · none · ref 58 · internal anchor
This survey organizes RL for LLM multi-agent systems into reward families, credit units, and five orchestration sub-decisions, notes the absence of explicit stopping-decision training in its paper pool, and releases a tagged corpus.
Fair Agents: Balancing Multistakeholder Alignment in Multi-Agent Personalization Systems cs.IR · 2026-05-04 · unverdicted · none · ref 8 · internal anchor
The authors propose a conceptual framework integrating stakeholder-LLM alignment methods, social choice-based aggregation for collective decisions, and stakeholder-centric evaluations to achieve fair multi-agent personalization.
Towards Multi-Agent Autonomous Reasoning in Hydrodynamics cs.AI · 2026-05-01 · unverdicted · none · ref 7 · internal anchor
A Layer Execution Graph multi-agent system for hydrodynamics achieves 93.6% factual precision and 100% pass rate on 37 queries while degrading gracefully under data loss.
Multi-Agent Systems: From Classical Paradigms to Large Foundation Model-Enabled Futures cs.AI · 2026-04-20 · unverdicted · none · ref 8 · internal anchor
A survey comparing classical multi-agent systems with large foundation model-enabled multi-agent systems, showing how the latter enables semantic-level collaboration and greater adaptability.
Agentic Microphysics: A Manifesto for Generative AI Safety cs.CY · 2026-04-16 · unverdicted · none · ref 43 · internal anchor
The authors introduce agentic microphysics and generative safety to link local agent interactions to population-level risks in agentic AI through a causally explicit framework.
Toward Self-Organizing Production Logistics: A Multi-Agent Approach eess.SY · 2026-04-06 · unverdicted · none · ref 29 · 2 links · internal anchor
The paper derives system objectives for self-organizing production logistics and proposes a multi-agent architecture with embodied agents, event-driven coordination, and a three-phase demonstration roadmap.
From Script to Stage: Automating Experimental Design for Social Simulations with LLMs cs.HC · 2025-10-22 · unverdicted · none · ref 18 · internal anchor
FSTS automates multi-agent social experiment design via LLM script generation across three phases, with tests indicating reproduction of real-world outcomes.
Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents cs.MA · 2025-10-19 · unverdicted · none · ref 6 · internal anchor
Lark is a biologically inspired neuroevolution framework for multi-stakeholder LLM agents that iteratively generates, refines, and selects strategies using plasticity, duplication/maturation, influence-weighted Borda scoring, and token penalties, achieving top-3 performance in 80% of 30-round trials
Evolving Roles of LLMs in Scientific Innovation: Assistant, Collaborator, Scientist, and Evaluator cs.DL · 2025-07-16 · unverdicted · none · ref 173 · internal anchor
The paper proposes a four-role framework for LLMs in scientific innovation and reviews methods, benchmarks, and limitations across Assistant, Collaborator, Scientist, and Evaluator roles.
A Survey of Scaling in Large Language Model Reasoning cs.AI · 2025-04-02 · unverdicted · none · ref 197 · internal anchor
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.
Survey of LLM Agent Communication with MCP: A Software Design Pattern Centric Review cs.SE · 2025-05-26 · unverdicted · none · ref 11 · internal anchor
A survey of classical software design patterns applied to communication challenges in LLM agent systems using the Model Context Protocol.
Harnessing Multiple Large Language Models: A Survey on LLM Ensemble cs.CL · 2025-02-25 · unverdicted · none · ref 53 · internal anchor
A systematic survey of LLM ensemble methods organized into a taxonomy of ensemble-before-inference, ensemble-during-inference, and ensemble-after-inference stages, with review of benchmarks, applications, and future directions.
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration cs.AI · 2026-05-19 · unreviewed · ref 15 · internal anchor
Coding Agent Is Good As World Simulator cs.AI · 2026-05-14 · unreviewed · ref 40 · internal anchor
EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation cs.AI · 2026-04-22 · unreviewed · ref 9 · internal anchor
When Identity Overrides Incentives: Representational Choices as Governance Decisions in Multi-Agent LLM Systems cs.MA · 2026-01-15 · unreviewed · ref 51 · internal anchor
PosterForest: Hierarchical Multi-Agent Collaboration for Scientific Poster Generation cs.AI · 2025-08-29 · unreviewed · ref 19 · internal anchor

Multi-Agent Collaboration Mechanisms: A Survey of LLMs

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer