hub

arXiv preprint arXiv:2502.02533 , year=

Multi-agent design: Optimizing agents with better prompts, topologies , author= · 2025 · arXiv 2502.02533

23 Pith papers cite this work. Polarity classification is still indexing.

23 Pith papers citing it

read on arXiv browse 23 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

\textsc{MasFACT}: Continual Multi-Agent Topology Learning via Geometry-Aware Posterior Transfer

cs.LG · 2026-05-17 · unverdicted · novelty 7.0

MasFACT transfers historical topology priors across tasks via Fused Gromov-Wasserstein optimal transport and PAC-Bayes conservative adaptation to reduce topology forgetting in continual multi-agent settings.

Learning to Interrupt in Language-based Multi-agent Communication

cs.CL · 2026-04-07 · unverdicted · novelty 7.0

HANDRAISER learns optimal interruption points in multi-agent LLM communication using estimated future reward and cost, achieving 32.2% lower communication cost with comparable or better task results across games, scheduling, and debate.

MAS-PromptBench: When Does Prompt Optimization Improve Multi-Agent LLM Systems?

cs.LG · 2026-06-22 · unverdicted · novelty 6.0

A new benchmark study finds that prompt optimization can deliver significant gains in multi-agent LLM systems but its effectiveness varies strongly with task, workflow, communication protocol, and team size.

VTOS: Learning to Orchestrate Vision Tools by Co-Searching Solutions and Observers

cs.CV · 2026-06-17 · unverdicted · novelty 6.0

VTOS jointly searches solution and observer programs to adaptively orchestrate vision tools, outperforming static pipelines on dense object counting and zero-shot plant disease segmentation.

The Illusion of Multi-Agent Advantage

cs.AI · 2026-06-11 · unverdicted · novelty 6.0

Automatically generated multi-agent systems underperform CoT-SC on benchmarks and a new diagnostic dataset, exposing architectural bloat that fails to deliver functional utility.

StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems

cs.AI · 2026-06-02 · unverdicted · novelty 6.0

StepFinder turns execution logs into temporal semantic sequences via LLMs then uses temporal modeling plus attention to attribute failures to specific steps more accurately and 79% faster than direct LLM methods on the Who&When benchmark.

Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems

cs.MA · 2026-05-28 · unverdicted · novelty 6.0

Meta-Team is a collaborative self-evolution framework that turns multi-agent execution experience into reusable improvements at agent, coordination, and team levels, outperforming baselines on six benchmarks.

PACE: Two-Timescale Self-Evolution for Small Language Model Agents

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

PACE coordinates low-risk prompt evolution with validated higher-risk control-logic updates to improve frozen SLM agents on benchmarks without model retraining.

CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution

cs.CL · 2026-05-13 · unverdicted · novelty 6.0

CANTANTE uses contrastive rollouts to attribute system rewards to individual agents, enabling better prompt optimization than prior methods on programming, math, and QA benchmarks.

The Bystander Effect in Multi-Agent Reasoning: Quantifying Cognitive Loafing in Collaborative Interactions

cs.MA · 2026-05-11 · unverdicted · novelty 6.0

Multi-agent LLM interactions induce cognitive loafing via a formalized Interaction Depth Limit and Sovereignty Gap, where models subjugate correct derivations to social compliance, with lead agent identity disproportionately affecting outcomes.

MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

MASPO jointly optimizes prompts in multi-agent LLM systems via downstream-success evaluation and evolutionary beam search, delivering 2.9 average accuracy gains over prior methods across six tasks.

SkillGraph: Self-Evolving Multi-Agent Collaboration with Multimodal Graph Topology

cs.AI · 2026-04-19 · unverdicted · novelty 6.0

SkillGraph jointly evolves agent skills and collaboration topologies in multi-agent vision-language systems using a multimodal graph transformer and a skill designer, yielding consistent performance gains on benchmarks.

Self-Optimizing Multi-Agent Systems for Deep Research

cs.IR · 2026-04-03 · unverdicted · novelty 6.0

Multi-agent deep research systems self-optimize prompts through self-play to match or outperform expert-crafted versions.

Don't Trust Your Upstream: Exploiting LLM Multi-Agent System via Topology-Guided Adversarial Propagation

cs.CR · 2025-12-03 · unverdicted · novelty 6.0

A topology-aware attack propagates adversarial contamination across LLM multi-agent systems to achieve 40-85% success rates on frameworks and real applications, revealing overlooked vulnerabilities.

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

cs.CL · 2025-11-25 · unverdicted · novelty 6.0

Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.

Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models

cs.CL · 2025-10-09 · unverdicted · novelty 6.0

GTD generates task-adaptive, sparse communication topologies for multi-LLM agents via guided iterative graph diffusion steered by a proxy model predicting accuracy, utility, and cost.

Decoupled Intelligence: A Multi-Agent LLM Framework for Controllable Traffic Scenario Generation in SUMO

cs.MA · 2026-05-26 · unverdicted · novelty 5.0

A multi-agent LLM system for SUMO decouples simulation tasks across Planner, Builder, Demand, Runner, and Analyst agents with MCP-based orchestration, yielding higher success rates than single-agent baselines in ablation studies.

ATOM: Instantiating Budget-Controllable Multi-Agent Collaboration via Nucleus-Electron Hierarchy

cs.MA · 2026-05-25 · unverdicted · novelty 5.0

ATOM uses a nucleus-electron hierarchy and task-driven RL to generate budget-controllable multi-agent collaboration graphs for LLMs, claiming SOTA performance with up to 30% better token efficiency on six benchmarks.

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

cs.AI · 2025-08-10 · unverdicted · novelty 5.0

A comprehensive review of self-evolving AI agents that improve themselves over time, organized via a framework of inputs, agent system, environment, and optimizers, with domain-specific and safety discussions.

What Prompts Don't Say: Understanding and Managing Underspecification in LLM Prompts

cs.CL · 2025-05-19 · unverdicted · novelty 5.0

Underspecified LLM prompts cause fragile performance that doubles regression risk, and requirements-aware optimization improves average results by 4.8%.

Toward a Safe Internet of Agents

cs.MA · 2025-11-29 · unverdicted · novelty 4.0

The paper proposes a bottom-up framework for safe agentic AI systems that treats each component as a dual-use interface where added capabilities also expand attack surfaces across single agents, multi-agent systems, and interoperable ecosystems.

A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Super Intelligence

cs.AI · 2025-07-28 · accept · novelty 4.0

The paper delivers the first systematic review of self-evolving agents, structured around what components evolve, when adaptation occurs, and how it is implemented.

Language Model Networks: Supervision-Efficient Learning through Dense Communication

cs.AI · 2025-05-19

citing papers explorer

Showing 21 of 21 citing papers after filters.

\textsc{MasFACT}: Continual Multi-Agent Topology Learning via Geometry-Aware Posterior Transfer cs.LG · 2026-05-17 · unverdicted · none · ref 52
MasFACT transfers historical topology priors across tasks via Fused Gromov-Wasserstein optimal transport and PAC-Bayes conservative adaptation to reduce topology forgetting in continual multi-agent settings.
Learning to Interrupt in Language-based Multi-agent Communication cs.CL · 2026-04-07 · unverdicted · none · ref 47
HANDRAISER learns optimal interruption points in multi-agent LLM communication using estimated future reward and cost, achieving 32.2% lower communication cost with comparable or better task results across games, scheduling, and debate.
MAS-PromptBench: When Does Prompt Optimization Improve Multi-Agent LLM Systems? cs.LG · 2026-06-22 · unverdicted · none · ref 34
A new benchmark study finds that prompt optimization can deliver significant gains in multi-agent LLM systems but its effectiveness varies strongly with task, workflow, communication protocol, and team size.
VTOS: Learning to Orchestrate Vision Tools by Co-Searching Solutions and Observers cs.CV · 2026-06-17 · unverdicted · none · ref 14
VTOS jointly searches solution and observer programs to adaptively orchestrate vision tools, outperforming static pipelines on dense object counting and zero-shot plant disease segmentation.
The Illusion of Multi-Agent Advantage cs.AI · 2026-06-11 · unverdicted · none · ref 44
Automatically generated multi-agent systems underperform CoT-SC on benchmarks and a new diagnostic dataset, exposing architectural bloat that fails to deliver functional utility.
StepFinder: A Temporal Semantic Framework for Failure Attribution in Multi-Agent Systems cs.AI · 2026-06-02 · unverdicted · none · ref 55
StepFinder turns execution logs into temporal semantic sequences via LLMs then uses temporal modeling plus attention to attribute failures to specific steps more accurately and 79% faster than direct LLM methods on the Who&When benchmark.
Evolve as a Team: Collaborative Self-Evolution for LLM-based Multi-Agent Systems cs.MA · 2026-05-28 · unverdicted · none · ref 86
Meta-Team is a collaborative self-evolution framework that turns multi-agent execution experience into reusable improvements at agent, coordination, and team levels, outperforming baselines on six benchmarks.
PACE: Two-Timescale Self-Evolution for Small Language Model Agents cs.LG · 2026-05-21 · unverdicted · none · ref 24
PACE coordinates low-risk prompt evolution with validated higher-risk control-logic updates to improve frozen SLM agents on benchmarks without model retraining.
CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution cs.CL · 2026-05-13 · unverdicted · none · ref 7
CANTANTE uses contrastive rollouts to attribute system rewards to individual agents, enabling better prompt optimization than prior methods on programming, math, and QA benchmarks.
The Bystander Effect in Multi-Agent Reasoning: Quantifying Cognitive Loafing in Collaborative Interactions cs.MA · 2026-05-11 · unverdicted · none · ref 40
Multi-agent LLM interactions induce cognitive loafing via a formalized Interaction Depth Limit and Sovereignty Gap, where models subjugate correct derivations to social compliance, with lead agent identity disproportionately affecting outcomes.
MASPO: Joint Prompt Optimization for LLM-based Multi-Agent Systems cs.AI · 2026-05-07 · unverdicted · none · ref 15
MASPO jointly optimizes prompts in multi-agent LLM systems via downstream-success evaluation and evolutionary beam search, delivering 2.9 average accuracy gains over prior methods across six tasks.
SkillGraph: Self-Evolving Multi-Agent Collaboration with Multimodal Graph Topology cs.AI · 2026-04-19 · unverdicted · none · ref 55
SkillGraph jointly evolves agent skills and collaboration topologies in multi-agent vision-language systems using a multimodal graph transformer and a skill designer, yielding consistent performance gains on benchmarks.
Self-Optimizing Multi-Agent Systems for Deep Research cs.IR · 2026-04-03 · unverdicted · none · ref 22
Multi-agent deep research systems self-optimize prompts through self-play to match or outperform expert-crafted versions.
Don't Trust Your Upstream: Exploiting LLM Multi-Agent System via Topology-Guided Adversarial Propagation cs.CR · 2025-12-03 · unverdicted · none · ref 30
A topology-aware attack propagates adversarial contamination across LLM multi-agent systems to achieve 40-85% success rates on frameworks and real applications, revealing overlooked vulnerabilities.
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory cs.CL · 2025-11-25 · unverdicted · none · ref 60
Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.
Dynamic Generation of Multi-LLM Agents Communication Topologies with Graph Diffusion Models cs.CL · 2025-10-09 · unverdicted · none · ref 35
GTD generates task-adaptive, sparse communication topologies for multi-LLM agents via guided iterative graph diffusion steered by a proxy model predicting accuracy, utility, and cost.
Decoupled Intelligence: A Multi-Agent LLM Framework for Controllable Traffic Scenario Generation in SUMO cs.MA · 2026-05-26 · unverdicted · none · ref 20
A multi-agent LLM system for SUMO decouples simulation tasks across Planner, Builder, Demand, Runner, and Analyst agents with MCP-based orchestration, yielding higher success rates than single-agent baselines in ablation studies.
ATOM: Instantiating Budget-Controllable Multi-Agent Collaboration via Nucleus-Electron Hierarchy cs.MA · 2026-05-25 · unverdicted · none · ref 45
ATOM uses a nucleus-electron hierarchy and task-driven RL to generate budget-controllable multi-agent collaboration graphs for LLMs, claiming SOTA performance with up to 30% better token efficiency on six benchmarks.
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems cs.AI · 2025-08-10 · unverdicted · none · ref 123
A comprehensive review of self-evolving AI agents that improve themselves over time, organized via a framework of inputs, agent system, environment, and optimizers, with domain-specific and safety discussions.
What Prompts Don't Say: Understanding and Managing Underspecification in LLM Prompts cs.CL · 2025-05-19 · unverdicted · none · ref 4
Underspecified LLM prompts cause fragile performance that doubles regression risk, and requirements-aware optimization improves average results by 4.8%.
Toward a Safe Internet of Agents cs.MA · 2025-11-29 · unverdicted · none · ref 31
The paper proposes a bottom-up framework for safe agentic AI systems that treats each component as a dual-use interface where added capabilities also expand attack surfaces across single agents, multi-agent systems, and interoperable ecosystems.

arXiv preprint arXiv:2502.02533 , year=

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer