When single-agent with skills replace multi-agent systems and when they fail.arXiv preprint arXiv:2601.04748, 2026

Xiaoxiao Li · 2026 · arXiv 2601.04748

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it

read on arXiv browse 8 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

MAVEN pipeline generates multi-scale spatio-temporal event descriptions from videos using agentic adaptation and refinement, then produces training data that lets a fine-tuned 8B model outperform Gemini baselines on private CCTV and AccidentBench tasks.

From Multi-Agent to Single-Agent: When Is Skill Distillation Beneficial?

cs.AI · 2026-04-02 · unverdicted · novelty 7.0

Metric Freedom (F), quantified via Mantel test on output diversity and score variance, predicts when single-agent skill distillation from multi-agent systems will succeed, enabling up to 8x cost and 15x latency reductions across tested tasks.

HMACE: Heterogeneous Multi-Agent Collaborative Evolution for Combinatorial Optimization

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

HMACE deploys Proposer, Generator, Evaluator, and Reflector agents in an evolutionary loop to generate and refine heuristics for NP-hard problems, reporting lower optimality gaps and token costs than baselines on TSP and Online BPP.

Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems

cs.AI · 2026-04-30 · unverdicted · novelty 6.0

SBD is a bilevel optimization framework that learns context-dependent safety weights for runtime task delegation in hierarchical multi-agent systems, with continuous authority transfer alpha and theoretical guarantees on safety monotonicity, policy convergence, and accountability propagation.

A Reference Architecture for Agentic Hybrid Retrieval in Dataset Search

cs.IR · 2026-03-28 · unverdicted · novelty 6.0

The paper defines a bounded reference architecture for LLM-orchestrated hybrid retrieval in dataset search using BM25, dense embeddings, reciprocal rank fusion, and metadata augmentation with pseudo-queries.

ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents

cs.SE · 2026-03-20 · unverdicted · novelty 6.0

ContractSkill converts draft web agent skills into explicit executable contracts that enable deterministic verification, fault localization, and minimal local repair, improving stability on benchmarks like VisualWebArena.

Compiling Agentic Workflows into LLM Weights: Near-Frontier Quality at Two Orders of Magnitude Less Cost

cs.AI · 2026-05-21 · unverdicted · novelty 5.0

Compiling agentic workflows into LLM weights creates subterranean agents with near-frontier quality at two orders of magnitude less cost, validated empirically on travel booking, Zoom support, and insurance claims tasks.

Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward

cs.MA · 2026-02-12 · unverdicted · novelty 4.0

The paper surveys agent skills for LLMs across architecture, acquisition, deployment, and security, proposing a four-tier Skill Trust and Lifecycle Governance Framework to address vulnerabilities in community skills.

citing papers explorer

Showing 8 of 8 citing papers.

MAVEN: A Multi-stage Agentic Annotation Pipeline for Video Reasoning Tasks cs.CV · 2026-05-21 · unverdicted · none · ref 15
MAVEN pipeline generates multi-scale spatio-temporal event descriptions from videos using agentic adaptation and refinement, then produces training data that lets a fine-tuned 8B model outperform Gemini baselines on private CCTV and AccidentBench tasks.
From Multi-Agent to Single-Agent: When Is Skill Distillation Beneficial? cs.AI · 2026-04-02 · unverdicted · none · ref 2
Metric Freedom (F), quantified via Mantel test on output diversity and score variance, predicts when single-agent skill distillation from multi-agent systems will succeed, enabling up to 8x cost and 15x latency reductions across tested tasks.
HMACE: Heterogeneous Multi-Agent Collaborative Evolution for Combinatorial Optimization cs.AI · 2026-05-08 · unverdicted · none · ref 27
HMACE deploys Proposer, Generator, Evaluator, and Reflector agents in an evolutionary loop to generate and refine heuristics for NP-hard problems, reporting lower optimality gaps and token costs than baselines on TSP and Online BPP.
Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems cs.AI · 2026-04-30 · unverdicted · none · ref 7
SBD is a bilevel optimization framework that learns context-dependent safety weights for runtime task delegation in hierarchical multi-agent systems, with continuous authority transfer alpha and theoretical guarantees on safety monotonicity, policy convergence, and accountability propagation.
A Reference Architecture for Agentic Hybrid Retrieval in Dataset Search cs.IR · 2026-03-28 · unverdicted · none · ref 35
The paper defines a bounded reference architecture for LLM-orchestrated hybrid retrieval in dataset search using BM25, dense embeddings, reciprocal rank fusion, and metadata augmentation with pseudo-queries.
ContractSkill: Repairable Contract-Based Skills for Multimodal Web Agents cs.SE · 2026-03-20 · unverdicted · none · ref 14
ContractSkill converts draft web agent skills into explicit executable contracts that enable deterministic verification, fault localization, and minimal local repair, improving stability on benchmarks like VisualWebArena.
Compiling Agentic Workflows into LLM Weights: Near-Frontier Quality at Two Orders of Magnitude Less Cost cs.AI · 2026-05-21 · unverdicted · none · ref 73
Compiling agentic workflows into LLM weights creates subterranean agents with near-frontier quality at two orders of magnitude less cost, validated empirically on travel booking, Zoom support, and insurance claims tasks.
Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward cs.MA · 2026-02-12 · unverdicted · none · ref 21
The paper surveys agent skills for LLMs across architecture, acquisition, deployment, and security, proposing a four-tier Skill Trust and Lifecycle Governance Framework to address vulnerabilities in community skills.

When single-agent with skills replace multi-agent systems and when they fail.arXiv preprint arXiv:2601.04748, 2026

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer