A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.
arXiv preprint arXiv:2601.09822 , year =
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
PerfCodeBench reveals that state-of-the-art LLMs produce functionally correct but significantly slower code than expert-optimized versions on system-level tasks, especially those involving parallelism and GPUs.
PromptMN is a pseudo-prompting DSL that adds compact typed directives to natural language to improve clarity, reusability, and reverse engineering of AI instructions.
Code Broker deploys a five-agent hierarchy that combines LLM semantic analysis with static linting to generate actionable Python code quality reports.
citing papers explorer
-
Beyond Individual Intelligence: Surveying Collaboration, Failure Attribution, and Self-Evolution in LLM-based Multi-Agent Systems
A survey that unifies prior work on multi-agent LLM systems via the LIFE framework, mapping dependencies across collaboration, failure attribution, and autonomous self-evolution while identifying cross-stage challenges.
-
PerfCodeBench: Benchmarking LLMs for System-Level High-Performance Code Optimization
PerfCodeBench reveals that state-of-the-art LLMs produce functionally correct but significantly slower code than expert-optimized versions on system-level tasks, especially those involving parallelism and GPUs.
-
PromptMN: Pseudo Prompting Language
PromptMN is a pseudo-prompting DSL that adds compact typed directives to natural language to improve clarity, reusability, and reverse engineering of AI instructions.
-
Code Broker: A Multi-Agent System for Automated Code Quality Assessment
Code Broker deploys a five-agent hierarchy that combines LLM semantic analysis with static linting to generate actionable Python code quality reports.