BOLAA: Benchmark- ing and Orchestrating LLM-augmented Autonomous Agents

Zhiwei Liu, Weiran Yao, Jianguo Zhang, Le Xue, Shelby Heinecke, Rithesh Murthy, Yihao Feng, Zeyuan Chen, Juan Carlos Niebles, Devansh Arpit, Ran Xu, Phil Mui, Huan Wang, Caiming Xiong, Silvio Savarese · 2023 · arXiv 2308.05960

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.

From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems

cs.MA · 2025-06-05 · accept · novelty 7.0

A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

cs.LG · 2024-03-12 · unverdicted · novelty 7.0

WorkArena benchmark shows LLM web agents achieve partial success on enterprise tasks but have a substantial gap to full automation and perform worse with open-source models.

GAIA: a benchmark for General AI Assistants

cs.CL · 2023-11-21 · unverdicted · novelty 7.0

GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.

A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration

cs.CL · 2023-10-03 · conditional · novelty 6.0

DyLAN automatically selects and dynamically organizes LLM agents for collaboration, outperforming fixed-agent baselines on code generation, reasoning, and decision tasks with up to 25% accuracy gains on some MMLU subjects.

A Survey on Large Language Model based Autonomous Agents

cs.AI · 2023-08-22 · accept · novelty 6.0

A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future directions.

Large Language Models for Multi-Robot Systems: A Survey

cs.RO · 2025-02-06 · unverdicted · novelty 4.0

A survey that categorizes LLM uses in multi-robot systems across task allocation, motion planning, action generation, and human interaction, while noting challenges and future research opportunities.

citing papers explorer

Showing 7 of 7 citing papers.

The Moltbook Files: A Harmless Slopocalypse or Humanity's Last Experiment cs.CL · 2026-05-08 · unverdicted · none · ref 79
An AI-agent social platform generated mostly neutral content whose use in fine-tuning reduced model truthfulness comparably to human Reddit data, suggesting limited unique harm but flagging tail risks like secret leaks.
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems cs.MA · 2025-06-05 · accept · none · ref 106
A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? cs.LG · 2024-03-12 · unverdicted · none · ref 15
WorkArena benchmark shows LLM web agents achieve partial success on enterprise tasks but have a substantial gap to full automation and perform worse with open-source models.
GAIA: a benchmark for General AI Assistants cs.CL · 2023-11-21 · unverdicted · none · ref 56
GAIA benchmark shows humans at 92% accuracy on simple real-world questions far outperform current AI systems at 15%, proposing this gap as a key milestone for general AI.
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration cs.CL · 2023-10-03 · conditional · none · ref 12
DyLAN automatically selects and dynamically organizes LLM agents for collaboration, outperforming fixed-agent baselines on code generation, reasoning, and decision tasks with up to 25% accuracy gains on some MMLU subjects.
A Survey on Large Language Model based Autonomous Agents cs.AI · 2023-08-22 · accept · none · ref 170
A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future directions.
Large Language Models for Multi-Robot Systems: A Survey cs.RO · 2025-02-06 · unverdicted · none · ref 76
A survey that categorizes LLM uses in multi-robot systems across task allocation, motion planning, action generation, and human interaction, while noting challenges and future research opportunities.

BOLAA: Benchmark- ing and Orchestrating LLM-augmented Autonomous Agents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer