It did not create the incident, did not create the change record, and did not send the required HTTP request to the next agent

After these two lookup failures, the model stopped rather than recovering with an alternative interpretation or proceeding with partial execution

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Beyond the All-in-One Agent: Benchmarking Role-Specialized Multi-Agent Collaboration in Enterprise Workflows

cs.MA · 2026-05-09 · unverdicted · novelty 7.0

EntCollabBench shows that today's LLM agents still struggle with delegation, context transfer, parameter grounding, workflow closure, and decision commitment when tested in a simulated enterprise with 11 role-specialized agents.

citing papers explorer

Showing 1 of 1 citing paper.

Beyond the All-in-One Agent: Benchmarking Role-Specialized Multi-Agent Collaboration in Enterprise Workflows cs.MA · 2026-05-09 · unverdicted · none · ref 72
EntCollabBench shows that today's LLM agents still struggle with delegation, context transfer, parameter grounding, workflow closure, and decision commitment when tested in a simulated enterprise with 11 role-specialized agents.

It did not create the incident, did not create the change record, and did not send the required HTTP request to the next agent

fields

years

verdicts

representative citing papers

citing papers explorer