SPIN enforces DAG-valid plans and prefix-based stopping for LLM agents, cutting executed tasks from 1061 to 623 and tool calls from 11.81 to 6.82 per run on AssetOpsBench while raising success from 0.638 to 0.706.
/benchmark/cods_track1/track1_result/trajectory
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AI 2years
2026 2representative citing papers
MCP-Cosmos combines world models with MCP agents via a bring-your-own-world-model strategy and reports gains in tool success rate and parameter accuracy on benchmark tasks.
citing papers explorer
-
MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments
MCP-Cosmos combines world models with MCP agents via a bring-your-own-world-model strategy and reports gains in tool success rate and parameter accuracy on benchmark tasks.