/benchmark/cods_track1/track1_result/trajectory

Yifan Zhang, Giridhar Ganapavarapu, Srideepika Jayaraman, Bhavna Agrawal, Dhaval Patel, Achille Fokoue · 2025 · arXiv 2512.23167

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks

cs.AI · 2026-05-13 · conditional · novelty 5.0

SPIN enforces DAG-valid plans and prefix-based stopping for LLM agents, cutting executed tasks from 1061 to 623 and tool calls from 11.81 to 6.82 per run on AssetOpsBench while raising success from 0.638 to 0.706.

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

cs.AI · 2026-05-09 · unverdicted · novelty 4.0

MCP-Cosmos combines world models with MCP agents via a bring-your-own-world-model strategy and reports gains in tool success rate and parameter accuracy on benchmark tasks.

citing papers explorer

Showing 1 of 1 citing paper after filters.

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments cs.AI · 2026-05-09 · unverdicted · none · ref 13
MCP-Cosmos combines world models with MCP agents via a bring-your-own-world-model strategy and reports gains in tool success rate and parameter accuracy on benchmark tasks.

/benchmark/cods_track1/track1_result/trajectory

fields

years

verdicts

representative citing papers

citing papers explorer