ISE creates 23,132 execution-grounded multi-turn OS agent trajectories via intent simulation and live execution, improving agent performance on ClawEval from 19.3 to 37.7 pass@1 with Qwen3-8B.
arXiv preprint arXiv:2603.01940 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CL 2years
2026 2representative citing papers
WRIT is a synthesis pipeline that generates write-read intensive trajectories along axes of write-decision count and per-decision evidence burden, enabling a 4B model to outperform GPT-5.1 on τ²-bench with reduced inference tokens.
citing papers explorer
-
ISE: An Execution-Grounded Recipe for Multi-Turn OS-Agent Trajectories
ISE creates 23,132 execution-grounded multi-turn OS agent trajectories via intent simulation and live execution, improving agent performance on ClawEval from 19.3 to 37.7 pass@1 with Qwen3-8B.
-
WRIT: Write-Read Intensive Trajectory Synthesis for Multi-Turn User-Facing Agents
WRIT is a synthesis pipeline that generates write-read intensive trajectories along axes of write-decision count and per-decision evidence burden, enabling a 4B model to outperform GPT-5.1 on τ²-bench with reduced inference tokens.