PIVOT refines LLM agent trajectories through plan-inspect-evolve-verify stages using environment feedback, yielding up to 94% relative gains in constraint satisfaction and 3-5x token efficiency over prior refinement methods.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.AI 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
SFT followed by RLVR on Qwen2.5-3B-Instruct raises syntactic and execution correctness when generating Game Code World Models across 30 games.
citing papers explorer
-
PIVOT: Bridging Planning and Execution in LLM Agents via Trajectory Refinement
PIVOT refines LLM agent trajectories through plan-inspect-evolve-verify stages using environment feedback, yielding up to 94% relative gains in constraint satisfaction and 3-5x token efficiency over prior refinement methods.
-
Distilling Game Code World Model Generation into Lightweight Large Language Models
SFT followed by RLVR on Qwen2.5-3B-Instruct raises syntactic and execution correctness when generating Game Code World Models across 30 games.