GROW decomposes trajectories into state-action samples to enable GRPO for multi-turn VLM agents and reports state-of-the-art results on more than 800 Minecraft tasks.
Rocket-1: Mastering open-world interaction with visual-temporal context prompting
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
GROW: Aligning GRPO with State-Action Modeling for Open-World VLM Agents
GROW decomposes trajectories into state-action samples to enable GRPO for multi-turn VLM agents and reports state-of-the-art results on more than 800 Minecraft tasks.