A vision-language-aligned world model turns visuomotor MPC into a language-following planner that reaches 87% success on 288 unseen semantic tasks where standard VLAs drop to 22%.
EAGLE-2: Faster inference of language models with dynamic draft trees
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Grounded World Model for Semantically Generalizable Planning
A vision-language-aligned world model turns visuomotor MPC into a language-following planner that reaches 87% success on 288 unseen semantic tasks where standard VLAs drop to 22%.