A video foundation model trained on human demonstrations generates zero-shot plans that convert to executable robot actions on novel scenes and tasks.
Sage: Bridging semantic and actionable parts for generalizable articulated-object manipulation under language instructions, 2023
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Large Video Planner Enables Generalizable Robot Control
A video foundation model trained on human demonstrations generates zero-shot plans that convert to executable robot actions on novel scenes and tasks.