A video foundation model trained on human demonstrations generates zero-shot plans that convert to executable robot actions on novel scenes and tasks.
Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.RO 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Large Video Planner Enables Generalizable Robot Control
A video foundation model trained on human demonstrations generates zero-shot plans that convert to executable robot actions on novel scenes and tasks.