Unified World Models couple video and action diffusion inside one transformer with independent timesteps, enabling pretraining on heterogeneous robot datasets that include action-free video and producing more generalizable policies than imitation learning alone.
High-resolution image synthesis with latent diffusion models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
method 1
citation-polarity summary
fields
cs.RO 2verdicts
UNVERDICTED 2representative citing papers
RoboCasa supplies a large-scale kitchen simulator, generative assets, 100 tasks, and automated data pipelines that produce a clear scaling trend in imitation learning for generalist robots.
citing papers explorer
No citing papers match the current filters.