The work introduces video retrieval augmented generation (VRAG) with explicit global state conditioning to reduce compounding errors and improve spatiotemporal consistency in interactive video world models.
Genie 2: A large-scale foundation world model
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Learning World Models for Interactive Video Generation
The work introduces video retrieval augmented generation (VRAG) with explicit global state conditioning to reduce compounding errors and improve spatiotemporal consistency in interactive video world models.