GTA generates 3D worlds from single images via a two-stage video diffusion process that prioritizes geometry before appearance to improve structural consistency.
Advances in neural information processing systems33, 6840– 6851 (2020)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
UniE2F conditions a pre-trained video diffusion model on event streams with inter-frame residual guidance to reconstruct, interpolate, and predict frames in a unified zero-shot framework.
citing papers explorer
-
GTA: Advancing Image-to-3D World Generation via Geometry Then Appearance Video Diffusion
GTA generates 3D worlds from single images via a two-stage video diffusion process that prioritizes geometry before appearance to improve structural consistency.
-
UniE2F: A Unified Diffusion Framework for Event-to-Frame Reconstruction with Video Foundation Models
UniE2F conditions a pre-trained video diffusion model on event streams with inter-frame residual guidance to reconstruct, interpolate, and predict frames in a unified zero-shot framework.