A feed-forward video latent transformer that predicts time-varying 3D Gaussian primitives from one image to produce controllable 4D scenes with appearance, geometry, and motion.
Dreamscene4d: Dynamic multi-object scene generation from monocular videos
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
LIVE-GS uses an LLM to predict physical parameters from static Gaussian assets in 10 seconds for physics-aware VR interactions, validated by interviews, baseline comparisons, and user studies.
By fine-tuning DUST3R to output per-timestep pointmaps on scarce dynamic video datasets, MonST3R achieves stronger video depth and pose estimation without explicit motion modeling.
citing papers explorer
-
Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models
A feed-forward video latent transformer that predicts time-varying 3D Gaussian primitives from one image to produce controllable 4D scenes with appearance, geometry, and motion.
-
LIVE-GS: LLM Powers Interactive VR Experience with Physics-Aware Gaussian Splatting
LIVE-GS uses an LLM to predict physical parameters from static Gaussian assets in 10 seconds for physics-aware VR interactions, validated by interviews, baseline comparisons, and user studies.
-
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
By fine-tuning DUST3R to output per-timestep pointmaps on scarce dynamic video datasets, MonST3R achieves stronger video depth and pose estimation without explicit motion modeling.