NoPo4D is the first feed-forward system for dynamic 4D Gaussian splatting from unposed multi-view videos, using velocity decomposition supervised by optical flow and a bidirectional motion encoder.
hub
Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction
12 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 12representative citing papers
GS-Surrogate creates a canonical Gaussian field that is sequentially deformed by simulation parameters to enable real-time, controllable 3D exploration of ensemble data while separating simulation variations from visualization adjustments.
R5DGS augments physics-driven 4D Gaussian splatting with identity encodings and centroid-only rigid-body dynamics to enable semantic open-vocabulary retrieval and 11 FPS faster extrapolation.
RiGS decomposes scenes into static, rigid, and transient 4D Gaussians with an object-wise dynamic mask and scene flow guidance to model multi-scale motions and achieve SOTA novel view synthesis.
Velox compresses dynamic point clouds into latent tokens that support geometry via 4D surface modeling and appearance via 3D Gaussians, showing strong results on video-to-4D generation, tracking, and image-to-4D cloth simulation.
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.
Skelebones compresses 4D Gaussian shapes into compact, controllable bones and skeletons, delivering 17.3% PSNR gains over LBS and 21.7% over BoB for unseen poses while preserving reconstruction quality.
GaussianDWM uses 3D Gaussians with embedded linguistic features, language-guided sampling, and dual-condition generation for unified scene understanding and multi-modal output in driving world models.
A feed-forward video latent transformer that predicts time-varying 3D Gaussian primitives from one image to produce controllable 4D scenes with appearance, geometry, and motion.
BulletGen enhances 4D dynamic scene reconstruction from monocular videos by supervising Gaussian optimization with diffusion-generated frames aligned at a bullet-time step, achieving SOTA on novel-view synthesis and tracking.
Structure-guided dynamic 3DGS methods deliver superior reconstruction fidelity and compactness on D-NeRF while gaussian-centric methods provide higher rendering speeds at the cost of quality variability and storage.
Dual-representation framework pairs fixed-topology meshes for physics with Gaussian splatting for rendering, but two conversion strategies from varying-topology reconstructions cause 65-80% geometric degradation and underperform native fixed-topology methods.
citing papers explorer
-
GaussianDWM: 3D Gaussian Driving World Model for Unified Scene Understanding and Multi-Modal Generation
GaussianDWM uses 3D Gaussians with embedded linguistic features, language-guided sampling, and dual-condition generation for unified scene understanding and multi-modal output in driving world models.
-
Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models
A feed-forward video latent transformer that predicts time-varying 3D Gaussian primitives from one image to produce controllable 4D scenes with appearance, geometry, and motion.
-
BulletGen: Improving 4D Reconstruction with Bullet-Time Generation
BulletGen enhances 4D dynamic scene reconstruction from monocular videos by supervising Gaussian optimization with diffusion-generated frames aligned at a bullet-time step, achieving SOTA on novel-view synthesis and tracking.