ReconPhys is the first feedforward neural network that jointly reconstructs 3D geometry and appearance via Gaussian Splatting while estimating physical attributes from a single monocular video using self-supervised training.
hub
arXiv preprint arXiv:2106.13228 (2021)
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
SplatWeaver uses cardinality Gaussian experts and pixel-level routing to dynamically allocate varying numbers of Gaussian primitives for generalizable novel view synthesis.
Cloth-HUGS uses layered Gaussians for body and cloth with SMPL-driven deformation and physics constraints to improve clothed human reconstruction over prior single-representation methods.
Splats in Splats++ embeds messages into 3DGS via importance-graded SH encryption, hash-grid opacity mapping, and a gradient-gated consistency loss, achieving higher fidelity and robustness than prior methods.
TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.
The paper presents a multimodal framework, dataset, and reconstruction pipeline to create immersive volumetric videos supporting large 6-DoF audiovisual interaction from real multi-view captures.
MoGaF groups Gaussians by motion in 4D splatting representations to enable stable long-term forecasting of dynamic scenes.
PerpetualWonder introduces a closed-loop generative simulator with a unified physical-visual representation for long-horizon action-conditioned 4D scene generation from one image.
PanopticQuery lifts 2D semantic predictions into globally consistent 4D groundings via neural field optimization and sets new state-of-the-art results on complex language queries for attributes, actions, and interactions.
HOIGS adds a cross-attention HOI module to Gaussian Splatting that combines HexPlane human features with Cubic Hermite Spline object features to model interaction-induced deformations.
PhysMorph-GS injects visual supervision via deformation gradients in differentiable physics simulation and uses phased Chamfer-guided plasticity to reduce silhouette error by up to 49.9% compared to physics-only baselines.
BulletGen enhances 4D dynamic scene reconstruction from monocular videos by supervising Gaussian optimization with diffusion-generated frames aligned at a bullet-time step, achieving SOTA on novel-view synthesis and tracking.
Dynamic 3DGS models achieve lower PSNR on egocentric videos than exocentric ones, with the gap arising from static content reconstruction.
citing papers explorer
-
ReconPhys: Reconstruct Appearance and Physical Attributes from Single Video
ReconPhys is the first feedforward neural network that jointly reconstructs 3D geometry and appearance via Gaussian Splatting while estimating physical attributes from a single monocular video using self-supervised training.
-
SplatWeaver: Learning to Allocate Gaussian Primitives for Generalizable Novel View Synthesis
SplatWeaver uses cardinality Gaussian experts and pixel-level routing to dynamically allocate varying numbers of Gaussian primitives for generalizable novel view synthesis.
-
CLOTH-HUGS: Cloth Aware Human Gaussian Splatting
Cloth-HUGS uses layered Gaussians for body and cloth with SMPL-driven deformation and physics constraints to improve clothed human reconstruction over prior single-representation methods.
-
Splats in Splats++: Robust and Generalizable 3D Gaussian Splatting Steganography
Splats in Splats++ embeds messages into 3DGS via importance-graded SH encryption, hash-grid opacity mapping, and a gradient-gated consistency loss, achieving higher fidelity and robustness than prior methods.
-
TokenGS: Decoupling 3D Gaussian Prediction from Pixels with Learnable Tokens
TokenGS uses learnable Gaussian tokens in an encoder-decoder architecture to regress 3D means directly, achieving SOTA feed-forward reconstruction on static and dynamic scenes with better robustness.
-
Realizing Immersive Volumetric Video: A Multimodal Framework for 6-DoF VR Engagement
The paper presents a multimodal framework, dataset, and reconstruction pipeline to create immersive volumetric videos supporting large 6-DoF audiovisual interaction from real multi-view captures.
-
Space-Time Forecasting of Dynamic Scenes with Motion-aware Gaussian Grouping
MoGaF groups Gaussians by motion in 4D splatting representations to enable stable long-term forecasting of dynamic scenes.
-
PerpetualWonder: Long-Horizon Action-Conditioned 4D Scene Generation
PerpetualWonder introduces a closed-loop generative simulator with a unified physical-visual representation for long-horizon action-conditioned 4D scene generation from one image.
-
PanopticQuery: Unified Query-Time Reasoning for 4D Scenes
PanopticQuery lifts 2D semantic predictions into globally consistent 4D groundings via neural field optimization and sets new state-of-the-art results on complex language queries for attributes, actions, and interactions.
-
HOIGS: Human-Object Interaction Gaussian Splatting
HOIGS adds a cross-attention HOI module to Gaussian Splatting that combines HexPlane human features with Cubic Hermite Spline object features to model interaction-induced deformations.
-
PhysMorph-GS: Render-Guided Volumetric Morphing with Differentiable Physics
PhysMorph-GS injects visual supervision via deformation gradients in differentiable physics simulation and uses phased Chamfer-guided plasticity to reduce silhouette error by up to 49.9% compared to physics-only baselines.
-
BulletGen: Improving 4D Reconstruction with Bullet-Time Generation
BulletGen enhances 4D dynamic scene reconstruction from monocular videos by supervising Gaussian optimization with diffusion-generated frames aligned at a bullet-time step, achieving SOTA on novel-view synthesis and tracking.
-
Bringing a Personal Point of View: Evaluating Dynamic 3D Gaussian Splatting for Egocentric Scene Reconstruction
Dynamic 3DGS models achieve lower PSNR on egocentric videos than exocentric ones, with the gap arising from static content reconstruction.