SAM3D-Phys recovers complete simulatable object geometries from incomplete real-world scene reconstructions by combining SAM3D generative priors with physics-constrained spatial optimization and mask-guided appearance distillation.
TelePhysics: Physics-Grounded Multi-Object Scene Generation from a Single Image with Real-Time Interaction
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Recent generative video models achieve impressive visual quality but remain constrained by limited physical consistency and controllability. Existing video generation methods provide minimal physical control, and single-image-to-3D conversion approaches often suffer from object interpenetration. Furthermore, physics-based scene-level 3D generation methods exhibit spatial misalignment, stylized artifacts, and inconsistencies with the input data, restricting their use in realistic interactive video synthesis. We propose TelePhysics, a training-free framework that converts a single image into a physically consistent and controllable video through holistic scene-level 3D reconstruction. By representing the full scene geometry in a unified spatial coordinate system, TelePhysics resolves object penetration and alignment ambiguity. Unlike prior methods, this formulation enables accurate scenelevel multi-object interactions and introduces richer, complex control types for advanced mechanicsbased manipulation. By decoupling simulation from rendering, TelePhysics bypasses latency-heavy priors, achieving real-time physical interaction previews paired while preserving photorealistic visual fidelity. Experimental results demonstrate that TelePhysics substantially outperforms prior methods in physical fidelity, spatial coherence, and controllability. The open-source code is available at https://github.com/xinzhang007/TelePhysics.
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SAM3D-Phys: Towards Multi-Object Interactive Simulation in Real World
SAM3D-Phys recovers complete simulatable object geometries from incomplete real-world scene reconstructions by combining SAM3D generative priors with physics-constrained spatial optimization and mask-guided appearance distillation.