VS3R: Robust Full-frame Video Stabilization via Deep 3D Reconstruction

Muhua Zhu; Tie Ji; Xinhao Jin; Xinping Wang; Yifei Xue; Yizhen Lao; Yu Zhang

read the original abstract

Video stabilization aims to mitigate camera shake but faces a fundamental trade-off between geometric robustness and full-frame consistency. While 2D methods suffer from aggressive cropping, 3D techniques are often undermined by fragile optimization pipelines that fail under extreme motions. Novel view synthesis models suffer from structural artifacts and scale blindness. To bridge this gap, we propose VS3R, a framework that synergizes feed-forward 3D reconstruction with generative video diffusion. Our pipeline jointly estimates camera parameters, depth, and masks to ensure all-scenario reliability, and introduces a Hybrid Stabilized Rendering (HSR) module that fuses semantic and geometric cues to preliminarily address parallax occlusions caused by pose transformations while maintaining dynamic-static consistency. Finally, a Video Stabilization-Driven Diffusion Model (VSDM) leverages contextual information to restore disoccluded regions, jointly optimizing texture and temporal consistency. Collectively, VS3R achieves high-fidelity, full-frame stabilization across diverse camera models and significantly outperforms state-of-the-art methods in robustness and visual quality.

VS3R: Robust Full-frame Video Stabilization via Deep 3D Reconstruction

discussion (0)