pith. sign in

arxiv: 1802.08091 · v1 · pith:OATMPFCRnew · submitted 2018-02-22 · 💻 cs.GR

Deep Online Video Stabilization

classification 💻 cs.GR
keywords stabilizationvideocamerapathnetworkneuralproposedwell
0
0 comments X
read the original abstract

Video stabilization technique is essential for most hand-held captured videos due to high-frequency shakes. Several 2D-, 2.5D- and 3D-based stabilization techniques are well studied, but to our knowledge, no solutions based on deep neural networks had been proposed. The reason for this is mostly the shortage of training data, as well as the challenge of modeling the problem using neural networks. In this paper, we solve the video stabilization problem using a convolutional neural network (ConvNet). Instead of dealing with offline holistic camera path smoothing based on feature matching, we focus on low-latency real-time camera path smoothing without explicitly representing the camera path. Our network, called StabNet, learns a transformation for each input unsteady frame progressively along the time-line, while creating a more stable latent camera path. To train the network, we create a dataset of synchronized steady/unsteady video pairs via a well designed hand-held hardware. Experimental results shows that the proposed online method (without using future frames) performs comparatively to traditional offline video stabilization methods, while running about 30 times faster. Further, the proposed StabNet is able to handle night-time and blurry videos, where existing methods fail in robust feature matching.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. StableNet: Semi-Online, Multi-Scale Deep Video Stabilization

    cs.CV 2019-07 unverdicted novelty 5.0

    StableNet learns an online multi-scale network to predict stabilizing affine transforms per frame from a synthesized shaky-video dataset and reports outperforming prior methods on some samples while remaining comparab...