StableNet: Semi-Online, Multi-Scale Deep Video Stabilization

Chia-Hung Huang; Chi-Keung Tang; Hang Yin; Yu-Wing Tai

REVIEW 2 major objections 1 minor 31 references

A multi-scale neural network learns to stabilize video frames by outputting affine transformations after training on synthesized shaky footage.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-24 17:09 UTC pith:UHWL5HWY

load-bearing objection The paper gives a multi-scale learned affine predictor for online stabilization plus a synthetic dataset, but the results stay tied to that data with no clear real-world validation. the 2 major comments →

arxiv 1907.10283 v1 pith:UHWL5HWY submitted 2019-07-24 cs.CV eess.IV

StableNet: Semi-Online, Multi-Scale Deep Video Stabilization

Chia-Hung Huang , Hang Yin , Yu-Wing Tai , Chi-Keung Tang This is my paper

classification cs.CV eess.IV

keywords video stabilizationdeep learningaffine transformationmulti-scale networkonline processingsynthetic datasethandheld video

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces StableNet, a data-driven method that processes each unsteady video frame progressively across scales from low to high resolution and predicts an affine transform to correct shake. The approach runs online, frame by frame, and learns the stabilization mapping implicitly from paired training data rather than relying on explicit feature tracking or optical flow. Because public stabilization datasets are scarce, the authors create their own by synthesizing unstable videos that vary in shake intensity to mimic handheld camera motion. Experiments indicate the resulting model matches or exceeds prior methods on several test clips and remains effective on complex scene content it never saw during training.

Core claim

The central claim is that an end-to-end multi-scale network can be trained to perform online video stabilization by directly regressing per-frame affine transformations from synthetic shaky-stable pairs, eliminating the need for separate motion estimation steps while generalizing to unseen complex footage.

What carries the argument

The multi-scale network that ingests an unsteady frame at successively higher resolutions and regresses an affine transformation matrix to stabilize it.

Load-bearing premise

Synthesized videos with varying shake extents accurately replicate real handheld camera motion so that training on them produces a model that works on genuine footage.

What would settle it

Measure stabilization quality on a set of real handheld videos captured independently of the synthesis process and compare against the performance reported on the synthetic test set.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Stabilization becomes possible without separate feature tracking or optical-flow computation at runtime.
The method operates online, producing a stabilized output for each frame as soon as it arrives.
A single model trained on synthetic data can dampen shake in scene types it was never explicitly shown.
The same progressive multi-scale architecture could be applied to other per-frame geometric correction tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Real-time deployment on mobile devices becomes feasible once the network is quantized or distilled, because no external motion estimators are required.
Collecting a modest amount of real paired data could further close any remaining domain gap between synthetic and genuine camera motion.
The learned affine corrections might serve as a lightweight prior for more expressive stabilization models that also handle rolling-shutter or parallax effects.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

The paper gives a multi-scale learned affine predictor for online stabilization plus a synthetic dataset, but the results stay tied to that data with no clear real-world validation.

read the letter

The paper's main contribution is StableNet, a semi-online network that processes each frame progressively across scales from low to high resolution and outputs an affine transform to stabilize it. They also created and release a paired synthetic dataset of shaky videos meant to mimic real camera motion. This is a legitimate engineering extension of prior deep stabilization work, moving away from explicit feature tracking or flow toward implicit learning from data. The multi-scale design is a sensible choice for handling varying shake strengths without separate modules. Releasing the dataset is useful since public stabilization data is scarce. The abstract notes outperformance on some unstable samples and partial robustness on complex content not seen in training, which at least shows the model can dampen motion to some degree. The central weakness is that everything rests on the authors' own synthetic data. No details appear on how the shake was generated, whether it includes realistic parallax or frequency content from actual handheld paths, or how the model performs on external real footage or standard benchmarks. Without those, the gains could be circular with the data synthesis, and generalization remains unproven. Training losses, optimization, and full quantitative tables are also absent from the provided text, making it hard to judge reproducibility or compare fairly. This is for CV engineers working on consumer video apps who want a learned online alternative to classical pipelines. A reader might try the multi-scale idea or the dataset for their own experiments, but the current evidence is too thin to shift practice. I would send it to peer review if a revised version adds real-data tests and ablations; otherwise it is still preliminary.

Referee Report

2 major / 1 minor

Summary. The paper proposes StableNet, a semi-online multi-scale deep network for video stabilization that progressively processes frames from low to high resolution to predict stabilizing affine transformations. It introduces a paired dataset of synthesized unstable videos with varying shake extents that are claimed to simulate real-life camera motion, and reports that the method outperforms prior approaches on several unstable samples while remaining comparable in general and showing robustness on complex content not seen during training.

Significance. If the synthetic data accurately reproduces the statistics of real handheld camera trajectories and the learned model generalizes, the implicit multi-scale affine prediction approach could provide an efficient online alternative to explicit feature-tracking or optical-flow methods. The design avoids per-frame feature extraction, which is a practical strength if the performance claims hold on real footage.

major comments (2)

[Abstract] Abstract and dataset section: The central performance and generalization claims rest on the assertion that 'synthesized unstable videos with different extent of shake ... simulate real-life camera movement,' yet the manuscript supplies no description of the generative process (2-D affine vs. 3-D paths, inclusion of parallax, frequency content of trajectories, or any quantitative match to real handheld statistics). This is load-bearing for the reported outperformance and robustness results.
[Experiments] Experiments section: The abstract states outperformance 'in several unstable samples' and robustness on untrained complex content, but without visible quantitative tables, standard metrics (cropping ratio, distortion, inter-frame consistency), held-out real-world test sets, or comparison against public benchmarks, it is impossible to assess whether gains are supported or affected by sample selection.

minor comments (1)

[Abstract] The term 'semi-online' is introduced in the title and abstract but is not explicitly defined relative to fully online or offline methods.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point by point below, indicating planned revisions where the manuscript requires strengthening.

read point-by-point responses

Referee: [Abstract] Abstract and dataset section: The central performance and generalization claims rest on the assertion that 'synthesized unstable videos with different extent of shake ... simulate real-life camera movement,' yet the manuscript supplies no description of the generative process (2-D affine vs. 3-D paths, inclusion of parallax, frequency content of trajectories, or any quantitative match to real handheld statistics). This is load-bearing for the reported outperformance and robustness results.

Authors: We agree that the current manuscript provides insufficient detail on the data synthesis procedure, which weakens the support for the generalization claims. The unstable videos were generated by applying controlled 2D affine perturbations to stable source videos, with shake extent varied across low, medium, and high levels; however, no explicit frequency matching or parallax modeling was performed. In the revised manuscript we will expand the dataset section with a full description of the generative process, the exact affine parameter ranges, and any quantitative comparison to real handheld trajectories that can be added. revision: yes
Referee: [Experiments] Experiments section: The abstract states outperformance 'in several unstable samples' and robustness on untrained complex content, but without visible quantitative tables, standard metrics (cropping ratio, distortion, inter-frame consistency), held-out real-world test sets, or comparison against public benchmarks, it is impossible to assess whether gains are supported or affected by sample selection.

Authors: The experiments section currently emphasizes qualitative visual results on selected synthesized samples and a limited number of real videos. We acknowledge that the absence of standard quantitative metrics and systematic benchmark comparisons makes it difficult to evaluate the strength of the performance claims. In the revision we will add tables reporting cropping ratio, distortion, and inter-frame consistency, include results on additional held-out real-world sequences, and provide comparisons against public stabilization benchmarks to allow a more objective assessment. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical training on synthetic data does not reduce claims to self-definition

full rationale

The paper presents an empirical, data-driven neural network for video stabilization trained on author-synthesized unstable/stable pairs. No derivation chain, equations, or first-principles steps are described that reduce a claimed prediction or result to its own inputs by construction. The abstract explicitly states the network learns the process implicitly from training data and reports outperformance on samples from that process plus robustness on untrained complex content; this is standard supervised learning rather than any of the enumerated circularity patterns (self-definitional, fitted-input-called-prediction, self-citation load-bearing, etc.). No load-bearing self-citations, uniqueness theorems, or ansatzes are invoked. The evaluation uses the authors' own synthetic distribution, but the paper does not claim external real-world generalization as a derived theorem; the result remains self-contained as an empirical demonstration on the generated data.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the representativeness of the synthetic shake model and on the sufficiency of affine transforms; both are domain assumptions rather than derived quantities.

axioms (2)

domain assumption Affine transformations are sufficient to model the dominant motion between consecutive frames in handheld video.
The network outputs only an affine transform; this modeling choice is invoked in the abstract description of the output.
domain assumption Synthetic shake added to steady video produces training pairs whose distribution matches real handheld camera motion.
The abstract states that unstable videos were synthesized to simulate real-life camera movement and used for training.

pith-pipeline@v0.9.0 · 5701 in / 1409 out tokens · 17305 ms · 2026-05-24T17:09:00.105298+00:00 · methodology

0 comments

read the original abstract

Video stabilization algorithms are of greater importance nowadays with the prevalence of hand-held devices which unavoidably produce videos with undesirable shaky motions. In this paper we propose a data-driven online video stabilization method along with a paired dataset for deep learning. The network processes each unsteady frame progressively in a multi-scale manner, from low resolution to high resolution, and then outputs an affine transformation to stabilize the frame. Different from conventional methods which require explicit feature tracking or optical flow estimation, the underlying stabilization process is learned implicitly from the training data, and the stabilization process can be done online. Since there are limited public video stabilization datasets available, we synthesized unstable videos with different extent of shake that simulate real-life camera movement. Experiments show that our method is able to outperform other stabilization methods in several unstable samples while remaining comparable in general. Also, our method is tested on complex contents and found robust enough to dampen these samples to some extent even it was not explicitly trained in the contents.

Figures

Figures reproduced from arXiv: 1907.10283 by Chia-Hung Huang, Chi-Keung Tang, Hang Yin, Yu-Wing Tai.

**Figure 2.** Figure 2: Network Architecture. The Multi-scale StableNet is based on Siamese architecture. Two consecutive unstable frames will be [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Implementation Details. All padding are in VALID [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Frames from our dataset. The dataset consists of about 420 pairs of steady and synthesized shaky videos with three extents of [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Fidelity experiments results. The fidelity is measured by calculating the average interframe PSNR (in dB): (a) shows the eval [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Stability experiments results. Stability is measured based on the minimum energy percentage in rotation, horizontal translation and [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Fidelity and stability for (a) zooming and (b) parallax videos. Although there is no scaling or grid warping in the output affine [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 3 internal anchors

[1]

Bosco, A

A. Bosco, A. Bruno, S. Battiato, G. Bella, and G. Puglisi. Digital video stabilization through curve warping tech- niques. IEEE Transactions on Consumer Electronics , 54(2):220–224, 2008. 2

work page 2008
[2]

J.-Y . Bouguet. Pyramidal implementation of the afﬁne lu- cas kanade feature tracker description of the algorithm. Intel Corporation, 5(1-10):4, 2001. 3, 4

work page 2001
[3]

Chang, S.-H

H.-C. Chang, S.-H. Lai, and K.-R. Lu. A robust and efﬁ- cient video stabilization algorithm. In Multimedia and Expo,

work page
[4]

2004 IEEE International Conference on , volume 1, pages 29–32

ICME’04. 2004 IEEE International Conference on , volume 1, pages 29–32. IEEE, 2004. 2

work page 2004
[5]

Goldstein and R

A. Goldstein and R. Fattal. Video stabilization using epipo- lar geometry. ACM Transactions on Graphics (TOG) , 8 31(5):126, 2012. 2, 6, 7

work page 2012
[6]

Grundmann, V

M. Grundmann, V . Kwatra, and I. Essa. Auto-directed video stabilization with robust l1 optimal camera paths. In The IEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), pages 225–232. IEEE, 2011. 2, 6, 7

work page 2011
[7]

R. Hu, R. Shi, I.-f. Shen, and W. Chen. Video stabilization using scale-invariant features. In Information Visualization,

work page
[8]

11th International Conference, pages 871–877

IV’07. 11th International Conference, pages 871–877. IEEE, 2007. 2

work page 2007
[9]

J. S. Jin, Z. Zhu, and G. Xu. Digital video sequence stabi- lization based on 2.5 d motion estimation and inertial motion ﬁltering. Real-Time Imaging, 7(4):357–365, 2001. 2

work page 2001
[10]

Karpenko, D

A. Karpenko, D. Jacobs, J. Baek, and M. Levoy. Digital video stabilization and rolling shutter correction using gyro- scopes. CSTR, 1:2, 2011. 2

work page 2011
[11]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014. 5

work page internal anchor Pith review Pith/arXiv arXiv 2014
[12]

Lee, Y .-Y

K.-Y . Lee, Y .-Y . Chuang, B.-Y . Chen, and M. Ouhyoung. Video stabilization using robust feature trajectories. InCom- puter Vision, 2009 IEEE 12th International Conference on , pages 1397–1404. IEEE, 2009. 2

work page 2009
[13]

P. Lei, F. Li, and S. Todorovic. Boundary ﬂow: A siamese network that predicts boundary motion without training on motion. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. 4

work page 2018
[14]

Litvin, J

A. Litvin, J. Konrad, and W. C. Karl. Probabilistic video sta- bilization using kalman ﬁltering and mosaicing. In Image and Video Communications and Processing 2003 , volume 5022, pages 663–675. International Society for Optics and Photonics, 2003. 2

work page 2003
[15]

F. Liu, M. Gleicher, H. Jin, and A. Agarwala. Content- preserving warps for 3d video stabilization. In ACM Trans- actions on Graphics (TOG) , volume 28, page 44. ACM,

work page
[16]

F. Liu, M. Gleicher, J. Wang, H. Jin, and A. Agarwala. Sub- space video stabilization. ACM Transactions on Graphics (TOG), 30(1):4, 2011. 2

work page 2011
[17]

S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshﬂow: Minimum latency online video stabilization. In European Conference on Computer Vision , pages 800–815. Springer,

work page
[18]

S. Liu, L. Yuan, P. Tan, and J. Sun. Bundled camera paths for video stabilization. ACM Transactions on Graphics (TOG), 32(4):78, 2013. 2, 6, 7

work page 2013
[19]

S. Liu, L. Yuan, P. Tan, and J. Sun. Steadyﬂow: Spatially smooth optical ﬂow for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4209–4216, 2014. 2

work page 2014
[20]

Matsushita, E

Y . Matsushita, E. Ofek, W. Ge, X. Tang, and H.-Y . Shum. Full-frame video stabilization with motion inpainting. IEEE Transactions on pattern analysis and Machine Intelligence , 28(7):1150–1163, 2006. 2

work page 2006
[21]

Morimoto and R

C. Morimoto and R. Chellappa. Evaluation of image stabi- lization algorithms. In 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP ’98 (Cat. No.98CH36181) , volume 5, pages 2789–2792 vol.5, May 1998. 6

work page 1998
[22]

Oreifej, X

O. Oreifej, X. Li, and M. Shah. Simultaneous video stabi- lization and moving object detection in turbulence. IEEE transactions on pattern analysis and machine intelligence , 35(2):450–462, 2013. 2

work page 2013
[23]

Ratakonda

K. Ratakonda. Real-time digital video stabilization for multi- media applications. In Circuits and Systems, 1998. IS- CAS’98. Proceedings of the 1998 IEEE International Sym- posium on, volume 4, pages 69–72. IEEE, 1998. 2

work page 1998
[24]

Shi and Tomasi

J. Shi and Tomasi. Good features to track. In 1994 Proceed- ings of IEEE Conference on Computer Vision and Pattern Recognition, pages 593–600, June 1994. 4

work page 1994
[25]

S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang. Deep video deblurring for hand-held cameras. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, page 6, 2017. 1

work page 2017
[26]

MoCoGAN: Decomposing Motion and Content for Video Generation

S. Tulyakov, M.-Y . Liu, X. Yang, and J. Kautz. Moco- gan: Decomposing motion and content for video generation. arXiv preprint arXiv:1707.04993, 2017. 1

work page internal anchor Pith review Pith/arXiv arXiv 2017
[27]

Deep Online Video Stabilization

M. Wang, G.-Y . Yang, J.-K. Lin, A. Shamir, S.-H. Zhang, S.- P. Lu, and S.-M. Hu. Deep online video stabilization. arXiv preprint arXiv:1802.08091, 2018. 3, 6

work page internal anchor Pith review Pith/arXiv arXiv 2018
[28]

J. Yang, D. Schonfeld, C. Chen, and M. Mohamed. Online video stabilization based on particle ﬁlters. In 2006 Interna- tional Conference on Image Processing , pages 1545–1548, Oct 2006. 7

work page 2006
[29]

J. Yang, D. Schonfeld, and M. Mohamed. Robust video sta- bilization based on particle ﬁlter tracking of projected cam- era motion. IEEE Transactions on Circuits and Systems for Video Technology, 19(7):945–954, 2009. 2

work page 2009
[30]

Zhang, W

G. Zhang, W. Hua, X. Qin, Y . Shao, and H. Bao. Video stabilization based on a 3d perspective camera model. The Visual Computer, 25(11):997, 2009. 2

work page 2009
[31]

F. Zhu, Z. Yan, J. Bu, and Y . Yu. Exemplar-based image and video stylization using fully convolutional semantic features. IEEE Transactions on Image Processing, 26(7):3542–3555,

work page

[1] [1]

Bosco, A

A. Bosco, A. Bruno, S. Battiato, G. Bella, and G. Puglisi. Digital video stabilization through curve warping tech- niques. IEEE Transactions on Consumer Electronics , 54(2):220–224, 2008. 2

work page 2008

[2] [2]

J.-Y . Bouguet. Pyramidal implementation of the afﬁne lu- cas kanade feature tracker description of the algorithm. Intel Corporation, 5(1-10):4, 2001. 3, 4

work page 2001

[3] [3]

Chang, S.-H

H.-C. Chang, S.-H. Lai, and K.-R. Lu. A robust and efﬁ- cient video stabilization algorithm. In Multimedia and Expo,

work page

[4] [4]

2004 IEEE International Conference on , volume 1, pages 29–32

ICME’04. 2004 IEEE International Conference on , volume 1, pages 29–32. IEEE, 2004. 2

work page 2004

[5] [5]

Goldstein and R

A. Goldstein and R. Fattal. Video stabilization using epipo- lar geometry. ACM Transactions on Graphics (TOG) , 8 31(5):126, 2012. 2, 6, 7

work page 2012

[6] [6]

Grundmann, V

M. Grundmann, V . Kwatra, and I. Essa. Auto-directed video stabilization with robust l1 optimal camera paths. In The IEEE Conference on Computer Vision and Pattern Recogni- tion (CVPR), pages 225–232. IEEE, 2011. 2, 6, 7

work page 2011

[7] [7]

R. Hu, R. Shi, I.-f. Shen, and W. Chen. Video stabilization using scale-invariant features. In Information Visualization,

work page

[8] [8]

11th International Conference, pages 871–877

IV’07. 11th International Conference, pages 871–877. IEEE, 2007. 2

work page 2007

[9] [9]

J. S. Jin, Z. Zhu, and G. Xu. Digital video sequence stabi- lization based on 2.5 d motion estimation and inertial motion ﬁltering. Real-Time Imaging, 7(4):357–365, 2001. 2

work page 2001

[10] [10]

Karpenko, D

A. Karpenko, D. Jacobs, J. Baek, and M. Levoy. Digital video stabilization and rolling shutter correction using gyro- scopes. CSTR, 1:2, 2011. 2

work page 2011

[11] [11]

D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014. 5

work page internal anchor Pith review Pith/arXiv arXiv 2014

[12] [12]

Lee, Y .-Y

K.-Y . Lee, Y .-Y . Chuang, B.-Y . Chen, and M. Ouhyoung. Video stabilization using robust feature trajectories. InCom- puter Vision, 2009 IEEE 12th International Conference on , pages 1397–1404. IEEE, 2009. 2

work page 2009

[13] [13]

P. Lei, F. Li, and S. Todorovic. Boundary ﬂow: A siamese network that predicts boundary motion without training on motion. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018. 4

work page 2018

[14] [14]

Litvin, J

A. Litvin, J. Konrad, and W. C. Karl. Probabilistic video sta- bilization using kalman ﬁltering and mosaicing. In Image and Video Communications and Processing 2003 , volume 5022, pages 663–675. International Society for Optics and Photonics, 2003. 2

work page 2003

[15] [15]

F. Liu, M. Gleicher, H. Jin, and A. Agarwala. Content- preserving warps for 3d video stabilization. In ACM Trans- actions on Graphics (TOG) , volume 28, page 44. ACM,

work page

[16] [16]

F. Liu, M. Gleicher, J. Wang, H. Jin, and A. Agarwala. Sub- space video stabilization. ACM Transactions on Graphics (TOG), 30(1):4, 2011. 2

work page 2011

[17] [17]

S. Liu, P. Tan, L. Yuan, J. Sun, and B. Zeng. Meshﬂow: Minimum latency online video stabilization. In European Conference on Computer Vision , pages 800–815. Springer,

work page

[18] [18]

S. Liu, L. Yuan, P. Tan, and J. Sun. Bundled camera paths for video stabilization. ACM Transactions on Graphics (TOG), 32(4):78, 2013. 2, 6, 7

work page 2013

[19] [19]

S. Liu, L. Yuan, P. Tan, and J. Sun. Steadyﬂow: Spatially smooth optical ﬂow for video stabilization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4209–4216, 2014. 2

work page 2014

[20] [20]

Matsushita, E

Y . Matsushita, E. Ofek, W. Ge, X. Tang, and H.-Y . Shum. Full-frame video stabilization with motion inpainting. IEEE Transactions on pattern analysis and Machine Intelligence , 28(7):1150–1163, 2006. 2

work page 2006

[21] [21]

Morimoto and R

C. Morimoto and R. Chellappa. Evaluation of image stabi- lization algorithms. In 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP ’98 (Cat. No.98CH36181) , volume 5, pages 2789–2792 vol.5, May 1998. 6

work page 1998

[22] [22]

Oreifej, X

O. Oreifej, X. Li, and M. Shah. Simultaneous video stabi- lization and moving object detection in turbulence. IEEE transactions on pattern analysis and machine intelligence , 35(2):450–462, 2013. 2

work page 2013

[23] [23]

Ratakonda

K. Ratakonda. Real-time digital video stabilization for multi- media applications. In Circuits and Systems, 1998. IS- CAS’98. Proceedings of the 1998 IEEE International Sym- posium on, volume 4, pages 69–72. IEEE, 1998. 2

work page 1998

[24] [24]

Shi and Tomasi

J. Shi and Tomasi. Good features to track. In 1994 Proceed- ings of IEEE Conference on Computer Vision and Pattern Recognition, pages 593–600, June 1994. 4

work page 1994

[25] [25]

S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang. Deep video deblurring for hand-held cameras. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 2, page 6, 2017. 1

work page 2017

[26] [26]

MoCoGAN: Decomposing Motion and Content for Video Generation

S. Tulyakov, M.-Y . Liu, X. Yang, and J. Kautz. Moco- gan: Decomposing motion and content for video generation. arXiv preprint arXiv:1707.04993, 2017. 1

work page internal anchor Pith review Pith/arXiv arXiv 2017

[27] [27]

Deep Online Video Stabilization

M. Wang, G.-Y . Yang, J.-K. Lin, A. Shamir, S.-H. Zhang, S.- P. Lu, and S.-M. Hu. Deep online video stabilization. arXiv preprint arXiv:1802.08091, 2018. 3, 6

work page internal anchor Pith review Pith/arXiv arXiv 2018

[28] [28]

J. Yang, D. Schonfeld, C. Chen, and M. Mohamed. Online video stabilization based on particle ﬁlters. In 2006 Interna- tional Conference on Image Processing , pages 1545–1548, Oct 2006. 7

work page 2006

[29] [29]

J. Yang, D. Schonfeld, and M. Mohamed. Robust video sta- bilization based on particle ﬁlter tracking of projected cam- era motion. IEEE Transactions on Circuits and Systems for Video Technology, 19(7):945–954, 2009. 2

work page 2009

[30] [30]

Zhang, W

G. Zhang, W. Hua, X. Qin, Y . Shao, and H. Bao. Video stabilization based on a 3d perspective camera model. The Visual Computer, 25(11):997, 2009. 2

work page 2009

[31] [31]

F. Zhu, Z. Yan, J. Bu, and Y . Yu. Exemplar-based image and video stylization using fully convolutional semantic features. IEEE Transactions on Image Processing, 26(7):3542–3555,

work page