pith. machine review for the scientific record. sign in

arxiv: 1708.03088 · v1 · submitted 2017-08-10 · 💻 cs.CV

Recognition: unknown

Semantic Video CNNs through Representation Warping

Authors on Pith no claims yet
classification 💻 cs.CV
keywords architecturesvideowarpingavailablecnnscomputationalcostdifferent
0
0 comments X
read the original abstract

In this work, we propose a technique to convert CNN models for semantic segmentation of static images into CNNs for video data. We describe a warping method that can be used to augment existing architectures with very little extra computational cost. This module is called NetWarp and we demonstrate its use for a range of network architectures. The main design principle is to use optical flow of adjacent frames for warping internal network representations across time. A key insight of this work is that fast optical flow methods can be combined with many different CNN architectures for improved performance and end-to-end training. Experiments validate that the proposed approach incurs only little extra computational cost, while improving performance, when video streams are available. We achieve new state-of-the-art results on the CamVid and Cityscapes benchmark datasets and show consistent improvements over different baseline networks. Our code and models will be available at http://segmentation.is.tue.mpg.de

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.