pith. the verified trust layer for science. sign in

arxiv: 1902.07383 · v2 · pith:XWGTDRZTnew · submitted 2019-02-20 · 📡 eess.IV

Neural Video Compression using Spatio-Temporal Priors

classification 📡 eess.IV
keywords priorstemporalvideocodingcompressionneuraljointlyresiduals
0
0 comments X p. Extension
Add this Pith Number to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{XWGTDRZT}

Prints a linked pith:XWGTDRZT badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

The pursuit of higher compression efficiency continuously drives the advances of video coding technologies. Fundamentally, we wish to find better "predictions" or "priors" that are reconstructed previously to remove the signal dependency efficiently and to accurately model the signal distribution for entropy coding. In this work, we propose a neural video compression framework, leveraging the spatial and temporal priors, independently and jointly to exploit the correlations in intra texture, optical flow based temporal motion and residuals. Spatial priors are generated using downscaled low-resolution features, while temporal priors (from previous reference frames and residuals) are captured using a convolutional neural network based long-short term memory (ConvLSTM) structure in a temporal recurrent fashion. All of these parts are connected and trained jointly towards the optimal rate-distortion performance. Compared with the High-Efficiency Video Coding (HEVC) Main Profile (MP), our method has demonstrated averaged 38% Bjontegaard-Delta Rate (BD-Rate) improvement using standard common test sequences, where the distortion is multi-scale structural similarity (MS-SSIM).

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.