pith. sign in

arxiv: 2603.15129 · v3 · pith:2TEUNCE7new · submitted 2026-03-16 · 💻 cs.CV

Next-Frame Decoding for Ultra-Low-Bitrate Image Compression with Video Diffusion Priors

classification 💻 cs.CV
keywords imagedecodinganchorcompressionframetemporalachievesdiffusion
0
0 comments X
read the original abstract

We present a novel paradigm for ultra-low-bitrate image compression (ULB-IC) that exploits the ``temporal'' evolution in generative image compression. Specifically, we define an explicit intermediate state during decoding: a compact anchor frame, which preserves the scene geometry and semantic layout while discarding high-frequency details. We then reinterpret generative decoding as a virtual temporal transition from this anchor to the final reconstructed image. To model this progression, we leverage a pretrained video diffusion model (VDM) as a temporal prior: the anchor frame serves as the initial frame and the original image as the target frame, transforming the decoding process into a next-frame prediction task. In contrast to image diffusion-based ULB-IC models, our decoding proceeds from a visible, semantically faithful anchor, which improves both fidelity and realism for perceptual image compression. Extensive experiments demonstrate that our method achieves superior rate-distortion performance. On the CLIC2020 test set, our method achieves over 50% bitrate savings across LPIPS, DISTS, FID, and KID compared to DiffC, while also delivering a significant decoding speedup of up to $\times$5. Code will be released at https://github.com/UnoC-727/NeFIC.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.