H3ae: High compression, high speed, and high quality autoencoder for video diffusion models

Yushu Wu, Yanyu Li, Ivan Skorokhodov, Anil Kag, Willi Menapace, Sharath Girish, Aliaksandr Siarohin, Yanzhi Wang, Sergey Tulyakov · 2025 · arXiv 2504.10567

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Latent-Compressed Variational Autoencoder for Video Diffusion Models

cs.CV · 2026-04-12 · unverdicted · novelty 6.0

A frequency-based latent compression method for video VAEs yields higher reconstruction quality than channel-reduction baselines at fixed compression ratios.

Video Generation with Predictive Latents

cs.CV · 2026-05-04 · unverdicted · novelty 5.0

PV-VAE improves video latent spaces for generation by unifying reconstruction with future-frame prediction, reporting 52% faster convergence and 34.42 FVD gain over Wan2.2 VAE on UCF101.

citing papers explorer

Showing 2 of 2 citing papers.

Latent-Compressed Variational Autoencoder for Video Diffusion Models cs.CV · 2026-04-12 · unverdicted · none · ref 47
A frequency-based latent compression method for video VAEs yields higher reconstruction quality than channel-reduction baselines at fixed compression ratios.
Video Generation with Predictive Latents cs.CV · 2026-05-04 · unverdicted · none · ref 54
PV-VAE improves video latent spaces for generation by unifying reconstruction with future-frame prediction, reporting 52% faster convergence and 34.42 FVD gain over Wan2.2 VAE on UCF101.

H3ae: High compression, high speed, and high quality autoencoder for video diffusion models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer