pith. sign in

Od-vae: An omni-dimensional video compressor for improving latent video diffusion model

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

background 2

citation-polarity summary

years

2026 5 2024 1

verdicts

UNVERDICTED 6

roles

background 2

polarities

background 2

representative citing papers

Efficient Video Diffusion Models: Advancements and Challenges

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

A survey that groups efficient video diffusion methods into four paradigms—step distillation, efficient attention, model compression, and cache/trajectory optimization—and outlines open challenges for practical use.

Task-Oriented Communication for Human Action Understanding via Edge-Cloud Co-Inference

eess.SP · 2026-05-08 · unverdicted · novelty 5.0

TOAU compresses human motion videos to 9 bits per frame with pose estimation and VQ-VAE, then aligns the tokens to a vision-language model via a lightweight projector, achieving 1% transmission payload and 20% latency of video codecs while maintaining comparable action understanding accuracy.

Video Generation with Predictive Latents

cs.CV · 2026-05-04 · unverdicted · novelty 5.0

PV-VAE improves video latent spaces for generation by unifying reconstruction with future-frame prediction, reporting 52% faster convergence and 34.42 FVD gain over Wan2.2 VAE on UCF101.

HunyuanVideo: A Systematic Framework For Large Video Generative Models

cs.CV · 2024-12-03 · unverdicted · novelty 5.0

HunyuanVideo presents a 13B-parameter open-source video generative model with integrated data, architecture, training, and inference systems whose professional evaluations show it outperforming prior SOTA models including Runway Gen-3 and Luma 1.6.

citing papers explorer

Showing 6 of 6 citing papers.