DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.
Structural pruning for diffusion models
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Z-Image Turbo++ narrows the quality gap to 8-step generation via three distillation techniques tailored for the 2-step regime.
OFA-Diffusion Compression trains diffusion models once to yield multiple size-specific compressed subnetworks via restricted candidate spaces, importance-based channel allocation, and reweighting.
2ndMatch finetunes pruned diffusion models via second-order Jacobian matching inspired by Finite-Time Lyapunov Exponents to reduce the quality gap with dense models on image generation tasks.
DVG dynamically selects content-aware spatio-temporal acceleration strategies for diffusion-based video generation, delivering up to 7x speedup with near-lossless quality on models like HunyuanVideo.
citing papers explorer
-
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.
-
High-Fidelity Two-Step Image Generation via Teacher-Aligned End-to-End Distillation
Z-Image Turbo++ narrows the quality gap to 8-step generation via three distillation techniques tailored for the 2-step regime.
-
2ndMatch: Finetuning Pruned Diffusion Models via Second-Order Jacobian Matching
2ndMatch finetunes pruned diffusion models via second-order Jacobian matching inspired by Finite-Time Lyapunov Exponents to reduce the quality gap with dense models on image generation tasks.
-
Dynamic Video Generation: Shaping Video Generation Across Time and Space
DVG dynamically selects content-aware spatio-temporal acceleration strategies for diffusion-based video generation, delivering up to 7x speedup with near-lossless quality on models like HunyuanVideo.