VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.
hub
Fora: Fast-forward caching in diffusion transformer acceleration.arXiv preprint arXiv:2407.01425
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
ORBIS uses output-guided token reduction and DATM to achieve 2x higher token reduction than AsymRnR, with up to 4.5x speedup and 79.3% energy savings versus A100 GPU for video DiT models.
Test-time sparsity with a parallel pipeline and omnidirectional feature reuse accelerates action diffusion by 5x to 47.5 Hz while cutting FLOPs 92% with no performance loss.
SpecEdit accelerates diffusion-based image editing up to 10x by using a low-resolution draft to identify edit-relevant tokens via semantic discrepancies for selective high-resolution denoising.
1.x-Distill achieves better quality and diversity than prior few-step distillation methods at 1.67 and 1.74 effective NFEs on SD3 models with up to 33x speedup.
DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.
FIS-DiT achieves 2.11-2.41x speedup on video DiT models in few-step regimes with negligible quality loss by exploiting frame-wise sparsity and consistency through a training-free interleaved execution strategy.
FlashClear delivers up to 122x faster object removal than prior diffusion models via adversarial step distillation and asymmetric attention caching while preserving visual quality.
MotionCache accelerates autoregressive video generation up to 6.28x by motion-weighted cache reuse based on inter-frame differences, with negligible quality loss on SkyReels-V2 and MAGI-1.
BAC accelerates transformer-based Diffusion Policy up to 3x by block-level adaptive feature caching using an Adaptive Caching Scheduler and Bubbling Union Algorithm to control error propagation.
Focused Forcing is a training-free per-frame KV selection method that combines attention scores with diversity metrics and head-importance estimation to accelerate autoregressive video diffusion up to 1.48x while improving quality.
AdaCorrection adaptively corrects offset caches in DiT inference via on-the-fly spatio-temporal validity checks to maintain near-original FID with moderate acceleration.
Sparse ActionGen accelerates diffusion policies up to 4x for robot control via rollout-adaptive pruning and zig-zag activation reuse without performance loss.
citing papers explorer
-
VDE: Training-Free Accelerating Rectified Flow Model via Velocity Decomposition and Estimation
VDE accelerates rectified flow models like Flux by 3.22x with LPIPS of 0.069 via velocity decomposition into parallel/orthogonal components plus periodic full-pass anchoring.
-
ORBIS: Output-Guided Token Reduction with Distribution-Aware Matching for Video Diffusion Acceleration
ORBIS uses output-guided token reduction and DATM to achieve 2x higher token reduction than AsymRnR, with up to 4.5x speedup and 79.3% energy savings versus A100 GPU for video DiT models.
-
Test-time Sparsity for Extreme Fast Action Diffusion
Test-time sparsity with a parallel pipeline and omnidirectional feature reuse accelerates action diffusion by 5x to 47.5 Hz while cutting FLOPs 92% with no performance loss.
-
SpecEdit: Training-Free Acceleration for Diffusion based Image Editing via Semantic Locking
SpecEdit accelerates diffusion-based image editing up to 10x by using a low-resolution draft to identify edit-relevant tokens via semantic discrepancies for selective high-resolution denoising.
-
1.x-Distill: Breaking the Diversity, Quality, and Efficiency Barrier in Distribution Matching Distillation
1.x-Distill achieves better quality and diversity than prior few-step distillation methods at 1.67 and 1.74 effective NFEs on SD3 models with up to 33x speedup.
-
DisCa: Accelerating Video Diffusion Transformers with Distillation-Compatible Learnable Feature Caching
DisCa replaces heuristic feature caching with a lightweight learnable neural predictor compatible with distillation, achieving 11.8× acceleration on video diffusion transformers with preserved generation quality.
-
FIS-DiT: Breaking the Few-Step Video Inference Barrier via Training-Free Frame Interleaved Sparsity
FIS-DiT achieves 2.11-2.41x speedup on video DiT models in few-step regimes with negligible quality loss by exploiting frame-wise sparsity and consistency through a training-free interleaved execution strategy.
-
FlashClear: Ultra-Fast Image Content Removal via Efficient Step Distillation and Feature Caching
FlashClear delivers up to 122x faster object removal than prior diffusion models via adversarial step distillation and asymmetric attention caching while preserving visual quality.
-
Motion-Aware Caching for Efficient Autoregressive Video Generation
MotionCache accelerates autoregressive video generation up to 6.28x by motion-weighted cache reuse based on inter-frame differences, with negligible quality loss on SkyReels-V2 and MAGI-1.
-
Block-wise Adaptive Caching for Accelerating Diffusion Policy
BAC accelerates transformer-based Diffusion Policy up to 3x by block-level adaptive feature caching using an Adaptive Caching Scheduler and Bubbling Union Algorithm to control error propagation.
-
Focused Forcing: Content-Aware Per-Frame KV Selection for Efficient Autoregressive Video Diffusion
Focused Forcing is a training-free per-frame KV selection method that combines attention scores with diversity metrics and head-importance estimation to accelerate autoregressive video diffusion up to 1.48x while improving quality.
-
AdaCorrection: Adaptive Offset Cache Correction for Accurate Diffusion Transformers
AdaCorrection adaptively corrects offset caches in DiT inference via on-the-fly spatio-temporal validity checks to maintain near-original FID with moderate acceleration.
-
Sparse ActionGen: Accelerating Diffusion Policy with Real-time Pruning
Sparse ActionGen accelerates diffusion policies up to 4x for robot control via rollout-adaptive pruning and zig-zag activation reuse without performance loss.