RT-Lynx shifts DiT sparsity from weights to activations, reports up to 1.55x linear-layer speedup while preserving generation quality across multiple diffusion models.
Available: https://arxiv.org/abs/2312.00858
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
OTCache uses optimal transport to interpolate caching schedules between a graph-based reference and an Optuna-optimized anchor, delivering 3.66x-4.7x speedups on FLUX.1, Qwen-Image and HunyuanVideo with improved fidelity.
Causal probing of attention in audio separation transformers identifies dual pathways and asynchronous convergence, enabling a training-free Layer-Selective Attention Caching method that reduces self-attention computation by ~25% with negligible quality loss.
citing papers explorer
-
RT-Lynx: Putting the GEMM Sparsity In a Right Way for Diffusion Models
RT-Lynx shifts DiT sparsity from weights to activations, reports up to 1.55x linear-layer speedup while preserving generation quality across multiple diffusion models.
-
OTCache: Optimal Transport for Geometry-Aware Caching in Diffusion Models
OTCache uses optimal transport to interpolate caching schedules between a graph-based reference and an Optuna-optimized anchor, delivering 3.66x-4.7x speedups on FLUX.1, Qwen-Image and HunyuanVideo with improved fidelity.
-
Inside the Latent Flow: Causal Deciphering of Attention Dynamics in Audio Separation Foundation Models
Causal probing of attention in audio separation transformers identifies dual pathways and asynchronous convergence, enabling a training-free Layer-Selective Attention Caching method that reduces self-attention computation by ~25% with negligible quality loss.