RDKV derives per-token and per-channel weights from attention distortion, then uses reverse water-filling to assign bit-widths from full precision to zero after prefilling, recovering 97.81% accuracy with 2.48% cache retention on LongBench.
Optimal brain damage.Advances in Neural Information Processing Systems, 2
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
SANA-SR uses 32x deep compression autoencoding and linear-attention DiT to deliver competitive real-world image super-resolution at 0.019s inference after pruning.
citing papers explorer
-
Efficient One-Step Diffusion Restoration Model with Compact Token Compression and Linear Attention
SANA-SR uses 32x deep compression autoencoding and linear-attention DiT to deliver competitive real-world image super-resolution at 0.019s inference after pruning.