Not all heads matter: A head-level kv cache compression method with integrated retrieval and reasoning.International Conference on Learning Representations, 2025

Yu Fu, Zefan Cai, Abedelkadir Asi, Wayne Xiong, Yue Dong, Wen Xiao · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.DS · 2026-05-07 · unverdicted · novelty 8.0

ε-coresets for attention exist of size O(√d e^{ρ+o(ρ)}/ε) for unit-norm keys/values and queries of norm ≤ρ, nearly matching the Ω(√d e^ρ/ε) lower bound.

citing papers explorer

Showing 1 of 1 citing paper.

Nearly Optimal Attention Coresets cs.DS · 2026-05-07 · unverdicted · none · ref 20
ε-coresets for attention exist of size O(√d e^{ρ+o(ρ)}/ε) for unit-norm keys/values and queries of norm ≤ρ, nearly matching the Ω(√d e^ρ/ε) lower bound.

Not all heads matter: A head-level kv cache compression method with integrated retrieval and reasoning.International Conference on Learning Representations, 2025

fields

years

verdicts

representative citing papers

citing papers explorer