pith. sign in

A Simple and Effective L\_2 Norm-Based Strategy for KV Cache Compression

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

years

2026 4

clear filters

representative citing papers

EntmaxKV: Support-Aware Decoding for Entmax Attention

cs.LG · 2026-05-20 · conditional · novelty 8.0

EntmaxKV enables exact sparse KV-cache decoding for entmax attention via support-aware page selection and a Gaussian threshold estimator, matching full attention quality at a fraction of the cache size with up to 5.43x speedup.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • EntmaxKV: Support-Aware Decoding for Entmax Attention cs.LG · 2026-05-20 · conditional · none · ref 5

    EntmaxKV enables exact sparse KV-cache decoding for entmax attention via support-aware page selection and a Gaussian threshold estimator, matching full attention quality at a fraction of the cache size with up to 5.43x speedup.

  • Value-Aware Stochastic KV Cache Eviction for Reasoning Models cs.LG · 2026-06-02 · unverdicted · none · ref 24

    VaSE improves KV cache eviction accuracy for reasoning models by over 4% versus prior eviction methods at 4x compression through value-magnitude protection and stochastic diversity.