RotateKV: Accurate and robust 2-bit KV cache quantization for LLMs via outlier-aware adaptive rotations

Zunhai Su, Hanyu Wei, Zhe Chen, Wang Shen, Linge Li, Huangqi Yu, Kehong Yuan · 2025 · DOI 10.24963/ijcai.2025/690

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open at publisher browse 1 citing papers

representative citing papers

RoPE-Aware Bit Allocation for KV-Cache Quantization

cs.LG · 2026-06-23 · unverdicted · novelty 7.0

Block-GTQ performs RoPE-aware greedy bit allocation on KV caches using per-block energy scores, cutting logit MAE 32-80% versus uniform TQ-MSE and lifting long-context task scores substantially at 2-3 bits per dimension.

citing papers explorer

Showing 1 of 1 citing paper.

RoPE-Aware Bit Allocation for KV-Cache Quantization cs.LG · 2026-06-23 · unverdicted · none · ref 33
Block-GTQ performs RoPE-aware greedy bit allocation on KV caches using per-block energy scores, cutting logit MAE 32-80% versus uniform TQ-MSE and lifting long-context task scores substantially at 2-3 bits per dimension.

RotateKV: Accurate and robust 2-bit KV cache quantization for LLMs via outlier-aware adaptive rotations

fields

years

verdicts

representative citing papers

citing papers explorer