SkipKV performs sentence-level KV eviction using similarity scoring and dynamically adjusts hidden states via a steering vector to produce shorter, accurate CoT outputs, delivering up to 26.7% higher accuracy and 1.7x throughput versus prior eviction methods.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
SkipKV: Selective Skipping of KV Generation and Storage for Efficient Inference with Large Reasoning Models
SkipKV performs sentence-level KV eviction using similarity scoring and dynamically adjusts hidden states via a steering vector to produce shorter, accurate CoT outputs, delivering up to 26.7% higher accuracy and 1.7x throughput versus prior eviction methods.