pith. sign in

MatryoshkaKV: Adaptive KV Compression via Trainable Orthogonal Projection

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

years

2026 4 2025 1

representative citing papers

Quantization Dominates Rank Reduction for KV-Cache Compression

cs.LG · 2026-04-13 · conditional · novelty 6.0

Quantization of the KV cache beats rank reduction for matched storage budgets by 4-364 PPL, because dimension removal can flip attention token selection under softmax while bounded quantization noise usually preserves ordering.

OjaKV: Context-Aware Online Low-Rank KV Cache Compression

cs.CL · 2025-09-25 · unverdicted · novelty 6.0

OjaKV introduces hybrid full-rank storage for key tokens combined with online low-rank KV cache compression via Oja's algorithm to support memory-efficient long-context LLM inference.

citing papers explorer

Showing 5 of 5 citing papers.