KVCache cache in the wild: Characterizing and optimizing KVCache cache at a large cloud provider.arXiv preprint arXiv:2506.02634, 2025a

7 Recency/Frequency Adaptive KV Caching for Large Language Model Serving Wang, J · 2015 · arXiv 2506.02634

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Fail-Closed Lowering of Resident KV Claims onto LLM Serving Runtimes

cs.DC · 2026-05-31 · unverdicted · novelty 6.0

Introduces fail-closed lowering semantics for Resident KV Claims in LLM serving runtimes, along with a conformance checker, descriptor format, and classification of existing systems.

Recency/Frequency Adaptive KV Caching for Large Language Model Serving

cs.DC · 2026-06-19 · unverdicted · novelty 5.0

Presents a recency/frequency adaptive KV caching approach that achieves up to 10.8% higher hit rate and 12.6% lower TTFT compared to vLLM on synthetic workloads.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Fail-Closed Lowering of Resident KV Claims onto LLM Serving Runtimes cs.DC · 2026-05-31 · unverdicted · none · ref 15
Introduces fail-closed lowering semantics for Resident KV Claims in LLM serving runtimes, along with a conformance checker, descriptor format, and classification of existing systems.
Recency/Frequency Adaptive KV Caching for Large Language Model Serving cs.DC · 2026-06-19 · unverdicted · none · ref 10
Presents a recency/frequency adaptive KV caching approach that achieves up to 10.8% higher hit rate and 12.6% lower TTFT compared to vLLM on synthetic workloads.

KVCache cache in the wild: Characterizing and optimizing KVCache cache at a large cloud provider.arXiv preprint arXiv:2506.02634, 2025a

fields

years

verdicts

representative citing papers

citing papers explorer