Latent Cache Flow uses small adapters to jointly translate and compress KV caches between LLMs, enabling accurate communication even with mismatched contexts and outperforming both prior cache adapters and text in early tests.
Proceedings of Machine Learning and Systems , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Latent Cache Flow: Model-to-Model Communication Without Text
Latent Cache Flow uses small adapters to jointly translate and compress KV caches between LLMs, enabling accurate communication even with mismatched contexts and outperforming both prior cache adapters and text in early tests.