Latent Cache Flow uses small adapters to jointly translate and compress KV caches between LLMs, enabling accurate communication even with mismatched contexts and outperforming both prior cache adapters and text in early tests.
International Conference on Learning Representations , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Latent Cache Flow: Model-to-Model Communication Without Text
Latent Cache Flow uses small adapters to jointly translate and compress KV caches between LLMs, enabling accurate communication even with mismatched contexts and outperforming both prior cache adapters and text in early tests.