Harsh Jhamtani and Taylor Berg-Kirkpatrick

Lightvlm: Acceleraing large multimodal models with pyramid token merging, kv cache compression · 2018 · arXiv 2509.00419

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

HybridKV reduces KV cache memory by up to 7.9x and speeds decoding by 1.52x in MLLMs with almost no performance loss by classifying heads into static and dynamic types and compressing them differently.

citing papers explorer

Showing 1 of 1 citing paper.

HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference cs.AI · 2026-04-07 · unverdicted · none · ref 1
HybridKV reduces KV cache memory by up to 7.9x and speeds decoding by 1.52x in MLLMs with almost no performance loss by classifying heads into static and dynamic types and compressing them differently.

Harsh Jhamtani and Taylor Berg-Kirkpatrick

fields

years

verdicts

representative citing papers

citing papers explorer