PiKV proposes expert-sharded KV storage, PiKV routing, adaptive scheduling, and compression modules to reduce overhead in multi-GPU MoE inference.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.
citing papers explorer
-
PiKV: KV Cache Management System for Mixture of Experts
PiKV proposes expert-sharded KV storage, PiKV routing, adaptive scheduling, and compression modules to reduce overhead in multi-GPU MoE inference.
-
A Survey of Mamba
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.