SpikingMamba distills Mamba into an SNN LLM achieving 4.76x energy savings with a 4.78% zero-shot accuracy gap that narrows to 2.23% after RL.
The sequence length is fixed at 2048 tokens, and the embedding layer remains frozen throughout training
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.NE 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SpikingMamba: Towards Energy-Efficient Large Language Models via Knowledge Distillation from Mamba
SpikingMamba distills Mamba into an SNN LLM achieving 4.76x energy savings with a 4.78% zero-shot accuracy gap that narrows to 2.23% after RL.