QSLM automates tiered quantization of spike-driven language models via sensitivity analysis and multi-objective search, delivering up to 86.5% memory reduction and 20% power savings while keeping accuracy close to the full-precision baseline.
A programmable event-driven architecture for eval- uating spiking neural networks,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.NE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
QSLM: A Performance- and Memory-aware Quantization Framework with Tiered Search Strategy for Spike-driven Language Models
QSLM automates tiered quantization of spike-driven language models via sensitivity analysis and multi-objective search, delivering up to 86.5% memory reduction and 20% power savings while keeping accuracy close to the full-precision baseline.