Dynamic int8 quantization via Quanto on Whisper-small reduces size by 57% and improves WER on LibriSpeech test sets compared to the unquantized baseline.
However, this accuracy comes at a cost: models with hundreds of millions of parameters are difficult to deploy on edge devices, embedded systems, or latency-sensitive applications
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Quantizing Whisper-small: How design choices affect ASR performance
Dynamic int8 quantization via Quanto on Whisper-small reduces size by 57% and improves WER on LibriSpeech test sets compared to the unquantized baseline.