Empirical tests show 8-bit weight-only quantization is lossless on both models while 4-bit works for the 7B but harms the 1B on reasoning/math/code tasks, and 2-bit or lower settings collapse performance.
arXiv preprint arXiv:2409.16694 (2024)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
An Empirical Study of OpenPangu Quantization on Ascend NPUs
Empirical tests show 8-bit weight-only quantization is lossless on both models while 4-bit works for the 7B but harms the 1B on reasoning/math/code tasks, and 2-bit or lower settings collapse performance.