Reducing precision from 16-bit to 8/4-bit in multi-hop reasoning creates a quantization trap that raises net energy consumption and degrades accuracy, breaking linear scaling laws.
Restoring precision removes the software-emulated de-quantization bottleneck, increasing throughput such that ∂ESI ∂p >0
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
The Quantization Trap: Breaking Linear Scaling Laws in Multi-Hop Reasoning
Reducing precision from 16-bit to 8/4-bit in multi-hop reasoning creates a quantization trap that raises net energy consumption and degrades accuracy, breaking linear scaling laws.