EdgeRazor uses structural mixed-precision quantization, layer-adaptive feature distillation, and entropy-aware KL divergence to achieve 1.88-bit LLMs that outperform prior 2-bit and 3-bit baselines with 4-10x lower training budget.
TGIF: A new dataset and benchmark on animated gif description
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation
EdgeRazor uses structural mixed-precision quantization, layer-adaptive feature distillation, and entropy-aware KL divergence to achieve 1.88-bit LLMs that outperform prior 2-bit and 3-bit baselines with 4-10x lower training budget.