61: Push the real limit of extremely low-bit post-training quantization methods for large language models , author=

Jiaqi Zhao, Miao Zhang, Ming Wang, Yuzhang Shang, Kaihao Zhang, Weili Guan, Yaowei Wang, Min Zhang · arXiv 2502.13179

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation

cs.LG · 2026-04-21 · unverdicted · novelty 6.0

LBLLM achieves better accuracy than prior binarization methods for LLMs by decoupling weight and activation quantization through initialization, layer-wise distillation, and learnable activation scaling.

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling

cs.CL · 2026-04-20 · unverdicted · novelty 6.0 · 2 refs

GSQ uses Gumbel-Softmax to optimize scalar quantization grids for LLMs, closing most of the accuracy gap to vector methods like QTIP at 2-3 bits per parameter while using symmetric scalar grids compatible with existing kernels.

citing papers explorer

Showing 2 of 2 citing papers.

LBLLM: Lightweight Binarization of Large Language Models via Three-Stage Distillation cs.LG · 2026-04-21 · unverdicted · none · ref 94
LBLLM achieves better accuracy than prior binarization methods for LLMs by decoupling weight and activation quantization through initialization, layer-wise distillation, and learnable activation scaling.
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling cs.CL · 2026-04-20 · unverdicted · none · ref 36 · 2 links
GSQ uses Gumbel-Softmax to optimize scalar quantization grids for LLMs, closing most of the accuracy gap to vector methods like QTIP at 2-3 bits per parameter while using symmetric scalar grids compatible with existing kernels.

61: Push the real limit of extremely low-bit post-training quantization methods for large language models , author=

fields

years

verdicts

representative citing papers

citing papers explorer