What makes quantization for large language models hard? an empirical study from the lens of perturbation.CoRR

Zhuocheng Gong, Jiahao Liu, Jingang Wang, Xunliang Cai, Dongyan Zhao, Rui Yan · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

RUQuant: Towards Refining Uniform Quantization for Large Language Models

cs.CL · 2026-04-05 · unverdicted · novelty 6.0

RUQuant uses block-wise composite orthogonal matrices from Householder reflections and Givens rotations plus a fine-tuned global reflection to achieve 99.8% full-precision accuracy at W6A6 and 97% at W4A4 for 13B LLMs in about one minute.

citing papers explorer

Showing 1 of 1 citing paper.

RUQuant: Towards Refining Uniform Quantization for Large Language Models cs.CL · 2026-04-05 · unverdicted · none · ref 17
RUQuant uses block-wise composite orthogonal matrices from Householder reflections and Givens rotations plus a fine-tuned global reflection to achieve 99.8% full-precision accuracy at W6A6 and 97% at W4A4 for 13B LLMs in about one minute.

What makes quantization for large language models hard? an empirical study from the lens of perturbation.CoRR

fields

years

verdicts

representative citing papers

citing papers explorer