RUQuant uses block-wise composite orthogonal matrices from Householder reflections and Givens rotations plus a fine-tuned global reflection to achieve 99.8% full-precision accuracy at W6A6 and 97% at W4A4 for 13B LLMs in about one minute.
Slicegpt - openai GPT-3.5 llm, blockchain and non-fungible token enabled intelligent 5g/6g network slice broker and marketplace
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
RUQuant: Towards Refining Uniform Quantization for Large Language Models
RUQuant uses block-wise composite orthogonal matrices from Householder reflections and Givens rotations plus a fine-tuned global reflection to achieve 99.8% full-precision accuracy at W6A6 and 97% at W4A4 for 13B LLMs in about one minute.