With known covariance, waterfilling improves GPTQ and WaterSIC reaches within 0.25 bit/entry of the rate-distortion limit while being basis-independent.
OPTQ: Accurate quantization for generative pre-trained transformers
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
High-rate quantization theory yields accurate approximations for the distortion of absmax INT and FP schemes in generic weight-plus-activation matrix multiplication.
citing papers explorer
-
High-Rate Quantized Matrix Multiplication II
With known covariance, waterfilling improves GPTQ and WaterSIC reaches within 0.25 bit/entry of the rate-distortion limit while being basis-independent.
-
High-Rate Quantized Matrix Multiplication I
High-rate quantization theory yields accurate approximations for the distortion of absmax INT and FP schemes in generic weight-plus-activation matrix multiplication.