GPTQ is equivalent to Babai's nearest plane algorithm for CVP on the Hessian lattice of layer inputs, yielding geometric interpretation, inherited error bounds, and improved clipping-free quantization with GPU kernels.
4 Jerry Chee, Yaohui Cai, Volodymyr Kuleshov, and Christopher M De Sa
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
With known covariance, waterfilling improves GPTQ and WaterSIC reaches within 0.25 bit/entry of the rate-distortion limit while being basis-independent.
High-rate quantization theory yields accurate approximations for the distortion of absmax INT and FP schemes in generic weight-plus-activation matrix multiplication.
citing papers explorer
-
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
GPTQ is equivalent to Babai's nearest plane algorithm for CVP on the Hessian lattice of layer inputs, yielding geometric interpretation, inherited error bounds, and improved clipping-free quantization with GPU kernels.
-
High-Rate Quantized Matrix Multiplication II
With known covariance, waterfilling improves GPTQ and WaterSIC reaches within 0.25 bit/entry of the rate-distortion limit while being basis-independent.
-
High-Rate Quantized Matrix Multiplication I
High-rate quantization theory yields accurate approximations for the distortion of absmax INT and FP schemes in generic weight-plus-activation matrix multiplication.