GPTQ-intrinsic LoRA augments GPTQ with intrinsic low-rank compensation via Hessian modification to achieve layer-wise reconstruction bounds that match information-theoretic lower bounds under structural assumptions.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Derives non-asymptotic 2-norm and infinity-norm error bounds for deterministic and stochastic variants of OPTQ and Qronos PTQ algorithms.
With known covariance, waterfilling improves GPTQ and WaterSIC reaches within 0.25 bit/entry of the rate-distortion limit while being basis-independent.
High-rate quantization theory yields accurate approximations for the distortion of absmax INT and FP schemes in generic weight-plus-activation matrix multiplication.
citing papers explorer
-
GPTQ-intrinsic LoRA: A Near-optimal Algorithm for Low-precision Quantization with Low-rank Adaptation
GPTQ-intrinsic LoRA augments GPTQ with intrinsic low-rank compensation via Hessian modification to achieve layer-wise reconstruction bounds that match information-theoretic lower bounds under structural assumptions.
-
High-Rate Quantized Matrix Multiplication II
With known covariance, waterfilling improves GPTQ and WaterSIC reaches within 0.25 bit/entry of the rate-distortion limit while being basis-independent.
-
High-Rate Quantized Matrix Multiplication I
High-rate quantization theory yields accurate approximations for the distortion of absmax INT and FP schemes in generic weight-plus-activation matrix multiplication.