(2016) and OPT-125M/WikiText2, which is one of the largest models to which OBQ can be reasonably applied

12 Published as a conference paper at ICLR · 2023

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers

cs.LG · 2022-10-31 · unverdicted · novelty 7.0

GPTQ quantizes 175B-parameter GPT models to 3-4 bits per weight in one shot using approximate second-order information, achieving negligible accuracy degradation and 3-4x inference speedups.

citing papers explorer

Showing 1 of 1 citing paper.

GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers cs.LG · 2022-10-31 · unverdicted · none · ref 18
GPTQ quantizes 175B-parameter GPT models to 3-4 bits per weight in one shot using approximate second-order information, achieving negligible accuracy degradation and 3-4x inference speedups.

(2016) and OPT-125M/WikiText2, which is one of the largest models to which OBQ can be reasonably applied

fields

years

verdicts

representative citing papers

citing papers explorer