Dl-qat: Weight-decomposed low- rank quantization-aware training for large language models

Wenjin Ke, Zhe Li, Dong Li, Lu Tian, Emad Barsoum · 2025 · arXiv 2504.09223

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

HCInfer: An Efficient Inference System via Error Compensation for Resource-Constrained Devices

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

HCInfer recovers up to 5.2% accuracy over compressed LLMs and delivers 10.4x speedup versus full-precision models by offloading compensation parameters to CPU with async execution on resource-limited hardware.

citing papers explorer

Showing 1 of 1 citing paper.

HCInfer: An Efficient Inference System via Error Compensation for Resource-Constrained Devices cs.LG · 2026-05-07 · unverdicted · none · ref 18
HCInfer recovers up to 5.2% accuracy over compressed LLMs and delivers 10.4x speedup versus full-precision models by offloading compensation parameters to CPU with async execution on resource-limited hardware.

Dl-qat: Weight-decomposed low- rank quantization-aware training for large language models

fields

years

verdicts

representative citing papers

citing papers explorer