arXiv preprint arXiv:2402.04902 (2024)

Jeon, H · 2024 · arXiv 2402.04902

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

LLM Compression with Jointly Optimizing Architectural and Quantization choices

cs.LG · 2026-06-02 · unverdicted · novelty 6.0

A differentiable NAS framework jointly optimizes LLM architecture and mixed-precision quantization for linear layers, yielding up to 1.4x faster inference or 6% higher accuracy than sequential baselines on reasoning tasks.

citing papers explorer

Showing 1 of 1 citing paper.

LLM Compression with Jointly Optimizing Architectural and Quantization choices cs.LG · 2026-06-02 · unverdicted · none · ref 17
A differentiable NAS framework jointly optimizes LLM architecture and mixed-precision quantization for linear layers, yielding up to 1.4x faster inference or 6% higher accuracy than sequential baselines on reasoning tasks.

arXiv preprint arXiv:2402.04902 (2024)

fields

years

verdicts

representative citing papers

citing papers explorer