Importantly, the key experts (i.e., the peaks in the error-estimation curves) with large estimated errors are consistently identified across different samples

As shown the figures, GEMQ is relatively robust to sampling noise, as the estimated error curves largely overlap even though only 128 sequences are used for calibration, achieving an average Pearson correlation over 0 · 2048

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

GEMQ applies global LP-based expert importance estimation and router fine-tuning within progressive quantization to cut memory and speed inference in MoE LLMs with little accuracy loss.

citing papers explorer

Showing 1 of 1 citing paper.

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs cs.LG · 2026-05-21 · unverdicted · none · ref 35
GEMQ applies global LP-based expert importance estimation and router fine-tuning within progressive quantization to cut memory and speed inference in MoE LLMs with little accuracy loss.

Importantly, the key experts (i.e., the peaks in the error-estimation curves) with large estimated errors are consistently identified across different samples

fields

years

verdicts

representative citing papers

citing papers explorer