In the right figure, we observe that using more calibration samples can further reduce perplexity on the test set, but the improvement is marginal

As shown in the left figure, since routers contain only a small number of parameters, training converges within a single epoch in under 2 minutes · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs

cs.LG · 2026-05-21 · unverdicted · novelty 6.0

GEMQ applies global LP-based expert importance estimation and router fine-tuning within progressive quantization to cut memory and speed inference in MoE LLMs with little accuracy loss.

citing papers explorer

Showing 1 of 1 citing paper.

GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs cs.LG · 2026-05-21 · unverdicted · none · ref 37
GEMQ applies global LP-based expert importance estimation and router fine-tuning within progressive quantization to cut memory and speed inference in MoE LLMs with little accuracy loss.

In the right figure, we observe that using more calibration samples can further reduce perplexity on the test set, but the improvement is marginal

fields

years

verdicts

representative citing papers

citing papers explorer