Distributed learning of mixtures of experts,

· 2023 · arXiv 2312.09877

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs

cs.CL · 2026-05-23 · unverdicted · novelty 4.0

Mix-MoE applies separate LM and MT expert groups in two post-pretraining stages with Fourier-enhanced routing to reduce parameter interference and improve multilingual MT over baselines.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs cs.CL · 2026-05-23 · unverdicted · none · ref 30
Mix-MoE applies separate LM and MT expert groups in two post-pretraining stages with Fourier-enhanced routing to reduce parameter interference and improve multilingual MT over baselines.

Distributed learning of mixtures of experts,

fields

years

verdicts

representative citing papers

citing papers explorer