The authors cast auxiliary-loss-free load balancing as a primal-dual assignment solver, prove structural properties in deterministic and online regimes, and report experiments on 1B-parameter DeepSeekMoE models.
Optimal and stable distributed bipartite load balancing.arXiv preprint arXiv:2411.17103,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
math.OC 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models
The authors cast auxiliary-loss-free load balancing as a primal-dual assignment solver, prove structural properties in deterministic and online regimes, and report experiments on 1B-parameter DeepSeekMoE models.