pith. sign in

Towards understanding the mixture-of-experts layer in deep learning

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2025 1

verdicts

UNVERDICTED 1

representative citing papers

Tight Clusters Make Specialized Experts

cs.LG · 2025-02-21 · unverdicted · novelty 6.0

Introduces Adaptive Clustering router for MoE models that scales features to identify tight expert clusters, yielding faster convergence, robustness to corruption, and performance gains.

citing papers explorer

Showing 1 of 1 citing paper.

  • Tight Clusters Make Specialized Experts cs.LG · 2025-02-21 · unverdicted · none · ref 7

    Introduces Adaptive Clustering router for MoE models that scales features to identify tight expert clusters, yielding faster convergence, robustness to corruption, and performance gains.