Mixture of Experts in Image Classification: What’s the Sweet Spot?, October 2025

Mathurin Videau, Alessandro Leite, Marc Schoenauer, Olivier Teytaud · 2025 · arXiv 2411.18322

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Beyond Routing: Characterising Expert Tuning and Representation in Vision Mixture-of-Experts

cs.CV · 2026-05-20 · unverdicted · novelty 7.0

Expert specialization in vision MoE models is dominated by a stable animate-inanimate distinction visible from gating to readout, with broader tuning to continuous visual and semantic dimensions rather than narrow categorical preferences.

When Does Sparse MoE Help in Vision? The Role of Backbone Compute Leverage in Sparse Routing

cs.CV · 2026-05-15 · unverdicted · novelty 5.0

Sparse MoE vision models show positive accuracy gaps only when routing a substantial compute fraction ρ and using k≥2 experts at large scale; batch-axis dispatch is identified as a key failure mode.

citing papers explorer

Showing 2 of 2 citing papers.

Beyond Routing: Characterising Expert Tuning and Representation in Vision Mixture-of-Experts cs.CV · 2026-05-20 · unverdicted · none · ref 15
Expert specialization in vision MoE models is dominated by a stable animate-inanimate distinction visible from gating to readout, with broader tuning to continuous visual and semantic dimensions rather than narrow categorical preferences.
When Does Sparse MoE Help in Vision? The Role of Backbone Compute Leverage in Sparse Routing cs.CV · 2026-05-15 · unverdicted · none · ref 61
Sparse MoE vision models show positive accuracy gaps only when routing a substantial compute fraction ρ and using k≥2 experts at large scale; batch-axis dispatch is identified as a key failure mode.

Mixture of Experts in Image Classification: What’s the Sweet Spot?, October 2025

fields

years

verdicts

representative citing papers

citing papers explorer