Routing topology in sparse Mixture-of-Experts models does not determine asymptotic language modeling perplexity; multiple variants including cosine-similarity routing achieve statistically equivalent performance.
RoMA: Routing manifold alignment improves generalization of mixture-of-experts LLMs
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
CoGR-MoE improves VQA by using concept-guided expert routing with option feature reweighting and contrastive learning to achieve consistent yet flexible reasoning across answer options.
citing papers explorer
-
Equifinality in Mixture of Experts: Routing Topology Does Not Determine Language Modeling Quality
Routing topology in sparse Mixture-of-Experts models does not determine asymptotic language modeling perplexity; multiple variants including cosine-similarity routing achieve statistically equivalent performance.
-
CoGR-MoE: Concept-Guided Expert Routing with Consistent Selection and Flexible Reasoning for Visual Question Answering
CoGR-MoE improves VQA by using concept-guided expert routing with option feature reweighting and contrastive learning to achieve consistent yet flexible reasoning across answer options.