org/CorpusID:249240535

URL https://api · 2023 · arXiv 2303.01610

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection

cs.LG · 2024-11-13 · unverdicted · novelty 6.0

Lynx exploits training-induced batch-level expert activation skews via AffinityBinning to reduce invoked experts per batch, delivering up to 1.30x throughput with under 1% accuracy loss across four model families.

Does a Global Perspective Help Prune Sparse MoEs Elegantly?

cs.CL · 2026-04-08 · unverdicted · novelty 5.0

GRAPE is a global redundancy-aware pruning strategy for sparse MoEs that dynamically allocates pruning budgets across layers and improves average accuracy by 1.40% over the best local baseline across tested models and settings.

FLEX-MoE: Federated Mixture-of-Experts with Load-balanced Expert Assignment for Edge Computing

cs.LG · 2025-12-28 · unverdicted · novelty 5.0

FLEX-MoE proposes client-expert fitness scores and an optimization algorithm to jointly maximize specialization and enforce balanced expert utilization in federated MoE for edge computing under non-IID data and capacity constraints.

citing papers explorer

Showing 3 of 3 citing papers.

Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection cs.LG · 2024-11-13 · unverdicted · none · ref 4
Lynx exploits training-induced batch-level expert activation skews via AffinityBinning to reduce invoked experts per batch, delivering up to 1.30x throughput with under 1% accuracy loss across four model families.
Does a Global Perspective Help Prune Sparse MoEs Elegantly? cs.CL · 2026-04-08 · unverdicted · none · ref 5
GRAPE is a global redundancy-aware pruning strategy for sparse MoEs that dynamically allocates pruning budgets across layers and improves average accuracy by 1.40% over the best local baseline across tested models and settings.
FLEX-MoE: Federated Mixture-of-Experts with Load-balanced Expert Assignment for Edge Computing cs.LG · 2025-12-28 · unverdicted · none · ref 1
FLEX-MoE proposes client-expert fitness scores and an optimization algorithm to jointly maximize specialization and enforce balanced expert utilization in federated MoE for edge computing under non-IID data and capacity constraints.

org/CorpusID:249240535

fields

years

verdicts

representative citing papers

citing papers explorer