ELDR reduces median TPOT by 5.9-13.9% in PD-disaggregated MoE serving by routing decode requests via prefill-derived expert signatures and K-means locality partitioning over load-balancing baselines.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ELDR: Expert-Locality-Aware Decode Routing for PD-Disaggregated MoE Serving
ELDR reduces median TPOT by 5.9-13.9% in PD-disaggregated MoE serving by routing decode requests via prefill-derived expert signatures and K-means locality partitioning over load-balancing baselines.