SpaceMoE partitions MoE layers across orbiting satellite subnets in a ring and optimizes expert placement by activation probability and path latency, yielding at least 3x lower inference latency in thousand-satellite simulations versus random baselines.
Optimizing mixture-of-experts inference time combining model deployment and communication scheduling
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.DC 2representative citing papers
Comprehensive profiling of expert selection in frontier MoE models reveals temporal and spatial patterns that enable 6.6x speedup on wafer-scale GPUs and 1.25x on existing systems via targeted optimizations.
citing papers explorer
-
SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks
SpaceMoE partitions MoE layers across orbiting satellite subnets in a ring and optimizes expert placement by activation probability and path latency, yielding at least 3x lower inference latency in thousand-satellite simulations versus random baselines.
-
Patterns behind Chaos: Forecasting Data Movement for Efficient Large-Scale MoE LLM Inference
Comprehensive profiling of expert selection in frontier MoE models reveals temporal and spatial patterns that enable 6.6x speedup on wafer-scale GPUs and 1.25x on existing systems via targeted optimizations.