{"paper":{"title":"SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A ring-subnet architecture for mixture-of-experts models in satellite constellations cuts token-generation latency by at least threefold.","cross_cats":["cs.AI","cs.NI"],"primary_cat":"cs.DC","authors_text":"Huiling Yang, Kaibin Huang, Khaled B. Letaief, Min Sheng, Zhanwei Wang","submitted_at":"2026-05-01T08:40:31Z","abstract_excerpt":"Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs). Recognizing this advantage, space and AI conglomerates (e.g., SpaceX, Google) are actively investing in this vision. One key challenge, however, is the efficient distributed deployment of a large-scale LLM in a satellite network due to the limited onboard computing and communication resources. This gives rise to a placement problem that involves partitioning and mapping model components to satellites such that t"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Experiments over a thousand-satellite constellation show that Space-XNet achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The assumption that autoregressive inference follows a ring-like communication pattern that can be directly exploited by partitioning the constellation into ring subnets, and that the intra-layer optimization based on activation probabilities and expected latency fully captures real performance without hidden costs such as accuracy degradation or unmodeled interference.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Space-XNet partitions satellite constellations into ring subnets for MoE layers and maps high-activation experts to low-latency satellites, yielding at least 3x lower inference latency than random or ablation placements in 1000-satellite simulations.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A ring-subnet architecture for mixture-of-experts models in satellite constellations cuts token-generation latency by at least threefold.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"351a935f5a8c4824d38dc860849c008c60a87975df361ecef1b43bb31c28d507"},"source":{"id":"2605.00515","kind":"arxiv","version":2},"verdict":{"id":"5574ef4c-42af-4478-a6de-6cae00f4146f","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-09T19:15:59.565881Z","strongest_claim":"Experiments over a thousand-satellite constellation show that Space-XNet achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.","one_line_summary":"Space-XNet partitions satellite constellations into ring subnets for MoE layers and maps high-activation experts to low-latency satellites, yielding at least 3x lower inference latency than random or ablation placements in 1000-satellite simulations.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The assumption that autoregressive inference follows a ring-like communication pattern that can be directly exploited by partitioning the constellation into ring subnets, and that the intra-layer optimization based on activation probabilities and expected latency fully captures real performance without hidden costs such as accuracy degradation or unmodeled interference.","pith_extraction_headline":"A ring-subnet architecture for mixture-of-experts models in satellite constellations cuts token-generation latency by at least threefold."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.00515/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-20T19:40:38.790379Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T18:04:40.782136Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"ae962ed134e73649a330a847232b3e3815879ce1d8c1aae61d3d75baf23cb076"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}