SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks

Huiling Yang; Kaibin Huang; Khaled B. Letaief; Min Sheng; Zhanwei Wang

arxiv: 2605.00515 · v2 · pith:EF6RETMQnew · submitted 2026-05-01 · 💻 cs.DC · cs.AI· cs.NI

SpaceMoE: Realizing Distributed Mixture-of-Experts Inference over Space Networks

Zhanwei Wang , Huiling Yang , Min Sheng , Khaled B. Letaief , Kaibin Huang This is my paper

Pith reviewed 2026-05-22 10:15 UTC · model grok-4.3

classification 💻 cs.DC cs.AIcs.NI

keywords space networksmixture-of-expertsdistributed inferencesatellite constellationLLM placementlatency reductionautoregressive generationmodel partitioning

0 comments

The pith

SpaceMoE partitions satellite constellations into orbiting ring subnets for each MoE layer and maps active experts to low-latency paths to cut distributed inference latency by at least threefold.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops placement strategies to run mixture-of-experts models across satellite networks for energy-efficient LLM inference in space. It divides the constellation along orbital paths into ring subnets, each assigned one MoE layer, to match the sequential data flow of generating tokens one at a time. Within each subnet it solves an optimization to assign experts according to their activation rates and the expected delays on inter-satellite routes. A reader would care because satellites offer continuous solar power yet face tight limits on compute and communication that standard cloud-style placements ignore. If the strategies hold, they enable practical low-latency token generation without exhausting onboard resources in large constellations.

Core claim

SpaceMoE introduces a two-level placement approach for deploying MoE models in space networks. For layer placement, the satellite constellation is partitioned along the orbiting direction into subnets arranged on a ring, with each subnet hosting one MoE layer to exploit the ring-like communication pattern of autoregressive inference. For intra-layer expert placement, an optimization problem is solved to map experts with heterogeneous activation probabilities onto satellites, revealing that frequently activated experts should be placed on satellites with low expected latency routing paths. Experiments on a thousand-satellite constellation demonstrate at least a threefold reduction in latency.

What carries the argument

Two-level placement: ring subnet partitioning for MoE layers matched to autoregressive communication, plus optimization-based mapping of experts by activation probability and path latency.

If this is right

Layer placement uses orbiting rings to align with the sequential token-passing steps of autoregressive generation.
Intra-layer placement assigns high-activation experts to satellites on low-latency routes.
The full strategy produces at least a threefold latency drop versus random or ablation baselines in large constellations.
The derived mapping rule favors frequent experts on paths with lower expected delay.
The approach reconciles MoE sparsity with the fixed topology and resource limits of satellite networks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The ring subnet idea may apply to other sequential workloads that traverse orbital links in a fixed order.
The optimization could be rerun periodically as satellites move and link qualities change.
Success would allow scaling to bigger MoE models by spreading load without proportional latency growth.
Hardware tests on actual inter-satellite links would check whether modeled latencies match observed delays.

Load-bearing premise

The satellite constellation can be partitioned along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer, by exploiting the ring-like communication pattern of autoregressive inference.

What would settle it

A thousand-satellite simulation that replaces the proposed ring subnet layer placement with random assignment and measures whether the threefold latency reduction disappears.

Figures

Figures reproduced from arXiv: 2605.00515 by Huiling Yang, Kaibin Huang, Khaled B. Letaief, Min Sheng, Zhanwei Wang.

**Figure 1.** Figure 1: Satellite constellation with time-varying network topologies. view at source ↗

**Figure 2.** Figure 2: MoE architecture and its autoregressive inference process. view at source ↗

**Figure 3.** Figure 3: Functional role of satellites in Space-XNet. view at source ↗

**Figure 5.** Figure 5: Ring-based MoE layer placement. An example of a 40-satellite view at source ↗

**Figure 6.** Figure 6: Performance comparisons with benchmarking schemes. view at source ↗

**Figure 7.** Figure 7: Effects of network parameters on E2E latency. view at source ↗

read the original abstract

Leveraging continuous solar energy harvesting at high efficiency, space data centers are envisioned as a promising platform for executing energy-intensive large language models (LLMs). Recognizing this advantage, space and AI conglomerates (e.g., SpaceX, Google) are actively investing in this vision. One key challenge, however, is the efficient distributed deployment of a large-scale LLM in a satellite network due to the limited onboard computing and communication resources. This gives rise to a placement problem that involves partitioning and mapping model components to satellites such that the fundamentally different model architecture and network topology can be reconciled to ensure low-latency token generation. To address this problem, we present the Space Network of Mixture-of-Experts (SpaceMoE) framework targeting the distributed execution of a popular mixture-of-experts (MoE) model in space. The proposed placement strategies are two-level: (1) layer placement, which assigns MoE layers to satellite subnets; and (2) intra-layer expert placement, which assigns individual experts to satellites associated with the same layer/subnet. For layer placement, we exploit the ring-like communication pattern of autoregressive inference to partition the satellite constellation along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer. Based on this architecture, we formulate and solve an optimization problem for intra-layer expert placement to map experts with heterogeneous activation probabilities onto satellites. The derived strategy reveals an intuitive principle: a frequently activated expert should be mapped to a satellite on a routing path with low expected latency. Experiments over a thousand-satellite constellation show that SpaceMoE achieves at least a threefold latency reduction compared with conventional random and ablation-based placement strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SpaceMoE gives a clean two-level placement for MoE layers on orbital rings plus activation-driven expert mapping, but the 3x latency claim rests on a static ring model that may not survive real LEO dynamics.

read the letter

SpaceMoE proposes splitting a satellite constellation into ring subnets along the orbit direction, one MoE layer per subnet, then solving an optimization to place individual experts according to their activation probabilities and expected routing latencies. The derived rule is straightforward: map the most active experts to paths with lower expected delay. That combination of ring topology and probability-aware assignment is the concrete new piece here, and it fits the sequential traffic of autoregressive generation reasonably well on paper. The optimization itself is presented clearly and yields an intuitive principle without unnecessary complexity. The work also correctly treats activation probabilities as external inputs rather than deriving them circularly from its own equations. The main weakness is that the reported threefold latency reduction comes from simulations whose details are not visible in the abstract: no parameter settings, no error bars, no precise baseline definitions, and no indication of whether orbital motion, changing link distances, or multi-hop mesh behavior were included. If the experiments enforce a fixed ring without those dynamics, the gains relative to random or ablation placements could be partly an artifact of the modeling choice rather than a general property of the placement algorithm. In real LEO constellations the inter-satellite links form time-varying meshes, so the ring-like communication pattern may not hold as cleanly as assumed. This paper is aimed at researchers working on distributed inference under tight communication or energy constraints, particularly those already looking at non-terrestrial platforms. A reader focused on systems for space data centers or topology-aware MoE scheduling would find the placement strategy worth examining in detail. The formulation is concrete enough and the application area timely enough that it deserves a serious referee, even if the experiments will need tightening.

Referee Report

2 major / 1 minor

Summary. The manuscript presents SpaceMoE, a framework for distributed inference of Mixture-of-Experts (MoE) models over space satellite networks. It proposes a two-level placement approach: (1) layer placement by partitioning the satellite constellation along the orbiting direction into ring-arranged subnets, each hosting one MoE layer to exploit the ring-like communication pattern of autoregressive inference; (2) intra-layer expert placement optimizing the mapping of experts with varying activation probabilities to satellites based on expected path latencies. Simulations on a thousand-satellite constellation demonstrate at least a threefold reduction in latency compared to random and ablation-based strategies.

Significance. If the modeling assumptions and experimental results hold under more realistic conditions, this work would be significant for enabling efficient large-scale LLM inference in space data centers that exploit continuous solar energy. It bridges MoE model structure with space network topologies and derives an intuitive placement principle (high-activation experts on low-latency paths) that could guide future distributed AI systems in orbital environments.

major comments (2)

Abstract and layer placement paragraph: The central claim of at least threefold latency reduction depends on partitioning the constellation into static ring-arranged subnets that exploit a 'ring-like communication pattern of autoregressive inference'. Real LEO constellations exhibit time-varying mesh topologies with changing inter-satellite distances and multi-hop routes due to orbital motion; the paper does not demonstrate that autoregressive token generation plus MoE routing produces strict ring traffic under these dynamics, raising the risk that reported gains are artifacts of the enforced static ring model rather than intrinsic to the placement algorithm.
Experiments (implied by abstract results): The abstract reports a threefold latency improvement from simulations over a thousand-satellite constellation, yet provides no details on simulation parameters, error bars, exact baselines (beyond 'random and ablation-based'), network dynamics modeling, or how activation probabilities were obtained. This leaves the primary performance claim only weakly supported and requires additional rigor to substantiate.

minor comments (1)

The optimization formulation for intra-layer placement should explicitly state whether activation probabilities and expected path latencies are treated as fixed inputs or derived within the model; clarifying this would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below. Where the concerns identify areas needing clarification or additional analysis, we have revised the manuscript accordingly.

read point-by-point responses

Referee: Abstract and layer placement paragraph: The central claim of at least threefold latency reduction depends on partitioning the constellation into static ring-arranged subnets that exploit a 'ring-like communication pattern of autoregressive inference'. Real LEO constellations exhibit time-varying mesh topologies with changing inter-satellite distances and multi-hop routes due to orbital motion; the paper does not demonstrate that autoregressive token generation plus MoE routing produces strict ring traffic under these dynamics, raising the risk that reported gains are artifacts of the enforced static ring model rather than intrinsic to the placement algorithm.

Authors: We agree that real LEO constellations have time-varying topologies. The ring-based layer placement is motivated by the sequential, layer-by-layer nature of autoregressive token generation, which creates a predictable forward pass along the orbit direction when layers are assigned to consecutive orbital rings. The current evaluation employs a static ring model to isolate the contribution of the placement algorithm itself. We acknowledge that this leaves open the question of robustness under full orbital dynamics. In the revised version we have added a dedicated subsection on topology dynamics, including new simulation results that incorporate time-varying inter-satellite distances and dynamic routing; the latency advantage remains above 2.5× relative to the same baselines. revision: yes
Referee: Experiments (implied by abstract results): The abstract reports a threefold latency improvement from simulations over a thousand-satellite constellation, yet provides no details on simulation parameters, error bars, exact baselines (beyond 'random and ablation-based'), network dynamics modeling, or how activation probabilities were obtained. This leaves the primary performance claim only weakly supported and requires additional rigor to substantiate.

Authors: We appreciate the referee highlighting the need for greater experimental transparency. The full manuscript already contains the simulation parameters, activation-probability profiling procedure, and baseline definitions in the Experiments section. To improve accessibility we have (i) expanded the abstract with the main simulation parameters and (ii) added a new subsection that reports error bars from ten independent runs, explicitly describes the orbital-mechanics-based network dynamics model, and details how activation probabilities were measured on a held-out validation set. These changes make the primary performance claim fully traceable. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper's derivation begins with an architectural modeling choice to partition the satellite constellation into ring-arranged subnets (one MoE layer per subnet) based on the assumed ring-like communication pattern of autoregressive inference. It then formulates an optimization problem for intra-layer expert placement that takes activation probabilities and expected path latencies as given inputs from the model and network. The resulting placement strategy is validated via simulation experiments on a 1000-satellite constellation that report latency reductions relative to baselines. No equation reduces to its own inputs by construction, no fitted parameters are relabeled as predictions, and no load-bearing steps rely on self-citations or imported uniqueness results. The approach consists of a design assumption, an optimization using external quantities, and separate empirical evaluation, making the chain self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on domain assumptions about orbital communication patterns and the availability of activation statistics; no new physical entities are postulated and only a modest number of placement decisions are optimized.

free parameters (1)

expert activation probabilities
Heterogeneous probabilities are inputs to the intra-layer optimization; their source (model profiling or fitting) is not specified in the abstract.

axioms (1)

domain assumption Satellite constellation exhibits ring-like communication pattern during autoregressive token generation
Invoked to justify partitioning the network into subnets each hosting one MoE layer.

pith-pipeline@v0.9.0 · 5850 in / 1219 out tokens · 35265 ms · 2026-05-22T10:15:23.888261+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

partition the satellite constellation along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer... exploit the ring-like communication pattern of autoregressive inference
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

optimal placement policy assigns the i-th most frequently activated expert to the i-th lowest-latency satellite

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

[1]

Toward an intelligent edge: Wireless communication meets machine learning,

G. Zhu, D. Liu, Y . Du, C. You, J. Zhang, and K. Huang, “Toward an intelligent edge: Wireless communication meets machine learning,” IEEE Commun. Mag., vol. 58, no. 1, pp. 19–25, 2020

work page 2020
[2]

Towards space-based computing infrastructure net- work: Development trends, network architecture, challenges analysis, and key technologies,

L. Kuanget al., “Towards space-based computing infrastructure net- work: Development trends, network architecture, challenges analysis, and key technologies,” arXiv:2503.06521, 2025

work page arXiv 2025
[3]

Satellite edge artificial intelligence with large models: Architectures and technologies,

Y . Shiet al., “Satellite edge artificial intelligence with large models: Architectures and technologies,”Sci. China Inf. Sci., vol. 68, no. 7, p. 170302, 2025

work page 2025
[4]

Space–ground fluid AI for 6G edge intelligence,

Q. Chen, Z. Wang, X. Chen, J. Wen, D. Zhou, S. Ji, M. Sheng, and K. Huang, “Space–ground fluid AI for 6G edge intelligence,” Engineering, vol. 54, pp. 14–19, 2025

work page 2025
[5]

How Starcloud is bringing data centers to outer space,

A. Lee, “How Starcloud is bringing data centers to outer space,” NVIDIA Blog, Oct. 2025, accessed: Dec. 26, 2025. [Online]. Available: https://blogs.nvidia.com/blog/starcloud/

work page 2025
[6]

Towards a future space-based, highly scalable AI infrastructure system design,

B. A. y Arcaset al., “Towards a future space-based, highly scalable AI infrastructure system design,” arXiv:2511.19468, 2025

work page arXiv 2025
[7]

L. L. Peterson and B. S. Davie,Computer Networks: A Systems Approach. Elsevier, 2007

work page 2007
[8]

On the topological design of distributed computer networks,

M. Gerla and L. Kleinrock, “On the topological design of distributed computer networks,”IEEE Trans. Commun., vol. 25, no. 1, pp. 48–60, 1977

work page 1977
[9]

Efficient processing of deep neural networks: A tutorial and survey,

V . Sze, Y .-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,”Proc. IEEE, vol. 105, no. 12, pp. 2295–2329, 2017

work page 2017
[10]

Beyond data and model parallelism for deep neural networks,

Z. Jia, M. Zaharia, and A. Aiken, “Beyond data and model parallelism for deep neural networks,”Proc. Mach. Learn. Syst., vol. 1, pp. 1–13, 2019

work page 2019
[11]

A scalable, commodity data center network architecture,

M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” inProc. ACM SIGCOMM Conf. Data Commun., 2008, pp. 63–74. 14

work page 2008
[12]

Technology-driven, highly- scalable dragonfly topology,

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly- scalable dragonfly topology,” inProc. Int. Symp. Comput. Archit. (ISCA), 2008, pp. 77–88

work page 2008
[13]

Exploring GPU-to-GPU communication: Insights into supercomputer interconnects,

D. D. Sensiet al., “Exploring GPU-to-GPU communication: Insights into supercomputer interconnects,” inProc. Int. Conf. High Perform. Comput., Netw., Storage Anal. (SC), 2024, pp. 1–15

work page 2024
[14]

MoETuner: Optimized mixture of expert serving with balanced expert placement and token routing,

S. Go and D. Mahajan, “MoETuner: Optimized mixture of expert serving with balanced expert placement and token routing,” arXiv:2502.06643, 2025

work page arXiv 2025
[15]

Optimizing mixture-of-experts inference time combining model deployment and communication scheduling,

J. Li, S. Tripathi, L. Rastogi, Y . Lei, R. Pan, and Y . Xia, “Optimizing mixture-of-experts inference time combining model deployment and communication scheduling,” arXiv:2410.17043, 2024

work page arXiv 2024
[16]

Cluster topology-driven placement of experts reduces network traffic in MoE inference,

D. Sivtsov, A. Katrutsa, and I. Oseledets, “Cluster topology-driven placement of experts reduces network traffic in MoE inference,” arXiv:2508.09229, 2025

work page arXiv 2025
[17]

Efficient pre-training of LLMs via topology-aware communication alignment on more than 9600 GPUs,

G. Heet al., “Efficient pre-training of LLMs via topology-aware communication alignment on more than 9600 GPUs,” inProc. Conf. Neural Inf. Process. Syst. (NeurIPS), San Diego, CA, USA, Dec. 2025

work page 2025
[18]

Optimal batch-size control for low-latency federated learning with device heterogeneity,

H. Yang, Z. Wang, and K. Huang, “Optimal batch-size control for low-latency federated learning with device heterogeneity,”IEEE Trans. Commun., 2026

work page 2026
[19]

Spectrum breathing: Protecting over-the-air federated learning against interference,

Z. Wang, K. Huang, and Y . C. Eldar, “Spectrum breathing: Protecting over-the-air federated learning against interference,”IEEE Trans. Wire- less Commun., vol. 23, no. 8, pp. 10 058–10 071, 2024

work page 2024
[20]

Communication-computation trade-off in resource-constrained edge inference,

J. Shao and J. Zhang, “Communication-computation trade-off in resource-constrained edge inference,”IEEE Commun. Mag., vol. 58, no. 12, pp. 20–26, 2020

work page 2020
[21]

Ultra-low- latency edge inference for distributed sensing,

Z. Wang, A. E. Kalør, Y . Zhou, P. Popovski, and K. Huang, “Ultra-low- latency edge inference for distributed sensing,”IEEE Trans. Wireless Commun., vol. 25, pp. 1908–1922, 2026

work page 1908
[22]

Revisiting outage for edge inference systems,

Z. Wang, Q. Zeng, H. Zheng, and K. Huang, “Revisiting outage for edge inference systems,” arXiv:2504.03686, 2025

work page arXiv 2025
[23]

AirBreath sensing: Protecting over-the-air distributed sensing against interference,

Z. Wang, M. Cui, H. Yang, Q. Zeng, M. Sheng, and K. Huang, “AirBreath sensing: Protecting over-the-air distributed sensing against interference,” arXiv:2508.11267, 2025

work page arXiv 2025
[24]

WDMoE: Wireless distributed mixture of experts for large language models,

N. Xue, Y . Sun, Z. Chen, M. Tao, X. Xu, L. Qian, S. Cui, W. Zhang, and P. Zhang, “WDMoE: Wireless distributed mixture of experts for large language models,”IEEE Trans. Wireless Commun., vol. 25, pp. 559–572, 2026

work page 2026
[25]

SlimCaching: Edge caching of mixture-of-experts for distributed inference,

Q. Chen, X. Chen, and K. Huang, “SlimCaching: Edge caching of mixture-of-experts for distributed inference,” arXiv:2507.06567, 2025

work page arXiv 2025
[26]

Quad-core radiation-hardened system-on-chip power architecture processor,

R. Bergeret al., “Quad-core radiation-hardened system-on-chip power architecture processor,” inProc. IEEE Aerosp. Conf., 2015, pp. 1–12

work page 2015
[27]

Space weather impact on radio communication and navigation,

M. Ishii, J. Berdermann, B. Forte, M. Hapgood, M. M. Bisi, and V . Ro- mano, “Space weather impact on radio communication and navigation,” Adv. Space Res., 2024

work page 2024
[28]

Space weather effects on satellites,

R. Miteva, S. W. Samwel, and S. Tkatchova, “Space weather effects on satellites,”Astronomy, vol. 2, no. 3, pp. 165–179, 2023

work page 2023
[29]

Satellite edge intelligence: DRL-based resource management for task inference in LEO-based satellite-ground collaborative networks,

W. Fan, Q. Meng, G. Wang, H. Bian, Y . Liu, and Y . Liu, “Satellite edge intelligence: DRL-based resource management for task inference in LEO-based satellite-ground collaborative networks,”IEEE Trans. Mobile Comput., vol. 24, no. 10, pp. 10 710–10 728, 2025

work page 2025
[30]

SLICE: Energy-efficient satellite-ground co-inference via layer-wise scheduling optimization,

Y . Chenet al., “SLICE: Energy-efficient satellite-ground co-inference via layer-wise scheduling optimization,”IEEE Trans. Serv. Comput., vol. 18, no. 4, pp. 2388–2402, 2025

work page 2025
[31]

LEOEdge: A satellite-ground cooperation platform for the AI inference in large LEO constellation,

S. Yaoet al., “LEOEdge: A satellite-ground cooperation platform for the AI inference in large LEO constellation,”IEEE J. Sel. Areas Commun., vol. 43, no. 1, pp. 36–50, 2025

work page 2025
[32]

Order and authorization: Spacex Gen2 Starlink satellite constellation,

Federal Communications Commission, “Order and authorization: Spacex Gen2 Starlink satellite constellation,” [Online]. Available: https://docs. fcc.gov/public/attachments/FCC-22-91A1.pdf, 2022, fCC 22-91

work page 2022
[33]

Capacity of two-layered satellite networks,

R. Liu, M. Sheng, K.-S. Lui, X. Wang, D. Zhou, and Y . Wang, “Capacity of two-layered satellite networks,”Wireless Netw., vol. 23, no. 8, pp. 2651–2669, 2017

work page 2017
[34]

R. J. Wilson,Introduction to graph theory, 4th ed. Harlow, England: Addison-Wesley, 1996

work page 1996
[35]

Sampling with unequal probabilities and without replacement,

H. O. Hartley and J. N. K. Rao, “Sampling with unequal probabilities and without replacement,”Ann. Math. Stat., vol. 33, pp. 350–374, 1962

work page 1962
[36]

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,

W. Fedus, B. Zoph, and N. Shazeer, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,”J. Mach. Learn. Res., vol. 23, no. 120, pp. 1–39, 2022

work page 2022
[37]

RAD5545 SpaceVPX Single-Board Computer,

BAE Systems, “RAD5545 SpaceVPX Single-Board Computer,” Product datasheet, 2025

work page 2025
[38]

SBC-2A72 VPX (SpaceVPX 3U) Single Board Computer,

Frontgrade Technologies, “SBC-2A72 VPX (SpaceVPX 3U) Single Board Computer,” [Online]. Available: https://www.frontgrade.com/ products/single-board-computers/SBC-2A72-VPX

work page
[39]

SpaceCloud iX10,

Unibap Space Solutions, “SpaceCloud iX10,” [Online]. Available: https: //unibap.com/solutions/hardware/ix10/

work page
[40]

A survey on acquisition, tracking, and pointing mech- anisms for mobile free-space optical communications,

Y . Kaymaket al., “A survey on acquisition, tracking, and pointing mech- anisms for mobile free-space optical communications,”IEEE Commun. Surveys Tuts., vol. 20, no. 2, pp. 1104–1123, 2018

work page 2018
[41]

Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus

T. Zhu, X. Qu, D. Dong, J. Ruan, J. Tong, C. He, and Y . Cheng, “LLaMA-MoE: Building mixture-of-experts from LLaMA with contin- ual pre-training,” arXiv:2406.16554, 2024

work page arXiv 2024
[42]

LM Evaluation Harness,

L. Gaoet al., “LM Evaluation Harness,” GitHub repository, EleutherAI. [Online]. Available: https://github.com/EleutherAI/ lm-evaluation-harness, 2021

work page 2021

[1] [1]

Toward an intelligent edge: Wireless communication meets machine learning,

G. Zhu, D. Liu, Y . Du, C. You, J. Zhang, and K. Huang, “Toward an intelligent edge: Wireless communication meets machine learning,” IEEE Commun. Mag., vol. 58, no. 1, pp. 19–25, 2020

work page 2020

[2] [2]

Towards space-based computing infrastructure net- work: Development trends, network architecture, challenges analysis, and key technologies,

L. Kuanget al., “Towards space-based computing infrastructure net- work: Development trends, network architecture, challenges analysis, and key technologies,” arXiv:2503.06521, 2025

work page arXiv 2025

[3] [3]

Satellite edge artificial intelligence with large models: Architectures and technologies,

Y . Shiet al., “Satellite edge artificial intelligence with large models: Architectures and technologies,”Sci. China Inf. Sci., vol. 68, no. 7, p. 170302, 2025

work page 2025

[4] [4]

Space–ground fluid AI for 6G edge intelligence,

Q. Chen, Z. Wang, X. Chen, J. Wen, D. Zhou, S. Ji, M. Sheng, and K. Huang, “Space–ground fluid AI for 6G edge intelligence,” Engineering, vol. 54, pp. 14–19, 2025

work page 2025

[5] [5]

How Starcloud is bringing data centers to outer space,

A. Lee, “How Starcloud is bringing data centers to outer space,” NVIDIA Blog, Oct. 2025, accessed: Dec. 26, 2025. [Online]. Available: https://blogs.nvidia.com/blog/starcloud/

work page 2025

[6] [6]

Towards a future space-based, highly scalable AI infrastructure system design,

B. A. y Arcaset al., “Towards a future space-based, highly scalable AI infrastructure system design,” arXiv:2511.19468, 2025

work page arXiv 2025

[7] [7]

L. L. Peterson and B. S. Davie,Computer Networks: A Systems Approach. Elsevier, 2007

work page 2007

[8] [8]

On the topological design of distributed computer networks,

M. Gerla and L. Kleinrock, “On the topological design of distributed computer networks,”IEEE Trans. Commun., vol. 25, no. 1, pp. 48–60, 1977

work page 1977

[9] [9]

Efficient processing of deep neural networks: A tutorial and survey,

V . Sze, Y .-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,”Proc. IEEE, vol. 105, no. 12, pp. 2295–2329, 2017

work page 2017

[10] [10]

Beyond data and model parallelism for deep neural networks,

Z. Jia, M. Zaharia, and A. Aiken, “Beyond data and model parallelism for deep neural networks,”Proc. Mach. Learn. Syst., vol. 1, pp. 1–13, 2019

work page 2019

[11] [11]

A scalable, commodity data center network architecture,

M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity data center network architecture,” inProc. ACM SIGCOMM Conf. Data Commun., 2008, pp. 63–74. 14

work page 2008

[12] [12]

Technology-driven, highly- scalable dragonfly topology,

J. Kim, W. J. Dally, S. Scott, and D. Abts, “Technology-driven, highly- scalable dragonfly topology,” inProc. Int. Symp. Comput. Archit. (ISCA), 2008, pp. 77–88

work page 2008

[13] [13]

Exploring GPU-to-GPU communication: Insights into supercomputer interconnects,

D. D. Sensiet al., “Exploring GPU-to-GPU communication: Insights into supercomputer interconnects,” inProc. Int. Conf. High Perform. Comput., Netw., Storage Anal. (SC), 2024, pp. 1–15

work page 2024

[14] [14]

MoETuner: Optimized mixture of expert serving with balanced expert placement and token routing,

S. Go and D. Mahajan, “MoETuner: Optimized mixture of expert serving with balanced expert placement and token routing,” arXiv:2502.06643, 2025

work page arXiv 2025

[15] [15]

Optimizing mixture-of-experts inference time combining model deployment and communication scheduling,

J. Li, S. Tripathi, L. Rastogi, Y . Lei, R. Pan, and Y . Xia, “Optimizing mixture-of-experts inference time combining model deployment and communication scheduling,” arXiv:2410.17043, 2024

work page arXiv 2024

[16] [16]

Cluster topology-driven placement of experts reduces network traffic in MoE inference,

D. Sivtsov, A. Katrutsa, and I. Oseledets, “Cluster topology-driven placement of experts reduces network traffic in MoE inference,” arXiv:2508.09229, 2025

work page arXiv 2025

[17] [17]

Efficient pre-training of LLMs via topology-aware communication alignment on more than 9600 GPUs,

G. Heet al., “Efficient pre-training of LLMs via topology-aware communication alignment on more than 9600 GPUs,” inProc. Conf. Neural Inf. Process. Syst. (NeurIPS), San Diego, CA, USA, Dec. 2025

work page 2025

[18] [18]

Optimal batch-size control for low-latency federated learning with device heterogeneity,

H. Yang, Z. Wang, and K. Huang, “Optimal batch-size control for low-latency federated learning with device heterogeneity,”IEEE Trans. Commun., 2026

work page 2026

[19] [19]

Spectrum breathing: Protecting over-the-air federated learning against interference,

Z. Wang, K. Huang, and Y . C. Eldar, “Spectrum breathing: Protecting over-the-air federated learning against interference,”IEEE Trans. Wire- less Commun., vol. 23, no. 8, pp. 10 058–10 071, 2024

work page 2024

[20] [20]

Communication-computation trade-off in resource-constrained edge inference,

J. Shao and J. Zhang, “Communication-computation trade-off in resource-constrained edge inference,”IEEE Commun. Mag., vol. 58, no. 12, pp. 20–26, 2020

work page 2020

[21] [21]

Ultra-low- latency edge inference for distributed sensing,

Z. Wang, A. E. Kalør, Y . Zhou, P. Popovski, and K. Huang, “Ultra-low- latency edge inference for distributed sensing,”IEEE Trans. Wireless Commun., vol. 25, pp. 1908–1922, 2026

work page 1908

[22] [22]

Revisiting outage for edge inference systems,

Z. Wang, Q. Zeng, H. Zheng, and K. Huang, “Revisiting outage for edge inference systems,” arXiv:2504.03686, 2025

work page arXiv 2025

[23] [23]

AirBreath sensing: Protecting over-the-air distributed sensing against interference,

Z. Wang, M. Cui, H. Yang, Q. Zeng, M. Sheng, and K. Huang, “AirBreath sensing: Protecting over-the-air distributed sensing against interference,” arXiv:2508.11267, 2025

work page arXiv 2025

[24] [24]

WDMoE: Wireless distributed mixture of experts for large language models,

N. Xue, Y . Sun, Z. Chen, M. Tao, X. Xu, L. Qian, S. Cui, W. Zhang, and P. Zhang, “WDMoE: Wireless distributed mixture of experts for large language models,”IEEE Trans. Wireless Commun., vol. 25, pp. 559–572, 2026

work page 2026

[25] [25]

SlimCaching: Edge caching of mixture-of-experts for distributed inference,

Q. Chen, X. Chen, and K. Huang, “SlimCaching: Edge caching of mixture-of-experts for distributed inference,” arXiv:2507.06567, 2025

work page arXiv 2025

[26] [26]

Quad-core radiation-hardened system-on-chip power architecture processor,

R. Bergeret al., “Quad-core radiation-hardened system-on-chip power architecture processor,” inProc. IEEE Aerosp. Conf., 2015, pp. 1–12

work page 2015

[27] [27]

Space weather impact on radio communication and navigation,

M. Ishii, J. Berdermann, B. Forte, M. Hapgood, M. M. Bisi, and V . Ro- mano, “Space weather impact on radio communication and navigation,” Adv. Space Res., 2024

work page 2024

[28] [28]

Space weather effects on satellites,

R. Miteva, S. W. Samwel, and S. Tkatchova, “Space weather effects on satellites,”Astronomy, vol. 2, no. 3, pp. 165–179, 2023

work page 2023

[29] [29]

Satellite edge intelligence: DRL-based resource management for task inference in LEO-based satellite-ground collaborative networks,

W. Fan, Q. Meng, G. Wang, H. Bian, Y . Liu, and Y . Liu, “Satellite edge intelligence: DRL-based resource management for task inference in LEO-based satellite-ground collaborative networks,”IEEE Trans. Mobile Comput., vol. 24, no. 10, pp. 10 710–10 728, 2025

work page 2025

[30] [30]

SLICE: Energy-efficient satellite-ground co-inference via layer-wise scheduling optimization,

Y . Chenet al., “SLICE: Energy-efficient satellite-ground co-inference via layer-wise scheduling optimization,”IEEE Trans. Serv. Comput., vol. 18, no. 4, pp. 2388–2402, 2025

work page 2025

[31] [31]

LEOEdge: A satellite-ground cooperation platform for the AI inference in large LEO constellation,

S. Yaoet al., “LEOEdge: A satellite-ground cooperation platform for the AI inference in large LEO constellation,”IEEE J. Sel. Areas Commun., vol. 43, no. 1, pp. 36–50, 2025

work page 2025

[32] [32]

Order and authorization: Spacex Gen2 Starlink satellite constellation,

Federal Communications Commission, “Order and authorization: Spacex Gen2 Starlink satellite constellation,” [Online]. Available: https://docs. fcc.gov/public/attachments/FCC-22-91A1.pdf, 2022, fCC 22-91

work page 2022

[33] [33]

Capacity of two-layered satellite networks,

R. Liu, M. Sheng, K.-S. Lui, X. Wang, D. Zhou, and Y . Wang, “Capacity of two-layered satellite networks,”Wireless Netw., vol. 23, no. 8, pp. 2651–2669, 2017

work page 2017

[34] [34]

R. J. Wilson,Introduction to graph theory, 4th ed. Harlow, England: Addison-Wesley, 1996

work page 1996

[35] [35]

Sampling with unequal probabilities and without replacement,

H. O. Hartley and J. N. K. Rao, “Sampling with unequal probabilities and without replacement,”Ann. Math. Stat., vol. 33, pp. 350–374, 1962

work page 1962

[36] [36]

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,

W. Fedus, B. Zoph, and N. Shazeer, “Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity,”J. Mach. Learn. Res., vol. 23, no. 120, pp. 1–39, 2022

work page 2022

[37] [37]

RAD5545 SpaceVPX Single-Board Computer,

BAE Systems, “RAD5545 SpaceVPX Single-Board Computer,” Product datasheet, 2025

work page 2025

[38] [38]

SBC-2A72 VPX (SpaceVPX 3U) Single Board Computer,

Frontgrade Technologies, “SBC-2A72 VPX (SpaceVPX 3U) Single Board Computer,” [Online]. Available: https://www.frontgrade.com/ products/single-board-computers/SBC-2A72-VPX

work page

[39] [39]

SpaceCloud iX10,

Unibap Space Solutions, “SpaceCloud iX10,” [Online]. Available: https: //unibap.com/solutions/hardware/ix10/

work page

[40] [40]

A survey on acquisition, tracking, and pointing mech- anisms for mobile free-space optical communications,

Y . Kaymaket al., “A survey on acquisition, tracking, and pointing mech- anisms for mobile free-space optical communications,”IEEE Commun. Surveys Tuts., vol. 20, no. 2, pp. 1104–1123, 2018

work page 2018

[41] [41]

Barret Zoph, Irwan Bello, Sameer Kumar, Nan Du, Yanping Huang, Jeff Dean, Noam Shazeer, and William Fedus

T. Zhu, X. Qu, D. Dong, J. Ruan, J. Tong, C. He, and Y . Cheng, “LLaMA-MoE: Building mixture-of-experts from LLaMA with contin- ual pre-training,” arXiv:2406.16554, 2024

work page arXiv 2024

[42] [42]

LM Evaluation Harness,

L. Gaoet al., “LM Evaluation Harness,” GitHub repository, EleutherAI. [Online]. Available: https://github.com/EleutherAI/ lm-evaluation-harness, 2021

work page 2021