Semantic Communication for Multi-Satellite Massive MIMO Transmission: A Mixture of Cooperative Modes Framework

Bj\"orn Ottersten; Rui Ding; Symeon Chatzinotas; Vu Nguyen Ha; Wenjin Wang; Yafei Wang; Yiming Zhu; Yuchen Zhang

arxiv: 2605.09013 · v1 · submitted 2026-05-09 · 📡 eess.SP

Semantic Communication for Multi-Satellite Massive MIMO Transmission: A Mixture of Cooperative Modes Framework

Yafei Wang , Yuchen Zhang , Yiming Zhu , Vu Nguyen Ha , Rui Ding , Wenjin Wang , Symeon Chatzinotas , Bj\"orn Ottersten This is my paper

Pith reviewed 2026-05-12 02:21 UTC · model grok-4.3

classification 📡 eess.SP

keywords semantic communicationmulti-satellite MIMOcoherent transmissionnon-coherent transmissionimage transmissionmassive MIMOchannel state informationneural network architectures

0 comments

The pith

A mixture of cooperative modes framework dynamically switches between coherent and non-coherent semantic transmission in multi-satellite massive MIMO to balance performance and complexity for image tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops semantic communication frameworks for multi-satellite massive MIMO systems serving multi-antenna users with image transmission. It creates separate designs for coherent joint transmission and non-coherent transmission, each with tailored neural networks. Building on these, it introduces a mixture framework that uses a permutation-invariant network to choose the mode based on statistical channel information. A sympathetic reader would care because this approach could enable more efficient and effective data delivery in satellite networks where full coordination is costly. The simulations indicate gains in semantic quality under practical setups.

Core claim

The paper establishes that integrating semantic communications for image transmission into multi-satellite cooperative massive MIMO is feasible through MSCT and MSNCT frameworks, and that a mixture of these modes via a dynamic switching network using multi-satellite statistical CSI provides a practical balance between semantic performance and system complexity, as demonstrated by simulation results.

What carries the argument

The mixture of cooperative modes (MoCM) framework, consisting of a permutation-invariant network that dynamically switches between multi-satellite coherent transmission (MSCT) and non-coherent transmission (MSNCT) semantic communication modes based on statistical channel state information.

If this is right

The MSCT framework with hybrid Swin-Transformer CNN architecture enables symmetric encoding and decoding for coherent satellite transmission.
The MSNCT framework supports cross-stream semantic interference exploitation through global attention in its Transformer backbone.
Overall, the proposed frameworks achieve performance gains in semantic metrics for image transmission compared to conventional approaches under practical satellite configurations.
Dynamic mode switching in MoCM allows trading off semantic fidelity against computational complexity depending on channel conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending the dynamic switching idea could help in other distributed communication systems where full synchronization is intermittent.
Future hardware implementations would need to verify if the statistical CSI-based switching holds when real-time impairments like timing offsets occur.
Applying similar semantic-aware mode selection might benefit terrestrial massive MIMO networks facing similar cooperation costs.

Load-bearing premise

The neural network architectures and mode-switching logic will continue to deliver semantic performance advantages when deployed in actual satellite channels that include synchronization impairments and other real-world effects not fully captured in simulations.

What would settle it

A hardware-in-the-loop test or field experiment showing that the semantic similarity scores or reconstruction quality of the MoCM framework drop below those of a fixed-mode baseline when realistic synchronization errors are present would falsify the practical advantage.

Figures

Figures reproduced from arXiv: 2605.09013 by Bj\"orn Ottersten, Rui Ding, Symeon Chatzinotas, Vu Nguyen Ha, Wenjin Wang, Yafei Wang, Yiming Zhu, Yuchen Zhang.

**Figure 2.** Figure 2: The proposed multi-satellite SemCom frameworks. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Structure of the HSTC block. M(1) k = MSCSNEnc,1 M(0) k ∈ R P1×D1 , (16) Zk = MSCSNEnc,2 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Permutation-invariant mode-switching network. [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 2.** Figure 2: This paper focuses on the tradeoff between a task-level [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 5.** Figure 5: Visualization of one Monte Carlo realization in the simulated LEO [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Average PSNR versus transmit power under different UT array sizes and compression ratios. [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Average SSIM (dB) versus transmit power under different UT array sizes and compression ratios. [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Reconstructed images and decoder attention maps at Layers [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

read the original abstract

This paper investigates semantic communications (SemComs) for multi-satellite cooperative massive multiple-input multiple-output (MIMO) transmission, where multiple massive-MIMO satellites jointly serve a common set of multi-antenna user terminals. For the first time, SemComs with image transmission task are integrated into satellite massive MIMO and multi-satellite cooperative transmission. For the two representative cooperative modes, namely coherent transmission (CT) and non-coherent transmission (NCT), we develop multi-satellite CT (MSCT) and multi-satellite NCT (MSNCT) SemCom frameworks, respectively. MSCT adopts a symmetric architecture, whereas MSNCT introduces transmitter-side stream allocation and a two-stage receiver design that combines per-stream semantic extraction with cross-stream semantic-interference exploitation. To instantiate MSCT, we further design a symmetric encoder and decoder network based on hybrid Swin-Transformer and lightweight bottleneck convolutional neural network (CNN) blocks, termed HSTC, where Swin Transformer provides scalable computation and the CNN branch improves performance and convergence. For MSNCT, a Transformer-based backbone is employed to support cross-stream interference exploitation through global attention. Building on these two frameworks, we propose a mixture of cooperative modes (MoCM) framework, in which a permutation-invariant network dynamically switches between MSCT and MSNCT using multi-satellite statistical channel state information, thereby balancing semantic performance and complexity. Simulation results under practical configurations demonstrate the performance gains of the proposed frameworks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces concrete semantic comm frameworks for multi-satellite massive MIMO image transmission with a dynamic coherent/non-coherent switcher, but the simulation claims need checking against real satellite impairments.

read the letter

The core takeaway is that this work brings semantic communication for images into multi-satellite massive MIMO and adds a mixture-of-modes switch that picks between coherent and non-coherent transmission based on statistical CSI. That combination looks new for the satellite setting and gives a practical way to trade performance against complexity when full phase alignment across satellites is difficult.

Referee Report

2 major / 2 minor

Summary. The paper integrates semantic communications for image transmission into multi-satellite cooperative massive MIMO systems. It develops an MSCT framework with a symmetric HSTC architecture (hybrid Swin-Transformer and lightweight CNN blocks) for coherent transmission, an MSNCT framework with a Transformer backbone and cross-stream attention for non-coherent transmission, and a MoCM framework that uses a permutation-invariant network to dynamically switch between MSCT and MSNCT based on multi-satellite statistical CSI, claiming measurable semantic performance gains and complexity benefits under practical simulation configurations.

Significance. If the simulation results hold under realistic conditions, the work is significant for being the first integration of semantic communications with satellite massive MIMO and multi-satellite cooperation. The explicit design of HSTC for scalable computation, the two-stage receiver in MSNCT for interference exploitation, and the data-driven MoCM switcher represent concrete technical contributions that could improve efficiency in satellite networks by trading off semantic fidelity against complexity.

major comments (2)

[§VI] §VI (Simulation results): The central claim of performance gains for MSCT, MSNCT, and MoCM rests on simulations under 'practical configurations,' yet the channel model description omits satellite-specific impairments including Doppler spread, residual frequency offset, phase noise, and inter-satellite timing misalignment. These directly corrupt received feature maps prior to semantic decoding and before the mode-selection network receives its statistical CSI input; because the architectures rely on global attention and precise cross-stream processing, even modest phase errors could invalidate the reported advantages.
[§VI] §VI and abstract: No information is supplied on the baselines (e.g., conventional MIMO or non-semantic schemes), semantic performance metrics, error bars, data exclusion rules, or exact channel models (including any Doppler or synchronization effects). This prevents verification of whether the math and data support the claim that the proposed frameworks deliver gains.

minor comments (2)

[Abstract] Abstract: The phrase 'practical configurations' is used without elaboration; adding one sentence on the key parameters (carrier frequency, orbital geometry, number of satellites/users) would improve clarity.
[§II] Notation: The distinction between 'semantic performance' and conventional metrics (e.g., PSNR vs. semantic similarity) should be defined explicitly when first introduced in §II or §III.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of simulation rigor and reproducibility that we will address in the revision. Below we respond point-by-point to the major comments.

read point-by-point responses

Referee: [§VI] §VI (Simulation results): The central claim of performance gains for MSCT, MSNCT, and MoCM rests on simulations under 'practical configurations,' yet the channel model description omits satellite-specific impairments including Doppler spread, residual frequency offset, phase noise, and inter-satellite timing misalignment. These directly corrupt received feature maps prior to semantic decoding and before the mode-selection network receives its statistical CSI input; because the architectures rely on global attention and precise cross-stream processing, even modest phase errors could invalidate the reported advantages.

Authors: We thank the referee for this important observation. The simulations employ a multi-satellite channel model drawn from 3GPP NTN specifications that includes large-scale fading and Rician small-scale fading under the assumption of ideal synchronization and Doppler compensation (standard in many satellite MIMO studies to isolate the communication scheme). We acknowledge that explicit modeling of residual Doppler spread, phase noise, and inter-satellite timing misalignment is omitted. In the revised manuscript we will expand Section III to state these assumptions explicitly, add a paragraph discussing the sensitivity of global-attention-based architectures to phase errors, and note the omission as a limitation for future work that incorporates realistic synchronization impairments. This will clarify the scope of the reported gains without overstating generality. revision: yes
Referee: [§VI] §VI and abstract: No information is supplied on the baselines (e.g., conventional MIMO or non-semantic schemes), semantic performance metrics, error bars, data exclusion rules, or exact channel models (including any Doppler or synchronization effects). This prevents verification of whether the math and data support the claim that the proposed frameworks deliver gains.

Authors: We apologize for the insufficient detail. The baselines consist of (i) conventional massive-MIMO transmission with separate JPEG source coding plus LDPC channel coding and (ii) non-semantic end-to-end MIMO schemes. Performance is quantified via PSNR, SSIM, and a learned semantic similarity score; all results are averaged over 1000 independent Monte-Carlo channel realizations with error bars showing one standard deviation. No data points are excluded. The channel model is the Rician fading model with parameters listed in Table I (including the Doppler spread value used for the statistical CSI input to MoCM). In the revision we will insert a dedicated “Simulation Setup” subsection in Section VI that tabulates all baselines, metrics, statistical procedures, and exact channel parameters (including the Doppler and synchronization assumptions). This will enable full verification and reproducibility of the claimed gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; frameworks constructed from standard neural components with external statistical CSI inputs

full rationale

The paper proposes MSCT (HSTC symmetric encoder-decoder with Swin-Transformer + CNN blocks), MSNCT (Transformer backbone with cross-stream attention), and MoCM (permutation-invariant dynamic switcher) as new semantic communication frameworks for multi-satellite MIMO. These are instantiated via standard architectures (Transformers, CNNs) and use multi-satellite statistical CSI for mode selection rather than any fitted parameter or self-referential definition. No equations or derivations reduce the claimed performance gains to inputs by construction; simulation results serve as empirical validation under stated configurations. No load-bearing self-citations, uniqueness theorems, or ansatzes imported from prior author work are invoked in the provided text to force the central claims. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, the paper introduces no explicit free parameters, axioms, or invented entities; it relies on standard neural network building blocks and simulation-based claims.

pith-pipeline@v0.9.0 · 5586 in / 1307 out tokens · 57529 ms · 2026-05-12T02:21:35.881341+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

a permutation-invariant network dynamically switches between MSCT and MSNCT using multi-satellite statistical channel state information
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

hybrid Swin-Transformer and lightweight bottleneck convolutional neural network (CNN) blocks, termed HSTC

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 1 internal anchor

[1]

A vision, survey, and roadmap toward space communications in the 6G and beyond era,

K. Ntontin, E. Lagunas, J. Querol, J. ur Rehmanet al., “A vision, survey, and roadmap toward space communications in the 6G and beyond era,” Proc. IEEE, pp. 1–37, Jan. 2025

work page 2025
[2]

Toward mobile satellite internet: The fundamental limitation of wireless transmission and enabling technologies,

W. Wang, Y . Zhu, Y . Wang, R. Dinget al., “Toward mobile satellite internet: The fundamental limitation of wireless transmission and enabling technologies,”Engineering, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2095809925003698

work page 2025
[3]

Large-scale MIMO enabled satellite communications: Concepts, technologies, and challenges,

Y . Wu, L. Xiao, J. Zhou, M. Fenget al., “Large-scale MIMO enabled satellite communications: Concepts, technologies, and challenges,”IEEE Commun. Mag., 2024. 13

work page 2024
[4]

Bluewalker 3,

AST SpaceMobile, “Bluewalker 3,” 2026, available: https://ast- science.com/spacemobile-network/bluewalker-3/. Accessed: Mar. 27, 2026

work page 2026
[5]

Statistical CSI- based distributed precoding design for OFDM-cooperative multi-satellite systems,

Y . Wang, V . N. Ha, K. Ntontin, H. Yanet al., “Statistical CSI- based distributed precoding design for OFDM-cooperative multi-satellite systems,”IEEE J. Sel. Areas Commun., vol. 44, pp. 3219–3236, Jan. 2026

work page 2026
[6]

Multi-LEO satellite coop- erative transmission: A spatial-temporal-frequency perspective,

Y . Wang, X. Xu, Y . Zhu, W. Wanget al., “Multi-LEO satellite coop- erative transmission: A spatial-temporal-frequency perspective,”IEEE Wireless Commun., pp. 1–8, 2026

work page 2026
[7]

Massive MIMO downlink transmission for multiple LEO satellite communication,

Z. Xiang, X. Gao, K.-X. Li, and X.-G. Xia, “Massive MIMO downlink transmission for multiple LEO satellite communication,”IEEE Trans. Commun., vol. 72, no. 6, pp. 3352–3364, Jun. 2024

work page 2024
[8]

Deep learning enabled semantic communication systems,

H. Xie, Z. Qin, G. Y . Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,”IEEE Trans. Signal Process., vol. 69, pp. 2663–2675, 2021

work page 2021
[9]

Semantic communication: A survey on research landscape, challenges, and future directions,

T. M. Getu, G. Kaddoum, and M. Bennis, “Semantic communication: A survey on research landscape, challenges, and future directions,”Proc. IEEE, vol. 112, no. 11, pp. 1649–1685, Nov. 2024

work page 2024
[10]

Joint coding and modulation for robust semantic communication in satellite communications,

Z. Lin, H. Lin, Y . Sun, S. Basheeret al., “Joint coding and modulation for robust semantic communication in satellite communications,”IEEE Internet Things J., vol. 13, no. 1, pp. 339–346, Jan. 2026

work page 2026
[11]

Massive MIMO transmission for LEO satellite communications,

L. You, K.-X. Li, J. Wang, X. Gaoet al., “Massive MIMO transmission for LEO satellite communications,”IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1851–1865, Aug. 2020

work page 2020
[12]

Downlink transmit design for massive MIMO LEO satellite communications,

K.-X. Li, L. You, J. Wang, X. Gaoet al., “Downlink transmit design for massive MIMO LEO satellite communications,”IEEE Trans. Commun., vol. 70, no. 2, pp. 1014–1028, Feb. 2021

work page 2021
[13]

Energy and computational efficient precoding for LEO satellite communications,

S. Wu, Y . Wang, G. Sun, L. Youet al., “Energy and computational efficient precoding for LEO satellite communications,” inProc. IEEE Glob. Commun. Conf. (GLOBECOM), Kuala Lumpur, Malaysia, Dec. 2023, pp. 1872–1877

work page 2023
[14]

Hybrid analog/digital precoding for downlink massive MIMO LEO satellite communications,

L. You, X. Qiang, K.-X. Li, C. G. Tsinoset al., “Hybrid analog/digital precoding for downlink massive MIMO LEO satellite communications,” IEEE Trans. Wireless Commun., vol. 21, no. 8, pp. 5962–5976, 2022

work page 2022
[15]

User-centric beam selection and precoding design for coordinated multiple-satellite systems,

V . N. Ha, D. H. N. Nguyen, J. C.-M. Duncan, J. L. Gonzalez-Rios et al., “User-centric beam selection and precoding design for coordinated multiple-satellite systems,” inProc. IEEE 35th Int. Symp. Pers., Indoor Mobile Radio Commun. (PIMRC), Valencia, Spain, Sep. 2024, pp. 1–6

work page 2024
[16]

Multi-satellite coordinated beam hopping for interference mitigation under tilted beam effects: A graph-theoretic approach,

Z. Liu, Y . Wang, W. Wang, Y . Sunet al., “Multi-satellite coordinated beam hopping for interference mitigation under tilted beam effects: A graph-theoretic approach,”IEEE Wireless Commun. Lett., vol. 15, pp. 2313–2317, 2026

work page 2026
[17]

Carrier aggregation in satellite communications: Impact and performance study,

M. G. Kibria, E. Lagunas, N. Maturo, H. Al-Hraishawiet al., “Carrier aggregation in satellite communications: Impact and performance study,” IEEE Open J. Commun. Soc., vol. 1, pp. 1390–1402, 2020

work page 2020
[18]

Multi-satellite MIMO systems for direct satellite-to-device communications: A survey,

Z. M. Bakhsh, Y . Omid, G. Chen, F. Kayhanet al., “Multi-satellite MIMO systems for direct satellite-to-device communications: A survey,” IEEE Commun. Surveys Tuts., vol. 27, no. 3, pp. 1536–1564, Jun. 2025

work page 2025
[19]

Multi-satellite multi-stream beamspace massive MIMO transmission,

Y . Wang, Y . Zhu, V . N. Ha, W. Wanget al., “Multi-satellite multi-stream beamspace massive MIMO transmission,”arXiv preprint arXiv:2512.21998, 2025

work page arXiv 2025
[20]

Distributed beamforming for multiple LEO satellites with imperfect delay and Doppler compensa- tions: Modeling and rate analysis,

S. Wu, Y . Wang, G. Sun, W. Wanget al., “Distributed beamforming for multiple LEO satellites with imperfect delay and Doppler compensa- tions: Modeling and rate analysis,”IEEE Trans. Veh. Technol., vol. 74, no. 9, pp. 14 978–14 984, Sep. 2025

work page 2025
[21]

Enabling scalable distributed beam- forming via networked LEO satellites toward 6G,

Y . Zhang and T. Y . Al-Naffouri, “Enabling scalable distributed beam- forming via networked LEO satellites toward 6G,”IEEE Trans. Wireless Commun., vol. 25, pp. 6666–6680, 2026

work page 2026
[22]

Decentralized Cooperative Beamforming for Networked LEO Satellites with Statistical CSI

Y . Zhang, E. Lagunas, X. X. Zheng, S. Chatzinotaset al., “Decentralized cooperative beamforming for networked LEO satellites with statistical CSI,”arXiv preprint arXiv:2512.18890, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[23]

Deep learning-based multi- satellite massive MIMO transmission: Centralized or decentralized?

W. Cao, Y . Wang, J. Zhang, X. Xuet al., “Deep learning-based multi- satellite massive MIMO transmission: Centralized or decentralized?” arXiv preprint arXiv:2603.20862, 2026

work page arXiv 2026
[24]

Deep joint source- channel coding for adaptive image transmission over MIMO channels,

H. Wu, Y . Shao, C. Bian, K. Mikolajczyket al., “Deep joint source- channel coding for adaptive image transmission over MIMO channels,” IEEE Trans. Wireless Commun., vol. 23, no. 11, pp. 15 002–15 017, 2024

work page 2024
[25]

Semantic satellite communications for synchronized audiovisual reconstruction,

F. Liu, P. Jiang, W. Wang, C.-K. Wenet al., “Semantic satellite communications for synchronized audiovisual reconstruction,”arXiv preprint arXiv:2603.10791, 2026

work page arXiv 2026
[26]

Semantic image encoding and communication for earth observation with LEO satellites,

V .-P. Bui, T. Q. Dinh, I. Leyva-Mayorga, S. R. Pandeyet al., “Semantic image encoding and communication for earth observation with LEO satellites,”IEEE Trans. Cogn. Commun. Netw., vol. 11, no. 2, pp. 1210– 1224, 2025

work page 2025
[27]

EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,

P. Helber, B. Bischke, A. Dengel, and D. Borth, “EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 7, pp. 2217–2226, 2019

work page 2019
[28]

Joint source and channel coding for multi-modal satellite-to-ground semantic communications,

Y . Yin, S. Liu, D. Wen, Y . Wuet al., “Joint source and channel coding for multi-modal satellite-to-ground semantic communications,” inProc. IEEE Wireless Commun. Networking Conf. (WCNC), 2025, pp. 1–6

work page 2025
[29]

Deepma: End-to- end deep multiple access for wireless image transmission in semantic communication,

W. Zhang, K. Bai, S. Zeadally, H. Zhanget al., “Deepma: End-to- end deep multiple access for wireless image transmission in semantic communication,”IEEE Trans. Cogn. Commun. Netw., vol. 10, no. 2, pp. 387–402, 2024

work page 2024
[30]

Optimization of image transmission in cooperative semantic communication networks,

W. Zhang, Y . Wang, M. Chen, T. Luoet al., “Optimization of image transmission in cooperative semantic communication networks,”IEEE Trans. Wireless Commun., vol. 23, no. 2, pp. 861–877, 2024

work page 2024
[31]

Joint beam alignment and doppler estimation for fast time-varying wideband mmWave channels,

H. Hou, Y . Wang, X. Yi, W. Wanget al., “Joint beam alignment and doppler estimation for fast time-varying wideband mmWave channels,” IEEE Trans. Wireless Commun., vol. 23, no. 9, pp. 10 895–10 910, Sep. 2024

work page 2024
[32]

Joint channel estimation and prediction for massive MIMO with frequency hopping sounding,

Y . Zhu, J. Zhuang, G. Sun, H. Houet al., “Joint channel estimation and prediction for massive MIMO with frequency hopping sounding,”IEEE Trans. Commun., vol. 73, no. 7, pp. 5139–5154, Jul. 2025

work page 2025
[33]

Toward multi- satellite cooperative transmission: A joint framework for CSI acquisition, feedback, and phase synchronization,

Y . Zhu, Y . Wang, C. Amatetti, A. Vanelli-Coralliet al., “Toward multi- satellite cooperative transmission: A joint framework for CSI acquisition, feedback, and phase synchronization,”arXiv preprint arXiv:2603.28195, 2026

work page arXiv 2026
[34]

Near optimal timing and fre- quency offset estimation for 5G integrated LEO satellite communication system,

W. Wang, Y . Tong, L. Li, A.-A. Luet al., “Near optimal timing and fre- quency offset estimation for 5G integrated LEO satellite communication system,”IEEE Access, vol. 7, pp. 113 298–113 310, 2019

work page 2019
[35]

Architectures and synchronization techniques for distributed satellite systems: A survey,

L. M. Marrero, J. C. Merlano-Duncan, J. Querol, S. Kumaret al., “Architectures and synchronization techniques for distributed satellite systems: A survey,”IEEE Access, vol. 10, pp. 45 375–45 409, 2022

work page 2022
[36]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Huet al., “Swin transformer: Hierarchical vision transformer using shifted windows,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 10 012–10 022

work page 2021
[37]

How do vision transformers work?

N. Park and S. Kim, “How do vision transformers work?” inProc. Int. Conf. Learn. Represent. (ICLR), Apr. 2022

work page 2022
[38]

Learned image compression with mixed transformer-CNN architectures,

J. Liu, H. Sun, and J. Katto, “Learned image compression with mixed transformer-CNN architectures,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, pp. 14 388–14 397

work page 2023
[39]

An image is worth 16x16 words: Transformers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenbornet al., “An image is worth 16x16 words: Transformers for image recognition at scale,” inProc. Int. Conf. Learn. Represent. (ICLR), May 2021

work page 2021
[40]

Weighted MMSE precoding for constructive interference region,

Y . Wang, W. Wang, L. You, C. G. Tsinoset al., “Weighted MMSE precoding for constructive interference region,”IEEE Wireless Commun. Lett., vol. 11, no. 12, pp. 2605–2609, 2022

work page 2022
[41]

GLaM: Efficient scaling of language models with mixture-of-experts,

N. Du, Y . Huang, A. M. Dai, S. Tonget al., “GLaM: Efficient scaling of language models with mixture-of-experts,” inProc. 39th Int. Conf. Mach. Learn. (ICML), vol. 162, 2022, pp. 5547–5569

work page 2022
[42]

Toward unified AI models for MU-MIMO communications: A tensor equivariance framework,

Y . Wang, H. Hou, X. Yi, W. Wanget al., “Toward unified AI models for MU-MIMO communications: A tensor equivariance framework,”IEEE Trans. Wireless Commun., vol. 24, no. 12, pp. 10 517–10 533, Dec. 2025

work page 2025
[43]

Set transformer: A framework for attention-based permutation-invariant neural networks,

J. Lee, Y . Lee, J. Kim, A. R. Kosioreket al., “Set transformer: A framework for attention-based permutation-invariant neural networks,” inProc. 36th Int. Conf. Mach. Learn. (ICML), vol. 97, Jun. 2019, pp. 3744–3753

work page 2019
[44]

Federal communications commission; amendment to pending applica- tion for the SpaceX Gen2 NGSO satellite system,

“Federal communications commission; amendment to pending applica- tion for the SpaceX Gen2 NGSO satellite system,” FCC, Washington, D.C., Tech. Rep. File No. SAT-AMD-2021, August 2021, available: https://fcc.report/IBFS/SAT-AMD-20210818-00105/12943361.pdf

work page 2021
[45]

TR 38.811 v15.4.0: Study on new radio (NR) to support non- terrestrial networks,

3GPP, “TR 38.811 v15.4.0: Study on new radio (NR) to support non- terrestrial networks,” 3GPP, Tech. Rep. TR 38.811 V15.4.0, Sep. 2020

work page 2020
[46]

TR 38.821 v16.2.0: Solutions for NR to support non-terrestrial networks (NTN),

——, “TR 38.821 v16.2.0: Solutions for NR to support non-terrestrial networks (NTN),” 3GPP, Tech. Rep. TR 38.821 V16.2.0, Mar. 2023

work page 2023
[47]

ImageNet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Liet al., “ImageNet: A large-scale hierarchical image database,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255

work page 2009
[48]

On the role of ViT and CNN in semantic communications: Analysis and prototype validation,

H. Yoo, L. Dai, S. Kim, and C.-B. Chae, “On the role of ViT and CNN in semantic communications: Analysis and prototype validation,”IEEE Access, vol. 11, pp. 71 528–71 541, Jul. 2023

work page 2023

[1] [1]

A vision, survey, and roadmap toward space communications in the 6G and beyond era,

K. Ntontin, E. Lagunas, J. Querol, J. ur Rehmanet al., “A vision, survey, and roadmap toward space communications in the 6G and beyond era,” Proc. IEEE, pp. 1–37, Jan. 2025

work page 2025

[2] [2]

Toward mobile satellite internet: The fundamental limitation of wireless transmission and enabling technologies,

W. Wang, Y . Zhu, Y . Wang, R. Dinget al., “Toward mobile satellite internet: The fundamental limitation of wireless transmission and enabling technologies,”Engineering, 2025. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S2095809925003698

work page 2025

[3] [3]

Large-scale MIMO enabled satellite communications: Concepts, technologies, and challenges,

Y . Wu, L. Xiao, J. Zhou, M. Fenget al., “Large-scale MIMO enabled satellite communications: Concepts, technologies, and challenges,”IEEE Commun. Mag., 2024. 13

work page 2024

[4] [4]

Bluewalker 3,

AST SpaceMobile, “Bluewalker 3,” 2026, available: https://ast- science.com/spacemobile-network/bluewalker-3/. Accessed: Mar. 27, 2026

work page 2026

[5] [5]

Statistical CSI- based distributed precoding design for OFDM-cooperative multi-satellite systems,

Y . Wang, V . N. Ha, K. Ntontin, H. Yanet al., “Statistical CSI- based distributed precoding design for OFDM-cooperative multi-satellite systems,”IEEE J. Sel. Areas Commun., vol. 44, pp. 3219–3236, Jan. 2026

work page 2026

[6] [6]

Multi-LEO satellite coop- erative transmission: A spatial-temporal-frequency perspective,

Y . Wang, X. Xu, Y . Zhu, W. Wanget al., “Multi-LEO satellite coop- erative transmission: A spatial-temporal-frequency perspective,”IEEE Wireless Commun., pp. 1–8, 2026

work page 2026

[7] [7]

Massive MIMO downlink transmission for multiple LEO satellite communication,

Z. Xiang, X. Gao, K.-X. Li, and X.-G. Xia, “Massive MIMO downlink transmission for multiple LEO satellite communication,”IEEE Trans. Commun., vol. 72, no. 6, pp. 3352–3364, Jun. 2024

work page 2024

[8] [8]

Deep learning enabled semantic communication systems,

H. Xie, Z. Qin, G. Y . Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,”IEEE Trans. Signal Process., vol. 69, pp. 2663–2675, 2021

work page 2021

[9] [9]

Semantic communication: A survey on research landscape, challenges, and future directions,

T. M. Getu, G. Kaddoum, and M. Bennis, “Semantic communication: A survey on research landscape, challenges, and future directions,”Proc. IEEE, vol. 112, no. 11, pp. 1649–1685, Nov. 2024

work page 2024

[10] [10]

Joint coding and modulation for robust semantic communication in satellite communications,

Z. Lin, H. Lin, Y . Sun, S. Basheeret al., “Joint coding and modulation for robust semantic communication in satellite communications,”IEEE Internet Things J., vol. 13, no. 1, pp. 339–346, Jan. 2026

work page 2026

[11] [11]

Massive MIMO transmission for LEO satellite communications,

L. You, K.-X. Li, J. Wang, X. Gaoet al., “Massive MIMO transmission for LEO satellite communications,”IEEE J. Sel. Areas Commun., vol. 38, no. 8, pp. 1851–1865, Aug. 2020

work page 2020

[12] [12]

Downlink transmit design for massive MIMO LEO satellite communications,

K.-X. Li, L. You, J. Wang, X. Gaoet al., “Downlink transmit design for massive MIMO LEO satellite communications,”IEEE Trans. Commun., vol. 70, no. 2, pp. 1014–1028, Feb. 2021

work page 2021

[13] [13]

Energy and computational efficient precoding for LEO satellite communications,

S. Wu, Y . Wang, G. Sun, L. Youet al., “Energy and computational efficient precoding for LEO satellite communications,” inProc. IEEE Glob. Commun. Conf. (GLOBECOM), Kuala Lumpur, Malaysia, Dec. 2023, pp. 1872–1877

work page 2023

[14] [14]

Hybrid analog/digital precoding for downlink massive MIMO LEO satellite communications,

L. You, X. Qiang, K.-X. Li, C. G. Tsinoset al., “Hybrid analog/digital precoding for downlink massive MIMO LEO satellite communications,” IEEE Trans. Wireless Commun., vol. 21, no. 8, pp. 5962–5976, 2022

work page 2022

[15] [15]

User-centric beam selection and precoding design for coordinated multiple-satellite systems,

V . N. Ha, D. H. N. Nguyen, J. C.-M. Duncan, J. L. Gonzalez-Rios et al., “User-centric beam selection and precoding design for coordinated multiple-satellite systems,” inProc. IEEE 35th Int. Symp. Pers., Indoor Mobile Radio Commun. (PIMRC), Valencia, Spain, Sep. 2024, pp. 1–6

work page 2024

[16] [16]

Multi-satellite coordinated beam hopping for interference mitigation under tilted beam effects: A graph-theoretic approach,

Z. Liu, Y . Wang, W. Wang, Y . Sunet al., “Multi-satellite coordinated beam hopping for interference mitigation under tilted beam effects: A graph-theoretic approach,”IEEE Wireless Commun. Lett., vol. 15, pp. 2313–2317, 2026

work page 2026

[17] [17]

Carrier aggregation in satellite communications: Impact and performance study,

M. G. Kibria, E. Lagunas, N. Maturo, H. Al-Hraishawiet al., “Carrier aggregation in satellite communications: Impact and performance study,” IEEE Open J. Commun. Soc., vol. 1, pp. 1390–1402, 2020

work page 2020

[18] [18]

Multi-satellite MIMO systems for direct satellite-to-device communications: A survey,

Z. M. Bakhsh, Y . Omid, G. Chen, F. Kayhanet al., “Multi-satellite MIMO systems for direct satellite-to-device communications: A survey,” IEEE Commun. Surveys Tuts., vol. 27, no. 3, pp. 1536–1564, Jun. 2025

work page 2025

[19] [19]

Multi-satellite multi-stream beamspace massive MIMO transmission,

Y . Wang, Y . Zhu, V . N. Ha, W. Wanget al., “Multi-satellite multi-stream beamspace massive MIMO transmission,”arXiv preprint arXiv:2512.21998, 2025

work page arXiv 2025

[20] [20]

Distributed beamforming for multiple LEO satellites with imperfect delay and Doppler compensa- tions: Modeling and rate analysis,

S. Wu, Y . Wang, G. Sun, W. Wanget al., “Distributed beamforming for multiple LEO satellites with imperfect delay and Doppler compensa- tions: Modeling and rate analysis,”IEEE Trans. Veh. Technol., vol. 74, no. 9, pp. 14 978–14 984, Sep. 2025

work page 2025

[21] [21]

Enabling scalable distributed beam- forming via networked LEO satellites toward 6G,

Y . Zhang and T. Y . Al-Naffouri, “Enabling scalable distributed beam- forming via networked LEO satellites toward 6G,”IEEE Trans. Wireless Commun., vol. 25, pp. 6666–6680, 2026

work page 2026

[22] [22]

Decentralized Cooperative Beamforming for Networked LEO Satellites with Statistical CSI

Y . Zhang, E. Lagunas, X. X. Zheng, S. Chatzinotaset al., “Decentralized cooperative beamforming for networked LEO satellites with statistical CSI,”arXiv preprint arXiv:2512.18890, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[23] [23]

Deep learning-based multi- satellite massive MIMO transmission: Centralized or decentralized?

W. Cao, Y . Wang, J. Zhang, X. Xuet al., “Deep learning-based multi- satellite massive MIMO transmission: Centralized or decentralized?” arXiv preprint arXiv:2603.20862, 2026

work page arXiv 2026

[24] [24]

Deep joint source- channel coding for adaptive image transmission over MIMO channels,

H. Wu, Y . Shao, C. Bian, K. Mikolajczyket al., “Deep joint source- channel coding for adaptive image transmission over MIMO channels,” IEEE Trans. Wireless Commun., vol. 23, no. 11, pp. 15 002–15 017, 2024

work page 2024

[25] [25]

Semantic satellite communications for synchronized audiovisual reconstruction,

F. Liu, P. Jiang, W. Wang, C.-K. Wenet al., “Semantic satellite communications for synchronized audiovisual reconstruction,”arXiv preprint arXiv:2603.10791, 2026

work page arXiv 2026

[26] [26]

Semantic image encoding and communication for earth observation with LEO satellites,

V .-P. Bui, T. Q. Dinh, I. Leyva-Mayorga, S. R. Pandeyet al., “Semantic image encoding and communication for earth observation with LEO satellites,”IEEE Trans. Cogn. Commun. Netw., vol. 11, no. 2, pp. 1210– 1224, 2025

work page 2025

[27] [27]

EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,

P. Helber, B. Bischke, A. Dengel, and D. Borth, “EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification,”IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 12, no. 7, pp. 2217–2226, 2019

work page 2019

[28] [28]

Joint source and channel coding for multi-modal satellite-to-ground semantic communications,

Y . Yin, S. Liu, D. Wen, Y . Wuet al., “Joint source and channel coding for multi-modal satellite-to-ground semantic communications,” inProc. IEEE Wireless Commun. Networking Conf. (WCNC), 2025, pp. 1–6

work page 2025

[29] [29]

Deepma: End-to- end deep multiple access for wireless image transmission in semantic communication,

W. Zhang, K. Bai, S. Zeadally, H. Zhanget al., “Deepma: End-to- end deep multiple access for wireless image transmission in semantic communication,”IEEE Trans. Cogn. Commun. Netw., vol. 10, no. 2, pp. 387–402, 2024

work page 2024

[30] [30]

Optimization of image transmission in cooperative semantic communication networks,

W. Zhang, Y . Wang, M. Chen, T. Luoet al., “Optimization of image transmission in cooperative semantic communication networks,”IEEE Trans. Wireless Commun., vol. 23, no. 2, pp. 861–877, 2024

work page 2024

[31] [31]

Joint beam alignment and doppler estimation for fast time-varying wideband mmWave channels,

H. Hou, Y . Wang, X. Yi, W. Wanget al., “Joint beam alignment and doppler estimation for fast time-varying wideband mmWave channels,” IEEE Trans. Wireless Commun., vol. 23, no. 9, pp. 10 895–10 910, Sep. 2024

work page 2024

[32] [32]

Joint channel estimation and prediction for massive MIMO with frequency hopping sounding,

Y . Zhu, J. Zhuang, G. Sun, H. Houet al., “Joint channel estimation and prediction for massive MIMO with frequency hopping sounding,”IEEE Trans. Commun., vol. 73, no. 7, pp. 5139–5154, Jul. 2025

work page 2025

[33] [33]

Toward multi- satellite cooperative transmission: A joint framework for CSI acquisition, feedback, and phase synchronization,

Y . Zhu, Y . Wang, C. Amatetti, A. Vanelli-Coralliet al., “Toward multi- satellite cooperative transmission: A joint framework for CSI acquisition, feedback, and phase synchronization,”arXiv preprint arXiv:2603.28195, 2026

work page arXiv 2026

[34] [34]

Near optimal timing and fre- quency offset estimation for 5G integrated LEO satellite communication system,

W. Wang, Y . Tong, L. Li, A.-A. Luet al., “Near optimal timing and fre- quency offset estimation for 5G integrated LEO satellite communication system,”IEEE Access, vol. 7, pp. 113 298–113 310, 2019

work page 2019

[35] [35]

Architectures and synchronization techniques for distributed satellite systems: A survey,

L. M. Marrero, J. C. Merlano-Duncan, J. Querol, S. Kumaret al., “Architectures and synchronization techniques for distributed satellite systems: A survey,”IEEE Access, vol. 10, pp. 45 375–45 409, 2022

work page 2022

[36] [36]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Huet al., “Swin transformer: Hierarchical vision transformer using shifted windows,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Oct. 2021, pp. 10 012–10 022

work page 2021

[37] [37]

How do vision transformers work?

N. Park and S. Kim, “How do vision transformers work?” inProc. Int. Conf. Learn. Represent. (ICLR), Apr. 2022

work page 2022

[38] [38]

Learned image compression with mixed transformer-CNN architectures,

J. Liu, H. Sun, and J. Katto, “Learned image compression with mixed transformer-CNN architectures,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2023, pp. 14 388–14 397

work page 2023

[39] [39]

An image is worth 16x16 words: Transformers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenbornet al., “An image is worth 16x16 words: Transformers for image recognition at scale,” inProc. Int. Conf. Learn. Represent. (ICLR), May 2021

work page 2021

[40] [40]

Weighted MMSE precoding for constructive interference region,

Y . Wang, W. Wang, L. You, C. G. Tsinoset al., “Weighted MMSE precoding for constructive interference region,”IEEE Wireless Commun. Lett., vol. 11, no. 12, pp. 2605–2609, 2022

work page 2022

[41] [41]

GLaM: Efficient scaling of language models with mixture-of-experts,

N. Du, Y . Huang, A. M. Dai, S. Tonget al., “GLaM: Efficient scaling of language models with mixture-of-experts,” inProc. 39th Int. Conf. Mach. Learn. (ICML), vol. 162, 2022, pp. 5547–5569

work page 2022

[42] [42]

Toward unified AI models for MU-MIMO communications: A tensor equivariance framework,

Y . Wang, H. Hou, X. Yi, W. Wanget al., “Toward unified AI models for MU-MIMO communications: A tensor equivariance framework,”IEEE Trans. Wireless Commun., vol. 24, no. 12, pp. 10 517–10 533, Dec. 2025

work page 2025

[43] [43]

Set transformer: A framework for attention-based permutation-invariant neural networks,

J. Lee, Y . Lee, J. Kim, A. R. Kosioreket al., “Set transformer: A framework for attention-based permutation-invariant neural networks,” inProc. 36th Int. Conf. Mach. Learn. (ICML), vol. 97, Jun. 2019, pp. 3744–3753

work page 2019

[44] [44]

Federal communications commission; amendment to pending applica- tion for the SpaceX Gen2 NGSO satellite system,

“Federal communications commission; amendment to pending applica- tion for the SpaceX Gen2 NGSO satellite system,” FCC, Washington, D.C., Tech. Rep. File No. SAT-AMD-2021, August 2021, available: https://fcc.report/IBFS/SAT-AMD-20210818-00105/12943361.pdf

work page 2021

[45] [45]

TR 38.811 v15.4.0: Study on new radio (NR) to support non- terrestrial networks,

3GPP, “TR 38.811 v15.4.0: Study on new radio (NR) to support non- terrestrial networks,” 3GPP, Tech. Rep. TR 38.811 V15.4.0, Sep. 2020

work page 2020

[46] [46]

TR 38.821 v16.2.0: Solutions for NR to support non-terrestrial networks (NTN),

——, “TR 38.821 v16.2.0: Solutions for NR to support non-terrestrial networks (NTN),” 3GPP, Tech. Rep. TR 38.821 V16.2.0, Mar. 2023

work page 2023

[47] [47]

ImageNet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Liet al., “ImageNet: A large-scale hierarchical image database,” inProc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2009, pp. 248–255

work page 2009

[48] [48]

On the role of ViT and CNN in semantic communications: Analysis and prototype validation,

H. Yoo, L. Dai, S. Kim, and C.-B. Chae, “On the role of ViT and CNN in semantic communications: Analysis and prototype validation,”IEEE Access, vol. 11, pp. 71 528–71 541, Jul. 2023

work page 2023