Transformer-Based Hybrid Beamforming with Reconfigurable Pixel Antenna for HAPS Communications

Keke Ying; Ruiqi Wang; Zhen Gao; Ziwei Wan

arxiv: 2605.17858 · v1 · pith:EDWEENMAnew · submitted 2026-05-18 · 💻 cs.IT · math.IT

Transformer-Based Hybrid Beamforming with Reconfigurable Pixel Antenna for HAPS Communications

Ruiqi Wang , Ziwei Wan , Keke Ying , Zhen Gao This is my paper

Pith reviewed 2026-05-20 01:29 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords hybrid beamformingreconfigurable pixel antennatransformer encoderHAPSmassive MIMOspectral efficiencypattern reconfiguration

0 comments

The pith

Transformer-based network selects radiation patterns and precoders for reconfigurable pixel antennas in HAPS massive MIMO.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces PR-HBFNet, a framework that combines a Transformer encoder with residual learning to handle hybrid beamforming for reconfigurable pixel antennas in high-altitude platform station systems. The encoder chooses a radiation pattern for each antenna element from channel state information, after which a second network refines analog and digital precoders starting from singular value decomposition. The result is a design that delivers spectral efficiency close to an exhaustive greedy benchmark while requiring far less computation.

Core claim

The proposed PR-HBFNet, consisting of a pattern reconfigurable network with a Transformer encoder and a hybrid beamforming network with model-driven residual learning, achieves spectral efficiency close to that of the greedy benchmark while significantly reducing computational complexity in RPA-equipped massive MIMO for HAPS communications.

What carries the argument

Transformer encoder that maps channel state information to radiation patterns for each reconfigurable pixel antenna element, paired with model-driven residual learning over SVD initializations to obtain analog and digital precoders.

If this is right

Pattern reconfiguration can be performed in real time for each channel realization without exhaustive search.
Hybrid precoding that starts from SVD and uses residual learning remains effective even when each element has multiple possible radiation patterns.
The overall approach scales to large antenna arrays typical of HAPS deployments while keeping onboard processing feasible.
Spectral efficiency near the greedy optimum supports high-rate links from stratospheric platforms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same Transformer-plus-residual structure could be tested on other reconfigurable antennas such as those used in low-Earth-orbit satellite systems.
If pattern selection generalizes across frequencies, the framework might reduce the number of RF chains needed in future HAPS designs.
Direct comparison against measured HAPS channels rather than simulated ones would reveal whether the learned patterns remain near-optimal under real propagation conditions.

Load-bearing premise

The Transformer encoder can reliably learn and output near-optimal radiation patterns for each reconfigurable pixel antenna element given the channel state information in the HAPS massive MIMO setting.

What would settle it

Simulations or hardware tests that measure the gap in achievable spectral efficiency between PR-HBFNet and the greedy benchmark together with the ratio of their runtimes; a large efficiency gap or no meaningful runtime reduction would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2605.17858 by Keke Ying, Ruiqi Wang, Zhen Gao, Ziwei Wan.

**Figure 2.** Figure 2: Proposed Pattern Reconfigurable Hybrid Beamforming Network. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Reconfigurable radiation patterns. modes that offer improved spatial adaptability compared to conventional fixed-pattern arrays. The generated channel dataset is partitioned into training, validation, and testing subsets, containing 102,400, 10,240, and 10,240 samples, respectively. The proposed PR-HBFNet is implemented on a system with an NVIDIA GeForce GTX 2080Ti GPU. Training is conducted with a batch s… view at source ↗

**Figure 4.** Figure 4: SE performance comparison versus Pt. 1 2 3 4 5 6 Number of UE 8 10 12 14 16 18 20 22 24 26 28 Spectral Efficiency(bits/Hz) Proposed algorithm Greedy algorithm Fixed pattern 1 Fixed pattern 2 Fixed pattern 3 Fixed pattern 4 Random pattern [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: SE performance comparison versus number of UEs. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

read the original abstract

This paper proposes a Transformer-based hybrid beamforming framework for reconfigurable pixel antenna (RPA)-equipped massive multiple-input multiple-output (MIMO) in high-altitude platform station (HAPS) communications. The proposed pattern reconfigurable hybrid beamforming network (PR-HBFNet) comprises two key components: 1) a pattern reconfigurable network that leverages a Transformer encoder to determine the radiation pattern for each antenna element, and 2) a hybrid beamforming network that employs model-driven residual learning to compute analog and digital precoders over SVD-based initializations. Simulation results demonstrate that the proposed PR-HBFNet closely approaches the spectral efficiency of a greedy benchmark while significantly reducing computational complexity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's new angle is pairing a Transformer encoder for per-element RPA pattern selection with residual learning on SVD initializations for hybrid beamforming in HAPS MIMO.

read the letter

The main thing here is PR-HBFNet, which uses a Transformer to pick radiation patterns for each reconfigurable pixel antenna element and then runs model-driven residual learning from SVD starts to get the analog and digital precoders. That specific mix for the RPA-HAPS setting looks like a fresh combination rather than a direct repeat of earlier Transformer beamforming work or standard reconfigurable antenna papers. It targets a real issue: keeping spectral efficiency high while cutting the heavy computation that comes with large arrays and pattern choices in high-altitude platforms. The residual step keeps things anchored in conventional beamforming instead of treating everything as a black-box fit, which is a reasonable design choice. The simulation claim that it gets close to a greedy benchmark at lower complexity is the sort of practical outcome that could matter for extending coverage in remote or emergency scenarios. On the soft side, the abstract gives no numbers on the actual gap, no training details, no error bars, and no tests across varied channel models, so the performance edge is hard to judge from what's stated. The central bet that the Transformer reliably outputs near-optimal patterns from CSI also sits on simulation evidence that needs more unpacking in the full text. This is aimed at people working on aerial wireless systems and reconfigurable antennas rather than a broad audience. A reader already following hybrid beamforming or HAPS papers would get the most out of it. I would send it for peer review because the problem is relevant, the architecture is grounded, and the approach shows clear thinking even if the current results stay preliminary.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a Transformer-based hybrid beamforming framework (PR-HBFNet) for reconfigurable pixel antenna (RPA) equipped massive MIMO systems in high-altitude platform station (HAPS) communications. The architecture consists of (1) a pattern reconfigurable network that uses a Transformer encoder to determine the radiation pattern for each antenna element based on channel state information, and (2) a hybrid beamforming network that applies model-driven residual learning to compute analog and digital precoders initialized from SVD. The central claim, supported by simulations, is that PR-HBFNet achieves spectral efficiency close to that of a greedy benchmark while substantially lowering computational complexity.

Significance. If the simulation results are robust, the work demonstrates a viable machine-learning-assisted approach to joint pattern reconfiguration and hybrid precoding in HAPS massive MIMO, where the combination of Transformer-based pattern selection and residual learning from conventional SVD initializations offers a practical trade-off between performance and complexity. This could be relevant for systems requiring real-time adaptation under the unique propagation conditions of high-altitude platforms.

major comments (2)

[§4] §4 (Simulation Results): The central performance claim that PR-HBFNet 'closely approaches' the spectral efficiency of the greedy benchmark is stated without quantitative gaps (e.g., percentage difference or absolute SE values), error bars, or explicit complexity reduction ratios (e.g., FLOPs or runtime). These metrics are load-bearing for the claim that the method offers a meaningful complexity-performance trade-off; their absence makes it impossible to judge how close 'closely' actually is or whether the reduction is sufficient to justify the added training overhead.
[§3.2] §3.2 (Hybrid Beamforming Network): The residual learning module is initialized from SVD and trained to refine analog/digital precoders, but the manuscript does not specify the loss function, training dataset size, or convergence criteria. Because the overall performance rests on the learned residual correction being near-optimal for the HAPS channel distribution, the lack of these training details leaves the reliability of the model-driven component unverified.

minor comments (2)

[Abstract] The abstract and introduction use 'approaches' and 'significantly reducing' without defining the quantitative thresholds; adding explicit numerical targets would improve clarity.
[§3] Notation for the Transformer encoder output (radiation pattern indices) and the residual precoder updates should be introduced with a single consistent symbol table to avoid ambiguity when reading §3.1 and §3.2 together.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the presentation of our results and methods. We address each major comment below and will revise the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [§4] §4 (Simulation Results): The central performance claim that PR-HBFNet 'closely approaches' the spectral efficiency of the greedy benchmark is stated without quantitative gaps (e.g., percentage difference or absolute SE values), error bars, or explicit complexity reduction ratios (e.g., FLOPs or runtime). These metrics are load-bearing for the claim that the method offers a meaningful complexity-performance trade-off; their absence makes it impossible to judge how close 'closely' actually is or whether the reduction is sufficient to justify the added training overhead.

Authors: We agree that explicit quantitative metrics would strengthen the central claim. In the revised version of §4, we will report the absolute spectral efficiency values for PR-HBFNet and the greedy benchmark across the simulated SNR range, the relative percentage gaps, error bars computed from multiple Monte Carlo channel realizations, and explicit complexity metrics including FLOPs counts and runtime comparisons on standard hardware. These additions will allow a precise evaluation of the performance-complexity trade-off. revision: yes
Referee: [§3.2] §3.2 (Hybrid Beamforming Network): The residual learning module is initialized from SVD and trained to refine analog/digital precoders, but the manuscript does not specify the loss function, training dataset size, or convergence criteria. Because the overall performance rests on the learned residual correction being near-optimal for the HAPS channel distribution, the lack of these training details leaves the reliability of the model-driven component unverified.

Authors: We acknowledge that these training specifics are necessary for assessing the reliability of the residual learning component. In the revised §3.2, we will explicitly state the loss function used to train the residual corrections, the number of channel realizations in the training dataset, and the convergence criteria (including any early stopping rules). These details were part of our implementation but will now be documented to improve reproducibility. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external benchmarks

full rationale

The proposed PR-HBFNet architecture combines a Transformer encoder for RPA pattern selection with model-driven residual hybrid beamforming initialized from SVD. Performance is validated via simulations against an independent greedy benchmark for spectral efficiency and complexity. No equations or claims reduce by construction to fitted inputs, self-citations, or renamed known results; the central results are empirical and externally falsifiable.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 1 invented entities

The central claim rests on the effectiveness of a learned Transformer for pattern selection and residual corrections starting from SVD; these are introduced by the paper without external benchmarks or proofs supplied in the abstract.

free parameters (1)

Transformer encoder weights and residual network parameters
Neural network parameters are fitted during training on simulated channels; exact count and initialization not stated in abstract.

invented entities (1)

PR-HBFNet no independent evidence
purpose: Joint pattern reconfiguration and hybrid precoding for RPA-MIMO HAPS
The named two-stage network is the paper's proposed construct.

pith-pipeline@v0.9.0 · 5645 in / 1273 out tokens · 40848 ms · 2026-05-20T01:29:35.015575+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The proposed pattern reconfigurable hybrid beamforming network (PR-HBFNet) comprises two key components: 1) a pattern reconfigurable network that leverages a Transformer encoder to determine the radiation pattern for each antenna element, and 2) a hybrid beamforming network that employs model-driven residual learning to compute analog and digital precoders over SVD-based initializations.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Simulation results demonstrate that the proposed PR-HBFNet closely approaches the spectral efficiency of a greedy benchmark while significantly reducing computational complexity.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 1 internal anchor

[1]

A Vision and Framework for the High Altitude Platform Station (HAPS) Networks of the Future,

G. K. Kurtet al., “A Vision and Framework for the High Altitude Platform Station (HAPS) Networks of the Future,”IEEE Commun. Surveys Tuts., vol. 23, no. 2, pp. 729–779, 2021

work page 2021
[2]

A review of wireless communication using high-altitude platforms for extended coverage and capacity,

S. C. Arum, D. Grace, and P. D. Mitchell, “A review of wireless communication using high-altitude platforms for extended coverage and capacity,”Comput. Commun., vol. 157, pp. 232–256, 2020

work page 2020
[3]

World Radiocommunication Conference 2019 (WRC-19) Final Acts,

ITU-R, “World Radiocommunication Conference 2019 (WRC-19) Final Acts,” Geneva, Switzerland, 2019

work page 2019
[4]

Transformer-Based Hybrid Beamforming With Dynamic Subarray for Near-Space Airship-Borne Communications,

R. Wang, Z. Gao, K. Ying, Z. Wan, S. Chatzinotas, and M.-S. Alouini, “Transformer-Based Hybrid Beamforming With Dynamic Subarray for Near-Space Airship-Borne Communications,”IEEE Wireless Commun. Lett., vol. 15, pp. 1876–1880, 2026

work page 2026
[5]

Spatially Sparse Precoding in Millimeter Wave MIMO Systems,

O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially Sparse Precoding in Millimeter Wave MIMO Systems,”IEEE Trans. Wireless Commun., vol. 13, no. 3, pp. 1499–1513, Mar. 2014

work page 2014
[6]

A Highly Pattern-Reconfigurable Planar Antenna With 360◦ Single- and Multi-Beam Steering,

Y . Zhanget al., “A Highly Pattern-Reconfigurable Planar Antenna With 360◦ Single- and Multi-Beam Steering,”IEEE Trans. Antennas Propag., vol. 70, no. 8, pp. 6490–6504, Aug. 2022

work page 2022
[7]

Reconfigurable Massive MIMO: Precoding Design and Channel Estimation in the Electromagnetic Domain,

K. Yinget al., “Reconfigurable Massive MIMO: Precoding Design and Channel Estimation in the Electromagnetic Domain,”IEEE Trans. Commun., vol. 72, no. 12, pp. 7609–7625, Dec. 2024

work page 2024
[8]

Successive Boolean Optimization of Planar Pixel Antennas,

S. Shen, Y . Sun, S. Song, D. P. Palomar, and R. D. Murch, “Successive Boolean Optimization of Planar Pixel Antennas,”IEEE Trans. Antennas Propag., vol. 65, no. 2, pp. 920–925, Feb. 2017

work page 2017
[9]

A 3-D GBSM Based on Isotropic and Non- Isotropic Scattering for HAP-MIMO Channel,

Z. Lian, L. Jiang, and C. He, “A 3-D GBSM Based on Isotropic and Non- Isotropic Scattering for HAP-MIMO Channel,”IEEE Commun. Lett., vol. 22, no. 5, pp. 1046–1049, May 2018

work page 2018
[10]

QuaDRiGa-Quasi Deterministic Radio Channel Gen- erator, User Manual and Documentation,

S. Jaeckelet al., “QuaDRiGa-Quasi Deterministic Radio Channel Gen- erator, User Manual and Documentation,” Heinrich Hertz Institute, Tech. Rep. v2.6.1, 2021

work page 2021
[11]

Attention Is All You Need,

A. Vaswaniet al., “Attention Is All You Need,” inProc. Int. Conf. Adv. Neural Inf. Process. Syst. (NeurIPS), Long Beach, CA, USA, Dec. 2017, pp. 5998–6008

work page 2017
[12]

Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication,

Y . Wanget al., “Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication,”IEEE Wireless Commun., vol. 30, no. 6, pp. 127–135, Dec. 2023

work page 2023
[13]

Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Mul- tiuser MIMO Systems,

Q. Hu, Y . Cai, Q. Shi, K. Xu, G. Yu, and Z. Ding, “Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Mul- tiuser MIMO Systems,”IEEE Trans. Wireless Commun., vol. 20, no. 2, pp. 1394–1410, Feb. 2021

work page 2021
[14]

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Y . Bengio, N. L ´eonard, and A. Courville, “Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation,” arXiv preprint arXiv:1308.3432, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[1] [1]

A Vision and Framework for the High Altitude Platform Station (HAPS) Networks of the Future,

G. K. Kurtet al., “A Vision and Framework for the High Altitude Platform Station (HAPS) Networks of the Future,”IEEE Commun. Surveys Tuts., vol. 23, no. 2, pp. 729–779, 2021

work page 2021

[2] [2]

A review of wireless communication using high-altitude platforms for extended coverage and capacity,

S. C. Arum, D. Grace, and P. D. Mitchell, “A review of wireless communication using high-altitude platforms for extended coverage and capacity,”Comput. Commun., vol. 157, pp. 232–256, 2020

work page 2020

[3] [3]

World Radiocommunication Conference 2019 (WRC-19) Final Acts,

ITU-R, “World Radiocommunication Conference 2019 (WRC-19) Final Acts,” Geneva, Switzerland, 2019

work page 2019

[4] [4]

Transformer-Based Hybrid Beamforming With Dynamic Subarray for Near-Space Airship-Borne Communications,

R. Wang, Z. Gao, K. Ying, Z. Wan, S. Chatzinotas, and M.-S. Alouini, “Transformer-Based Hybrid Beamforming With Dynamic Subarray for Near-Space Airship-Borne Communications,”IEEE Wireless Commun. Lett., vol. 15, pp. 1876–1880, 2026

work page 2026

[5] [5]

Spatially Sparse Precoding in Millimeter Wave MIMO Systems,

O. El Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially Sparse Precoding in Millimeter Wave MIMO Systems,”IEEE Trans. Wireless Commun., vol. 13, no. 3, pp. 1499–1513, Mar. 2014

work page 2014

[6] [6]

A Highly Pattern-Reconfigurable Planar Antenna With 360◦ Single- and Multi-Beam Steering,

Y . Zhanget al., “A Highly Pattern-Reconfigurable Planar Antenna With 360◦ Single- and Multi-Beam Steering,”IEEE Trans. Antennas Propag., vol. 70, no. 8, pp. 6490–6504, Aug. 2022

work page 2022

[7] [7]

Reconfigurable Massive MIMO: Precoding Design and Channel Estimation in the Electromagnetic Domain,

K. Yinget al., “Reconfigurable Massive MIMO: Precoding Design and Channel Estimation in the Electromagnetic Domain,”IEEE Trans. Commun., vol. 72, no. 12, pp. 7609–7625, Dec. 2024

work page 2024

[8] [8]

Successive Boolean Optimization of Planar Pixel Antennas,

S. Shen, Y . Sun, S. Song, D. P. Palomar, and R. D. Murch, “Successive Boolean Optimization of Planar Pixel Antennas,”IEEE Trans. Antennas Propag., vol. 65, no. 2, pp. 920–925, Feb. 2017

work page 2017

[9] [9]

A 3-D GBSM Based on Isotropic and Non- Isotropic Scattering for HAP-MIMO Channel,

Z. Lian, L. Jiang, and C. He, “A 3-D GBSM Based on Isotropic and Non- Isotropic Scattering for HAP-MIMO Channel,”IEEE Commun. Lett., vol. 22, no. 5, pp. 1046–1049, May 2018

work page 2018

[10] [10]

QuaDRiGa-Quasi Deterministic Radio Channel Gen- erator, User Manual and Documentation,

S. Jaeckelet al., “QuaDRiGa-Quasi Deterministic Radio Channel Gen- erator, User Manual and Documentation,” Heinrich Hertz Institute, Tech. Rep. v2.6.1, 2021

work page 2021

[11] [11]

Attention Is All You Need,

A. Vaswaniet al., “Attention Is All You Need,” inProc. Int. Conf. Adv. Neural Inf. Process. Syst. (NeurIPS), Long Beach, CA, USA, Dec. 2017, pp. 5998–6008

work page 2017

[12] [12]

Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication,

Y . Wanget al., “Transformer-Empowered 6G Intelligent Networks: From Massive MIMO Processing to Semantic Communication,”IEEE Wireless Commun., vol. 30, no. 6, pp. 127–135, Dec. 2023

work page 2023

[13] [13]

Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Mul- tiuser MIMO Systems,

Q. Hu, Y . Cai, Q. Shi, K. Xu, G. Yu, and Z. Ding, “Iterative Algorithm Induced Deep-Unfolding Neural Networks: Precoding Design for Mul- tiuser MIMO Systems,”IEEE Trans. Wireless Commun., vol. 20, no. 2, pp. 1394–1410, Feb. 2021

work page 2021

[14] [14]

Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation

Y . Bengio, N. L ´eonard, and A. Courville, “Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation,” arXiv preprint arXiv:1308.3432, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013