pith. machine review for the scientific record. sign in

arxiv: 2604.11979 · v1 · submitted 2026-04-13 · 📡 eess.SP

AI-Empowered Resource Allocation for Wirelessly Powered Pinching-Antenna Systems

Pith reviewed 2026-05-10 15:24 UTC · model grok-4.3

classification 📡 eess.SP
keywords pinching antennawireless power transferNOMAenergy efficiencydeep reinforcement learningresource allocationantenna positioning
0
0 comments X

The pith

A pinching-antenna array combined with deep reinforcement learning boosts energy efficiency in wirelessly powered NOMA systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines a multi-user system in which users harvest energy from a base station before transmitting information over non-orthogonal multiple access. A pinching antenna array is introduced because it can move to match changing propagation conditions and thereby improve both energy transfer and data delivery. The resulting joint optimization of antenna positions, transmit powers, and time-switching ratios is non-convex and must accommodate nonlinear harvesting and location uncertainties, so the authors develop a deep reinforcement learning algorithm that learns allocation policies directly from experience. Simulations show the learned policies produce markedly higher energy efficiency than fixed-antenna baselines. The work therefore demonstrates how adaptive hardware and learning-based control can together reduce the energy cost of sustaining wireless users.

Core claim

The central claim is that a deep reinforcement learning algorithm can autonomously learn near-optimal joint policies for pinching-antenna positioning, power control, and time-switching in a wirelessly powered NOMA system, delivering substantial energy-efficiency gains over conventional fixed-antenna schemes even when energy harvesting is nonlinear and user locations and battery states are uncertain.

What carries the argument

The deep reinforcement learning algorithm that learns resource-allocation policies for joint antenna positioning, transmit-power control, and time-switching ratio selection under dynamic propagation and battery conditions.

Load-bearing premise

The deep reinforcement learning algorithm can autonomously learn near-optimal resource allocation policies despite non-convexity, nonlinear energy harvesting, and uncertainties in user locations and battery states.

What would settle it

A side-by-side simulation or hardware test in which the learned policy produces equal or lower energy efficiency than a fixed-antenna scheme under rapid user movement and high battery uncertainty would falsify the reported gains.

Figures

Figures reproduced from arXiv: 2604.11979 by Fang Fang, Ming Zeng, Mohsen Ahmadzadeh, Saeid Pakravan, Xingwang Li.

Figure 1
Figure 1. Figure 1: System model and operation protocol of the proposed PA-assisted [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Performance evaluation of the proposed DRL-enabled adaptive PA positioning framework: (a) convergence of average reward; (b) EE versus the [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
read the original abstract

This paper considers a multi-user system, where the users first harvest energy from the base station and then use the harvested energy to transmit information via non-orthogonal multiple access (NOMA). A pinching antenna array is adopted to assist the energy transfer and information transmission, owing to its ability to adapt to dynamic propagation conditions. To enhance the system's energy efficiency (EE), we formulate a joint optimization problem involving antenna positioning, transmit power control, and time-switching ratio selection. The problem is non-convex due to the coupled variables, nonlinear energy-harvesting characteristics, and uncertainties in user locations and battery states. To effectively solve this problem, a deep reinforcement learning-based algorithm is proposed to autonomously learn near-optimal resource allocation policies in dynamic environments. Simulation results demonstrate that the proposed PA-assisted scheme achieves significant gains in EE compared with conventional fixed-antenna schemes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript considers a multi-user wirelessly powered NOMA system in which users harvest energy from the base station before transmitting information. A pinching-antenna array is used to adapt to dynamic channels. The authors formulate a joint non-convex optimization problem over antenna positions, transmit powers, and time-switching ratios to maximize energy efficiency, incorporating nonlinear energy harvesting and uncertainties in user locations and battery states. They propose a deep reinforcement learning algorithm to learn near-optimal policies and report that simulations show significant EE gains relative to conventional fixed-antenna schemes.

Significance. If the DRL policies are shown to generalize across the stated uncertainties, the work would provide a concrete demonstration of reconfigurable-antenna-assisted resource allocation in energy-harvesting NOMA systems. The combination of pinching antennas with DRL for a coupled positioning/power/time-switching problem under nonlinear EH is a timely contribution to adaptive wireless-powered networks, provided the simulation evidence is made reproducible and the generalization claims are substantiated.

major comments (2)
  1. [Abstract and Simulation Results] Abstract and Simulation Results section: the central claim that the DRL agent 'autonomously learn[s] near-optimal resource allocation policies in dynamic environments' rests on simulation results, yet no information is supplied on the state/action spaces, reward design, training episode generation, baseline algorithms, number of Monte-Carlo runs, or how location and battery-state uncertainties are sampled. Without these details the reported EE gains cannot be assessed for robustness or generalization.
  2. [Problem Formulation and DRL Algorithm] Problem Formulation and DRL Algorithm sections: user locations and battery states are modeled as random variables, but the manuscript does not specify the probability distributions, the range of variation, or whether training episodes are drawn from a single fixed distribution versus a broader ensemble. If the training distribution is narrow, the learned policy may overfit and the claimed performance advantage under 'dynamic environments' would not hold.
minor comments (2)
  1. [System Model] Notation for the time-switching ratio and the nonlinear EH model should be introduced with explicit equations and parameter definitions in the System Model section to improve readability.
  2. [Abstract] The abstract would benefit from a brief statement of the number of users, the pinching-antenna array size, and the key simulation parameters that produce the reported EE gains.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help improve the clarity and reproducibility of our work. We will revise the manuscript to address the concerns regarding implementation details and uncertainty modeling.

read point-by-point responses
  1. Referee: [Abstract and Simulation Results] Abstract and Simulation Results section: the central claim that the DRL agent 'autonomously learn[s] near-optimal resource allocation policies in dynamic environments' rests on simulation results, yet no information is supplied on the state/action spaces, reward design, training episode generation, baseline algorithms, number of Monte-Carlo runs, or how location and battery-state uncertainties are sampled. Without these details the reported EE gains cannot be assessed for robustness or generalization.

    Authors: We agree that these details are necessary for evaluating robustness and generalization. In the revised manuscript, we will add a dedicated subsection under Simulation Results that explicitly describes the state and action spaces, reward function, training episode generation process, baseline algorithms, number of Monte-Carlo runs, and the sampling procedures for user locations and battery-state uncertainties. This addition will directly support the claims made in the abstract. revision: yes

  2. Referee: [Problem Formulation and DRL Algorithm] Problem Formulation and DRL Algorithm sections: user locations and battery states are modeled as random variables, but the manuscript does not specify the probability distributions, the range of variation, or whether training episodes are drawn from a single fixed distribution versus a broader ensemble. If the training distribution is narrow, the learned policy may overfit and the claimed performance advantage under 'dynamic environments' would not hold.

    Authors: We acknowledge the need for this clarification. The revised manuscript will specify the probability distributions (including ranges of variation) for user locations and battery states in the Problem Formulation section. We will also detail in the DRL Algorithm section how training episodes are generated, confirming use of a distribution that reflects the considered dynamic environments, and discuss steps taken to mitigate overfitting risks. revision: yes

Circularity Check

0 steps flagged

No significant circularity; DRL solver and simulation gains are independent of inputs

full rationale

The paper formulates a joint non-convex optimization over PA positioning, power, and time-switching under nonlinear EH and location/battery uncertainty, then applies a standard DRL algorithm as solver. Simulation results compare EE against fixed-antenna baselines. No derivation step reduces by construction to a fitted parameter, self-citation chain, or renamed input; the reported gains are empirical outputs from the proposed solver, not tautological. The approach is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only: no explicit free parameters, axioms, or invented entities are introduced beyond standard wireless channel models and DRL frameworks.

pith-pipeline@v0.9.0 · 5452 in / 1021 out tokens · 20700 ms · 2026-05-10T15:24:48.910111+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    Energy-efficient power allocation in uplink mmwave massive MIMO with NOMA,

    M. Zenget al., “Energy-efficient power allocation in uplink mmwave massive MIMO with NOMA,”IEEE Trans. V eh. Technol., vol. 68, no. 3, pp. 3000–3004, Mar. 2019

  2. [2]

    Resource allocation for downlink NOMA systems: Key techniques and open issues,

    S. M. R. Islamet al., “Resource allocation for downlink NOMA systems: Key techniques and open issues,”IEEE Wireless Commun., vol. 25, no. 2, pp. 40–47, Apr. 2018

  3. [3]

    Practical non-linear energy harvesting model and resource allocation for SWIPT systems,

    E. Boshkovskaet al., “Practical non-linear energy harvesting model and resource allocation for SWIPT systems,”IEEE Commun. Lett., vol. 19, no. 12, pp. 2082–2085, Dec. 2015

  4. [4]

    Performance of a multiuser cooperative IoT NOMA network with battery-assisted energy harvesting,

    K. Agrawalet al., “Performance of a multiuser cooperative IoT NOMA network with battery-assisted energy harvesting,”IEEE Trans. Ind. Informat., vol. 20, no. 2, pp. 2307–2319, Feb. 2024

  5. [5]

    Energy-efficient resource allocation for NOMA- assisted uplink pinching-antenna systems,

    M. Zenget al., “Energy-efficient resource allocation for NOMA- assisted uplink pinching-antenna systems,”IEEE Wireless Commun. Lett., vol. 14, no. 11, pp. 3695 – 3699, Nov. 2025

  6. [6]

    Delay minimization in pinching-antenna-enabled NOMA- MEC networks,

    Y . Aiet al., “Delay minimization in pinching-antenna-enabled NOMA- MEC networks,”IEEE Commun. Lett., vol. 30, pp. 962 – 966, Jan. 2026

  7. [7]

    Performance analysis of wireless-powered pinching antenna systems,

    K. Caoet al., “Performance analysis of wireless-powered pinching antenna systems,”arXiv preprint arXiv:2511.03401, Nov. 2025

  8. [8]

    Pinching- antenna systems (PASS)-enabled secure wireless communications,

    G. Zhu, X. Mu, L. Guo, S. Xu, Y . Liu, and N. Al-Dhahir, “Pinching- antenna systems (PASS)-enabled secure wireless communications,” IEEE Trans. Commun., vol. 74, pp. 490–505, Oct. 2025

  9. [9]

    Pinching-antenna systems-enabled multi-user communi- cations: Transmission structures and beamforming optimization,

    J. Zhaoet al., “Pinching-antenna systems-enabled multi-user communi- cations: Transmission structures and beamforming optimization,”IEEE Trans. Commun., vol. 74, pp. 2316–2330, Dec. 2025

  10. [10]

    Pinching antenna-aided wireless powered communication networks,

    Y . Liet al., “Pinching antenna-aided wireless powered communication networks,”IEEE Wireless Commun. Lett., vol. 15, pp. 255 – 259, Oct. 2025

  11. [11]

    Modeling and beamforming optimization for pinching- antenna systems,

    Z. Wanget al., “Modeling and beamforming optimization for pinching- antenna systems,”IEEE Trans. Commun., vol. 73, no. 12, pp. 13 904– 13 919, Dec. 2025

  12. [12]

    Antenna activation for NOMA-assisted pinching- antenna systems,

    K. Wanget al., “Antenna activation for NOMA-assisted pinching- antenna systems,”IEEE Wireless Commun. Lett., vol. 14, no. 5, pp. 1526–1530, May. 2025

  13. [13]

    Array gain for pinching-antenna systems (PASS),

    C. Ouyanget al., “Array gain for pinching-antenna systems (PASS),” IEEE Commun. Lett., vol. 29, no. 6, pp. 1471–1475, May. 2025

  14. [14]

    AI-based secure NOMA and cognitive radio- enabled green communications: Channel state information and battery value uncertainties,

    S. Sheikhzadehet al., “AI-based secure NOMA and cognitive radio- enabled green communications: Channel state information and battery value uncertainties,”IEEE Trans. Green Commun. Netw., vol. 6, no. 2, pp. 1037–1054, Dec. 2021

  15. [15]

    Spectral and energy-efficient resource allocation for multi-carrier uplink NOMA systems,

    M. Zenget al., “Spectral and energy-efficient resource allocation for multi-carrier uplink NOMA systems,”IEEE Trans. V eh. Technol., vol. 68, no. 9, pp. 9293–9296, Mar. 2019

  16. [16]

    No-pain no-gain: DRL assisted optimization in energy- constrained CR-NOMA networks,

    Z. Dinget al., “No-pain no-gain: DRL assisted optimization in energy- constrained CR-NOMA networks,”IEEE Trans. Commun., vol. 69, no. 9, pp. 5917–5932, Sep. 2021

  17. [17]

    Fluid antenna-assisted uplink NOMA networks under imperfect SIC,

    S. Pakravanet al., “Fluid antenna-assisted uplink NOMA networks under imperfect SIC,”IEEE Trans. V eh. Technol., vol. 75, no. 1, pp. 1689– 1694, Jan. 2026

  18. [18]

    An evaluation of DDPG, TD3, SAC, and PPO: Deep reinforce- ment learning algorithms for controlling continuous system,

    S. Liu, “An evaluation of DDPG, TD3, SAC, and PPO: Deep reinforce- ment learning algorithms for controlling continuous system,” inProc. Int. Conf. Data Sci., Adv. Algorithm Intell. Comput. (DAI). Atlantis Press, Feb. 2024, pp. 15–24

  19. [19]

    AI-based fluid antenna design for client selection in over-the-air federated learning,

    M. Ahmadzadehet al., “AI-based fluid antenna design for client selection in over-the-air federated learning,” pp. 42 549–42 558, Oct. 2025