AI-Empowered Resource Allocation for Wirelessly Powered Pinching-Antenna Systems
Pith reviewed 2026-05-10 15:24 UTC · model grok-4.3
The pith
A pinching-antenna array combined with deep reinforcement learning boosts energy efficiency in wirelessly powered NOMA systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a deep reinforcement learning algorithm can autonomously learn near-optimal joint policies for pinching-antenna positioning, power control, and time-switching in a wirelessly powered NOMA system, delivering substantial energy-efficiency gains over conventional fixed-antenna schemes even when energy harvesting is nonlinear and user locations and battery states are uncertain.
What carries the argument
The deep reinforcement learning algorithm that learns resource-allocation policies for joint antenna positioning, transmit-power control, and time-switching ratio selection under dynamic propagation and battery conditions.
Load-bearing premise
The deep reinforcement learning algorithm can autonomously learn near-optimal resource allocation policies despite non-convexity, nonlinear energy harvesting, and uncertainties in user locations and battery states.
What would settle it
A side-by-side simulation or hardware test in which the learned policy produces equal or lower energy efficiency than a fixed-antenna scheme under rapid user movement and high battery uncertainty would falsify the reported gains.
Figures
read the original abstract
This paper considers a multi-user system, where the users first harvest energy from the base station and then use the harvested energy to transmit information via non-orthogonal multiple access (NOMA). A pinching antenna array is adopted to assist the energy transfer and information transmission, owing to its ability to adapt to dynamic propagation conditions. To enhance the system's energy efficiency (EE), we formulate a joint optimization problem involving antenna positioning, transmit power control, and time-switching ratio selection. The problem is non-convex due to the coupled variables, nonlinear energy-harvesting characteristics, and uncertainties in user locations and battery states. To effectively solve this problem, a deep reinforcement learning-based algorithm is proposed to autonomously learn near-optimal resource allocation policies in dynamic environments. Simulation results demonstrate that the proposed PA-assisted scheme achieves significant gains in EE compared with conventional fixed-antenna schemes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript considers a multi-user wirelessly powered NOMA system in which users harvest energy from the base station before transmitting information. A pinching-antenna array is used to adapt to dynamic channels. The authors formulate a joint non-convex optimization problem over antenna positions, transmit powers, and time-switching ratios to maximize energy efficiency, incorporating nonlinear energy harvesting and uncertainties in user locations and battery states. They propose a deep reinforcement learning algorithm to learn near-optimal policies and report that simulations show significant EE gains relative to conventional fixed-antenna schemes.
Significance. If the DRL policies are shown to generalize across the stated uncertainties, the work would provide a concrete demonstration of reconfigurable-antenna-assisted resource allocation in energy-harvesting NOMA systems. The combination of pinching antennas with DRL for a coupled positioning/power/time-switching problem under nonlinear EH is a timely contribution to adaptive wireless-powered networks, provided the simulation evidence is made reproducible and the generalization claims are substantiated.
major comments (2)
- [Abstract and Simulation Results] Abstract and Simulation Results section: the central claim that the DRL agent 'autonomously learn[s] near-optimal resource allocation policies in dynamic environments' rests on simulation results, yet no information is supplied on the state/action spaces, reward design, training episode generation, baseline algorithms, number of Monte-Carlo runs, or how location and battery-state uncertainties are sampled. Without these details the reported EE gains cannot be assessed for robustness or generalization.
- [Problem Formulation and DRL Algorithm] Problem Formulation and DRL Algorithm sections: user locations and battery states are modeled as random variables, but the manuscript does not specify the probability distributions, the range of variation, or whether training episodes are drawn from a single fixed distribution versus a broader ensemble. If the training distribution is narrow, the learned policy may overfit and the claimed performance advantage under 'dynamic environments' would not hold.
minor comments (2)
- [System Model] Notation for the time-switching ratio and the nonlinear EH model should be introduced with explicit equations and parameter definitions in the System Model section to improve readability.
- [Abstract] The abstract would benefit from a brief statement of the number of users, the pinching-antenna array size, and the key simulation parameters that produce the reported EE gains.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help improve the clarity and reproducibility of our work. We will revise the manuscript to address the concerns regarding implementation details and uncertainty modeling.
read point-by-point responses
-
Referee: [Abstract and Simulation Results] Abstract and Simulation Results section: the central claim that the DRL agent 'autonomously learn[s] near-optimal resource allocation policies in dynamic environments' rests on simulation results, yet no information is supplied on the state/action spaces, reward design, training episode generation, baseline algorithms, number of Monte-Carlo runs, or how location and battery-state uncertainties are sampled. Without these details the reported EE gains cannot be assessed for robustness or generalization.
Authors: We agree that these details are necessary for evaluating robustness and generalization. In the revised manuscript, we will add a dedicated subsection under Simulation Results that explicitly describes the state and action spaces, reward function, training episode generation process, baseline algorithms, number of Monte-Carlo runs, and the sampling procedures for user locations and battery-state uncertainties. This addition will directly support the claims made in the abstract. revision: yes
-
Referee: [Problem Formulation and DRL Algorithm] Problem Formulation and DRL Algorithm sections: user locations and battery states are modeled as random variables, but the manuscript does not specify the probability distributions, the range of variation, or whether training episodes are drawn from a single fixed distribution versus a broader ensemble. If the training distribution is narrow, the learned policy may overfit and the claimed performance advantage under 'dynamic environments' would not hold.
Authors: We acknowledge the need for this clarification. The revised manuscript will specify the probability distributions (including ranges of variation) for user locations and battery states in the Problem Formulation section. We will also detail in the DRL Algorithm section how training episodes are generated, confirming use of a distribution that reflects the considered dynamic environments, and discuss steps taken to mitigate overfitting risks. revision: yes
Circularity Check
No significant circularity; DRL solver and simulation gains are independent of inputs
full rationale
The paper formulates a joint non-convex optimization over PA positioning, power, and time-switching under nonlinear EH and location/battery uncertainty, then applies a standard DRL algorithm as solver. Simulation results compare EE against fixed-antenna baselines. No derivation step reduces by construction to a fitted parameter, self-citation chain, or renamed input; the reported gains are empirical outputs from the proposed solver, not tautological. The approach is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Energy-efficient power allocation in uplink mmwave massive MIMO with NOMA,
M. Zenget al., “Energy-efficient power allocation in uplink mmwave massive MIMO with NOMA,”IEEE Trans. V eh. Technol., vol. 68, no. 3, pp. 3000–3004, Mar. 2019
work page 2019
-
[2]
Resource allocation for downlink NOMA systems: Key techniques and open issues,
S. M. R. Islamet al., “Resource allocation for downlink NOMA systems: Key techniques and open issues,”IEEE Wireless Commun., vol. 25, no. 2, pp. 40–47, Apr. 2018
work page 2018
-
[3]
Practical non-linear energy harvesting model and resource allocation for SWIPT systems,
E. Boshkovskaet al., “Practical non-linear energy harvesting model and resource allocation for SWIPT systems,”IEEE Commun. Lett., vol. 19, no. 12, pp. 2082–2085, Dec. 2015
work page 2082
-
[4]
Performance of a multiuser cooperative IoT NOMA network with battery-assisted energy harvesting,
K. Agrawalet al., “Performance of a multiuser cooperative IoT NOMA network with battery-assisted energy harvesting,”IEEE Trans. Ind. Informat., vol. 20, no. 2, pp. 2307–2319, Feb. 2024
work page 2024
-
[5]
Energy-efficient resource allocation for NOMA- assisted uplink pinching-antenna systems,
M. Zenget al., “Energy-efficient resource allocation for NOMA- assisted uplink pinching-antenna systems,”IEEE Wireless Commun. Lett., vol. 14, no. 11, pp. 3695 – 3699, Nov. 2025
work page 2025
-
[6]
Delay minimization in pinching-antenna-enabled NOMA- MEC networks,
Y . Aiet al., “Delay minimization in pinching-antenna-enabled NOMA- MEC networks,”IEEE Commun. Lett., vol. 30, pp. 962 – 966, Jan. 2026
work page 2026
-
[7]
Performance analysis of wireless-powered pinching antenna systems,
K. Caoet al., “Performance analysis of wireless-powered pinching antenna systems,”arXiv preprint arXiv:2511.03401, Nov. 2025
-
[8]
Pinching- antenna systems (PASS)-enabled secure wireless communications,
G. Zhu, X. Mu, L. Guo, S. Xu, Y . Liu, and N. Al-Dhahir, “Pinching- antenna systems (PASS)-enabled secure wireless communications,” IEEE Trans. Commun., vol. 74, pp. 490–505, Oct. 2025
work page 2025
-
[9]
J. Zhaoet al., “Pinching-antenna systems-enabled multi-user communi- cations: Transmission structures and beamforming optimization,”IEEE Trans. Commun., vol. 74, pp. 2316–2330, Dec. 2025
work page 2025
-
[10]
Pinching antenna-aided wireless powered communication networks,
Y . Liet al., “Pinching antenna-aided wireless powered communication networks,”IEEE Wireless Commun. Lett., vol. 15, pp. 255 – 259, Oct. 2025
work page 2025
-
[11]
Modeling and beamforming optimization for pinching- antenna systems,
Z. Wanget al., “Modeling and beamforming optimization for pinching- antenna systems,”IEEE Trans. Commun., vol. 73, no. 12, pp. 13 904– 13 919, Dec. 2025
work page 2025
-
[12]
Antenna activation for NOMA-assisted pinching- antenna systems,
K. Wanget al., “Antenna activation for NOMA-assisted pinching- antenna systems,”IEEE Wireless Commun. Lett., vol. 14, no. 5, pp. 1526–1530, May. 2025
work page 2025
-
[13]
Array gain for pinching-antenna systems (PASS),
C. Ouyanget al., “Array gain for pinching-antenna systems (PASS),” IEEE Commun. Lett., vol. 29, no. 6, pp. 1471–1475, May. 2025
work page 2025
-
[14]
S. Sheikhzadehet al., “AI-based secure NOMA and cognitive radio- enabled green communications: Channel state information and battery value uncertainties,”IEEE Trans. Green Commun. Netw., vol. 6, no. 2, pp. 1037–1054, Dec. 2021
work page 2021
-
[15]
Spectral and energy-efficient resource allocation for multi-carrier uplink NOMA systems,
M. Zenget al., “Spectral and energy-efficient resource allocation for multi-carrier uplink NOMA systems,”IEEE Trans. V eh. Technol., vol. 68, no. 9, pp. 9293–9296, Mar. 2019
work page 2019
-
[16]
No-pain no-gain: DRL assisted optimization in energy- constrained CR-NOMA networks,
Z. Dinget al., “No-pain no-gain: DRL assisted optimization in energy- constrained CR-NOMA networks,”IEEE Trans. Commun., vol. 69, no. 9, pp. 5917–5932, Sep. 2021
work page 2021
-
[17]
Fluid antenna-assisted uplink NOMA networks under imperfect SIC,
S. Pakravanet al., “Fluid antenna-assisted uplink NOMA networks under imperfect SIC,”IEEE Trans. V eh. Technol., vol. 75, no. 1, pp. 1689– 1694, Jan. 2026
work page 2026
-
[18]
S. Liu, “An evaluation of DDPG, TD3, SAC, and PPO: Deep reinforce- ment learning algorithms for controlling continuous system,” inProc. Int. Conf. Data Sci., Adv. Algorithm Intell. Comput. (DAI). Atlantis Press, Feb. 2024, pp. 15–24
work page 2024
-
[19]
AI-based fluid antenna design for client selection in over-the-air federated learning,
M. Ahmadzadehet al., “AI-based fluid antenna design for client selection in over-the-air federated learning,” pp. 42 549–42 558, Oct. 2025
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.