Hierarchical Learning for IRS-Assisted MEC Systems with Rate-Splitting Multiple Access
Pith reviewed 2026-05-23 08:28 UTC · model grok-4.3
The pith
A hierarchical deep reinforcement learning algorithm jointly optimizes IRS beamforming, user power allocation, task offloading, and RSMA parameters to minimize average delay in MEC systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The hierarchical DRL architecture solves the coupled continuous-discrete optimization of IRS-assisted MEC with RSMA from an uplink perspective, producing lower average delay than benchmarks while handling both beamforming and rate-splitting variables.
What carries the argument
Hierarchical deep reinforcement learning policy and evaluation networks that combine convolutional neural networks with densely connected convolutional networks to extract channel features and separately manage continuous and discrete optimization variables.
If this is right
- RSMA decoding order and public-private ratio choices become additional degrees of freedom that reduce contention latency in uplink offloading.
- Passive IRS phase shifts can be coordinated with active beamforming and power control to improve overall MEC throughput under interference.
- The separation of continuous and discrete actions inside the hierarchical learner allows stable training even when the number of users or IRS elements grows.
- Feature extraction via combined CNN and DenseNet layers improves policy quality by capturing spatial correlations across IRS elements and user channels.
Where Pith is reading between the lines
- The same hierarchical structure could be tested on downlink MEC scenarios or on systems that combine RSMA with other multiple-access schemes.
- If real hardware imposes phase noise or limited feedback, the learned policy may need online fine-tuning layers that the current offline training does not include.
- Extending the state space to include mobility or energy-harvesting constraints would reveal whether the reported convergence speed holds under time-varying conditions.
Load-bearing premise
The non-convex joint optimization problem with highly coupled continuous and discrete variables can be solved to near-optimality by the proposed hierarchical DRL architecture without explicit guarantees on generalization beyond the simulated scenarios.
What would settle it
A set of channel realizations drawn from a distribution different from the training set in which the learned policy produces higher average delay than a well-tuned successive convex approximation baseline.
Figures
read the original abstract
Intelligent reflecting surface (IRS)-assisted mobile edge computing (MEC) systems have shown notable improvements in efficiency, such as reduced latency, higher data rates, and better energy efficiency. However, the resource competition among users will lead to uneven allocation, increased latency, and lower throughput. Fortunately, the rate-splitting multiple access (RSMA) technique has emerged as a promising solution for managing interference and optimizing resource allocation in MEC systems. This paper studies an IRS-assisted MEC system with RSMA, aiming to jointly optimize the passive beamforming of the IRS, the active beamforming of the base station, the task offloading allocation, the transmit power of users, the ratios of public and private information allocation, and the decoding order of the RSMA to minimize the average delay from a novel uplink transmission perspective. Since the formulated problem is non-convex and the optimization variables are highly coupled, we propose a hierarchical deep reinforcement learning-based algorithm to optimize both continuous and discrete variables of the problem. Additionally, to better extract channel features, we design a novel network architecture within the policy and evaluation networks of the proposed algorithm, combining convolutional neural networks and densely connected convolutional network for feature extraction. Simulation results indicate that the proposed algorithm not only exhibits excellent convergence performance but also outperforms various benchmarks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper formulates a non-convex mixed-integer optimization problem for an IRS-assisted MEC system employing RSMA, jointly optimizing IRS passive beamforming, BS active beamforming, task offloading ratios, user transmit powers, public/private rate-splitting ratios, and RSMA decoding order to minimize average delay. It proposes a hierarchical DRL algorithm (actor-critic with separate handling of discrete/continuous actions) whose policy and value networks incorporate a CNN-DenseNet feature extractor. Simulation results are presented to show convergence behavior and outperformance relative to several benchmarks.
Significance. If the reported performance gains are reproducible and statistically reliable, the work provides a concrete demonstration that hierarchical DRL can tractably address the coupled continuous-discrete variables arising in IRS-MEC-RSMA resource allocation. The CNN-DenseNet extractor is a modest but domain-specific architectural choice that may be of interest to researchers applying deep RL to wireless optimization problems with structured channel inputs.
major comments (3)
- [§V (Simulation Results)] The simulation results (abstract and §V) assert that the proposed algorithm “outperforms various benchmarks,” yet the manuscript supplies no information on the channel model (e.g., Rician factors, path-loss exponents), the number of Monte-Carlo realizations, confidence intervals, or statistical tests. Without these details the central empirical claim cannot be evaluated.
- [§V (Simulation Results)] It is not stated whether the benchmark schemes (e.g., conventional DRL, alternating optimization, or heuristic baselines) received equivalent hyper-parameter search or computational budget as the proposed hierarchical agent. This omission directly affects the validity of the reported delay reductions.
- [§IV (Proposed Algorithm)] The problem statement treats the decoding order as a discrete optimization variable, yet the hierarchical architecture description does not specify how the discrete action head is trained or how its exploration is balanced with the continuous beamforming and power actions; this coupling is load-bearing for the claimed near-optimality.
minor comments (2)
- [§II] Notation for the public/private rate-splitting ratios and the IRS phase-shift matrix is introduced without an explicit table of symbols; a compact notation table would improve readability.
- [§V] Figure captions for the convergence and delay plots should include the exact parameter settings (number of users, IRS elements, SNR range) used to generate each curve.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which highlight important aspects for improving reproducibility and clarity. We address each major comment below and will revise the manuscript to incorporate the necessary details and explanations.
read point-by-point responses
-
Referee: [§V (Simulation Results)] The simulation results (abstract and §V) assert that the proposed algorithm “outperforms various benchmarks,” yet the manuscript supplies no information on the channel model (e.g., Rician factors, path-loss exponents), the number of Monte-Carlo realizations, confidence intervals, or statistical tests. Without these details the central empirical claim cannot be evaluated.
Authors: We agree that these details are essential for evaluating the empirical claims. The channel model parameters (Rician factors and path-loss exponents) are defined in the simulation setup section but were not presented with sufficient prominence. In the revised manuscript we will add an explicit subsection in §V listing all channel parameters, the number of Monte-Carlo realizations (averaged over 1000 independent runs), and report mean delays accompanied by standard deviations or confidence intervals across multiple random seeds. This revision will directly address the concern and strengthen the reproducibility of the results. revision: yes
-
Referee: [§V (Simulation Results)] It is not stated whether the benchmark schemes (e.g., conventional DRL, alternating optimization, or heuristic baselines) received equivalent hyper-parameter search or computational budget as the proposed hierarchical agent. This omission directly affects the validity of the reported delay reductions.
Authors: The referee correctly identifies a potential source of bias in the comparisons. While the benchmarks were implemented according to standard practices in the literature, a systematic hyper-parameter search with matched computational budget was not documented. In the revision we will include a new paragraph in §V describing the hyper-parameter selection procedure applied to each benchmark (grid search or random search within the same total training steps) and, where feasible, present additional results obtained under equivalent tuning effort to confirm the validity of the reported gains. revision: yes
-
Referee: [§IV (Proposed Algorithm)] The problem statement treats the decoding order as a discrete optimization variable, yet the hierarchical architecture description does not specify how the discrete action head is trained or how its exploration is balanced with the continuous beamforming and power actions; this coupling is load-bearing for the claimed near-optimality.
Authors: We acknowledge that the current description of the discrete action head is insufficiently detailed. The hierarchical actor employs a dedicated categorical policy head for the decoding order, updated via policy-gradient loss with a temperature-controlled softmax for exploration; continuous actions use a Gaussian policy with additive noise. The two heads share the CNN-DenseNet feature extractor and are jointly optimized under the same critic, with the overall reward signal providing the coupling. In the revised §IV we will expand the architecture description with these specifics, including the loss formulations and exploration schedule, to clarify how the discrete-continuous interaction is handled. revision: yes
Circularity Check
No significant circularity; empirical claims rest on standard simulation validation
full rationale
The paper proposes a hierarchical DRL algorithm to solve a non-convex mixed-integer optimization problem in an IRS-assisted MEC system with RSMA. Its central claims concern convergence behavior and relative performance versus benchmarks, supported solely by simulation results within the modeled environment. No derivation chain, first-principles result, or prediction is asserted that reduces by construction to fitted inputs or self-citations. The simulations constitute external validation against other methods inside the same model, which is the appropriate and non-circular evidence type for this class of algorithmic contribution. No load-bearing self-citation, ansatz smuggling, or self-definitional steps are present.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Latency optimization for resource allocation in mobile-edge computation offloading,
J. Ren, G. Yu, Y . Cai, and Y . He, “Latency optimization for resource allocation in mobile-edge computation offloading,”IEEE Trans. Wireless Commun., vol. 17, no. 8, pp. 5506–5519, Aug. 2018
work page 2018
-
[2]
Mobile edge computing: A survey,
N. Abbas, Y . Zhang, A. Taherkordi, and T. Skeie, “Mobile edge computing: A survey,” IEEE Internet Things J. , vol. 5, no. 1, pp. 450– 465, Feb. 2018
work page 2018
-
[3]
Intelligent reflecting surface-aided wireless communications: A tutorial,
Q. Wu, S. Zhang, B. Zheng, C. You, and R. Zhang, “Intelligent reflecting surface-aided wireless communications: A tutorial,” IEEE Trans. Commun., vol. 69, no. 5, pp. 3313–3351, May 2021
work page 2021
-
[4]
Reconfigurable intelligent surfaces for N-LoS radar surveillance,
A. Aubry, A. De Maio, and M. Rosamilia, “Reconfigurable intelligent surfaces for N-LoS radar surveillance,” IEEE Trans. Veh. Technol. , vol. 70, no. 10, pp. 10 735–10 749, Oct. 2021
work page 2021
-
[5]
Exploiting intelligent reflecting surfaces in NOMA networks: Joint beamforming optimiza- tion,
X. Mu, Y . Liu, L. Guo, J. Lin, and N. Al-Dhahir, “Exploiting intelligent reflecting surfaces in NOMA networks: Joint beamforming optimiza- tion,” IEEE Trans. Wireless Commun. , vol. 19, no. 10, pp. 6884–6898, Oct. 2020
work page 2020
-
[6]
Reconfigurable intelligent surface assisted device-to- device communications,
Y . Chen et al. , “Reconfigurable intelligent surface assisted device-to- device communications,” IEEE Trans. Wireless Commun., vol. 20, no. 5, pp. 2792–2804, May 2021
work page 2021
-
[7]
Intelligent reflecting surface enabled sensing: Cram ´er-Rao bound optimization,
X. Song, J. Xu, F. Liu, T. X. Han, and Y . C. Eldar, “Intelligent reflecting surface enabled sensing: Cram ´er-Rao bound optimization,” IEEE Trans. Signal Process., vol. 71, pp. 2011–2026, May 2023
work page 2011
-
[8]
Rate-splitting multiple access: Fundamentals, survey, and future research trends,
Y . Mao, O. Dizdar, B. Clerckx, R. Schober, P. Popovski, and H. V . Poor, “Rate-splitting multiple access: Fundamentals, survey, and future research trends,” IEEE Commun. Surveys Tuts., vol. 24, no. 4, pp. 2073– 2126, 4th Quart. 2022
work page 2073
-
[9]
A primer on rate-splitting multiple access: Tutorial, myths, and frequently asked questions,
B. Clerckx, Y . Mao, E. A. Jorswieck, J. Yuan, D. J. Love, E. Erkip, and D. Niyato, “A primer on rate-splitting multiple access: Tutorial, myths, and frequently asked questions,” IEEE J. Sel. Areas Commun. , vol. 41, no. 5, pp. 1265–1308, May 2023. 16
work page 2023
-
[10]
Deep reinforcement learning for task offloading in mobile edge computing systems,
M. Tang and V . W. Wong, “Deep reinforcement learning for task offloading in mobile edge computing systems,” IEEE Trans. Mobile Comput., vol. 21, no. 6, pp. 1985–1997, Jun. 2022
work page 1985
-
[11]
Joint user association and resource allocation optimization for MEC-enabled IoT networks,
Y . Sun, J. Xu, and S. Cui, “Joint user association and resource allocation optimization for MEC-enabled IoT networks,” in Proc. IEEE Int. Conf. Commun., ICC, May 2022, pp. 4884–4889
work page 2022
-
[12]
B. Liang, R. Fan, H. Hu, H. Jiang, J. Xu, and N. Zhang, “Joint task offloading and resource allocation in multi-user mobile edge computing with continuous spectrum sharing,” IEEE Trans. Veh. Technol., vol. 73, no. 5, pp. 7234–7249, May 2024
work page 2024
-
[13]
Learning to hybrid offload in space-air-ground integrated mobile edge computing for iot networks,
X. Zhang, W. Liu, H. Xing, Z. Jin, W. Zang, S. Wang, Y . Shen, and L. Xue, “Learning to hybrid offload in space-air-ground integrated mobile edge computing for iot networks,” in Proc. IEEE Int. Conf. CYBER Technol. Autom., Control, Intell. Syst., CYBER , Jul. 2023, pp. 836–841
work page 2023
-
[14]
W. Liu, Z. Jin, X. Zhang, W. Zang, S. Wang, and Y . Shen, “AoI-aware UA V-enabled marine MEC networks with integrated sensing, compu- tation, and communication,” in Proc. IEEE/CIC Int. Conf. Commun. China, ICCC Workshops, Aug. 2023, pp. 1–6
work page 2023
-
[15]
Y . He, K. Xiang, X. Cao, and M. Guizani, “Task scheduling and trajectory optimization based on fairness and communication security for multi-UA V-MEC system,”IEEE Internet Things J. , to appear in 2024
work page 2024
-
[16]
Joint trajectory design and resource allocation in UA V-enabled heterogeneous MEC systems,
W. Liu, H. Wang, X. Zhang, H. Xing, J. Ren, Y . Shen, and S. Cui, “Joint trajectory design and resource allocation in UA V-enabled heterogeneous MEC systems,” IEEE Internet Things J. , vol. 11, no. 19, pp. 30 817– 30 832, Oct. 2024
work page 2024
-
[17]
S. Zhao, Y . Liu, S. Gong, B. Gu, R. Fan, and B. Lyu, “Computation offloading and beamforming optimization for energy minimization in wireless-powered IRS-assisted MEC,” IEEE Internet Things J. , vol. 10, no. 22, pp. 19 466–19 478, Nov. 2023
work page 2023
-
[18]
DRL based data offloading for intelligent reflecting surface aided mobile edge comput- ing,
X. Zhang, Y . Shen, B. Yang, W. Zang, and S. Wang, “DRL based data offloading for intelligent reflecting surface aided mobile edge comput- ing,” in Proc. IEEE Wireless Commun. Networking Conf., WCNC , Mar. 2021, pp. 1–7
work page 2021
-
[19]
Computational rate maximization for IRS-assisted full-duplex wireless-powered MEC systems,
P. Chen, B. Lyu, S. Gong, H. Guo, J. Jiang, and Z. Yang, “Computational rate maximization for IRS-assisted full-duplex wireless-powered MEC systems,” IEEE Trans. Veh. Technol., vol. 73, no. 1, pp. 1191–1206, Jan. 2024
work page 2024
-
[20]
Cram ´er-Rao bound minimization for IRS-enabled multiuser integrated sensing and communications,
X. Song, X. Qin, J. Xu, and R. Zhang, “Cram ´er-Rao bound minimization for IRS-enabled multiuser integrated sensing and communications,” IEEE Trans. Wireless Commun. , to appear in 2024
work page 2024
-
[21]
Multi-IRS- enabled integrated sensing and communications,
Y . Fang, S. Zhang, X. Li, X. Yu, J. Xu, and S. Cui, “Multi-IRS- enabled integrated sensing and communications,”IEEE Trans. Commun., to appear in 2024
work page 2024
-
[22]
User grouping and reflective beamforming for irs-aided urllc,
H. Xie, J. Xu, Y .-F. Liu, L. Liu, and D. W. K. Ng, “User grouping and reflective beamforming for irs-aided urllc,” IEEE Wireless Commun. Lett., vol. 10, no. 11, pp. 2533–2537, Nov. 2021
work page 2021
-
[23]
X. Zhang, H. Xing, W. Zang, Z. Jin, and Y . Shen, “Cybertwin-driven multi-intelligent reflecting surfaces aided vehicular edge computing leveraged by deep reinforcement learning,” in Proc. IEEE Veh. Technol. Conf., (VTC), Sep. 2022, pp. 1–7
work page 2022
-
[24]
Energy efficiency maximization in RIS-assisted SWIPT networks with RSMA: A PPO-based approach,
R. Zhang, K. Xiong, Y . Lu, P. Fan, D. W. K. Ng, and K. B. Letaief, “Energy efficiency maximization in RIS-assisted SWIPT networks with RSMA: A PPO-based approach,” IEEE J. Sel. Areas Commun. , vol. 41, no. 5, pp. 1413–1430, May 2023
work page 2023
-
[25]
C. Wang, X. Zhang, H. Xing, L. Xue, S. Wang, Y . Shen, B. Yang, and X. Guan, “Joint association, beamforming, and resource allocation for multi-irs enabled mu-miso systems with rsma,” IEEE Trans. Mobile Comput., to appear, 2024
work page 2024
-
[26]
J. Huang, Y . Yang, L. Yin, D. He, and Q. Yan, “Deep reinforcement learning-based power allocation for rate-splitting multiple access in 6G LEO satellite communication system,” IEEE Wireless Commun. Lett. , vol. 11, no. 10, pp. 2185–2189, Oct. 2022
work page 2022
-
[27]
Rate-splitting assisted massive machine-type communications in cell-free massive MIMO,
A. Mishra, Y . Mao, L. Sanguinetti, and B. Clerckx, “Rate-splitting assisted massive machine-type communications in cell-free massive MIMO,” IEEE Commun. Lett., vol. 26, no. 6, pp. 1358–1362, Jun. 2022
work page 2022
-
[28]
Rate splitting mul- tiple access aided mobile edge computing in cognitive radio networks,
H. Liu, Y . Ye, Z. Bai, K. J. Kim, and T. A. Tsiftsis, “Rate splitting mul- tiple access aided mobile edge computing in cognitive radio networks,” in Proc. IEEE Int. Conf. Commun. Workshops, ICC Workshops , May 2022, pp. 598–603
work page 2022
-
[29]
Machine learning for predictive deployment of UA Vs with rate splitting multiple access,
L. Lu, Y . Hu, Y . Zhang, G. Jia, J. Nie, and M. Shikh-Bahaei, “Machine learning for predictive deployment of UA Vs with rate splitting multiple access,” in Proc. IEEE Globecom Workshops , Dec. 2020, pp. 1–6
work page 2020
-
[30]
Sum-rate maximization of uplink rate splitting multiple access (rsma) communi- cation,
Z. Yang, M. Chen, W. Saad, W. Xu, and M. Shikh-Bahaei, “Sum-rate maximization of uplink rate splitting multiple access (rsma) communi- cation,” IEEE Trans. Mobile Comput. , vol. 21, no. 7, pp. 2596–2609, Jul. 2022
work page 2022
-
[31]
Delay minimization for rate-splitting multiple access-based multi-server MEC offloading,
M. Diamanti, C. Pelekis, E. E. Tsiropoulou, and S. Papavassiliou, “Delay minimization for rate-splitting multiple access-based multi-server MEC offloading,” IEEE/ACM Trans. Netw. , vol. 32, no. 2, pp. 1035–1047, Apr. 2024
work page 2024
-
[32]
Rate splitting multiple access for sum-rate maximization in IRS aided uplink communications,
M. Katwe, K. Singh, B. Clerckx, and C.-P. Li, “Rate splitting multiple access for sum-rate maximization in IRS aided uplink communications,” IEEE Trans. Wireless Commun. , vol. 22, no. 4, pp. 2246–2261, Apr. 2023
work page 2023
-
[33]
Enhancing sensing capabilities in rsma downlink networks through user-assisted beamforming,
A. Amhaz, M. Elhattab, C. Assi, and S. Sharafeddine, “Enhancing sensing capabilities in rsma downlink networks through user-assisted beamforming,” in ICC 2024 - IEEE International Conference on Com- munications, Denver, CO, USA, 2024, pp. 4335–4340
work page 2024
-
[34]
S. Zan, Y . Pang, R. Gravina, E. Cao, Y . Li, and W. Zang, “A deep reinforcement learning based approach for intelligent reconfigurable surface elements selection,” in Proc. IEEE Int. Conf. Dependable, Auton. Secur. Comput., Int. Conf. Pervasive Intell. Comput., Int. Conf. Cloud Big Data Comput., Int. Conf. Cyber Sci. Technol. Congr., DASC/PiCom/CBDCom/Cy...
work page 2022
-
[35]
Intelli- gent reflecting surface aided mobile edge computing with rate-splitting multiple access,
Y . Wu, X. Zhang, H. Xing, W. Zang, S. Wang, and Y . Shen, “Intelli- gent reflecting surface aided mobile edge computing with rate-splitting multiple access,” in Proc. IEEE Veh. Technol. Conf., (VTC) , Jun. 2024, pp. 1–6
work page 2024
-
[36]
T. Yin, L. Li, W. Lin, H. Hu, D. Ma, J. Liang, T. Bai, C. Pan, and Z. Han, “Joint active and passive beamforming optimization for multi-IRS- assisted wireless communication systems: A covariance matrix adap- tation evolution strategy,” IEEE Transactions on Vehicular Technology , vol. 72, no. 7, pp. 9281–9292, 2023
work page 2023
-
[37]
A rate-splitting approach to the Gaussian multiple-access channel,
B. Rimoldi and R. Urbanke, “A rate-splitting approach to the Gaussian multiple-access channel,” IEEE Trans. Inf. Theory , vol. 42, no. 2, pp. 364–375, Mar. 1996
work page 1996
-
[38]
Computation efficiency maximization in wireless-powered mobile edge computing networks,
F. Zhou and R. Q. Hu, “Computation efficiency maximization in wireless-powered mobile edge computing networks,” IEEE Trans. Wire- less Commun., vol. 19, no. 5, pp. 3170–3184, May 2020
work page 2020
-
[39]
S. Mao, N. Zhang, L. Liu, J. Wu, M. Dong, K. Ota, T. Liu, and D. Wu, “Computation rate maximization for intelligent reflecting surface enhanced wireless powered mobile edge computing networks,” IEEE Trans. Veh. Technol., vol. 70, no. 10, pp. 10 820–10 831, Oct. 2021
work page 2021
-
[40]
Collaborative cloud and edge computing for latency minimization,
J. Ren, G. Yu, Y . He, and G. Y . Li, “Collaborative cloud and edge computing for latency minimization,”IEEE Trans. Veh. Technol., vol. 68, no. 5, pp. 5031–5044, May 2019
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.