Aerial IRS Deployment-Aided Secure Computation Offloading Against DISCO Jamming Attacks

Hongliang Zhang; Huan Huang; Jiayang Xiao; Minghui Min; Peng Zhang; Ruixin Yang; Shiyin Li; Zhu Han

arxiv: 2604.10558 · v1 · submitted 2026-04-12 · 📡 eess.SP

Aerial IRS Deployment-Aided Secure Computation Offloading Against DISCO Jamming Attacks

Minghui Min , Peng Zhang , Jiayang Xiao , Ruixin Yang , Shiyin Li , Huan Huang , Hongliang Zhang , Zhu Han This is my paper

Pith reviewed 2026-05-10 16:35 UTC · model grok-4.3

classification 📡 eess.SP

keywords Aerial IRSSecure computation offloadingDISCO jammingTwo-timescale optimizationDual-agent DRLEdge computingAntijamming

0 comments

The pith

An aerial IRS with slow-timescale deployment and dual-agent DRL maximizes secure offloading utility against DISCO jamming.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how an aerial intelligent reflective surface can counter dynamic jamming during edge computation offloading. It separates the problem into a two-timescale structure: the aerial IRS position changes slowly to improve overall resistance, while offloading ratios and reflection phases adapt rapidly to the current jammed channels. A dual-agent deep reinforcement learning algorithm learns the joint policy to maximize the secure utility metric. Simulations confirm higher performance than conventional schemes that lack aerial deployment or the timescale split.

Core claim

An aerial IRS, with its deployment fixed on a long timescale and its phase shifts plus user offloading ratios optimized on a short timescale, raises the achievable secure offloading rate when facing fully-passive IRS jammers that apply random time-varying phase shifts; the dual-agent DRL scheme solves the resulting non-convex stochastic optimization and yields measurable utility gains over benchmarks.

What carries the argument

The two-timescale framework together with the DDADSO dual-agent DRL scheme that separately learns slow AIRS placement and fast offloading-plus-phase decisions.

If this is right

Slow-timescale AIRS repositioning raises antijamming margin without requiring continuous physical movement.
Fast-timescale phase and offloading adaptation tracks the random phase shifts of the jammer.
Separating the two agents reduces the joint action space and improves learning stability.
The resulting utility improvement holds across varying jammer strengths and user densities in the tested scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same timescale split could be tested with multiple aerial platforms or with ground-based IRS arrays.
If the DRL agents are retrained on real channel traces, the scheme might extend to other passive-jammer types.
Deployment cost and energy for the aerial platform become new variables that future work would need to trade against the observed utility gain.

Load-bearing premise

The channel and jamming models used in simulation match the statistics of real DISCO attacks closely enough for the learned policies to transfer.

What would settle it

Field trials in which the DDADSO policy produces lower secure offloading utility than a static-deployment benchmark under measured DISCO jamming.

Figures

Figures reproduced from arXiv: 2604.10558 by Hongliang Zhang, Huan Huang, Jiayang Xiao, Minghui Min, Peng Zhang, Ruixin Yang, Shiyin Li, Zhu Han.

**Figure 2.** Figure 2: The specific example of 2Ts framework. AIRS ΥA (xA, yA, zA). We formulate the secure offloading utility maximization problem as follows: max χ,θA,ΥA PK k=1 Uk, s.t. (a) : Rk ≥ Rmin, k ∈ {1, 2, . . . , K} , (b) : χk ∈ [0, 1] , k ∈ {1, 2, . . . , K} , (c) : ΥA ∈ Υ, (d) : θA,1, ..., θA,NA ∈ [0, 2π), (21) where Rmin in (21a) signifies the minimum offloading rate threshold at each UE [17]. A higher offloading r… view at source ↗

**Figure 3.** Figure 3: We propose a dual-agent DRL-based scheme that [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: SINR versus power control under different jamming [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 6.** Figure 6: The variation in offloading utility with different AIRS [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 8.** Figure 8: Impact of the number of reflecting elements on the [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Impact of the number of DIRS reflecting elements [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 10.** Figure 10: System performance with a growing number of UEs. [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

read the original abstract

With the rapid growth of Multi-access Edge Computing (MEC), secure and efficient computation offloading from user equipment (UEs) to edge access points (APs) is critical. However, DISCO intelligent reflective surface-based fully-passive jammers (DIRS-based FPJs) use random time-varying phase shifts to launch DISCO jamming attacks, disrupting offloading performance. This paper leverages an aerial intelligent reflective surface (AIRS) to enable secure computation offloading against DISCO jamming by jointly optimizing offloading ratios, AIRS phase shifts, and deployment. A two-timescale (2Ts) framework is proposed to address the optimization challenge caused by the distinct update frequencies of different strategies. Specifically, AIRS deployment is adjusted on a long timescale to boost antijamming capability due to the impracticality of frequent physical adjustment, while offloading ratios and phase shifts are optimized on a short timescale to adapt to DIRS-jammed dynamic channel conditions. We propose a dual-agent deep reinforcement learning (DRL)-based AIRS deployment-aided secure computation offloading (DDADSO) scheme to maximize the secure offloading utility under DISCO jamming. Simulation results verify that the proposed DDADSO scheme outperforms benchmark schemes, demonstrating the effectiveness of AIRS deployment in improving offloading performance against DISCO jamming attacks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a two-timescale dual-agent DRL scheme for aerial IRS deployment in secure MEC offloading against DISCO jamming, with simulation gains over baselines but no robustness checks on the models.

read the letter

The main point is that the authors combine an aerial IRS with a dual-agent DRL method under a two-timescale split: slow updates for physical deployment of the IRS and fast updates for offloading ratios plus phase shifts. This targets DISCO jamming from fully passive surfaces that apply random time-varying phases. The simulations report higher secure offloading utility than the benchmarks they chose, which is a reasonable result for this setup. The timescale separation is a practical touch because frequent repositioning of an aerial platform is unrealistic. The DRL agents appear to handle the dynamic channel conditions created by the jammer in their tests. That part of the work is straightforward and fits the problem. The evaluation is the clear weak point. All claims rest on how well the simulated channels and random-phase jamming match actual DISCO attacks. There is no sensitivity analysis for phase errors, non-ideal mobility, or changes in jammer behavior, and no theoretical bounds or convergence results for the learning. If the environment model is off, the learned policies lose value quickly. The paper stays within its simulation world without external validation. This is for people already working on IRS-assisted edge computing and wireless security. A specialist in anti-jamming offloading could pick up the framework and try extending it, but the results will not shift the wider field. I would send it for peer review. The idea is a clear extension of existing IRS and DRL tools to a specific jammer type, and the simulations give enough initial evidence to justify referee time, even though the authors will need to add robustness tests.

Referee Report

3 major / 3 minor

Summary. The manuscript proposes a two-timescale (2Ts) framework and dual-agent deep reinforcement learning (DRL) scheme (DDADSO) for secure computation offloading in MEC systems. AIRS deployment is optimized on a long timescale while offloading ratios and phase shifts are optimized on a short timescale to maximize secure offloading utility against DISCO jamming from DIRS-based fully-passive jammers that apply random time-varying phase shifts. The central claim, supported solely by simulation results, is that DDADSO outperforms benchmark schemes.

Significance. If the simulation results prove robust, the work could meaningfully advance secure MEC designs by demonstrating how aerial IRS deployment can counter advanced passive jamming. The 2Ts separation directly addresses the practical constraint that physical AIRS repositioning cannot occur at the same rate as phase or offloading adjustments. The dual-agent DRL formulation is a reasonable way to handle the resulting mixed-timescale non-convex problem. However, the absence of theoretical bounds, convergence analysis, or sensitivity studies on the jamming and channel models limits the result to an empirical demonstration whose broader applicability remains unproven.

major comments (3)

[Simulation Results] Simulation Results section: The reported outperformance of DDADSO is presented without specifying the number of independent Monte Carlo runs, confidence intervals on the utility values, or any statistical significance testing of the gains versus benchmarks. Because the central claim rests entirely on these numerical comparisons, the lack of reproducibility details and variance reporting undermines confidence in the magnitude and consistency of the reported improvements.
[System Model] System Model section: The DISCO jamming model is defined via fully passive DIRS with random time-varying phase shifts, yet no sensitivity analysis is provided with respect to phase-shift correlation time, hardware phase errors, or partial channel knowledge at the jammer. Any mismatch between this idealized model and real DISCO attack dynamics would invalidate the learned policies and the claimed security gains.
[Proposed Scheme] Proposed Scheme section: The dual-agent DRL algorithm is introduced without convergence guarantees, regret bounds, or even basic training hyperparameters (learning rates, network architectures, replay buffer sizes). In the absence of such analysis, it is impossible to determine whether the 2Ts framework reliably approaches the claimed maximum secure offloading utility or simply overfits the specific simulation environment.

minor comments (3)

[Abstract] The acronym DIRS is introduced in the abstract without an explicit expansion; a parenthetical definition on first use would improve readability.
[Simulation Results] Figure captions in the simulation section could explicitly list the key parameter values (e.g., AIRS altitude, jammer power, number of reflecting elements) used to generate each curve.
[Introduction] A short discussion of related work on aerial IRS deployment or timescale-separated optimization would help situate the novelty of the 2Ts + dual-agent approach.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We sincerely thank the referee for the constructive and detailed review of our manuscript. We have carefully addressed each major comment to improve reproducibility, robustness, and clarity. Point-by-point responses follow, with revisions incorporated where appropriate to strengthen the empirical demonstration of the DDADSO scheme.

read point-by-point responses

Referee: [Simulation Results] Simulation Results section: The reported outperformance of DDADSO is presented without specifying the number of independent Monte Carlo runs, confidence intervals on the utility values, or any statistical significance testing of the gains versus benchmarks. Because the central claim rests entirely on these numerical comparisons, the lack of reproducibility details and variance reporting undermines confidence in the magnitude and consistency of the reported improvements.

Authors: We agree that these details are necessary for rigorous evaluation of the simulation results. In the revised manuscript, we specify that all results are averaged over 1000 independent Monte Carlo runs, include 95% confidence intervals in the figures and tables, and report the outcomes of paired t-tests confirming statistical significance of the performance gains over the benchmarks at the 0.01 level. revision: yes
Referee: [System Model] System Model section: The DISCO jamming model is defined via fully passive DIRS with random time-varying phase shifts, yet no sensitivity analysis is provided with respect to phase-shift correlation time, hardware phase errors, or partial channel knowledge at the jammer. Any mismatch between this idealized model and real DISCO attack dynamics would invalidate the learned policies and the claimed security gains.

Authors: We acknowledge the value of sensitivity analysis for validating the DISCO jamming model. The original model captures the core threat of fully passive, random phase-shift jamming. In the revision, we add a dedicated sensitivity study varying phase-shift correlation time and introducing hardware phase errors up to 10 degrees, showing that the DDADSO policy retains its performance advantage. Regarding partial channel knowledge at the jammer, we clarify that the fully-passive nature precludes active channel estimation, but we discuss this as a potential extension in the revised text. revision: partial
Referee: [Proposed Scheme] Proposed Scheme section: The dual-agent DRL algorithm is introduced without convergence guarantees, regret bounds, or even basic training hyperparameters (learning rates, network architectures, replay buffer sizes). In the absence of such analysis, it is impossible to determine whether the 2Ts framework reliably approaches the claimed maximum secure offloading utility or simply overfits the specific simulation environment.

Authors: We agree that implementation details must be fully specified. The revised manuscript includes a new table listing all hyperparameters (learning rates, network architectures with layer sizes, replay buffer size of 10^5, etc.) and training procedures. While theoretical convergence guarantees and regret bounds for the dual-agent DRL under mixed timescales are not derived (as they remain an open challenge for this non-convex setting), we add training reward curves demonstrating empirical convergence within 2000 episodes across multiple random seeds, supporting that the learned policies generalize beyond the specific simulation environment. revision: partial

Circularity Check

0 steps flagged

No circularity; DRL optimization learns policies from simulated environment without reducing predictions to input fits

full rationale

The paper's central contribution is a dual-agent DRL scheme (DDADSO) within a 2Ts framework that jointly optimizes AIRS deployment, offloading ratios, and phase shifts to maximize secure offloading utility. This is validated via simulation outperformance against benchmarks. No derivation chain reduces a claimed prediction or first-principles result to its own inputs by construction, self-definition, or self-citation load-bearing. The DRL agents learn from an environment model rather than fitting parameters that are then renamed as predictions. Self-citations (if present for channel/jamming models) are not load-bearing for the optimization claim itself, which remains empirically falsifiable through simulation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on standard wireless propagation and optimization assumptions without introducing new entities or many free parameters beyond typical DRL hyperparameters.

axioms (1)

domain assumption Wireless channel models and random phase-shift jamming behaviors accurately capture real DISCO attacks
Invoked to justify the simulation environment and DRL training for both long and short timescales.

pith-pipeline@v0.9.0 · 5553 in / 1109 out tokens · 56162 ms · 2026-05-10T16:35:11.514693+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Joint deep reinforcement learning strategy in MEC for smart 14 Internet of vehicles edge computing networks,

J. Luo, Q. Song, F. Guo, H. Wu, H. M. Som, S. Alahmari, and A. N. Hoshyar, “Joint deep reinforcement learning strategy in MEC for smart 14 Internet of vehicles edge computing networks,”Sustain Comput-infor, vol. 46, p. 101121, Jun. 2025

work page 2025
[2]

Intelligent offloading balance for vehicular edge computing and networks,

Y . Wu, X. Fang, G. Min, H. Chen, and C. Luo, “Intelligent offloading balance for vehicular edge computing and networks,”IEEE Trans. Intell. Transp. Syst., vol. 26, no. 05, pp. 5792–5803, May 2025

work page 2025
[3]

Vehicle edge computing task offloading strategy based on multi-agent deep reinforcement learning,

J. Bo and X. Zhao, “Vehicle edge computing task offloading strategy based on multi-agent deep reinforcement learning,”J. Grid Comput., vol. 23, no. 2, pp. 1–32, Mar. 2025

work page 2025
[4]

DISCO might not be funky: Random intelligent reflective surface configurations that attack,

H. Huang, L. Dai, H. Zhang, C. Zhang, Z. Tian, Y . Cai, A. L. Swindle- hurst, and Z. Han, “DISCO might not be funky: Random intelligent reflective surface configurations that attack,”IEEE Wireless Commun., vol. 31, no. 5, pp. 76–82, Oct. 2024

work page 2024
[5]

Anti-jamming precoding against disco intelligent reflecting surfaces based fully-passive jamming attacks,

H. Huang, L. Dai, H. Zhang, Z. Tian, Y . Cai, C. Zhang, A. L. Swindle- hurst, and Z. Han, “Anti-jamming precoding against disco intelligent reflecting surfaces based fully-passive jamming attacks,”IEEE Trans. Wireless Commun., vol. 23, no. 8, pp. 9315–9329, Aug. 2024

work page 2024
[6]

Game theory and reinforcement learning for anti-jamming defense in wireless communications: Current research, challenges, and solutions,

L. Jia, N. Qi, Z. Su, F. Chu, S. Fang, K.-K. Wong, and C.-B. Chae, “Game theory and reinforcement learning for anti-jamming defense in wireless communications: Current research, challenges, and solutions,” IEEE Commun. Surv. Tutor., vol. 27, no. 3, pp. 1798–1838, Jun. 2025

work page 2025
[7]

Reinforcement learning for resilient aerial-irs assisted wireless communications net- works in the presence of multiple jammers,

Z. U. A. Tariq, E. Baccour, A. Erbad, and M. Hamdi, “Reinforcement learning for resilient aerial-irs assisted wireless communications net- works in the presence of multiple jammers,”IEEE Open J. Commun. Soc., vol. 5, pp. 15–37, Dec. 2023

work page 2023
[8]

IRS-enhanced anti-jamming precoding against DISCO physical layer jamming attacks,

H. Huang, H. Zhang, Y . Cai, Y . Zhang, A. L. Swindlehurst, and Z. Han, “IRS-enhanced anti-jamming precoding against DISCO physical layer jamming attacks,” inProc. IEEE Int. Conf. Commun., Denver, CO, Aug. 2024

work page 2024
[9]

Reconfigurable intel- ligent surface RIS assisted wireless coverage extension: RIS orientation and location optimization,

S. Zeng, H. Zhang, B. Di, Z. Han, and L. Song, “Reconfigurable intel- ligent surface RIS assisted wireless coverage extension: RIS orientation and location optimization,”IEEE Commun. Lett., vol. 25, no. 1, pp. 269– 273, Sep. 2021

work page 2021
[10]

RIS-aided wireless communications: Extra degrees of freedom via rotation and location optimization,

Y . Cheng, W. Peng, C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “RIS-aided wireless communications: Extra degrees of freedom via rotation and location optimization,”IEEE Trans. Wireless Commun., vol. 21, no. 8, pp. 6656–6671, Feb. 2022

work page 2022
[11]

Aerial-IRS-assisted securing communications against eavesdropping: Joint trajectory and resource allocation,

Y . Gao, Y . Zhang, H. Geng, X. Li, D. B. da Costa, and M. Zeng, “Aerial-IRS-assisted securing communications against eavesdropping: Joint trajectory and resource allocation,”IEEE Internet Things J., vol. 11, no. 7, pp. 11974–11985, Apr. 2024

work page 2024
[12]

Aerial-IRSs- assisted energy-efficient task offloading and computing,

W. Jiang, B. Ai, M. Li, W. Wu, Y . Pei, and X. Shen, “Aerial-IRSs- assisted energy-efficient task offloading and computing,”IEEE Internet Things J., vol. 11, no. 11, pp. 20178–20193, Jun. 2024

work page 2024
[13]

Energy-efficient secure offloading system designed via UA V-mounted intelligent reflecting surface for re- silience enhancement,

D. Kim, S. Jeong, and J. Kang, “Energy-efficient secure offloading system designed via UA V-mounted intelligent reflecting surface for re- silience enhancement,”IEEE Internet Things J., vol. 11, no. 3, pp. 3768– 3778, Feb. 2024

work page 2024
[14]

RIS-assisted multi- user MISO communications exploiting statistical CSI,

X. Gan, C. Zhong, C. Huang, and Z. Zhang, “RIS-assisted multi- user MISO communications exploiting statistical CSI,”IEEE Trans. Commun., vol. 69, no. 10, pp. 6781–6792, Jul. 2021

work page 2021
[15]

Aerial intelligent reflecting surface for secure wireless networks: Secrecy capacity and optimal trajectory strategy,

H. Niu, Z. Chu, Z. Zhu, and F. Zhou, “Aerial intelligent reflecting surface for secure wireless networks: Secrecy capacity and optimal trajectory strategy,”Intell. Converg. Netw., vol. 3, no. 1, pp. 119 – 133, Mar. 2022

work page 2022
[16]

Delay-optimal computation task scheduling for mobile-edge computing systems,

J. Liu, Y . Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computation task scheduling for mobile-edge computing systems,” inProc. IEEE Int. Symp. Inf. Theory (ISIT), Barcelona, Spain, Aug. 2016

work page 2016
[17]

Anti-jamming technique for IRS aided JRC system in mobile vehicular networks,

Y . Yao, B. Zhao, J. Zhao, F. Shu, Y . Wu, and X. Cheng, “Anti-jamming technique for IRS aided JRC system in mobile vehicular networks,” IEEE Trans. Intell. Transp. Syst., vol. 25, no. 9, pp. 12550–12560, Sep. 2024

work page 2024
[18]

Deep reinforcement learning-based dynamic scheduling for resilient and sustainable man- ufacturing: A systematic review,

C. Zhang, M. Juraschek, and C. Herrmann, “Deep reinforcement learning-based dynamic scheduling for resilient and sustainable man- ufacturing: A systematic review,”J. Manuf. Syst., vol. 77, pp. 962–989, Dec. 2024

work page 2024
[19]

Energy-efficient edge-cloud collaborative intelligent computing: A two-timescale approach,

T. Wang, Y . Jiang, K. Zhao, and X. Liu, “Energy-efficient edge-cloud collaborative intelligent computing: A two-timescale approach,” inProc. IEEE Int. Conf. Services Comput. (SCC), Barcelona, Spain, Jul. 2022

work page 2022
[20]

Two time-scale energy and spectrum allocation for mec networks with hybrid energy supplies,

W. Zhang, Z. Shen, M. Qin, and G. Zhang, “Two time-scale energy and spectrum allocation for mec networks with hybrid energy supplies,” IEEE Trans. Netw. Sci. Eng., vol. 10, no. 6, pp. 3529–3542, Apr. 2023

work page 2023
[21]

Variational autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance,

T. N. Larsen, E. R. Barlaug, and A. Rasheed, “Variational autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance,” inProc. Offshore Technol., Singapore, Singapore, Jun. 2024

work page 2024
[22]

Resource management and reflection optimization for intelligent reflecting surface assisted multi-access edge computing using deep reinforcement learning,

Z. Wang, Y . Wei, Z. Feng, F. R. Yu, and Z. Han, “Resource management and reflection optimization for intelligent reflecting surface assisted multi-access edge computing using deep reinforcement learning,”IEEE Trans. Wireless Commun., vol. 22, no. 2, pp. 1175–1186, Oct. 2022

work page 2022
[23]

Aerial reconfigurable intelligent surfaces meet mobile edge computing,

B. Shang, H. V . Poor, and L. Liu, “Aerial reconfigurable intelligent surfaces meet mobile edge computing,”IEEE Wirel. Commun., vol. 29, no. 6, pp. 104–111, May 2022

work page 2022
[24]

IRS-aided non-orthogonal ISAC systems: Performance analysis and beamforming design,

Z. Yu, X. Hu, C. Liu, and M. Peng, “IRS-aided non-orthogonal ISAC systems: Performance analysis and beamforming design,”IEEE Trans. Green Commun. Networking, vol. 8, no. 4, pp. 1930–1942, Dec. 2024

work page 1930
[25]

Hybrid RIS-enhanced ISAC secure systems: Joint optimization in the presence of an extended target,

Y . Yao, J. Zhang, P. Miao, L. Zhang, G. Chen, F. Shu, and K.-K. Wong, “Hybrid RIS-enhanced ISAC secure systems: Joint optimization in the presence of an extended target,”IEEE Trans. Commun., vol. 73, no. 12, pp. 15688–15704, Dec. 2025

work page 2025
[26]

Optimizing hybrid RIS-aided ISAC systems in V2X networks: A deep reinforcement learning method for anti-eavesdropping techniques,

Y . Yao, Z. Zhu, P. Miao, X. Cheng, F. Shu, and J. Wang, “Optimizing hybrid RIS-aided ISAC systems in V2X networks: A deep reinforcement learning method for anti-eavesdropping techniques,”IEEE Trans. Veh. Technol., vol. 74, no. 6, pp. 9224–9239, Jun. 2025

work page 2025
[27]

Joint robust beamforming and orientation optimizing for ARIS-aided communication with ARIS location uncertainty,

L. Yang, S. Ma, S. Shen, G. Xu, and S. Li, “Joint robust beamforming and orientation optimizing for ARIS-aided communication with ARIS location uncertainty,”IEEE Commun. Lett., vol. 28, no. 5, pp. 1097– 1101, May 2024

work page 2024
[28]

Aerial RIS-aided physical layer security: Optimal deployment and partitioning,

S. Arzykulov, A. Celik, G. Nauryzbayev, and A. M. Eltawil, “Aerial RIS-aided physical layer security: Optimal deployment and partitioning,” IEEE Trans. Cognit. Commun. Netw., vol. 10, no. 5, pp. 1867–1882, Oct. 2024

work page 2024
[29]

IRS-aided mobile edge computing for mine IoT networks using deep reinforcement learning,

P. Zhang, M. Min, J. Xiao, S. Li, and H. Zhang, “IRS-aided mobile edge computing for mine IoT networks using deep reinforcement learning,” inProc. IEEE/CIC Int. Conf. Commun. in China (ICCC), Dalian, China, Aug. 2023

work page 2023
[30]

When deep rein- forcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5G ultradense network,

S. Yu, X. Chen, Z. Zhou, X. Gong, and D. Wu, “When deep rein- forcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5G ultradense network,”IEEE Internet Things J., vol. 8, no. 4, pp. 2238–2251, Sep. 2021

work page 2021
[31]

Generative AI for deep reinforcement learning: Framework, analysis, and use cases,

G. Sun, W. Xie, D. Niyato, F. Mei, J. Kang, H. Du, and S. Mao, “Generative AI for deep reinforcement learning: Framework, analysis, and use cases,”IEEE Wireless Commun., vol. 32, no. 3, pp. 186–195, Jun. 2025

work page 2025
[32]

Latency minimization for intelligent reflecting surface aided mobile edge computing,

T. Bai, C. Pan, Y . Deng, M. Elkashlan, A. Nallanathan, and L. Hanzo, “Latency minimization for intelligent reflecting surface aided mobile edge computing,”IEEE J. Sel. Areas. Commun., vol. 38, no. 11, pp. 2666–2682, Jul. 2020

work page 2020
[33]

Energy and spectral effi- ciency of very large multiuser MIMO systems,

H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral effi- ciency of very large multiuser MIMO systems,”IEEE Trans. Commun., vol. 61, no. 4, pp. 1436–1449, Apr. 2013

work page 2013
[34]

Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,

Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,”IEEE Trans. Wireless Commun., vol. 18, no. 11, pp. 5394–5409, Nov. 2019

work page 2019
[35]

Max-min fairness in ris-assisted anti-jamming communications: Optimization versus deep reinforcement learning approaches,

J. Liu, G. Yang, Y .-C. Liang, and C. Yuen, “Max-min fairness in ris-assisted anti-jamming communications: Optimization versus deep reinforcement learning approaches,”IEEE Trans. Commun., vol. 72, no. 7, pp. 4476–4492, Jul. 2024

work page 2024
[36]

Generative-adversarial-network-enhanced DRL for ISAC with double active RISs,

J. Zhang, M. Sheng, C. Xing, J. Liu, N. Zhao, and G. K. Karagiannidis, “Generative-adversarial-network-enhanced DRL for ISAC with double active RISs,”IEEE Internet Things J., vol. 12, no. 10, pp. 13487–13499, May 2025

work page 2025
[37]

Two-timescale joint optimization of task scheduling and resource scaling in multi-data center system based on multi-agent deep reinforcement learning,

S. Chen, J. Li, Q. Yuan, H. He, S. Li, and J. Yang, “Two-timescale joint optimization of task scheduling and resource scaling in multi-data center system based on multi-agent deep reinforcement learning,”IEEE Trans. Parallel Distrib. Syst., vol. 35, no. 12, pp. 2331–2346, Sep. 2024

work page 2024
[38]

Reconfigurable intelligent surface aided mobile edge computing: From optimization-based to location-only learning-based solutions,

X. Hu, C. Masouros, and K.-K. Wong, “Reconfigurable intelligent surface aided mobile edge computing: From optimization-based to location-only learning-based solutions,”IEEE Trans. Commun., vol. 69, no. 6, pp. 3709–3725, Mar. 2021

work page 2021
[39]

Joint trajectory and user scheduling design for UA V-assisted secure communication,

X. Wang, P. Wu, X. Yuan, Y . Hu, and Y . Zhang, “Joint trajectory and user scheduling design for UA V-assisted secure communication,” inProc. 2024 Int. Symp. on Wireless Commun. Syst., Rio de Janeiro, Brazil, Jul. 2024

work page 2024
[40]

Learning- based resource management optimization for UA V-assisted MEC against jamming,

S. Liu, H. Yang, L. Xiao, M. Zheng, H. Lu, and Z. Xiong, “Learning- based resource management optimization for UA V-assisted MEC against jamming,”IEEE Trans. Commun., vol. 72, no. 8, pp. 4873–4886, Aug. 2024

work page 2024
[41]

Edge computing that utilizes in-network CPUs to achieve high capacity and interruption tolerance with fewer edge servers,

K. Muramatsu, Y . Uematsu, S. Okamoto, and N. Yamanaka, “Edge computing that utilizes in-network CPUs to achieve high capacity and interruption tolerance with fewer edge servers,” inProc. Int. Conf. on Computing, Netw. and Commun. (ICNC), Big Island, HI, Jun. 2024

work page 2024

[1] [1]

Joint deep reinforcement learning strategy in MEC for smart 14 Internet of vehicles edge computing networks,

J. Luo, Q. Song, F. Guo, H. Wu, H. M. Som, S. Alahmari, and A. N. Hoshyar, “Joint deep reinforcement learning strategy in MEC for smart 14 Internet of vehicles edge computing networks,”Sustain Comput-infor, vol. 46, p. 101121, Jun. 2025

work page 2025

[2] [2]

Intelligent offloading balance for vehicular edge computing and networks,

Y . Wu, X. Fang, G. Min, H. Chen, and C. Luo, “Intelligent offloading balance for vehicular edge computing and networks,”IEEE Trans. Intell. Transp. Syst., vol. 26, no. 05, pp. 5792–5803, May 2025

work page 2025

[3] [3]

Vehicle edge computing task offloading strategy based on multi-agent deep reinforcement learning,

J. Bo and X. Zhao, “Vehicle edge computing task offloading strategy based on multi-agent deep reinforcement learning,”J. Grid Comput., vol. 23, no. 2, pp. 1–32, Mar. 2025

work page 2025

[4] [4]

DISCO might not be funky: Random intelligent reflective surface configurations that attack,

H. Huang, L. Dai, H. Zhang, C. Zhang, Z. Tian, Y . Cai, A. L. Swindle- hurst, and Z. Han, “DISCO might not be funky: Random intelligent reflective surface configurations that attack,”IEEE Wireless Commun., vol. 31, no. 5, pp. 76–82, Oct. 2024

work page 2024

[5] [5]

Anti-jamming precoding against disco intelligent reflecting surfaces based fully-passive jamming attacks,

H. Huang, L. Dai, H. Zhang, Z. Tian, Y . Cai, C. Zhang, A. L. Swindle- hurst, and Z. Han, “Anti-jamming precoding against disco intelligent reflecting surfaces based fully-passive jamming attacks,”IEEE Trans. Wireless Commun., vol. 23, no. 8, pp. 9315–9329, Aug. 2024

work page 2024

[6] [6]

Game theory and reinforcement learning for anti-jamming defense in wireless communications: Current research, challenges, and solutions,

L. Jia, N. Qi, Z. Su, F. Chu, S. Fang, K.-K. Wong, and C.-B. Chae, “Game theory and reinforcement learning for anti-jamming defense in wireless communications: Current research, challenges, and solutions,” IEEE Commun. Surv. Tutor., vol. 27, no. 3, pp. 1798–1838, Jun. 2025

work page 2025

[7] [7]

Reinforcement learning for resilient aerial-irs assisted wireless communications net- works in the presence of multiple jammers,

Z. U. A. Tariq, E. Baccour, A. Erbad, and M. Hamdi, “Reinforcement learning for resilient aerial-irs assisted wireless communications net- works in the presence of multiple jammers,”IEEE Open J. Commun. Soc., vol. 5, pp. 15–37, Dec. 2023

work page 2023

[8] [8]

IRS-enhanced anti-jamming precoding against DISCO physical layer jamming attacks,

H. Huang, H. Zhang, Y . Cai, Y . Zhang, A. L. Swindlehurst, and Z. Han, “IRS-enhanced anti-jamming precoding against DISCO physical layer jamming attacks,” inProc. IEEE Int. Conf. Commun., Denver, CO, Aug. 2024

work page 2024

[9] [9]

Reconfigurable intel- ligent surface RIS assisted wireless coverage extension: RIS orientation and location optimization,

S. Zeng, H. Zhang, B. Di, Z. Han, and L. Song, “Reconfigurable intel- ligent surface RIS assisted wireless coverage extension: RIS orientation and location optimization,”IEEE Commun. Lett., vol. 25, no. 1, pp. 269– 273, Sep. 2021

work page 2021

[10] [10]

RIS-aided wireless communications: Extra degrees of freedom via rotation and location optimization,

Y . Cheng, W. Peng, C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “RIS-aided wireless communications: Extra degrees of freedom via rotation and location optimization,”IEEE Trans. Wireless Commun., vol. 21, no. 8, pp. 6656–6671, Feb. 2022

work page 2022

[11] [11]

Aerial-IRS-assisted securing communications against eavesdropping: Joint trajectory and resource allocation,

Y . Gao, Y . Zhang, H. Geng, X. Li, D. B. da Costa, and M. Zeng, “Aerial-IRS-assisted securing communications against eavesdropping: Joint trajectory and resource allocation,”IEEE Internet Things J., vol. 11, no. 7, pp. 11974–11985, Apr. 2024

work page 2024

[12] [12]

Aerial-IRSs- assisted energy-efficient task offloading and computing,

W. Jiang, B. Ai, M. Li, W. Wu, Y . Pei, and X. Shen, “Aerial-IRSs- assisted energy-efficient task offloading and computing,”IEEE Internet Things J., vol. 11, no. 11, pp. 20178–20193, Jun. 2024

work page 2024

[13] [13]

Energy-efficient secure offloading system designed via UA V-mounted intelligent reflecting surface for re- silience enhancement,

D. Kim, S. Jeong, and J. Kang, “Energy-efficient secure offloading system designed via UA V-mounted intelligent reflecting surface for re- silience enhancement,”IEEE Internet Things J., vol. 11, no. 3, pp. 3768– 3778, Feb. 2024

work page 2024

[14] [14]

RIS-assisted multi- user MISO communications exploiting statistical CSI,

X. Gan, C. Zhong, C. Huang, and Z. Zhang, “RIS-assisted multi- user MISO communications exploiting statistical CSI,”IEEE Trans. Commun., vol. 69, no. 10, pp. 6781–6792, Jul. 2021

work page 2021

[15] [15]

Aerial intelligent reflecting surface for secure wireless networks: Secrecy capacity and optimal trajectory strategy,

H. Niu, Z. Chu, Z. Zhu, and F. Zhou, “Aerial intelligent reflecting surface for secure wireless networks: Secrecy capacity and optimal trajectory strategy,”Intell. Converg. Netw., vol. 3, no. 1, pp. 119 – 133, Mar. 2022

work page 2022

[16] [16]

Delay-optimal computation task scheduling for mobile-edge computing systems,

J. Liu, Y . Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computation task scheduling for mobile-edge computing systems,” inProc. IEEE Int. Symp. Inf. Theory (ISIT), Barcelona, Spain, Aug. 2016

work page 2016

[17] [17]

Anti-jamming technique for IRS aided JRC system in mobile vehicular networks,

Y . Yao, B. Zhao, J. Zhao, F. Shu, Y . Wu, and X. Cheng, “Anti-jamming technique for IRS aided JRC system in mobile vehicular networks,” IEEE Trans. Intell. Transp. Syst., vol. 25, no. 9, pp. 12550–12560, Sep. 2024

work page 2024

[18] [18]

Deep reinforcement learning-based dynamic scheduling for resilient and sustainable man- ufacturing: A systematic review,

C. Zhang, M. Juraschek, and C. Herrmann, “Deep reinforcement learning-based dynamic scheduling for resilient and sustainable man- ufacturing: A systematic review,”J. Manuf. Syst., vol. 77, pp. 962–989, Dec. 2024

work page 2024

[19] [19]

Energy-efficient edge-cloud collaborative intelligent computing: A two-timescale approach,

T. Wang, Y . Jiang, K. Zhao, and X. Liu, “Energy-efficient edge-cloud collaborative intelligent computing: A two-timescale approach,” inProc. IEEE Int. Conf. Services Comput. (SCC), Barcelona, Spain, Jul. 2022

work page 2022

[20] [20]

Two time-scale energy and spectrum allocation for mec networks with hybrid energy supplies,

W. Zhang, Z. Shen, M. Qin, and G. Zhang, “Two time-scale energy and spectrum allocation for mec networks with hybrid energy supplies,” IEEE Trans. Netw. Sci. Eng., vol. 10, no. 6, pp. 3529–3542, Apr. 2023

work page 2023

[21] [21]

Variational autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance,

T. N. Larsen, E. R. Barlaug, and A. Rasheed, “Variational autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance,” inProc. Offshore Technol., Singapore, Singapore, Jun. 2024

work page 2024

[22] [22]

Resource management and reflection optimization for intelligent reflecting surface assisted multi-access edge computing using deep reinforcement learning,

Z. Wang, Y . Wei, Z. Feng, F. R. Yu, and Z. Han, “Resource management and reflection optimization for intelligent reflecting surface assisted multi-access edge computing using deep reinforcement learning,”IEEE Trans. Wireless Commun., vol. 22, no. 2, pp. 1175–1186, Oct. 2022

work page 2022

[23] [23]

Aerial reconfigurable intelligent surfaces meet mobile edge computing,

B. Shang, H. V . Poor, and L. Liu, “Aerial reconfigurable intelligent surfaces meet mobile edge computing,”IEEE Wirel. Commun., vol. 29, no. 6, pp. 104–111, May 2022

work page 2022

[24] [24]

IRS-aided non-orthogonal ISAC systems: Performance analysis and beamforming design,

Z. Yu, X. Hu, C. Liu, and M. Peng, “IRS-aided non-orthogonal ISAC systems: Performance analysis and beamforming design,”IEEE Trans. Green Commun. Networking, vol. 8, no. 4, pp. 1930–1942, Dec. 2024

work page 1930

[25] [25]

Hybrid RIS-enhanced ISAC secure systems: Joint optimization in the presence of an extended target,

Y . Yao, J. Zhang, P. Miao, L. Zhang, G. Chen, F. Shu, and K.-K. Wong, “Hybrid RIS-enhanced ISAC secure systems: Joint optimization in the presence of an extended target,”IEEE Trans. Commun., vol. 73, no. 12, pp. 15688–15704, Dec. 2025

work page 2025

[26] [26]

Optimizing hybrid RIS-aided ISAC systems in V2X networks: A deep reinforcement learning method for anti-eavesdropping techniques,

Y . Yao, Z. Zhu, P. Miao, X. Cheng, F. Shu, and J. Wang, “Optimizing hybrid RIS-aided ISAC systems in V2X networks: A deep reinforcement learning method for anti-eavesdropping techniques,”IEEE Trans. Veh. Technol., vol. 74, no. 6, pp. 9224–9239, Jun. 2025

work page 2025

[27] [27]

Joint robust beamforming and orientation optimizing for ARIS-aided communication with ARIS location uncertainty,

L. Yang, S. Ma, S. Shen, G. Xu, and S. Li, “Joint robust beamforming and orientation optimizing for ARIS-aided communication with ARIS location uncertainty,”IEEE Commun. Lett., vol. 28, no. 5, pp. 1097– 1101, May 2024

work page 2024

[28] [28]

Aerial RIS-aided physical layer security: Optimal deployment and partitioning,

S. Arzykulov, A. Celik, G. Nauryzbayev, and A. M. Eltawil, “Aerial RIS-aided physical layer security: Optimal deployment and partitioning,” IEEE Trans. Cognit. Commun. Netw., vol. 10, no. 5, pp. 1867–1882, Oct. 2024

work page 2024

[29] [29]

IRS-aided mobile edge computing for mine IoT networks using deep reinforcement learning,

P. Zhang, M. Min, J. Xiao, S. Li, and H. Zhang, “IRS-aided mobile edge computing for mine IoT networks using deep reinforcement learning,” inProc. IEEE/CIC Int. Conf. Commun. in China (ICCC), Dalian, China, Aug. 2023

work page 2023

[30] [30]

When deep rein- forcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5G ultradense network,

S. Yu, X. Chen, Z. Zhou, X. Gong, and D. Wu, “When deep rein- forcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5G ultradense network,”IEEE Internet Things J., vol. 8, no. 4, pp. 2238–2251, Sep. 2021

work page 2021

[31] [31]

Generative AI for deep reinforcement learning: Framework, analysis, and use cases,

G. Sun, W. Xie, D. Niyato, F. Mei, J. Kang, H. Du, and S. Mao, “Generative AI for deep reinforcement learning: Framework, analysis, and use cases,”IEEE Wireless Commun., vol. 32, no. 3, pp. 186–195, Jun. 2025

work page 2025

[32] [32]

Latency minimization for intelligent reflecting surface aided mobile edge computing,

T. Bai, C. Pan, Y . Deng, M. Elkashlan, A. Nallanathan, and L. Hanzo, “Latency minimization for intelligent reflecting surface aided mobile edge computing,”IEEE J. Sel. Areas. Commun., vol. 38, no. 11, pp. 2666–2682, Jul. 2020

work page 2020

[33] [33]

Energy and spectral effi- ciency of very large multiuser MIMO systems,

H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral effi- ciency of very large multiuser MIMO systems,”IEEE Trans. Commun., vol. 61, no. 4, pp. 1436–1449, Apr. 2013

work page 2013

[34] [34]

Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,

Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,”IEEE Trans. Wireless Commun., vol. 18, no. 11, pp. 5394–5409, Nov. 2019

work page 2019

[35] [35]

Max-min fairness in ris-assisted anti-jamming communications: Optimization versus deep reinforcement learning approaches,

J. Liu, G. Yang, Y .-C. Liang, and C. Yuen, “Max-min fairness in ris-assisted anti-jamming communications: Optimization versus deep reinforcement learning approaches,”IEEE Trans. Commun., vol. 72, no. 7, pp. 4476–4492, Jul. 2024

work page 2024

[36] [36]

Generative-adversarial-network-enhanced DRL for ISAC with double active RISs,

J. Zhang, M. Sheng, C. Xing, J. Liu, N. Zhao, and G. K. Karagiannidis, “Generative-adversarial-network-enhanced DRL for ISAC with double active RISs,”IEEE Internet Things J., vol. 12, no. 10, pp. 13487–13499, May 2025

work page 2025

[37] [37]

Two-timescale joint optimization of task scheduling and resource scaling in multi-data center system based on multi-agent deep reinforcement learning,

S. Chen, J. Li, Q. Yuan, H. He, S. Li, and J. Yang, “Two-timescale joint optimization of task scheduling and resource scaling in multi-data center system based on multi-agent deep reinforcement learning,”IEEE Trans. Parallel Distrib. Syst., vol. 35, no. 12, pp. 2331–2346, Sep. 2024

work page 2024

[38] [38]

Reconfigurable intelligent surface aided mobile edge computing: From optimization-based to location-only learning-based solutions,

X. Hu, C. Masouros, and K.-K. Wong, “Reconfigurable intelligent surface aided mobile edge computing: From optimization-based to location-only learning-based solutions,”IEEE Trans. Commun., vol. 69, no. 6, pp. 3709–3725, Mar. 2021

work page 2021

[39] [39]

Joint trajectory and user scheduling design for UA V-assisted secure communication,

X. Wang, P. Wu, X. Yuan, Y . Hu, and Y . Zhang, “Joint trajectory and user scheduling design for UA V-assisted secure communication,” inProc. 2024 Int. Symp. on Wireless Commun. Syst., Rio de Janeiro, Brazil, Jul. 2024

work page 2024

[40] [40]

Learning- based resource management optimization for UA V-assisted MEC against jamming,

S. Liu, H. Yang, L. Xiao, M. Zheng, H. Lu, and Z. Xiong, “Learning- based resource management optimization for UA V-assisted MEC against jamming,”IEEE Trans. Commun., vol. 72, no. 8, pp. 4873–4886, Aug. 2024

work page 2024

[41] [41]

Edge computing that utilizes in-network CPUs to achieve high capacity and interruption tolerance with fewer edge servers,

K. Muramatsu, Y . Uematsu, S. Okamoto, and N. Yamanaka, “Edge computing that utilizes in-network CPUs to achieve high capacity and interruption tolerance with fewer edge servers,” inProc. Int. Conf. on Computing, Netw. and Commun. (ICNC), Big Island, HI, Jun. 2024

work page 2024