Aerial IRS Deployment-Aided Secure Computation Offloading Against DISCO Jamming Attacks
Pith reviewed 2026-05-10 16:35 UTC · model grok-4.3
The pith
An aerial IRS with slow-timescale deployment and dual-agent DRL maximizes secure offloading utility against DISCO jamming.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An aerial IRS, with its deployment fixed on a long timescale and its phase shifts plus user offloading ratios optimized on a short timescale, raises the achievable secure offloading rate when facing fully-passive IRS jammers that apply random time-varying phase shifts; the dual-agent DRL scheme solves the resulting non-convex stochastic optimization and yields measurable utility gains over benchmarks.
What carries the argument
The two-timescale framework together with the DDADSO dual-agent DRL scheme that separately learns slow AIRS placement and fast offloading-plus-phase decisions.
If this is right
- Slow-timescale AIRS repositioning raises antijamming margin without requiring continuous physical movement.
- Fast-timescale phase and offloading adaptation tracks the random phase shifts of the jammer.
- Separating the two agents reduces the joint action space and improves learning stability.
- The resulting utility improvement holds across varying jammer strengths and user densities in the tested scenarios.
Where Pith is reading between the lines
- The same timescale split could be tested with multiple aerial platforms or with ground-based IRS arrays.
- If the DRL agents are retrained on real channel traces, the scheme might extend to other passive-jammer types.
- Deployment cost and energy for the aerial platform become new variables that future work would need to trade against the observed utility gain.
Load-bearing premise
The channel and jamming models used in simulation match the statistics of real DISCO attacks closely enough for the learned policies to transfer.
What would settle it
Field trials in which the DDADSO policy produces lower secure offloading utility than a static-deployment benchmark under measured DISCO jamming.
Figures
read the original abstract
With the rapid growth of Multi-access Edge Computing (MEC), secure and efficient computation offloading from user equipment (UEs) to edge access points (APs) is critical. However, DISCO intelligent reflective surface-based fully-passive jammers (DIRS-based FPJs) use random time-varying phase shifts to launch DISCO jamming attacks, disrupting offloading performance. This paper leverages an aerial intelligent reflective surface (AIRS) to enable secure computation offloading against DISCO jamming by jointly optimizing offloading ratios, AIRS phase shifts, and deployment. A two-timescale (2Ts) framework is proposed to address the optimization challenge caused by the distinct update frequencies of different strategies. Specifically, AIRS deployment is adjusted on a long timescale to boost antijamming capability due to the impracticality of frequent physical adjustment, while offloading ratios and phase shifts are optimized on a short timescale to adapt to DIRS-jammed dynamic channel conditions. We propose a dual-agent deep reinforcement learning (DRL)-based AIRS deployment-aided secure computation offloading (DDADSO) scheme to maximize the secure offloading utility under DISCO jamming. Simulation results verify that the proposed DDADSO scheme outperforms benchmark schemes, demonstrating the effectiveness of AIRS deployment in improving offloading performance against DISCO jamming attacks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a two-timescale (2Ts) framework and dual-agent deep reinforcement learning (DRL) scheme (DDADSO) for secure computation offloading in MEC systems. AIRS deployment is optimized on a long timescale while offloading ratios and phase shifts are optimized on a short timescale to maximize secure offloading utility against DISCO jamming from DIRS-based fully-passive jammers that apply random time-varying phase shifts. The central claim, supported solely by simulation results, is that DDADSO outperforms benchmark schemes.
Significance. If the simulation results prove robust, the work could meaningfully advance secure MEC designs by demonstrating how aerial IRS deployment can counter advanced passive jamming. The 2Ts separation directly addresses the practical constraint that physical AIRS repositioning cannot occur at the same rate as phase or offloading adjustments. The dual-agent DRL formulation is a reasonable way to handle the resulting mixed-timescale non-convex problem. However, the absence of theoretical bounds, convergence analysis, or sensitivity studies on the jamming and channel models limits the result to an empirical demonstration whose broader applicability remains unproven.
major comments (3)
- [Simulation Results] Simulation Results section: The reported outperformance of DDADSO is presented without specifying the number of independent Monte Carlo runs, confidence intervals on the utility values, or any statistical significance testing of the gains versus benchmarks. Because the central claim rests entirely on these numerical comparisons, the lack of reproducibility details and variance reporting undermines confidence in the magnitude and consistency of the reported improvements.
- [System Model] System Model section: The DISCO jamming model is defined via fully passive DIRS with random time-varying phase shifts, yet no sensitivity analysis is provided with respect to phase-shift correlation time, hardware phase errors, or partial channel knowledge at the jammer. Any mismatch between this idealized model and real DISCO attack dynamics would invalidate the learned policies and the claimed security gains.
- [Proposed Scheme] Proposed Scheme section: The dual-agent DRL algorithm is introduced without convergence guarantees, regret bounds, or even basic training hyperparameters (learning rates, network architectures, replay buffer sizes). In the absence of such analysis, it is impossible to determine whether the 2Ts framework reliably approaches the claimed maximum secure offloading utility or simply overfits the specific simulation environment.
minor comments (3)
- [Abstract] The acronym DIRS is introduced in the abstract without an explicit expansion; a parenthetical definition on first use would improve readability.
- [Simulation Results] Figure captions in the simulation section could explicitly list the key parameter values (e.g., AIRS altitude, jammer power, number of reflecting elements) used to generate each curve.
- [Introduction] A short discussion of related work on aerial IRS deployment or timescale-separated optimization would help situate the novelty of the 2Ts + dual-agent approach.
Simulated Author's Rebuttal
We sincerely thank the referee for the constructive and detailed review of our manuscript. We have carefully addressed each major comment to improve reproducibility, robustness, and clarity. Point-by-point responses follow, with revisions incorporated where appropriate to strengthen the empirical demonstration of the DDADSO scheme.
read point-by-point responses
-
Referee: [Simulation Results] Simulation Results section: The reported outperformance of DDADSO is presented without specifying the number of independent Monte Carlo runs, confidence intervals on the utility values, or any statistical significance testing of the gains versus benchmarks. Because the central claim rests entirely on these numerical comparisons, the lack of reproducibility details and variance reporting undermines confidence in the magnitude and consistency of the reported improvements.
Authors: We agree that these details are necessary for rigorous evaluation of the simulation results. In the revised manuscript, we specify that all results are averaged over 1000 independent Monte Carlo runs, include 95% confidence intervals in the figures and tables, and report the outcomes of paired t-tests confirming statistical significance of the performance gains over the benchmarks at the 0.01 level. revision: yes
-
Referee: [System Model] System Model section: The DISCO jamming model is defined via fully passive DIRS with random time-varying phase shifts, yet no sensitivity analysis is provided with respect to phase-shift correlation time, hardware phase errors, or partial channel knowledge at the jammer. Any mismatch between this idealized model and real DISCO attack dynamics would invalidate the learned policies and the claimed security gains.
Authors: We acknowledge the value of sensitivity analysis for validating the DISCO jamming model. The original model captures the core threat of fully passive, random phase-shift jamming. In the revision, we add a dedicated sensitivity study varying phase-shift correlation time and introducing hardware phase errors up to 10 degrees, showing that the DDADSO policy retains its performance advantage. Regarding partial channel knowledge at the jammer, we clarify that the fully-passive nature precludes active channel estimation, but we discuss this as a potential extension in the revised text. revision: partial
-
Referee: [Proposed Scheme] Proposed Scheme section: The dual-agent DRL algorithm is introduced without convergence guarantees, regret bounds, or even basic training hyperparameters (learning rates, network architectures, replay buffer sizes). In the absence of such analysis, it is impossible to determine whether the 2Ts framework reliably approaches the claimed maximum secure offloading utility or simply overfits the specific simulation environment.
Authors: We agree that implementation details must be fully specified. The revised manuscript includes a new table listing all hyperparameters (learning rates, network architectures with layer sizes, replay buffer size of 10^5, etc.) and training procedures. While theoretical convergence guarantees and regret bounds for the dual-agent DRL under mixed timescales are not derived (as they remain an open challenge for this non-convex setting), we add training reward curves demonstrating empirical convergence within 2000 episodes across multiple random seeds, supporting that the learned policies generalize beyond the specific simulation environment. revision: partial
Circularity Check
No circularity; DRL optimization learns policies from simulated environment without reducing predictions to input fits
full rationale
The paper's central contribution is a dual-agent DRL scheme (DDADSO) within a 2Ts framework that jointly optimizes AIRS deployment, offloading ratios, and phase shifts to maximize secure offloading utility. This is validated via simulation outperformance against benchmarks. No derivation chain reduces a claimed prediction or first-principles result to its own inputs by construction, self-definition, or self-citation load-bearing. The DRL agents learn from an environment model rather than fitting parameters that are then renamed as predictions. Self-citations (if present for channel/jamming models) are not load-bearing for the optimization claim itself, which remains empirically falsifiable through simulation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Wireless channel models and random phase-shift jamming behaviors accurately capture real DISCO attacks
Reference graph
Works this paper leans on
-
[1]
J. Luo, Q. Song, F. Guo, H. Wu, H. M. Som, S. Alahmari, and A. N. Hoshyar, “Joint deep reinforcement learning strategy in MEC for smart 14 Internet of vehicles edge computing networks,”Sustain Comput-infor, vol. 46, p. 101121, Jun. 2025
work page 2025
-
[2]
Intelligent offloading balance for vehicular edge computing and networks,
Y . Wu, X. Fang, G. Min, H. Chen, and C. Luo, “Intelligent offloading balance for vehicular edge computing and networks,”IEEE Trans. Intell. Transp. Syst., vol. 26, no. 05, pp. 5792–5803, May 2025
work page 2025
-
[3]
Vehicle edge computing task offloading strategy based on multi-agent deep reinforcement learning,
J. Bo and X. Zhao, “Vehicle edge computing task offloading strategy based on multi-agent deep reinforcement learning,”J. Grid Comput., vol. 23, no. 2, pp. 1–32, Mar. 2025
work page 2025
-
[4]
DISCO might not be funky: Random intelligent reflective surface configurations that attack,
H. Huang, L. Dai, H. Zhang, C. Zhang, Z. Tian, Y . Cai, A. L. Swindle- hurst, and Z. Han, “DISCO might not be funky: Random intelligent reflective surface configurations that attack,”IEEE Wireless Commun., vol. 31, no. 5, pp. 76–82, Oct. 2024
work page 2024
-
[5]
H. Huang, L. Dai, H. Zhang, Z. Tian, Y . Cai, C. Zhang, A. L. Swindle- hurst, and Z. Han, “Anti-jamming precoding against disco intelligent reflecting surfaces based fully-passive jamming attacks,”IEEE Trans. Wireless Commun., vol. 23, no. 8, pp. 9315–9329, Aug. 2024
work page 2024
-
[6]
L. Jia, N. Qi, Z. Su, F. Chu, S. Fang, K.-K. Wong, and C.-B. Chae, “Game theory and reinforcement learning for anti-jamming defense in wireless communications: Current research, challenges, and solutions,” IEEE Commun. Surv. Tutor., vol. 27, no. 3, pp. 1798–1838, Jun. 2025
work page 2025
-
[7]
Z. U. A. Tariq, E. Baccour, A. Erbad, and M. Hamdi, “Reinforcement learning for resilient aerial-irs assisted wireless communications net- works in the presence of multiple jammers,”IEEE Open J. Commun. Soc., vol. 5, pp. 15–37, Dec. 2023
work page 2023
-
[8]
IRS-enhanced anti-jamming precoding against DISCO physical layer jamming attacks,
H. Huang, H. Zhang, Y . Cai, Y . Zhang, A. L. Swindlehurst, and Z. Han, “IRS-enhanced anti-jamming precoding against DISCO physical layer jamming attacks,” inProc. IEEE Int. Conf. Commun., Denver, CO, Aug. 2024
work page 2024
-
[9]
S. Zeng, H. Zhang, B. Di, Z. Han, and L. Song, “Reconfigurable intel- ligent surface RIS assisted wireless coverage extension: RIS orientation and location optimization,”IEEE Commun. Lett., vol. 25, no. 1, pp. 269– 273, Sep. 2021
work page 2021
-
[10]
RIS-aided wireless communications: Extra degrees of freedom via rotation and location optimization,
Y . Cheng, W. Peng, C. Huang, G. C. Alexandropoulos, C. Yuen, and M. Debbah, “RIS-aided wireless communications: Extra degrees of freedom via rotation and location optimization,”IEEE Trans. Wireless Commun., vol. 21, no. 8, pp. 6656–6671, Feb. 2022
work page 2022
-
[11]
Y . Gao, Y . Zhang, H. Geng, X. Li, D. B. da Costa, and M. Zeng, “Aerial-IRS-assisted securing communications against eavesdropping: Joint trajectory and resource allocation,”IEEE Internet Things J., vol. 11, no. 7, pp. 11974–11985, Apr. 2024
work page 2024
-
[12]
Aerial-IRSs- assisted energy-efficient task offloading and computing,
W. Jiang, B. Ai, M. Li, W. Wu, Y . Pei, and X. Shen, “Aerial-IRSs- assisted energy-efficient task offloading and computing,”IEEE Internet Things J., vol. 11, no. 11, pp. 20178–20193, Jun. 2024
work page 2024
-
[13]
D. Kim, S. Jeong, and J. Kang, “Energy-efficient secure offloading system designed via UA V-mounted intelligent reflecting surface for re- silience enhancement,”IEEE Internet Things J., vol. 11, no. 3, pp. 3768– 3778, Feb. 2024
work page 2024
-
[14]
RIS-assisted multi- user MISO communications exploiting statistical CSI,
X. Gan, C. Zhong, C. Huang, and Z. Zhang, “RIS-assisted multi- user MISO communications exploiting statistical CSI,”IEEE Trans. Commun., vol. 69, no. 10, pp. 6781–6792, Jul. 2021
work page 2021
-
[15]
H. Niu, Z. Chu, Z. Zhu, and F. Zhou, “Aerial intelligent reflecting surface for secure wireless networks: Secrecy capacity and optimal trajectory strategy,”Intell. Converg. Netw., vol. 3, no. 1, pp. 119 – 133, Mar. 2022
work page 2022
-
[16]
Delay-optimal computation task scheduling for mobile-edge computing systems,
J. Liu, Y . Mao, J. Zhang, and K. B. Letaief, “Delay-optimal computation task scheduling for mobile-edge computing systems,” inProc. IEEE Int. Symp. Inf. Theory (ISIT), Barcelona, Spain, Aug. 2016
work page 2016
-
[17]
Anti-jamming technique for IRS aided JRC system in mobile vehicular networks,
Y . Yao, B. Zhao, J. Zhao, F. Shu, Y . Wu, and X. Cheng, “Anti-jamming technique for IRS aided JRC system in mobile vehicular networks,” IEEE Trans. Intell. Transp. Syst., vol. 25, no. 9, pp. 12550–12560, Sep. 2024
work page 2024
-
[18]
C. Zhang, M. Juraschek, and C. Herrmann, “Deep reinforcement learning-based dynamic scheduling for resilient and sustainable man- ufacturing: A systematic review,”J. Manuf. Syst., vol. 77, pp. 962–989, Dec. 2024
work page 2024
-
[19]
Energy-efficient edge-cloud collaborative intelligent computing: A two-timescale approach,
T. Wang, Y . Jiang, K. Zhao, and X. Liu, “Energy-efficient edge-cloud collaborative intelligent computing: A two-timescale approach,” inProc. IEEE Int. Conf. Services Comput. (SCC), Barcelona, Spain, Jul. 2022
work page 2022
-
[20]
Two time-scale energy and spectrum allocation for mec networks with hybrid energy supplies,
W. Zhang, Z. Shen, M. Qin, and G. Zhang, “Two time-scale energy and spectrum allocation for mec networks with hybrid energy supplies,” IEEE Trans. Netw. Sci. Eng., vol. 10, no. 6, pp. 3529–3542, Apr. 2023
work page 2023
-
[21]
T. N. Larsen, E. R. Barlaug, and A. Rasheed, “Variational autoencoders for exteroceptive perception in reinforcement learning-based collision avoidance,” inProc. Offshore Technol., Singapore, Singapore, Jun. 2024
work page 2024
-
[22]
Z. Wang, Y . Wei, Z. Feng, F. R. Yu, and Z. Han, “Resource management and reflection optimization for intelligent reflecting surface assisted multi-access edge computing using deep reinforcement learning,”IEEE Trans. Wireless Commun., vol. 22, no. 2, pp. 1175–1186, Oct. 2022
work page 2022
-
[23]
Aerial reconfigurable intelligent surfaces meet mobile edge computing,
B. Shang, H. V . Poor, and L. Liu, “Aerial reconfigurable intelligent surfaces meet mobile edge computing,”IEEE Wirel. Commun., vol. 29, no. 6, pp. 104–111, May 2022
work page 2022
-
[24]
IRS-aided non-orthogonal ISAC systems: Performance analysis and beamforming design,
Z. Yu, X. Hu, C. Liu, and M. Peng, “IRS-aided non-orthogonal ISAC systems: Performance analysis and beamforming design,”IEEE Trans. Green Commun. Networking, vol. 8, no. 4, pp. 1930–1942, Dec. 2024
work page 1930
-
[25]
Hybrid RIS-enhanced ISAC secure systems: Joint optimization in the presence of an extended target,
Y . Yao, J. Zhang, P. Miao, L. Zhang, G. Chen, F. Shu, and K.-K. Wong, “Hybrid RIS-enhanced ISAC secure systems: Joint optimization in the presence of an extended target,”IEEE Trans. Commun., vol. 73, no. 12, pp. 15688–15704, Dec. 2025
work page 2025
-
[26]
Y . Yao, Z. Zhu, P. Miao, X. Cheng, F. Shu, and J. Wang, “Optimizing hybrid RIS-aided ISAC systems in V2X networks: A deep reinforcement learning method for anti-eavesdropping techniques,”IEEE Trans. Veh. Technol., vol. 74, no. 6, pp. 9224–9239, Jun. 2025
work page 2025
-
[27]
L. Yang, S. Ma, S. Shen, G. Xu, and S. Li, “Joint robust beamforming and orientation optimizing for ARIS-aided communication with ARIS location uncertainty,”IEEE Commun. Lett., vol. 28, no. 5, pp. 1097– 1101, May 2024
work page 2024
-
[28]
Aerial RIS-aided physical layer security: Optimal deployment and partitioning,
S. Arzykulov, A. Celik, G. Nauryzbayev, and A. M. Eltawil, “Aerial RIS-aided physical layer security: Optimal deployment and partitioning,” IEEE Trans. Cognit. Commun. Netw., vol. 10, no. 5, pp. 1867–1882, Oct. 2024
work page 2024
-
[29]
IRS-aided mobile edge computing for mine IoT networks using deep reinforcement learning,
P. Zhang, M. Min, J. Xiao, S. Li, and H. Zhang, “IRS-aided mobile edge computing for mine IoT networks using deep reinforcement learning,” inProc. IEEE/CIC Int. Conf. Commun. in China (ICCC), Dalian, China, Aug. 2023
work page 2023
-
[30]
S. Yu, X. Chen, Z. Zhou, X. Gong, and D. Wu, “When deep rein- forcement learning meets federated learning: Intelligent multitimescale resource management for multiaccess edge computing in 5G ultradense network,”IEEE Internet Things J., vol. 8, no. 4, pp. 2238–2251, Sep. 2021
work page 2021
-
[31]
Generative AI for deep reinforcement learning: Framework, analysis, and use cases,
G. Sun, W. Xie, D. Niyato, F. Mei, J. Kang, H. Du, and S. Mao, “Generative AI for deep reinforcement learning: Framework, analysis, and use cases,”IEEE Wireless Commun., vol. 32, no. 3, pp. 186–195, Jun. 2025
work page 2025
-
[32]
Latency minimization for intelligent reflecting surface aided mobile edge computing,
T. Bai, C. Pan, Y . Deng, M. Elkashlan, A. Nallanathan, and L. Hanzo, “Latency minimization for intelligent reflecting surface aided mobile edge computing,”IEEE J. Sel. Areas. Commun., vol. 38, no. 11, pp. 2666–2682, Jul. 2020
work page 2020
-
[33]
Energy and spectral effi- ciency of very large multiuser MIMO systems,
H. Q. Ngo, E. G. Larsson, and T. L. Marzetta, “Energy and spectral effi- ciency of very large multiuser MIMO systems,”IEEE Trans. Commun., vol. 61, no. 4, pp. 1436–1449, Apr. 2013
work page 2013
-
[34]
Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,
Q. Wu and R. Zhang, “Intelligent reflecting surface enhanced wireless network via joint active and passive beamforming,”IEEE Trans. Wireless Commun., vol. 18, no. 11, pp. 5394–5409, Nov. 2019
work page 2019
-
[35]
J. Liu, G. Yang, Y .-C. Liang, and C. Yuen, “Max-min fairness in ris-assisted anti-jamming communications: Optimization versus deep reinforcement learning approaches,”IEEE Trans. Commun., vol. 72, no. 7, pp. 4476–4492, Jul. 2024
work page 2024
-
[36]
Generative-adversarial-network-enhanced DRL for ISAC with double active RISs,
J. Zhang, M. Sheng, C. Xing, J. Liu, N. Zhao, and G. K. Karagiannidis, “Generative-adversarial-network-enhanced DRL for ISAC with double active RISs,”IEEE Internet Things J., vol. 12, no. 10, pp. 13487–13499, May 2025
work page 2025
-
[37]
S. Chen, J. Li, Q. Yuan, H. He, S. Li, and J. Yang, “Two-timescale joint optimization of task scheduling and resource scaling in multi-data center system based on multi-agent deep reinforcement learning,”IEEE Trans. Parallel Distrib. Syst., vol. 35, no. 12, pp. 2331–2346, Sep. 2024
work page 2024
-
[38]
X. Hu, C. Masouros, and K.-K. Wong, “Reconfigurable intelligent surface aided mobile edge computing: From optimization-based to location-only learning-based solutions,”IEEE Trans. Commun., vol. 69, no. 6, pp. 3709–3725, Mar. 2021
work page 2021
-
[39]
Joint trajectory and user scheduling design for UA V-assisted secure communication,
X. Wang, P. Wu, X. Yuan, Y . Hu, and Y . Zhang, “Joint trajectory and user scheduling design for UA V-assisted secure communication,” inProc. 2024 Int. Symp. on Wireless Commun. Syst., Rio de Janeiro, Brazil, Jul. 2024
work page 2024
-
[40]
Learning- based resource management optimization for UA V-assisted MEC against jamming,
S. Liu, H. Yang, L. Xiao, M. Zheng, H. Lu, and Z. Xiong, “Learning- based resource management optimization for UA V-assisted MEC against jamming,”IEEE Trans. Commun., vol. 72, no. 8, pp. 4873–4886, Aug. 2024
work page 2024
-
[41]
K. Muramatsu, Y . Uematsu, S. Okamoto, and N. Yamanaka, “Edge computing that utilizes in-network CPUs to achieve high capacity and interruption tolerance with fewer edge servers,” inProc. Int. Conf. on Computing, Netw. and Commun. (ICNC), Big Island, HI, Jun. 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.