AISC deployment in dynamic UAV-assisted MEC network: a reinforcement learning method based on heterogeneous graph attention neural network
Pith reviewed 2026-06-27 23:38 UTC · model grok-4.3
The pith
A double deep attention Q-network on heterogeneous graphs enables effective AISC deployment in dynamic UAV-assisted MEC networks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that modeling the UMEC environment and AISC relationships as a heterogeneous graph and embedding attention mechanisms inside a double deep Q-network allows the reinforcement learning agent to produce deployment decisions that adapt to UAV mobility, yielding shorter AISC completion times, higher completion rates, improved load balancing across UAVs, and lower energy consumption.
What carries the argument
The double deep attention Q-network based on heterogeneous graph neural networks, which encodes diverse UMEC and AISC relationships in the graph and uses attention to weight critical nodes and links during policy learning.
If this is right
- AISC completion time decreases because the agent can reassign VNFs in response to current UAV positions and loads.
- AISC completion rate rises under the same energy and balancing constraints.
- Load is distributed more evenly across the UAV fleet.
- Total energy consumed by the UAVs for inference and communication drops.
- Quality of the delivered AI service improves through shorter and more reliable chain execution.
Where Pith is reading between the lines
- The same graph-plus-attention structure could be tested on other rapidly changing wireless infrastructures such as vehicular or satellite edge networks.
- Adding a short-term mobility predictor to the state representation might reduce the frequency of policy updates needed.
- Scaling the heterogeneous graph construction to hundreds of UAVs would require checking whether attention still prevents policy instability.
Load-bearing premise
That representing the UMEC environment and AISC relationships as a heterogeneous graph plus attention will let the RL agent keep useful policies when UAVs move fast enough to change the topology frequently.
What would settle it
A simulation in which UAV speeds are increased until topology changes occur several times per episode, with the proposed method failing to show lower average AISC completion time or higher completion rate than a standard deep Q-network baseline.
Figures
read the original abstract
Unmanned aerial vehicles-assisted mobile edge computing (UMEC) can execute compute-intensive and latency-critical artificial intelligence (AI) services, which can be provided by multiple UAVs collaborating in the air to perform inference tasks. Completing an AI service requires multiple inferences, each of which is implemented by an AI service chain consisting of multiple virtual network functions (VNFs). The application of AISC relies on an efficient AISC deployment strategy to determine which UAV to deploy VNF on. However, the UMEC network topology is highly dynamic due to the high-speed movement of UAVs or their departure/arrival, which makes the AISC deployment in the UMEC network challenging. In addition, the intricate relationships between UMEC environment and AISC, as well as between individual VNFs in an AISC, can also affect the effectiveness of AISC deployment strategy. Moreover, under the constraints of energy consumption and load balancing, it is also difficult to optimize the AISC strategy to minimize AISC completion time for enhancing the quality of AI service. To address the above challenges, this paper proposes a double deep attention Q-network based on heterogeneous graph neural networks, which incorporates heterogeneous graph to capture diverse relationships in UMEC and utilizes attention mechanisms to adaptively focus on critical nodes and links for intelligent AISC deployment. The experimental results demonstrate that the proposed algorithm performs excellently in AISC completion time, AISC completion rate, load balancing and energy consumption.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a double deep attention Q-network (DDAQN) based on heterogeneous graph neural networks for AISC deployment in dynamic UAV-assisted mobile edge computing (UMEC) networks. It models intricate UMEC-AISC and VNF relationships via heterogeneous graphs, uses attention to focus on critical nodes/links, and optimizes under energy and load-balancing constraints to minimize completion time. The abstract asserts that the method 'performs excellently' on completion time, completion rate, load balancing, and energy consumption.
Significance. If the quantitative claims hold under rigorous testing, the work would offer a concrete RL approach for adaptive VNF placement in high-mobility UAV edge environments, potentially improving service quality for latency-critical AI inference chains. The use of heterogeneous graph attention to capture diverse relationships is a plausible direction, but the absence of reported metrics, baselines, or topology-change protocols in the provided text limits assessment of whether the gains are load-bearing or generalizable.
major comments (2)
- [Abstract] Abstract: the central claim of 'excellent' performance in AISC completion time, rate, load balancing, and energy is asserted without any numerical results, baseline comparisons, statistical significance tests, or description of how high-speed UAV movement and arrival/departure events are simulated during training or evaluation. This makes it impossible to verify whether the heterogeneous-graph attention mechanism actually stabilizes the policy under topology dynamics.
- [Abstract] Abstract (and implied method): no evidence is supplied that the learned policy was stress-tested under topology change rates materially higher than the training distribution or that online adaptation (versus periodic retraining) was evaluated. If graph embeddings or attention weights become stale faster than double-DQN updates can compensate, the reported gains would not generalize to the stated dynamic UMEC setting.
minor comments (1)
- [Abstract] Abstract: the acronym 'AISC' is introduced without expansion on first use; 'UMEC' is expanded but the relationship to 'UAV-assisted MEC' should be clarified for readers outside the subfield.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We respond point-by-point below and will revise the manuscript to improve clarity on the abstract claims and evaluation of dynamics.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim of 'excellent' performance in AISC completion time, rate, load balancing, and energy is asserted without any numerical results, baseline comparisons, statistical significance tests, or description of how high-speed UAV movement and arrival/departure events are simulated during training or evaluation. This makes it impossible to verify whether the heterogeneous-graph attention mechanism actually stabilizes the policy under topology dynamics.
Authors: The abstract provides a high-level summary as is standard; the full manuscript (Sections 4 and 5) reports the numerical results, baseline comparisons (including DQN variants and heuristics), and the simulation protocol for UAV mobility and topology events. To address the concern directly, we will revise the abstract to include a concise statement of the key quantitative gains and a brief reference to the dynamic simulation setup. revision: yes
-
Referee: [Abstract] Abstract (and implied method): no evidence is supplied that the learned policy was stress-tested under topology change rates materially higher than the training distribution or that online adaptation (versus periodic retraining) was evaluated. If graph embeddings or attention weights become stale faster than double-DQN updates can compensate, the reported gains would not generalize to the stated dynamic UMEC setting.
Authors: The experiments evaluate performance under the dynamic conditions (UAV movement, arrivals/departures) specified in Section 4.2. Explicit stress-testing at materially higher change rates or direct comparison of online adaptation versus periodic retraining is not reported. We agree this limits claims about extreme generalization and will add a limitations paragraph plus future-work discussion on this point in the revised manuscript. revision: partial
Circularity Check
No circularity: method proposal and experimental claims are independent of self-referential definitions or fitted inputs
full rationale
The abstract and description present a proposed RL architecture (double deep attention Q-network on heterogeneous graph) whose performance is evaluated via simulation experiments on completion time, rate, load balance and energy. No equations, parameter-fitting steps, or self-citations are quoted that would reduce any claimed prediction or result to a quantity defined by the model itself. The central claim rests on empirical outcomes rather than a derivation that is tautological by construction; therefore the paper is self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Joint Resource and Trajectory Optimization for Security in UAV -Assisted MEC Systems,
Y. Xu, T. Zhang, D. Yang, Y. Liu, and M. Tao, “Joint Resource and Trajectory Optimization for Security in UAV -Assisted MEC Systems,” IEEE Transactions on Communications , vol. 69, no. 1, pp. 573 –588, 2021, doi: 10.1109/TCOMM.2020.3025910
-
[2]
UAV -Assisted MEC Networks With Aerial and Ground Cooperation,
Y. Xu, T. Zhang, Y. Liu, D. Yang, L. Xiao, and M. Tao, “UAV -Assisted MEC Networks With Aerial and Ground Cooperation,” IEEE Transactions on Wireless Communications , vol. 20, no. 12, pp. 7712 – 7727, 2021, doi: 10.1109/TWC.2021.3086521
-
[3]
Y. Qiu, J. Liang, V. C. M. Leung, and M. Chen, “Online Security -Aware and Reliability -Guaranteed AI Service Chains Provisioning in Edge Intelligence Cloud,” IEEE Transactions on Mobile Computing , vol. 23, no. 5, pp. 5933–5948, 2024, doi: 10.1109/TMC.2023.3314580
-
[4]
C. Deng, X. Fang, and X. Wang, “UAV -Enabled Mobile -Edge Computing for AI Applications: Joint Model Decision, Resource Allocation, and Trajectory Optimization,” IEEE Internet of Things Journal, vol. 10, no. 7, pp. 5662 –5675, 2023, doi: 10.1109/JIOT.2022.3151619
-
[5]
Secure Service Function Chain Provisioning for Task Offloading in Device -Edge-Cloud Computing,
J. Liu, X. Wang, K. Ren, Y. Zhou, and M. Li, “Secure Service Function Chain Provisioning for Task Offloading in Device -Edge-Cloud Computing,” IEEE Transactions on Information Forensics and Security , vol. 20, pp. 3717–3730, 2025, doi: 10.1109/TIFS.2025.3553013
-
[6]
UAV Communications for 5G and Beyond: Recent Advances and Future Trends,
B. Li, Z. Fei, and Y. Zhang, “UAV Communications for 5G and Beyond: Recent Advances and Future Trends,” IEEE Internet of Things Journal , vol. 6, no. 2, pp. 2241–2263, 2019, doi: 10.1109/JIOT.2018.2887086
-
[7]
A. H. Wheeb, R. Nordin, A. A. Samah, M. H. Alsharif, and M. A. Khan, “Topology-Based Routing Protocols and Mobility Models for Flying Ad Hoc Networks: A Contemporary Review and Future Research Directions,” Drones, vol. 6, no. 1, 2022, doi: 10.3390/drones6010009
-
[8]
M. Pourghasemian, M. R. Abedi, S. S. Hosseini, N. Mokari, M. R. Javan, and E. A. Jorswieck, “AI -Based Mobility -Aware Energy Efficient Resource Allocation and Trajectory Design for NFV Enabled Aerial Networks,” IEEE Transactions on Green Communications and Networking, vol. 7, no. 1, pp. 281 –297, 2023, doi: 10.1109/TGCN.2022.3186911
-
[9]
Enhancing Resilience in Distributed ML Inference Pipelines for Edge Computing,
L. Wu, W. A. Hanafy, A. Souza, T. Abdelzaher, G. Verma, and P. Shenoy, “Enhancing Resilience in Distributed ML Inference Pipelines for Edge Computing,” in MILCOM 2024 - 2024 IEEE Military Communications Conference (MILCOM) , 2024, pp. 1 –6. doi: 10.1109/MILCOM61039.2024.10773652
-
[10]
M. Abdel-Basset, R. Mohamed, I. M. Hezam, K. M. Sallam, A. Foul, and I. A. Hameed, “Multiobjective trajectory optimization algorithms for solving multi -UAV-assisted mobile edge computing problem,” J Cloud Comp, vol. 13, no. 1, p. 35, Feb. 2024, doi: 10.1186/s13677 -024-00594- z
-
[11]
S. Tong, Y. Liu, J. Mišić, X. Chang, Z. Zhang, and C. Wang, “Joint Task Offloading and Resource Allocation for Fog -Based Intelligent Transportation Systems: A UAV -Enabled Multi -Hop Collaboration Paradigm,” IEEE Trans. Intell. Transport. Syst ., vol. 24, no. 11, pp. 12933–12948, Nov. 2023, doi: 10.1109/TITS.2022.3163804
-
[12]
Y. Liu et al., “Cost -Oriented and Delay -Constrained Anycasting for Service Function Chain Provisioning Leveraging Cloud -Edge Collaboration in Space-Air-Ground Integrated Networks,” IEEE Internet Things J., pp. 1–1, 2024, doi: 10.1109/JIOT.2024.3485640
-
[13]
M. Li, J. Gao, C. Zhou, X. S. Shen, and W. Zhuang, “Slicing -Based Artificial Intelligence Service Provisioning on the Network Edge: Balancing AI Service Performance and Resource Consumption of Data Management,” IEEE Vehicular Technology Magazine , vol. 16, no. 4, pp. 16–26, Dec. 2021, doi: 10.1109/MVT.2021.3114655
-
[14]
D. Xu, X. Tian, K. Pham, E. Blasch, and G. Chen, “Virtual Network Function Placement for Mapping SFC Requests of UAV -Sourced Video Streaming in Cloud Networks,” in 2024 IEEE International Conference on Communications Workshops (ICC Workshops), Denver, CO, USA: 8 > REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) < IEEE, Jun. 2...
-
[15]
Adaptive QoE -Aware SFC Orchestration in UAV Networks: A Deep Reinforcement Learning Approach,
Y. Wu, Z. Jia, Q. Wu, and Z. Lu, “Adaptive QoE -Aware SFC Orchestration in UAV Networks: A Deep Reinforcement Learning Approach,” IEEE Trans. Netw. Sci. Eng. , vol. 11, no. 6, pp. 6052 –6065, Nov. 2024, doi: 10.1109/TNSE.2024.3442857
-
[16]
Service Function Chain Scheduling in Heterogeneous Multi-UAV Edge Computing,
Y. Wang et al., “Service Function Chain Scheduling in Heterogeneous Multi-UAV Edge Computing,” Drones, vol. 7, no. 2, p. 132, Feb. 2023, doi: 10.3390/drones7020132
-
[17]
Research on Service Function Chain Embedding and Migration Algorithm for UAV IoT,
X. Wang, S. Shi, and C. Wu, “Research on Service Function Chain Embedding and Migration Algorithm for UAV IoT,” Drones, vol. 8, no. 4, p. 117, Mar. 2024, doi: 10.3390/drones8040117
-
[18]
Y. Lu, C. Jiang, L. Tan, J. Zhang, P. Zhang, and C. Rong, “UAV Dynamic Service Function Chains Deployment Based on Security Considerations: A Reinforcement Learning Method,” IEEE Internet Things J., vol. 11, no. 24, pp. 39731–39743, Dec. 2024, doi: 10.1109/JIOT.2024.3450886
-
[19]
Mobility -Aware Service Function Chain Deployment with Migration in NFV-Based Edge- Cloud,
Y. Zhang, R. Wang, Q. Wu, J. Hao, and Z. Xiong, “Mobility -Aware Service Function Chain Deployment with Migration in NFV-Based Edge- Cloud,” in 2023 21st International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt) , Singapore, Singapore: IEEE, Aug. 2023, pp. 87 –94. doi: 10.23919/WiOpt58741.2023.10349842
-
[20]
X. Wang, H. Xing, F. Song, S. Luo, P. Dai, and B. Zhao, “On Jointly Optimizing Partial Offloading and SFC Mapping: A Cooperative Dual - Agent Deep Reinforcement Learning Approach,” IEEE Trans. Parallel Distrib. Syst., vol. 34, no. 8, pp. 2479 –2497, Aug. 2023, doi: 10.1109/TPDS.2023.3287633
-
[21]
Z. Wang, H. Yao, T. Mai, and D. Wu, “Distributed Generative Reinforcement Learning for Stable Service Function Chain Orchestration in Highly Dynamic UAV Swarm Networks,” IEEE Trans. Veh. Technol., pp. 1–15, 2025, doi: 10.1109/TVT.2025.3585912
-
[22]
GNN -Based QoE Optimization for Dependent Task Scheduling in Edge -Cloud Computing Network,
Y. Ping, K. Xie, X. Huang, C. Li, and Y. Zhang, “GNN -Based QoE Optimization for Dependent Task Scheduling in Edge -Cloud Computing Network,” in 2024 IEEE Wireless Communications and Networking Conference (WCNC), Dubai, United Arab Emirates: IEEE, Apr. 2024, pp. 1–6. doi: 10.1109/WCNC57260.2024.10571289
-
[23]
Radiometer calibration using machine learning,
Y. Gao, M. Liu, X. Yuan, Y. Hu, P. Sun, and A. Schmeink, “Federated deep reinforcement learning based trajectory design for UAV -assisted networks with mobile ground devices,” Sci Rep, vol. 14, no. 1, p. 22753, Oct. 2024, doi: 10.1038/s41598 -024-72654-y
-
[24]
InSIGGRAPH Asia 2024 Conference Papers
Z. Feng, D. Wu, M. Huang, and C. Yuen, “Graph -Attention-Based Reinforcement Learning for Trajectory Design and Resource Assignment in Multi-UAV-Assisted Communication,” IEEE Internet Things J. , vol. 11, no. 16, pp. 27421 –27434, Aug. 2024, doi: 10.1109/JIOT.2024.3397823
-
[25]
T. Bao, A. Syed, W. S. Kennedy, and M. Erol-Kantarci, “Sustainable Task Offloading in Secure UAV -Assisted Smart Farm Networks: A Multi - Agent DRL With Action Mask Approach,” IEEE Trans. Netw. Serv. Manage., pp. 1–1, 2024, doi: 10.1109/TNSM.2024.3486288
-
[26]
Pytorch: An imperative style, high -performance deep learning library,
A. Paszke, “Pytorch: An imperative style, high -performance deep learning library,” arXiv preprint arXiv:1912.01703, 2019
Pith/arXiv arXiv 1912
-
[27]
Proximal Policy Optimization Algorithms
J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal Policy Optimization Algorithms,” Aug. 28, 2017, arXiv: arXiv:1707.06347. doi: 10.48550/arXiv.1707.06347
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1707.06347 2017
-
[28]
Semi -Supervised Classification with Graph Convolutional Networks,
T. N. Kipf and M. Welling, “Semi -Supervised Classification with Graph Convolutional Networks,” Feb. 22, 2017, arXiv: arXiv:1609.02907. Accessed: Nov. 11, 2023. [Online]. Available: http://arxiv.org/abs/1609.02907
Pith/arXiv arXiv 2017
-
[29]
Heterogeneous Graph Attention Network,
X. Wang et al., “Heterogeneous Graph Attention Network,” in The World Wide Web Conference, in WWW ’19. New York, NY, USA: Association for Computing Machinery, 2019, pp. 2022 –2032. doi: 10.1145/3308558.3313562
-
[30]
On random matrices,
P. Erdős and A. Renyi, “On random matrices,” Magyar Tud. Akad. Mat. Kutató Int. Kö zl, vol. 8, pp. 455–461, 1964
1964
-
[31]
R. Albert and A. -L. Barabási, “Statistical mechanics of complex networks,” Rev. Mod. Phys. , vol. 74, no. 1, pp. 47 –97, Jan. 2002, doi: 10.1103/RevModPhys.74.47
-
[32]
B. Bollobás, “Random Graphs,” in Modern Graph Theory , New York, NY: Springer New York, 1998, pp. 215 –252. doi: 10.1007/978 -1-4612- 0619-4_7
work page doi:10.1007/978 1998
-
[33]
D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small -world’ networks,” Nature, vol. 393, no. 6684, pp. 440 –442, Jun. 1998, doi: 10.1038/30918. Hanzhi Chang (Student Member, IEEE) received his B.S. degree from the Department of Cyber Science and Engineering, University of International Relations, Beijing, China, in
-
[34]
degree in the Department of Cyber Science and Engineering, University of I nternational Relations, Beijing, China
He is currently pursuing for his M.S. degree in the Department of Cyber Science and Engineering, University of I nternational Relations, Beijing, China. His research interests include network function virtualization, network resource orchestration and management, and reinforcement learning algorithms. Jing Bai received the PhD degree in cyberspace securit...
2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.