Latency-Aware Service Placement using Neural Combinatorial Optimisers for Edge--Cloud Systems
Pith reviewed 2026-06-25 20:15 UTC · model grok-4.3
The pith
EP-NCO uses dual-graph neural networks and reinforcement learning to place microservices across edge-cloud systems and reduces total response time by 46 to 50 percent versus genetic algorithms.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
EP-NCO employs a dual-graph model to capture resource relationships and service dependencies within both computing infrastructure and application structure. Graph neural networks learn structural embeddings of infrastructure nodes and service components, whereas reinforcement learning policies construct feasible placements that account for execution latency, communication link delays, and bandwidth-sharing effects. Extensive simulations across multiple system scales demonstrate that EP-NCO consistently achieves high-quality placement decisions, reducing the total service response time by 46 percent to 50 percent compared with metaheuristics and by 25 percent to 35 percent compared with contr
What carries the argument
The dual-graph model that encodes both the computing infrastructure and the application service dependencies, whose node embeddings from graph neural networks are fed to reinforcement learning policies that build complete placements while respecting latency and bandwidth constraints.
If this is right
- Enables fast online placement decisions after training for systems with hundreds of nodes and thousands of applications.
- Accounts for execution latency, communication delays, and bandwidth sharing when constructing placements.
- Maintains performance improvements across different simulated system sizes compared with both metaheuristics and other reinforcement learning methods.
Where Pith is reading between the lines
- The same dual-graph plus reinforcement learning structure could be tested on related assignment problems such as task scheduling in data centers.
- Real deployments would need additional mechanisms to handle workload changes that were not present in the training simulations.
- The fast inference property may allow the method to replace periodic re-optimization loops in orchestration platforms that currently rely on slower search procedures.
Load-bearing premise
The simulated system scales and workload patterns used in the experiments are representative of real heterogeneous edge-cloud infrastructures with dynamic arrivals and bandwidth-sharing effects.
What would settle it
Direct measurement of end-to-end service response times when the same placement algorithm is run on a physical multi-node edge-cloud testbed using real microservice workloads and live network traffic.
Figures
read the original abstract
The growth of Internet of Things (IoT) applications and latency-sensitive services has increased the demand for efficient service placement across compute continuum platforms, such as edge--cloud systems. Modern applications are decomposed into interdependent microservices deployed over heterogeneous infrastructures, making placement under resource and network constraints an intractable NP-hard combinatorial optimisation problem. This study proposes a latency-aware Edge Placement Neural Combinatorial Optimiser (EP-NCO), a learning-based framework for service placement in compute continuum platforms. EP-NCO employs a dual-graph model to capture resource relationships and service dependencies within both computing infrastructure and application structure. Graph neural networks (GNNs) learn structural embeddings of infrastructure nodes and service components, whereas reinforcement learning policies construct feasible placements that account for execution latency, communication link delays, and bandwidth-sharing effects. Extensive simulations across multiple system scales demonstrate that EP-NCO consistently achieves high-quality placement decisions, reducing the total service response time by 46%--50% compared with metaheuristics (genetic algorithm and particle swarm optimisation) and by 25%--35% compared with controlled RL ablation baselines. Once trained, EP-NCO enables fast online inference, making it a practical solution for dynamic large-scale edge--cloud environments with hundreds of computing nodes, hosting thousands of applications, which is significantly beyond the capability of current scheduling systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes EP-NCO, a latency-aware neural combinatorial optimizer for service placement in edge-cloud systems. It uses a dual-graph model with GNNs to embed infrastructure nodes and service dependencies, then applies RL policies to construct placements accounting for execution latency, link delays, and bandwidth sharing. Extensive simulations across system scales are reported to yield 46-50% reductions in total service response time versus genetic algorithm and particle swarm optimization baselines, and 25-35% versus controlled RL ablations, with fast online inference suitable for large deployments.
Significance. If the simulation results prove robust, the work addresses a practically relevant NP-hard problem in compute-continuum platforms and supplies a scalable learning-based alternative to metaheuristics. The dual-graph formulation and explicit modeling of bandwidth-sharing effects constitute clear technical strengths; fast inference after training is a useful property for dynamic environments.
major comments (2)
- [§4] §4 (Experimental Setup) and the abstract: the workload arrival processes, bandwidth-sharing model, and node heterogeneity parameters are not shown to match production edge-cloud traces or real IoT workloads. Because the headline 46-50% and 25-35% gains rest entirely on these synthetic simulations, the lack of fidelity validation is load-bearing for the central empirical claim.
- [§4] §4, performance tables: no information is supplied on the number of independent runs, statistical tests, variance, or exact baseline implementations (e.g., how GA/PSO hyperparameters were tuned or how the RL ablations were controlled). This prevents assessment of whether the reported percentage improvements are statistically reliable or reproducible.
minor comments (2)
- [§3] Notation for the dual-graph model and GNN embedding dimensions could be introduced with an explicit diagram or table in §3 to improve readability.
- The manuscript would benefit from a short discussion of training time versus inference time trade-offs, even if only in the supplementary material.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on the experimental setup. We agree that additional transparency is required to support the empirical claims and will revise the manuscript to address both points.
read point-by-point responses
-
Referee: [§4] §4 (Experimental Setup) and the abstract: the workload arrival processes, bandwidth-sharing model, and node heterogeneity parameters are not shown to match production edge-cloud traces or real IoT workloads. Because the headline 46-50% and 25-35% gains rest entirely on these synthetic simulations, the lack of fidelity validation is load-bearing for the central empirical claim.
Authors: We acknowledge that the simulations rely on synthetic workloads without direct validation against specific production traces. Publicly available edge-cloud datasets with the required granularity for service dependencies and bandwidth sharing are scarce. In the revised manuscript we will expand §4 with a justification of parameter choices drawn from established models in the literature (exponential inter-arrival times, node capacities 1-10 cores, bandwidth sharing factors from prior edge studies). We will also add a limitations subsection explicitly noting the synthetic nature of the evaluation and identifying real-trace validation as future work. This provides necessary context while preserving the reported simulation results. revision: partial
-
Referee: [§4] §4, performance tables: no information is supplied on the number of independent runs, statistical tests, variance, or exact baseline implementations (e.g., how GA/PSO hyperparameters were tuned or how the RL ablations were controlled). This prevents assessment of whether the reported percentage improvements are statistically reliable or reproducible.
Authors: We agree these details are essential and were omitted. The revised manuscript will update §4 and the tables to report: results from 30 independent runs with means and standard deviations; statistical significance via Wilcoxon signed-rank tests (p < 0.05); full GA settings (population size 50, 100 generations, mutation 0.1); PSO settings (swarm size 30, 200 iterations); and identical training protocols for all RL ablations. These additions will allow readers to assess reliability and reproducibility. revision: yes
Circularity Check
No circularity in derivation chain
full rationale
The provided abstract and context describe a proposed EP-NCO framework using GNNs and RL policies to solve a combinatorial placement problem, with performance evaluated against external metaheuristics (GA, PSO) and RL ablations. No equations, fitted parameters renamed as predictions, self-definitional constructs, or load-bearing self-citations are quoted or evident that would reduce claimed latency reductions to internal definitions by construction. The central claims rest on simulation comparisons to independent baselines rather than tautological reductions, making the derivation self-contained.
Axiom & Free-Parameter Ledger
invented entities (1)
-
EP-NCO
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Ai- driven service placement in fog and edge computing environments: a systematic review, taxonomy and future directions.Cluster Comput- ing, 28(16):1–39, 2025
Thatikonda Supraja, Priyanka Chawla, and Sukhpal Singh Gill. Ai- driven service placement in fog and edge computing environments: a systematic review, taxonomy and future directions.Cluster Comput- ing, 28(16):1–39, 2025
2025
-
[2]
A survey on services placement algorithms in integrated cloud-fog/edge com- puting.ACM Computing Surveys, 57(11):1–36, 2025
Imane Taleb, Jean-Loup Guillaume, and Benjamin Duthil. A survey on services placement algorithms in integrated cloud-fog/edge com- puting.ACM Computing Surveys, 57(11):1–36, 2025
2025
-
[3]
Edge- computing-driven internet of things: A survey.ACM Computing Surveys, 55(8):1–41, 2022
Linghe Kong, Jinlin Tan, Junqin Huang, Guihai Chen, Shuaitian Wang,XiJin,PengZeng,MuhammadKhan,andSajalKDas. Edge- computing-driven internet of things: A survey.ACM Computing Surveys, 55(8):1–41, 2022
2022
-
[4]
Neural combinatorial optimization for energy-efficient offloading in mobile edge comput- ing.IEEE Access, 8:35077–35089, 2020
Qingmiao Jiang, Yuan Zhang, and Jinyao Yan. Neural combinatorial optimization for energy-efficient offloading in mobile edge comput- ing.IEEE Access, 8:35077–35089, 2020
2020
-
[5]
An overview of service placement problem in fog and edge computing.ACM Computing Surveys (CSUR), 53(3):1–35, 2020
Farah Ait Salaht, Frédéric Desprez, and Adrien Lebre. An overview of service placement problem in fog and edge computing.ACM Computing Surveys (CSUR), 53(3):1–35, 2020
2020
-
[6]
Dynamic service placement in multi-access edge computing: A sys- tematic literature review.IEEE Access, 10:32639–32688, 2022
Hadi Tabatabaee Malazi, Saqib Rasool Chaudhry, Aqeel Kazmi, Andrei Palade, Christian Cabrera, Gary White, and Siobhán Clarke. Dynamic service placement in multi-access edge computing: A sys- tematic literature review.IEEE Access, 10:32639–32688, 2022
2022
-
[7]
Survey on placement methods in the edge and beyond
Balázs Sonkoly, János Czentye, Márk Szalay, Balázs Németh, and László Toka. Survey on placement methods in the edge and beyond. IEEECommunicationsSurveys&Tutorials,23(4):2590–2629,2021
2021
-
[8]
Ahmed, and Calin Curescu
Mohammadsadeq Garshasbi Herabad, Javid Taheri, Bestoun S. Ahmed, and Calin Curescu. A lightweight learning-based approach foronlineedge-to-cloudserviceplacement.Electronics,15(1),2026. ISSN 2079-9292
2026
-
[9]
Optimal placement of recurrent service chains on distributed edge-cloud infrastructures
Ayeh Mahjoubi, Javid Taheri, Karl-Johan Grinnemo, and Shuiguang Deng. Optimal placement of recurrent service chains on distributed edge-cloud infrastructures. In2021 IEEE 46th Conference on Local Computer Networks (LCN), pages 495–502. IEEE, 2021
2021
-
[10]
Code: Computation offloading in d2d-edge systemforvideostreaming.IEEESystemsJournal,17(3):4014–4025, 2022
MuhammadAsifKhan,EmnaBaccour,AimanErbad,RidhaHamila, and Mounir Hamdi. Code: Computation offloading in d2d-edge systemforvideostreaming.IEEESystemsJournal,17(3):4014–4025, 2022
2022
-
[11]
Online user allocation in mobile edge computing environments: A decentralized reactive approach.Journal of Systems Architecture, 113:101904, 2021
Chunrong Wu, Qinglan Peng, Yunni Xia, Yong Ma, Wangbo Zheng, Hong Xie, Shanchen Pang, Fan Li, Xiaodong Fu, Xiaobo Li, et al. Online user allocation in mobile edge computing environments: A decentralized reactive approach.Journal of Systems Architecture, 113:101904, 2021
2021
-
[12]
E-psoga: An enhanced hybrid metaheuristic for optimaledge-to-cloudplacementofserviceswithmulti-versioncom- ponents.IEEE Access, 2025
Mohammadsadeq G Herabad, Javid Taheri, Bestoun S Ahmed, and Calin Curescu. E-psoga: An enhanced hybrid metaheuristic for optimaledge-to-cloudplacementofserviceswithmulti-versioncom- ponents.IEEE Access, 2025
2025
-
[13]
Ahybridmeta-heuristicalgorithmformulti-objective iot service placement in fog computing environments.Decision Analytics Journal, 10:100379, 2024
Hemant Kumar Apat, Bibhudutta Sahoo, Veena Goswami, and Ra- bindraKBarik. Ahybridmeta-heuristicalgorithmformulti-objective iot service placement in fog computing environments.Decision Analytics Journal, 10:100379, 2024
2024
-
[14]
Quantum-inspired particle swarm optimization for efficient iot service placement in edge computing systems.Expert Systems with Applications, 236:121270, 2024
Marlom Bey, Pratyay Kuila, Banavath Balaji Naik, and Santanu Ghosh. Quantum-inspired particle swarm optimization for efficient iot service placement in edge computing systems.Expert Systems with Applications, 236:121270, 2024
2024
-
[15]
Anantcolonyoptimization-basedmultiobjectiveservice replicasplacementstrategyforfogcomputing.IEEETransactionson Cybernetics, 51(11):5595–5608, 2020
TianshengHuang,WeiweiLin,ChennianXiong,RuiPan,andJingx- uanHuang. Anantcolonyoptimization-basedmultiobjectiveservice replicasplacementstrategyforfogcomputing.IEEETransactionson Cybernetics, 51(11):5595–5608, 2020
2020
-
[16]
Machine learning-based solutions for resource management in fog computing.Multimedia Tools and Applications, 83(8):23019– 23045, 2024
Muhammad Fahimullah, Shohreh Ahvar, Mihir Agarwal, and Maria Trocan. Machine learning-based solutions for resource management in fog computing.Multimedia Tools and Applications, 83(8):23019– 23045, 2024
2024
-
[17]
Intelligent service placement algorithm based on ddqn and prioritized experience replay in iot-fog computing environment.Internet of Things, 25:101112, 2024
Ankur Sharma and Veni Thangaraj. Intelligent service placement algorithm based on ddqn and prioritized experience replay in iot-fog computing environment.Internet of Things, 25:101112, 2024
2024
-
[18]
Dynamic task offloading for resource allocation and privacy-preserving frame- work in kubeedge-based edge computing using machine learning
Sadananda Lingayya, Sathyendra Bhat Jodumutt, Sanjay Rangrao Pawar, Anoop Vylala, and Senthilkumar Chandrasekaran. Dynamic task offloading for resource allocation and privacy-preserving frame- work in kubeedge-based edge computing using machine learning. Cluster Computing, 27(7):9415–9431, 2024
2024
-
[19]
Q-learning: Theory and applications
Jesse Clifton and Eric Laber. Q-learning: Theory and applications. AnnualReviewofStatisticsandItsApplication,7:279–301,2020.doi: 10.1146/annurev-statistics-031219-041220
-
[20]
Byungjin Jang, Minho Kim, Gaspard Harerimana, and Jae Wook Kim. Q-learning algorithms: A comprehensive classification and applications.IEEE Access, 7:133653–133667, 2019. doi: 10.1109/ ACCESS.2019.2941229
arXiv 2019
-
[21]
Tao Liu, Shuai Ni, Xiang Li, Yan Zhu, and Linghe Kong. Deep reinforcement learning based approach for online service placement and computation resource allocation in edge computing.IEEE Transactions on Mobile Computing, 2022. doi: 10.1109/TMC.2022. 3141230
-
[22]
Yao Chen, Yang Sun, Bo Yang, and Tarik Taleb. Joint caching and computing service placement for edge-enabled iot based on deep reinforcement learning.IEEE Internet of Things Journal, 9(20): 20006–20017, 2022. doi: 10.1109/JIOT.2022.3159913
-
[23]
Machine learn- ingforcombinatorialoptimization:Amethodologicaltourd’horizon
Yoshua Bengio, Andrea Lodi, and Antoine Prouvost. Machine learn- ingforcombinatorialoptimization:Amethodologicaltourd’horizon. European Journal of Operational Research, 290(2):405–421, 2021. doi: 10.1016/j.ejor.2018.10.063
-
[24]
Pérez-Ramírez, and Magnus Boman
Natalia Vesselinova, Rebecca Steinert, Daniel F. Pérez-Ramírez, and Magnus Boman. Learning combinatorial optimization on graphs: A survey with applications to networking.IEEE Access, 8:120388– 120416, 2020. doi: 10.1109/ACCESS.2020.3005682
-
[25]
Ka Tai Chung, C. K. M. Lee, and Y. P. Tsang. Neural combinatorial optimization with reinforcement learning in industrial engineering: A survey.Artificial Intelligence Review, 2025. doi: 10.1007/ Abedpour et al.:Preprint submitted to ElsevierPage 17 of 18 Latency-Aware Neural Service Placement s10462-024-11045-1
2025
-
[26]
ACM Computing Surveys53(3) (2020) https://doi
FarahAïtSalaht,FrédéricDesprez,andAdrienLebre.Anoverviewof serviceplacementprobleminfogandedgecomputing.ACMComput. Surv., 53(3), June 2020. ISSN 0360-0300. doi: 10.1145/3391196. URLhttps://doi.org/10.1145/3391196
-
[27]
Hadi Tabatabaee Malazi, Saqib Rasool Chaudhry, Aqeel Kazmi, Andrei Palade, Christian Cabrera, Gary White, and Siobhán Clarke. Dynamic service placement in multi-access edge computing: A sys- tematic literature review.IEEE Access, 10:32639–32688, 2022. doi: 10.1109/ACCESS.2022.3160738
-
[28]
Animprovedgravitationalsearchalgorithmfortaskoffloading in a mobile edge computing network with task priority.Electronics, 13(3):540, 2024
LingXu,YunpengLiu,BingFan,XiaorongXu,YiguoMei,andWei Feng. Animprovedgravitationalsearchalgorithmfortaskoffloading in a mobile edge computing network with task priority.Electronics, 13(3):540, 2024
2024
-
[29]
Qos-aware deployment of iot applications through the fog.IEEE internet of Things Journal, 4(5): 1185–1192, 2017
Antonio Brogi and Stefano Forti. Qos-aware deployment of iot applications through the fog.IEEE internet of Things Journal, 4(5): 1185–1192, 2017
2017
-
[30]
Computing- assisted task offloading and resource allocation for wireless vr sys- tems
Shiqi Li, Peng Lin, Jing Song, and Qingyang Song. Computing- assisted task offloading and resource allocation for wireless vr sys- tems. In2020 IEEE 6th International Conference on Computer and Communications (ICCC), pages 368–372. IEEE, 2020
2020
-
[31]
Mohammadsadeq Garshasbi Herabad, Javid Taheri, Bestoun S Ahmed,andCalinCurescu. Optimizingserviceplacementinedge-to- cloud ar/vr systems using a multi-objective genetic algorithm.arXiv preprint arXiv:2403.12849, 2024
arXiv 2024
-
[32]
A bee colony-based algorithm for task offloading in vehicular edge computing.IEEE systems journal, 17(3):4165–4176, 2023
AlissonBarbosadeSouza,PauloAntonioLealRego,VinayChamola, Tiago Carneiro, Paulo Henrique Gonçalves Rocha, and José Neuman de Souza. A bee colony-based algorithm for task offloading in vehicular edge computing.IEEE systems journal, 17(3):4165–4176, 2023
2023
-
[33]
A cost-efficient iot service placement approach using whale optimization algorithm in fog computing environment.Expert Systems with Applications, 200: 117012, 2022
Mostafa Ghobaei-Arani and Ali Shahidinejad. A cost-efficient iot service placement approach using whale optimization algorithm in fog computing environment.Expert Systems with Applications, 200: 117012, 2022
2022
-
[34]
Graph-reinforcement- learning-based dependency-aware microservice deployment in edge computing.IEEEInternetofThingsJournal,11(1):1604–1615,2023
Wenkai Lv, Pengfei Yang, Tianyang Zheng, Chengmin Lin, Zhenyi Wang, Minwen Deng, and Quan Wang. Graph-reinforcement- learning-based dependency-aware microservice deployment in edge computing.IEEEInternetofThingsJournal,11(1):1604–1615,2023
2023
-
[35]
Graph neural network aided deep reinforcement learning for microservice deployment in cooperative edge computing.IEEE Transactions on Services Computing, 17(6): 3742–3757, 2024
Shuangwu Chen, Qifeng Yuan, Jiangming Li, Huasen He, Sen Li, Xiaofeng Jiang, and Jian Yang. Graph neural network aided deep reinforcement learning for microservice deployment in cooperative edge computing.IEEE Transactions on Services Computing, 17(6): 3742–3757, 2024
2024
-
[36]
Efficient microservice deployment in the edge-cloud networks with policy-gradient reinforcement learning.IEEE Access, 2024
Kevin Afachao, Adnan M Abu-Mahfouz, and Gerhard P Hanke. Efficient microservice deployment in the edge-cloud networks with policy-gradient reinforcement learning.IEEE Access, 2024
2024
-
[37]
On adaptive edge microservice placement: A reinforcement learning approachendowedwithgraphcomprehension.IEEETransactionson Mobile Computing, 23(12):11144–11158, 2024
Lixing Chen, Yang Bai, Pan Zhou, Youqi Li, Zhe Qu, and Jie Xu. On adaptive edge microservice placement: A reinforcement learning approachendowedwithgraphcomprehension.IEEETransactionson Mobile Computing, 23(12):11144–11158, 2024
2024
-
[38]
Shanchen Pang, Teng Wang, Haiyuan Gui, Xiao He, and Lili Hou. An intelligent task offloading method based on multi-agent deep reinforcement learning in ultra-dense heterogeneous network with mobile edge computing.Computer Networks, 250:110555, 2024
2024
-
[39]
A rein- forcementlearningapproachforonlineservicetreeplacementinedge computing
Yimeng Wang, Yongbo Li, Tian Lan, and Nakjung Choi. A rein- forcementlearningapproachforonlineservicetreeplacementinedge computing. In2019 IEEE 27th International Conference on Network Protocols (ICNP), pages 1–6. IEEE, 2019
2019
-
[40]
Xiang-JieXiao,YongWang,Pei-QiuHuang,andKezhiWang.Neural combinatorial optimization for multiobjective task offloading in mo- bile edge computing.IEEE Transactions on Vehicular Technology, 2025
2025
-
[41]
A survey on services placement algorithms in integrated cloud-fog / edge computing.ACM Comput
Imane Taleb, Jean-Loup Guillaume, and Benjamin Duthil. A survey on services placement algorithms in integrated cloud-fog / edge computing.ACM Comput. Surv., 57(11), June 2025. ISSN 0360-
2025
-
[42]
URLhttps://doi.org/10.1145/3729214
doi:10.1145/3729214. URLhttps://doi.org/10.1145/3729214
-
[43]
Heuris- tic and reinforcement learning algorithms for dynamic service place- mentonmobileedgecloud,2021
DhruvGarg,NanjangudC.Narendra,andSelomeTesfatsion. Heuris- tic and reinforcement learning algorithms for dynamic service place- mentonmobileedgecloud,2021. URLhttps://arxiv.org/abs/2111. 00240
2021
-
[44]
Kevin Afachao, Adnan M. Abu-Mahfouz, and Gerhard P. Hanke. Efficient microservice deployment in the edge-cloud networks with policy-gradient reinforcement learning.IEEE Access, 12:133110– 133124, 2024. doi: 10.1109/ACCESS.2024.3461149
-
[45]
Ep-nco: Latency-aware service placement us- ing neural combinatorial optimisation (code).https://github.com/ kimiaa45-ab/EP-NCO, 2026
Kimia Abedpour. Ep-nco: Latency-aware service placement us- ing neural combinatorial optimisation (code).https://github.com/ kimiaa45-ab/EP-NCO, 2026. Available online; accessed May 2026
2026
-
[46]
Kimia Abedpourreceived her B.Sc
JanezDemšar.Statisticalcomparisonsofclassifiersovermultipledata sets.Journal of Machine learning research, 7(Jan):1–30, 2006. Kimia Abedpourreceived her B.Sc. and M.Sc. degreesinComputerEngineering(Software)from Marlik Nowshahr Institute and Tabarestan Chalus Institute, Iran, in 2019 and 2021, respectively. She is currently pursuing her Ph.D. in Computer ...
2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.