A Conflict-Aware Resource Management Framework for the Computing Continuum

Ilir Murturi; Praveen Kumar Donta; Schahram Dustdar; Vlad Popescu-Vifor

arxiv: 2512.12299 · v2 · submitted 2025-12-13 · 💻 cs.DC

A Conflict-Aware Resource Management Framework for the Computing Continuum

Vlad Popescu-Vifor , Ilir Murturi , Praveen Kumar Donta , Schahram Dustdar This is my paper

Pith reviewed 2026-05-16 22:51 UTC · model grok-4.3

classification 💻 cs.DC

keywords resource managementconflict resolutiondeep reinforcement learningcomputing continuumorchestrationedge computingKubernetes

0 comments

The pith

A DRL framework mediates resource conflicts in the computing continuum using performance feedback for adaptive reallocation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a novel framework that uses deep reinforcement learning to resolve conflicts in resource orchestration across edge, fog, and cloud environments. Decentralized agent decisions often create persistent conflict loops that degrade performance, and the framework addresses this by training a DRL model on real-time metrics and historical states. The approach has been prototyped on a Kubernetes testbed, where it shows efficient resource reallocation and the ability to learn adaptively in changing conditions. A sympathetic reader would care because it offers a way to manage heterogeneity and decentralization without constant manual oversight in large-scale distributed systems.

Core claim

The framework enables handling of resource conflicts across deployments by integrating a DRL model trained to mediate conflicts based on real-time performance feedback and historical state information, achieving efficient reallocation and adaptive learning as demonstrated in preliminary Kubernetes-based validation.

What carries the argument

A Deep Reinforcement Learning model that mediates resource conflicts by learning from real-time performance feedback and historical state information.

Load-bearing premise

That a DRL model can reliably detect and resolve persistent conflict loops using only performance feedback and historical states, without built-in mechanisms for explicit conflict detection or safeguards against training instability.

What would settle it

Observing that the DRL agent fails to converge to stable resource allocations or repeatedly enters conflict loops during extended runs on the Kubernetes testbed with new service deployment patterns would disprove the effectiveness of the training approach.

Figures

Figures reproduced from arXiv: 2512.12299 by Ilir Murturi, Praveen Kumar Donta, Schahram Dustdar, Vlad Popescu-Vifor.

**Figure 2.** Figure 2: Flow diagram of DRL feedback process. The DRL method maintains a registry of specialized models using a dictionary structure within the system container. When an optimization request is received, the method first checks if a model specifically designed for the given deployment already exists. If such a model is available, it is loaded to generate a prediction. If not, a new model is created by cloning the … view at source ↗

**Figure 4.** Figure 4: Cluster architecture diagram. the Kubernetes cluster and resource management framework, while the worker nodes are equipped with 4 vCPUs and 8 GB of RAM to simulate edge environments. The cluster comprises three control plane nodes and four worker nodes, which allows for testing horizontal and vertical scaling and dynamic conflict resolution due to competing resource allocation policies. The cluster servic… view at source ↗

**Figure 5.** Figure 5: CPU usage on worker-1 after optimization. V. CONCLUSION This paper introduces a conflict-aware resource orchestration framework using DRL to manage agent-level resource conflicts in computing continuum environments. The implementation on a virtualized cluster showed that applying the framework reduced node-level CPU usage from 75% to 60.75% and optimized service-level resource quotas by 15- 20%. Our init… view at source ↗

read the original abstract

The increasing device heterogeneity and decentralization requirements in the computing continuum (i.e., spanning edge, fog, and cloud) introduce new challenges in resource orchestration. In such environments, agents are often responsible for optimizing resource usage across deployed services. However, agent decisions can lead to persistent conflict loops, inefficient resource utilization, and degraded service performance. To overcome such challenges, we propose a novel framework for adaptive conflict resolution in resource-oriented orchestration using a Deep Reinforcement Learning (DRL) approach. The framework enables handling resource conflicts across deployments and integrates a DRL model trained to mediate such conflicts based on real-time performance feedback and historical state information. The framework has been prototyped and validated on a Kubernetes-based testbed, illustrating its methodological feasibility and architectural resilience. Preliminary results show that the framework achieves efficient resource reallocation and adaptive learning in dynamic scenarios, thus providing a scalable and resilient solution for conflict-aware orchestration in the computing continuum.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proposes a DRL-based conflict resolution framework for the computing continuum but fails to provide any quantitative results or technical specifics to substantiate its claims.

read the letter

This paper describes a framework that uses deep reinforcement learning to resolve conflicts in resource orchestration for the computing continuum. The main takeaway is that while the idea addresses a practical issue, the lack of any quantitative data or implementation details leaves the claims unsupported. It does a good job laying out how agent decisions can create persistent loops and inefficiency in heterogeneous setups spanning edge to cloud. The proposal to train a DRL model on performance feedback and historical information for mediation is a straightforward application of existing techniques to this domain. Prototyping on Kubernetes shows they're thinking about real-world feasibility. Where it falls short is in the evidence. The abstract talks about achieving efficient reallocation and adaptive learning in dynamic scenarios, but offers no metrics, comparisons, or specifics on the learning setup. This makes it impossible to judge if the DRL can reliably detect and break conflict loops or if it risks instability. The assumption that feedback alone suffices without explicit conflict detection or stability checks seems shaky based on what's presented. The work is for systems engineers working on orchestration in IoT and continuum environments who might want to explore DRL extensions. It could be worth discussing in a group for the problem framing, but the missing details limit its value. I don't think it should go to peer review in this state. The authors need to add the DRL architecture details, training process, and actual results before it merits referee attention.

Referee Report

3 major / 2 minor

Summary. The paper proposes a novel framework for adaptive conflict resolution in resource orchestration across the computing continuum (edge, fog, cloud) using Deep Reinforcement Learning (DRL). Agents optimize resources but can create persistent conflict loops; the framework integrates a DRL model trained on real-time performance feedback and historical states to mediate conflicts. It describes a Kubernetes-based prototype demonstrating methodological feasibility, architectural resilience, and preliminary results of efficient reallocation and adaptive learning in dynamic scenarios.

Significance. If the DRL component can be shown to produce stable policies that reliably detect and resolve conflicts without oscillation, the work would address a practically important gap in decentralized resource management for heterogeneous environments. A validated learning-based approach could improve utilization and resilience over static or heuristic orchestration methods, with potential impact on scalable continuum deployments.

major comments (3)

[Abstract] Abstract: the claim that 'preliminary results show that the framework achieves efficient resource reallocation and adaptive learning' is unsupported by any quantitative metrics, baselines, error bars, or statistical analysis, which is load-bearing for the validation of the central claim.
[Framework / DRL Integration] DRL model description: no state-space definition, reward formulation (including any loop-penalty terms), action space, training procedure, or convergence criteria are provided, leaving the weakest assumption—that feedback alone suffices for stable conflict-loop resolution—unexamined and unreproducible.
[Prototype and Validation] Prototype validation: the Kubernetes testbed results are described only qualitatively; without reported resource-utilization deltas, conflict-resolution latency, policy-stability metrics, or comparisons to non-DRL baselines, the assertions of 'efficient reallocation' and 'architectural resilience' cannot be assessed.

minor comments (2)

[Introduction / Problem Statement] Clarify the precise definition of 'conflict loop' and how it is detected or represented in the state; this would strengthen the link between the problem statement and the DRL design.
[Abstract / Introduction] Ensure the abstract and introduction explicitly distinguish the proposed DRL approach from prior RL-based orchestration work to better highlight novelty.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments correctly identify areas where the manuscript requires additional rigor, particularly in quantitative support and technical specifications. We will revise the paper to address each point.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that 'preliminary results show that the framework achieves efficient resource reallocation and adaptive learning' is unsupported by any quantitative metrics, baselines, error bars, or statistical analysis, which is load-bearing for the validation of the central claim.

Authors: We agree that the abstract claim is not adequately supported. The manuscript presents only high-level preliminary observations without metrics. In the revision we will either remove the unsupported phrasing or qualify it strictly to match the evidence actually shown in the evaluation section, and we will add key quantitative indicators (utilization deltas, latency figures) to the abstract if space allows. revision: yes
Referee: [Framework / DRL Integration] DRL model description: no state-space definition, reward formulation (including any loop-penalty terms), action space, training procedure, or convergence criteria are provided, leaving the weakest assumption—that feedback alone suffices for stable conflict-loop resolution—unexamined and unreproducible.

Authors: This observation is accurate and points to a genuine omission. The current text does not supply the requested DRL specifications. We will add a dedicated subsection that explicitly defines the state space, the reward function (including explicit loop-penalty terms), the action space, the training algorithm and hyperparameters, and the convergence criteria used. This will make the stability claim testable and the work reproducible. revision: yes
Referee: [Prototype and Validation] Prototype validation: the Kubernetes testbed results are described only qualitatively; without reported resource-utilization deltas, conflict-resolution latency, policy-stability metrics, or comparisons to non-DRL baselines, the assertions of 'efficient reallocation' and 'architectural resilience' cannot be assessed.

Authors: We accept that the validation section is insufficiently quantitative. The prototype results are currently described at a high level only. In the revised manuscript we will report concrete metrics (resource-utilization deltas, conflict-resolution latency, policy-stability indicators) together with comparisons against non-DRL baselines. If additional runs are required to obtain statistically meaningful figures, we will perform them. revision: yes

Circularity Check

0 steps flagged

No mathematical derivations or self-referential reductions present

full rationale

The paper describes a high-level framework for conflict-aware resource orchestration in the computing continuum, integrating a DRL component trained on performance feedback. No equations, parameter fittings, predictions derived from inputs, or load-bearing self-citations appear in the provided text. Claims rest on architectural description and preliminary Kubernetes testbed validation rather than any derivation chain that reduces to its own inputs by construction. This is a standard non-circular framework paper.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the unproven effectiveness of DRL for real-time conflict mediation in dynamic continuum environments; no free parameters are explicitly fitted in the abstract, but training relies on domain assumptions about feedback signals.

free parameters (1)

DRL training hyperparameters
Model-specific parameters such as learning rate and reward weights are required for the described training but not reported.

axioms (1)

domain assumption Real-time performance feedback and historical states suffice to train a stable mediator for resource conflicts
Invoked in the description of the DRL model training process.

invented entities (1)

Conflict-aware DRL orchestration framework no independent evidence
purpose: To mediate persistent resource conflicts across edge-fog-cloud deployments
New architectural construct introduced without external falsifiable predictions or independent validation data.

pith-pipeline@v0.9.0 · 5467 in / 1396 out tokens · 44533 ms · 2026-05-16T22:51:50.993186+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

[1]

Distributed computing continuum systems–opportunities and research challenges,

V . C. Pujol, P. K. Donta, A. Morichetta, I. Murturi, and S. Dustdar, “Distributed computing continuum systems–opportunities and research challenges,” inICSOC Workshops, Springer, 2023

work page 2023
[2]

Edge intelligence—research opportunities for distributed computing continuum systems,

V . C. Pujol, P. K. Donta, A. Morichetta, I. Murturi, and S. Dustdar, “Edge intelligence—research opportunities for distributed computing continuum systems,”IEEE Internet Computing, vol. 27, no. 4, pp. 53– 74, 2023

work page 2023
[3]

A comprehensive feature comparison study of open-source container orchestration frameworks,

E. Truyen, D. Van Landuyt, D. Preuveneers, B. Lagaisse, and W. Joosen, “A comprehensive feature comparison study of open-source container orchestration frameworks,”Applied Sciences, vol. 9, no. 5, p. 931, 2019

work page 2019
[4]

Slo-aware dynamic self-adaptation of resources,

M. Awad, N. Kara, and C. Edstrom, “Slo-aware dynamic self-adaptation of resources,”Future Generation Computer Systems, vol. 133, pp. 266– 280, 2022

work page 2022
[5]

Performance optimization across the edge-cloud continuum: A multi-agent rollout approach for cloud-native application workload placement,

P. Soumplis, G. Kontos, P. Kokkinos, A. Kretsis, S. Barrachina-Mu ˜noz, R. Nikbakht, J. Baranda, M. Payar ´o, J. Mangues-Bafalluy, and E. Var- varigos, “Performance optimization across the edge-cloud continuum: A multi-agent rollout approach for cloud-native application workload placement,”SN Computer Science, vol. 5, no. 3, p. 318, 2024

work page 2024
[6]

Heteroedge: Taming the heterogeneity of edge computing system in social sensing,

D. Zhang, T. Rashid, X. Li, N. Vance, and D. Wang, “Heteroedge: Taming the heterogeneity of edge computing system in social sensing,” inProceedings of the International Conference on Internet of Things Design and Implementation, pp. 37–48, 2019

work page 2019
[7]

arXiv preprint arXiv:2505.07603 , year =

C. H. Chen and M. F. Shiu, “Agentflow: Resilient adaptive cloud-edge framework for multi-agent coordination,”arXiv preprint arXiv:2505.07603, 2025

work page arXiv 2025
[8]

Neurosurgeon: Collaborative agent systems for resource- aware edge-cloud orchestration,

S. Tuliet al., “Neurosurgeon: Collaborative agent systems for resource- aware edge-cloud orchestration,”IEEE Transactions on Cloud Comput- ing, 2023

work page 2023
[9]

Resource management in wireless networks via multi-agent deep reinforcement learning,

N. Naderializadeh, J. J. Sydir, M. Simsek, and H. Nikopour, “Resource management in wireless networks via multi-agent deep reinforcement learning,”IEEE Transactions on Wireless Communications, vol. 20, no. 6, pp. 3507–3523, 2021

work page 2021
[10]

Learning-driven zero trust in distributed computing continuum sys- tems,

I. Murturi, P. K. Donta, V . C. Pujol, A. Morichetta, and S. Dustdar, “Learning-driven zero trust in distributed computing continuum sys- tems,” in2023 Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 0044–0049, IEEE, 2023

work page 2023
[11]

A decentralized approach for resource dis- covery using metadata replication in edge networks,

I. Murturi and S. Dustdar, “A decentralized approach for resource dis- covery using metadata replication in edge networks,”IEEE Transactions on Services Computing, vol. 15, no. 5, pp. 2526–2537, 2022

work page 2022
[12]

Adaptive ai-based decentralized resource management in the cloud-edge continuum,

L. Li, J. Bell, M. Coppola, and V . Lomonaco, “Adaptive ai-based decentralized resource management in the cloud-edge continuum,” in 2025 33rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), pp. 329–332, IEEE, 2025

work page 2025
[13]

Resource management at the network edge: A deep reinforcement learning approach,

D. Zeng, L. Gu, S. Pan, J. Cai, and S. Guo, “Resource management at the network edge: A deep reinforcement learning approach,”IEEE Network, vol. 33, no. 3, pp. 26–33, 2019

work page 2019
[14]

A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,

N. Liu, Z. Li, J. Xu, Z. Xu, S. Lin, Q. Qiu, J. Tang, and Y . Wang, “A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,” in2017 IEEE 37th international conference on distributed computing systems (ICDCS), pp. 372–382, IEEE, 2017

work page 2017
[15]

Learning to branch for multi-task learning,

P. Guo, C.-Y . Lee, and D. Ulbricht, “Learning to branch for multi-task learning,” inInternational conference on machine learning, pp. 3854– 3863, PMLR, 2020

work page 2020

[1] [1]

Distributed computing continuum systems–opportunities and research challenges,

V . C. Pujol, P. K. Donta, A. Morichetta, I. Murturi, and S. Dustdar, “Distributed computing continuum systems–opportunities and research challenges,” inICSOC Workshops, Springer, 2023

work page 2023

[2] [2]

Edge intelligence—research opportunities for distributed computing continuum systems,

V . C. Pujol, P. K. Donta, A. Morichetta, I. Murturi, and S. Dustdar, “Edge intelligence—research opportunities for distributed computing continuum systems,”IEEE Internet Computing, vol. 27, no. 4, pp. 53– 74, 2023

work page 2023

[3] [3]

A comprehensive feature comparison study of open-source container orchestration frameworks,

E. Truyen, D. Van Landuyt, D. Preuveneers, B. Lagaisse, and W. Joosen, “A comprehensive feature comparison study of open-source container orchestration frameworks,”Applied Sciences, vol. 9, no. 5, p. 931, 2019

work page 2019

[4] [4]

Slo-aware dynamic self-adaptation of resources,

M. Awad, N. Kara, and C. Edstrom, “Slo-aware dynamic self-adaptation of resources,”Future Generation Computer Systems, vol. 133, pp. 266– 280, 2022

work page 2022

[5] [5]

Performance optimization across the edge-cloud continuum: A multi-agent rollout approach for cloud-native application workload placement,

P. Soumplis, G. Kontos, P. Kokkinos, A. Kretsis, S. Barrachina-Mu ˜noz, R. Nikbakht, J. Baranda, M. Payar ´o, J. Mangues-Bafalluy, and E. Var- varigos, “Performance optimization across the edge-cloud continuum: A multi-agent rollout approach for cloud-native application workload placement,”SN Computer Science, vol. 5, no. 3, p. 318, 2024

work page 2024

[6] [6]

Heteroedge: Taming the heterogeneity of edge computing system in social sensing,

D. Zhang, T. Rashid, X. Li, N. Vance, and D. Wang, “Heteroedge: Taming the heterogeneity of edge computing system in social sensing,” inProceedings of the International Conference on Internet of Things Design and Implementation, pp. 37–48, 2019

work page 2019

[7] [7]

arXiv preprint arXiv:2505.07603 , year =

C. H. Chen and M. F. Shiu, “Agentflow: Resilient adaptive cloud-edge framework for multi-agent coordination,”arXiv preprint arXiv:2505.07603, 2025

work page arXiv 2025

[8] [8]

Neurosurgeon: Collaborative agent systems for resource- aware edge-cloud orchestration,

S. Tuliet al., “Neurosurgeon: Collaborative agent systems for resource- aware edge-cloud orchestration,”IEEE Transactions on Cloud Comput- ing, 2023

work page 2023

[9] [9]

Resource management in wireless networks via multi-agent deep reinforcement learning,

N. Naderializadeh, J. J. Sydir, M. Simsek, and H. Nikopour, “Resource management in wireless networks via multi-agent deep reinforcement learning,”IEEE Transactions on Wireless Communications, vol. 20, no. 6, pp. 3507–3523, 2021

work page 2021

[10] [10]

Learning-driven zero trust in distributed computing continuum sys- tems,

I. Murturi, P. K. Donta, V . C. Pujol, A. Morichetta, and S. Dustdar, “Learning-driven zero trust in distributed computing continuum sys- tems,” in2023 Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pp. 0044–0049, IEEE, 2023

work page 2023

[11] [11]

A decentralized approach for resource dis- covery using metadata replication in edge networks,

I. Murturi and S. Dustdar, “A decentralized approach for resource dis- covery using metadata replication in edge networks,”IEEE Transactions on Services Computing, vol. 15, no. 5, pp. 2526–2537, 2022

work page 2022

[12] [12]

Adaptive ai-based decentralized resource management in the cloud-edge continuum,

L. Li, J. Bell, M. Coppola, and V . Lomonaco, “Adaptive ai-based decentralized resource management in the cloud-edge continuum,” in 2025 33rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP), pp. 329–332, IEEE, 2025

work page 2025

[13] [13]

Resource management at the network edge: A deep reinforcement learning approach,

D. Zeng, L. Gu, S. Pan, J. Cai, and S. Guo, “Resource management at the network edge: A deep reinforcement learning approach,”IEEE Network, vol. 33, no. 3, pp. 26–33, 2019

work page 2019

[14] [14]

A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,

N. Liu, Z. Li, J. Xu, Z. Xu, S. Lin, Q. Qiu, J. Tang, and Y . Wang, “A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning,” in2017 IEEE 37th international conference on distributed computing systems (ICDCS), pp. 372–382, IEEE, 2017

work page 2017

[15] [15]

Learning to branch for multi-task learning,

P. Guo, C.-Y . Lee, and D. Ulbricht, “Learning to branch for multi-task learning,” inInternational conference on machine learning, pp. 3854– 3863, PMLR, 2020

work page 2020