A Delta-Aware Orchestration Framework for Scalable Multi-Agent Edge Computing

Joyjit Roy; Samaresh Kumar Singh

arxiv: 2604.20129 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.DC· cs.PF· cs.SE

A Delta-Aware Orchestration Framework for Scalable Multi-Agent Edge Computing

Samaresh Kumar Singh , Joyjit Roy This is my paper

Pith reviewed 2026-05-10 00:59 UTC · model grok-4.3

classification 💻 cs.LG cs.DCcs.PFcs.SE

keywords multi-agent edge computingorchestration frameworkdifferential neural cachingaction space pruninghardware affinity matchinglatency optimizationscalable multi-agent systemssynergistic performance degradation

0 comments

The pith

Integrating three interdependent optimizations prevents superlinear latency growth when scaling multi-agent edge systems beyond 100 agents.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that scaling multi-agent edge computing past 100 agents triggers a synergistic collapse where separate fixes for action-space growth, redundant computation, and poor hardware scheduling fail to prevent sharp drops in performance. It introduces DAOEF to apply differential neural caching, criticality-based action pruning, and learned hardware affinity matching together so that each mechanism compensates for the others. Controlled tests show that removing any one mechanism raises latency by more than 40 percent, confirming the gains are interdependent. On four datasets and a physical testbed the combined approach delivers a 1.45 times improvement over applying the same three techniques independently, with sub-linear latency growth up to 250 agents and a 62 percent latency cut in a 200-agent cloud run.

Core claim

The authors establish that the three factors—exponential action-space growth, computational redundancy among adjacent agents, and task-agnostic hardware scheduling—amplify one another when left unaddressed, producing the Synergistic Collapse. DAOEF counters this by storing input deltas for caching, organizing agents into priority tiers for pruning, and matching tasks to optimal accelerators, with each component using outputs from the others to keep thresholds and tiers effective. Experiments isolating each mechanism demonstrate that only their joint operation keeps deadline satisfaction and latency from degrading superlinearly.

What carries the argument

The Delta-Aware Orchestration Framework (DAOEF) that coordinates differential neural caching of layer activations, criticality-based action-space pruning into priority tiers, and learned hardware-affinity matching to exploit their mutual dependencies.

If this is right

Latency stays sub-linear up to 250 agents instead of degrading sharply after 100.
A 200-agent deployment achieves 280 ms average latency versus 735 ms without the joint approach.
Removing any single mechanism increases latency by more than 40 percent.
Deadline satisfaction remains above 70 percent where independent optimizations drop it below 35 percent.
The 1.45 times gain over independent application holds across the tested range of 100-250 agents.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar joint-optimization patterns could be tested in other domains where multiple scaling bottlenecks interact, such as distributed robotics fleets.
The framework's emphasis on delta computation suggests potential energy savings in battery-powered edge devices if the caching hit-rate gains translate directly to reduced compute cycles.
Extending the priority-tier pruning to dynamic, non-spatial agent topologies would be a direct next measurement to check whether the O(n log n) reduction generalizes.

Load-bearing premise

The three mechanisms interact so strongly that each one requires the others to deliver its full benefit, and the similarity thresholds and priority tiers calibrated on the tested datasets will hold for other workloads and hardware.

What would settle it

Running the same 100-to-250-agent experiments with any one of the three mechanisms disabled and finding that latency rises by 40 percent or less, or observing linear or superlinear latency growth past 200 agents on a new dataset outside the four used.

Figures

Figures reproduced from arXiv: 2604.20129 by Joyjit Roy, Samaresh Kumar Singh.

**Figure 2.** Figure 2: Scalability comparison: DAOEF maintains sub-100 ms latency to 250 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

read the original abstract

The Synergistic Collapse occurs when scaling beyond 100 agents causes superlinear performance degradation that individual optimizations cannot prevent. We observe this collapse with 150 cameras in Smart City deployment using MADDPG, where Deadline Satisfaction drops from 78% to 34%, producing approximately $180,000 in annual cost overruns. Prior work has addressed each contributing factor in isolation: exponential action-space growth, computational redundancy among spatially adjacent agents, and task-agnostic hardware scheduling. None has examined how these three factors interact and amplify each other. We present DAOEF (Delta-Aware Orchestration for Edge Federations), a framework that addresses all three simultaneously through: (1) Differential Neural Caching, which stores intermediate layer activations and computes only the input deltas, achieving 2.1x higher hit ratios (72% vs. 35%) than output-level caching while staying within 2% accuracy loss through empirically calibrated similarity thresholds; (2) Criticality-Based Action Space Pruning, which organizes agents into priority tiers and reduces coordination complexity from O(n2) to O(n log n) with less than 6% optimality loss; and (3) Learned Hardware Affinity Matching, which assigns tasks to their optimal accelerator (GPU, CPU, NPU, or FPGA) to prevent compounding mismatch penalties. Controlled factor-isolation experiments confirm that each mechanism is necessary but insufficient on its own: removing any single mechanism increases latency by more than 40%, validating that the gains are interdependent rather than additive. Across four datasets (100-250 agents) and a 20-device physical testbed, DAOEF achieves a 1.45x multiplicative gain over applying the three mechanisms independently. A 200-agent cloud deployment yields 62% latency reduction (280 ms vs. 735 ms), sub-linear latency growth up to 250 agents.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces the DAOEF framework to prevent synergistic collapse in multi-agent edge computing systems scaling beyond 100 agents. It integrates three mechanisms—Differential Neural Caching (using input-delta storage and empirically calibrated similarity thresholds for 2.1x hit-ratio improvement), Criticality-Based Action Space Pruning (priority tiers reducing complexity from O(n²) to O(n log n) with <6% optimality loss), and Learned Hardware Affinity Matching—and reports via factor-isolation experiments that the mechanisms are interdependent. Across four datasets (100-250 agents) and a 20-device testbed, it claims a 1.45x multiplicative gain over independent application of the mechanisms, plus 62% latency reduction (280 ms vs. 735 ms) and sub-linear growth in a 200-agent cloud deployment.

Significance. If the empirical claims hold with proper validation, the result would highlight the importance of jointly addressing interacting factors (action-space growth, redundancy, and hardware mismatch) rather than isolated optimizations in large-scale edge AI, with direct relevance to cost-sensitive deployments like smart-city camera networks. The factor-isolation design and reported necessity of each component (removing any increases latency >40%) represent a useful methodological contribution if the baselines and metrics are fully specified.

major comments (3)

[Abstract and experimental evaluation] Abstract and experimental evaluation section: The central quantitative claims (1.45x multiplicative gain, 62% latency reduction, sub-linear growth to 250 agents, and >40% latency increase when removing any mechanism) are presented without error bars, statistical tests, dataset characteristics (e.g., exact input distributions for the four datasets), full experimental protocol, or hyperparameter details for the similarity thresholds and priority tiers. This renders the factor-isolation results and necessity argument unverifiable from the provided information.
[Factor-isolation experiments] Factor-isolation experiments: The definition of the 'multiplicative gain' metric and the construction of the independent-mechanism baselines must be specified precisely (including how the three mechanisms are disabled individually while keeping all other parameters fixed) to confirm that the reported 1.45x factor reflects true interdependence rather than an artifact of baseline selection or parameter retuning.
[Mechanism descriptions and evaluation] Generalizability of calibrated parameters: The similarity thresholds in Differential Neural Caching and the priority tier definitions in Criticality-Based Action Space Pruning are described as empirically calibrated on the tested MADDPG camera scenarios (100-250 agents). Additional cross-dataset or cross-task validation (beyond the four reported datasets and 20-device testbed) is required to support the claim that these parameters generalize and that synergistic collapse is prevented in broader settings.

minor comments (2)

[Abstract] The abstract mentions 'approximately $180,000 in annual cost overruns' without detailing the cost model or assumptions used to derive this figure.
[Criticality-Based Action Space Pruning] Notation for complexity reductions (O(n²) to O(n log n)) should include a brief justification or reference to the underlying coordination algorithm.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which have helped us strengthen the verifiability and clarity of the manuscript. We provide point-by-point responses to the major comments below, indicating revisions made where appropriate.

read point-by-point responses

Referee: [Abstract and experimental evaluation] Abstract and experimental evaluation section: The central quantitative claims (1.45x multiplicative gain, 62% latency reduction, sub-linear growth to 250 agents, and >40% latency increase when removing any mechanism) are presented without error bars, statistical tests, dataset characteristics (e.g., exact input distributions for the four datasets), full experimental protocol, or hyperparameter details for the similarity thresholds and priority tiers. This renders the factor-isolation results and necessity argument unverifiable from the provided information.

Authors: We agree that the original presentation lacked sufficient detail for independent verification. In the revised manuscript, we have added error bars (standard deviation over five random seeds), paired t-test results (p < 0.01) for all reported gains, exact dataset characteristics including input distributions and event rates for each of the four datasets (now in Section 4.1), the complete experimental protocol (moved to Appendix C), and specific hyperparameter values (similarity threshold of 0.82 and tier definitions in Sections 3.1 and 3.2). These changes render the factor-isolation experiments and necessity claims fully verifiable. revision: yes
Referee: [Factor-isolation experiments] Factor-isolation experiments: The definition of the 'multiplicative gain' metric and the construction of the independent-mechanism baselines must be specified precisely (including how the three mechanisms are disabled individually while keeping all other parameters fixed) to confirm that the reported 1.45x factor reflects true interdependence rather than an artifact of baseline selection or parameter retuning.

Authors: We concur that precise definitions are essential. The multiplicative gain is formally defined as the ratio of the combined latency reduction achieved by DAOEF to the sum of the latency reductions obtained when each mechanism is applied in isolation. The independent baselines are constructed by disabling one mechanism at a time (replacing Differential Neural Caching with standard output caching, removing priority tiers from action pruning, and substituting round-robin scheduling for Learned Hardware Affinity Matching) while holding all other parameters, hyperparameters, and training conditions fixed. We have added this definition, along with pseudocode, to the new Section 5.2 in the revision. revision: yes
Referee: [Mechanism descriptions and evaluation] Generalizability of calibrated parameters: The similarity thresholds in Differential Neural Caching and the priority tier definitions in Criticality-Based Action Space Pruning are described as empirically calibrated on the tested MADDPG camera scenarios (100-250 agents). Additional cross-dataset or cross-task validation (beyond the four reported datasets and 20-device testbed) is required to support the claim that these parameters generalize and that synergistic collapse is prevented in broader settings.

Authors: The parameters were calibrated on the MADDPG camera scenarios, with the four datasets providing variation in scale and input statistics within this domain. We have added a sensitivity analysis in the revised Section 6.3 demonstrating that performance varies by less than 5% across the tested datasets when thresholds and tiers are held constant. We acknowledge that cross-task validation on non-camera multi-agent settings would provide stronger evidence of broader generalizability. We have therefore revised the claims to explicitly limit applicability to similar MADDPG-based edge camera networks and noted the need for further validation as a limitation in the discussion. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical claims rest on independent experiments

full rationale

The paper presents DAOEF as an orchestration framework validated through controlled factor-isolation experiments on four datasets and a physical testbed. No mathematical derivation chain, equations, or self-citations are shown that reduce the 1.45x multiplicative gain or latency reductions to fitted parameters by construction. The interdependence claim is supported by explicit removal experiments (latency increase >40% when any mechanism is ablated), which are independent of the framework definition itself. Empirically calibrated thresholds are inputs to the system, not outputs renamed as predictions. The derivation is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 1 invented entities

Abstract-only view limits visibility; several empirical choices and interaction assumptions are invoked without upstream justification or independent evidence.

free parameters (2)

similarity thresholds
Empirically calibrated to keep accuracy loss under 2% for differential caching.
priority tier definitions
Used to organize agents for O(n log n) pruning with claimed <6% optimality loss.

axioms (1)

domain assumption The three factors (action-space growth, redundancy, hardware mismatch) interact and amplify each other beyond isolated fixes
Stated as the core motivation and validated only via the paper's own factor-isolation experiments.

invented entities (1)

DAOEF framework no independent evidence
purpose: Orchestration system that simultaneously applies the three mechanisms
Newly proposed integrated system without external validation of its components outside the reported tests.

pith-pipeline@v0.9.0 · 5645 in / 1518 out tokens · 98442 ms · 2026-05-10T00:59:01.024851+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 1 internal anchor

[1]

Multi-agent deep reinforcement learning for task offloading in uav-assisted mobile edge computing,

N. Zhao, Z. Ye, Y . Pei, Y .-C. Liang, and D. Niyato, “Multi-agent deep reinforcement learning for task offloading in uav-assisted mobile edge computing,”IEEE Transactions on Wireless Communications, vol. 21, no. 9, pp. 6949–6963, 2022

work page 2022
[2]

Joint secure offloading and resource allocation for vehicular edge computing network: A multi-agent deep reinforcement learning approach,

Y . Ju, Y . Chen, Z. Cao, L. Liu, Q. Pei, M. Xiao, K. Ota, M. Dong, and V . C. Leung, “Joint secure offloading and resource allocation for vehicular edge computing network: A multi-agent deep reinforcement learning approach,”IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 5, pp. 5555–5569, 2023

work page 2023
[3]

Large-scale cooperative task offloading and resource allocation in heterogeneous mec systems via multi-agent reinforcement learning,

Z. Gao, L. Yang, and Y . Dai, “Large-scale cooperative task offloading and resource allocation in heterogeneous mec systems via multi-agent reinforcement learning,”IEEE Internet of Things Journal, vol. 10, no. 20, pp. 17 837–17 851, 2023

work page 2023
[4]

Cooperative task offloading for mobile edge computing based on multi-agent deep reinforcement learning,

J. Yang, Q. Yuan, S. Chen, H. He, X. Jiang, and X. Tan, “Cooperative task offloading for mobile edge computing based on multi-agent deep reinforcement learning,”IEEE Transactions on Network and Service Management, vol. 20, no. 4, pp. 4123–4138, 2023

work page 2023
[5]

Collabo- rative task offloading optimization for satellite mobile edge computing using multi-agent deep reinforcement learning,

H. Zhang, H. Zhao, R. Liu, A. Kaushik, X. Gao, and S. Xu, “Collabo- rative task offloading optimization for satellite mobile edge computing using multi-agent deep reinforcement learning,”IEEE Transactions on Vehicular Technology, vol. 73, no. 8, pp. 11 234–11 249, 2024

work page 2024
[6]

Multitask multiobjective deep reinforcement learning-based computation offloading method for industrial internet of things,

J. Cai, H. Fu, and Y . Liu, “Multitask multiobjective deep reinforcement learning-based computation offloading method for industrial internet of things,”IEEE Internet of Things Journal, vol. 10, no. 4, pp. 3516–3528, 2023

work page 2023
[7]

Qmix: Monotonic value function factorisation for decentralised multi-agent reinforcement learning,

T. Rashid, M. Samvelyan, C. Schroederet al., “Qmix: Monotonic value function factorisation for decentralised multi-agent reinforcement learning,” inInternational Conference on Machine Learning. PMLR, 2018, pp. 4295–4304

work page 2018
[8]

A survey on edge computing systems and tools,

X. Chen, L. Jiao, W. Li, and X. Fu, “A survey on edge computing systems and tools,”Proceedings of the IEEE, vol. 112, no. 1, pp. 6–34, 2024

work page 2024
[9]

Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems,

L. Huang, X. Feng, L. Zhang, L. Qian, and Y . Wu, “Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems,”IEEE Transactions on Network Science and Engineering, vol. 11, no. 1, pp. 968–983, 2024

work page 2024
[10]

Communication-efficient federated learning for digital twin edge net- works in industrial iot,

Y . Lu, X. Huang, K. Zhang, S. Maharjan, and Y . Zhang, “Communication-efficient federated learning for digital twin edge net- works in industrial iot,”IEEE Transactions on Industrial Informatics, vol. 19, no. 2, pp. 1361–1371, 2023

work page 2023
[11]

Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications for centralized and federated learning,

M. A. Ferrag, O. Friha, M. Hamdi, H. Gharbi, L. Shu, X. Xie, A. Seret, and A. Polleres, “Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications for centralized and federated learning,”IEEE Access, vol. 9, pp. 121 676–121 698, 2021

work page 2021
[12]

Similarity estimation techniques from rounding algo- rithms,

M. S. Charikar, “Similarity estimation techniques from rounding algo- rithms,” inProceedings of the Thirty-fourth Annual ACM Symposium on Theory of Computing. ACM, 2002, pp. 380–388

work page 2002
[13]

Addressing function approx- imation error in actor-critic methods,

S. Fujimoto, H. v. Hoof, and D. Meger, “Addressing function approx- imation error in actor-critic methods,” inInternational Conference on Machine Learning (ICML). PMLR, 2018, pp. 1587–1596

work page 2018
[14]

A stochastic approximation method,

H. Robbins and S. Monro, “A stochastic approximation method,”The Annals of Mathematical Statistics, 1951

work page 1951
[15]

A measure of asymptotic efficiency for tests of a hypoth- esis based on the sum of observations,

H. Chernoff, “A measure of asymptotic efficiency for tests of a hypoth- esis based on the sum of observations,”The Annals of Mathematical Statistics, 1952

work page 1952
[16]

Approximation capabilities of multilayer feedforward net- works,

K. Hornik, “Approximation capabilities of multilayer feedforward net- works,” 1991

work page 1991
[17]

N. L. Johnson, S. Kotz, and N. Balakrishnan,Continuous univariate distributions. John Wiley & Sons, 1995, vol. 2

work page 1995
[18]

Lipschitz regularity of deep neural networks: analysis and efficient estimation,

A. Virmaux and K. Scaman, “Lipschitz regularity of deep neural networks: analysis and efficient estimation,” inNeurIPS, 2018

work page 2018
[19]

Spectrally-normalized margin bounds for neural networks,

P. L. Bartlett, D. J. Foster, and M. J. Telgarsky, “Spectrally-normalized margin bounds for neural networks,” inNeurIPS, 2017

work page 2017
[20]

Detection and tracking meet drones challenge,

P. Zhu, L. Wen, X. Bian, H. Ling, Q. Huet al., “Detection and tracking meet drones challenge,”arXiv preprint arXiv:2001.06303, 2020

work page arXiv 2001
[21]

Multi- agent actor-critic for mixed cooperative-competitive environments,

R. Lowe, Y . Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi- agent actor-critic for mixed cooperative-competitive environments,”Ad- vances in Neural Information Processing Systems, vol. 30, 2017

work page 2017
[22]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox- imal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017, widely adopted in multi-agent RL systems

work page internal anchor Pith review Pith/arXiv arXiv 2017
[23]

Cohen,Statistical Power Analysis for the Behavioral Sciences, 2nd ed

J. Cohen,Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988

work page 1988
[24]

Greenhouse gas equivalencies calculator,

U.S. Environmental Protection Agency, “Greenhouse gas equivalencies calculator,” https://www.epa.gov/energy/greenhouse-gas-equivalencies- calculator, 2021, accessed: 2024-01-10

work page 2021

[1] [1]

Multi-agent deep reinforcement learning for task offloading in uav-assisted mobile edge computing,

N. Zhao, Z. Ye, Y . Pei, Y .-C. Liang, and D. Niyato, “Multi-agent deep reinforcement learning for task offloading in uav-assisted mobile edge computing,”IEEE Transactions on Wireless Communications, vol. 21, no. 9, pp. 6949–6963, 2022

work page 2022

[2] [2]

Joint secure offloading and resource allocation for vehicular edge computing network: A multi-agent deep reinforcement learning approach,

Y . Ju, Y . Chen, Z. Cao, L. Liu, Q. Pei, M. Xiao, K. Ota, M. Dong, and V . C. Leung, “Joint secure offloading and resource allocation for vehicular edge computing network: A multi-agent deep reinforcement learning approach,”IEEE Transactions on Intelligent Transportation Systems, vol. 24, no. 5, pp. 5555–5569, 2023

work page 2023

[3] [3]

Large-scale cooperative task offloading and resource allocation in heterogeneous mec systems via multi-agent reinforcement learning,

Z. Gao, L. Yang, and Y . Dai, “Large-scale cooperative task offloading and resource allocation in heterogeneous mec systems via multi-agent reinforcement learning,”IEEE Internet of Things Journal, vol. 10, no. 20, pp. 17 837–17 851, 2023

work page 2023

[4] [4]

Cooperative task offloading for mobile edge computing based on multi-agent deep reinforcement learning,

J. Yang, Q. Yuan, S. Chen, H. He, X. Jiang, and X. Tan, “Cooperative task offloading for mobile edge computing based on multi-agent deep reinforcement learning,”IEEE Transactions on Network and Service Management, vol. 20, no. 4, pp. 4123–4138, 2023

work page 2023

[5] [5]

Collabo- rative task offloading optimization for satellite mobile edge computing using multi-agent deep reinforcement learning,

H. Zhang, H. Zhao, R. Liu, A. Kaushik, X. Gao, and S. Xu, “Collabo- rative task offloading optimization for satellite mobile edge computing using multi-agent deep reinforcement learning,”IEEE Transactions on Vehicular Technology, vol. 73, no. 8, pp. 11 234–11 249, 2024

work page 2024

[6] [6]

Multitask multiobjective deep reinforcement learning-based computation offloading method for industrial internet of things,

J. Cai, H. Fu, and Y . Liu, “Multitask multiobjective deep reinforcement learning-based computation offloading method for industrial internet of things,”IEEE Internet of Things Journal, vol. 10, no. 4, pp. 3516–3528, 2023

work page 2023

[7] [7]

Qmix: Monotonic value function factorisation for decentralised multi-agent reinforcement learning,

T. Rashid, M. Samvelyan, C. Schroederet al., “Qmix: Monotonic value function factorisation for decentralised multi-agent reinforcement learning,” inInternational Conference on Machine Learning. PMLR, 2018, pp. 4295–4304

work page 2018

[8] [8]

A survey on edge computing systems and tools,

X. Chen, L. Jiao, W. Li, and X. Fu, “A survey on edge computing systems and tools,”Proceedings of the IEEE, vol. 112, no. 1, pp. 6–34, 2024

work page 2024

[9] [9]

Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems,

L. Huang, X. Feng, L. Zhang, L. Qian, and Y . Wu, “Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems,”IEEE Transactions on Network Science and Engineering, vol. 11, no. 1, pp. 968–983, 2024

work page 2024

[10] [10]

Communication-efficient federated learning for digital twin edge net- works in industrial iot,

Y . Lu, X. Huang, K. Zhang, S. Maharjan, and Y . Zhang, “Communication-efficient federated learning for digital twin edge net- works in industrial iot,”IEEE Transactions on Industrial Informatics, vol. 19, no. 2, pp. 1361–1371, 2023

work page 2023

[11] [11]

Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications for centralized and federated learning,

M. A. Ferrag, O. Friha, M. Hamdi, H. Gharbi, L. Shu, X. Xie, A. Seret, and A. Polleres, “Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications for centralized and federated learning,”IEEE Access, vol. 9, pp. 121 676–121 698, 2021

work page 2021

[12] [12]

Similarity estimation techniques from rounding algo- rithms,

M. S. Charikar, “Similarity estimation techniques from rounding algo- rithms,” inProceedings of the Thirty-fourth Annual ACM Symposium on Theory of Computing. ACM, 2002, pp. 380–388

work page 2002

[13] [13]

Addressing function approx- imation error in actor-critic methods,

S. Fujimoto, H. v. Hoof, and D. Meger, “Addressing function approx- imation error in actor-critic methods,” inInternational Conference on Machine Learning (ICML). PMLR, 2018, pp. 1587–1596

work page 2018

[14] [14]

A stochastic approximation method,

H. Robbins and S. Monro, “A stochastic approximation method,”The Annals of Mathematical Statistics, 1951

work page 1951

[15] [15]

A measure of asymptotic efficiency for tests of a hypoth- esis based on the sum of observations,

H. Chernoff, “A measure of asymptotic efficiency for tests of a hypoth- esis based on the sum of observations,”The Annals of Mathematical Statistics, 1952

work page 1952

[16] [16]

Approximation capabilities of multilayer feedforward net- works,

K. Hornik, “Approximation capabilities of multilayer feedforward net- works,” 1991

work page 1991

[17] [17]

N. L. Johnson, S. Kotz, and N. Balakrishnan,Continuous univariate distributions. John Wiley & Sons, 1995, vol. 2

work page 1995

[18] [18]

Lipschitz regularity of deep neural networks: analysis and efficient estimation,

A. Virmaux and K. Scaman, “Lipschitz regularity of deep neural networks: analysis and efficient estimation,” inNeurIPS, 2018

work page 2018

[19] [19]

Spectrally-normalized margin bounds for neural networks,

P. L. Bartlett, D. J. Foster, and M. J. Telgarsky, “Spectrally-normalized margin bounds for neural networks,” inNeurIPS, 2017

work page 2017

[20] [20]

Detection and tracking meet drones challenge,

P. Zhu, L. Wen, X. Bian, H. Ling, Q. Huet al., “Detection and tracking meet drones challenge,”arXiv preprint arXiv:2001.06303, 2020

work page arXiv 2001

[21] [21]

Multi- agent actor-critic for mixed cooperative-competitive environments,

R. Lowe, Y . Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi- agent actor-critic for mixed cooperative-competitive environments,”Ad- vances in Neural Information Processing Systems, vol. 30, 2017

work page 2017

[22] [22]

Proximal Policy Optimization Algorithms

J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Prox- imal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017, widely adopted in multi-agent RL systems

work page internal anchor Pith review Pith/arXiv arXiv 2017

[23] [23]

Cohen,Statistical Power Analysis for the Behavioral Sciences, 2nd ed

J. Cohen,Statistical Power Analysis for the Behavioral Sciences, 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates, 1988

work page 1988

[24] [24]

Greenhouse gas equivalencies calculator,

U.S. Environmental Protection Agency, “Greenhouse gas equivalencies calculator,” https://www.epa.gov/energy/greenhouse-gas-equivalencies- calculator, 2021, accessed: 2024-01-10

work page 2021