Carbon-Aware Intrusion Detection: A Comparative Study of Supervised and Unsupervised DRL for Sustainable IoT Edge Gateways
Pith reviewed 2026-05-21 19:10 UTC · model grok-4.3
The pith
Two DRL-based IDS for IoT edge gateways achieve 94% and 98% detection accuracy via a carbon-aware multi-objective reward.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that a carbon-aware multi-objective reward formulation in deep reinforcement learning enables AutoDRL-IDS to reach 94% detection accuracy with labeled data and DeepEdgeIDS to reach 98% offline evaluation accuracy through label-free anomaly detection plus online mitigation feedback, supporting sustainable real-time IDS operation in dynamic IoT networks.
What carries the argument
The carbon-aware multi-objective reward formulation that supports supervised reward optimization for AutoDRL-IDS and label-free online reward learning for DeepEdgeIDS.
If this is right
- AutoDRL-IDS offers a path to 94% accurate detection wherever labeled attack data is available.
- DeepEdgeIDS shows that label-free anomaly detection plus online feedback can reach 98% accuracy on edge hardware.
- Both models support real-time IDS that accounts for energy efficiency and carbon impact in dynamic networks.
- Theoretical analysis combined with gateway experiments confirms the feasibility of the dual-objective approach.
Where Pith is reading between the lines
- The same reward structure could be tested on other edge security tasks such as malware detection or access control.
- Deployment in real IoT testbeds with live traffic would reveal whether the offline 98% accuracy holds under variable loads.
- Pairing the models with low-power hardware accelerators might yield further measurable drops in carbon cost.
Load-bearing premise
The custom carbon-aware multi-objective reward function can be optimized to improve both detection performance and sustainability metrics simultaneously without one objective degrading the other in practice.
What would settle it
An experiment where lowering the carbon component of the reward consistently drops detection accuracy below 90% would falsify the claim that both objectives improve together.
Figures
read the original abstract
The rapid expansion of the Internet of Things (IoT) has intensified cybersecurity challenges, particularly in mitigating Distributed Denial-of-Service (DDoS) attacks at the network edge. Traditional Intrusion Detection Systems (IDSs) face significant limitations, including poor adaptability to evolving and zero-day attacks, reliance on static signatures and labeled datasets, and inefficiency on resource-constrained edge gateways. Moreover, most existing DRL-based IDS studies overlook sustainability factors such as energy efficiency and carbon impact. To address these challenges, this paper proposes two novel Deep Reinforcement Learning (DRL)-based IDS: DeepEdgeIDS, a label-free Autoencoder-DRL hybrid, and AutoDRL-IDS, a supervised LSTM--DRL model. Both DRL-based IDS are validated through theoretical analysis and experimental evaluation on edge gateways. Results demonstrate that AutoDRL-IDS achieves 94% detection accuracy using labeled data, while DeepEdgeIDS attains 98% offline evaluation accuracy through label-free anomaly detection and online mitigation feedback. This study introduces a carbon-aware, multi-objective reward formulation that supports supervised reward optimization for AutoDRL-IDS and label-free online reward learning for DeepEdgeIDS, enabling sustainable real-time IDS operation in dynamic IoT networks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes two DRL-based intrusion detection systems for IoT edge gateways facing DDoS attacks: AutoDRL-IDS, a supervised LSTM-DRL model reporting 94% detection accuracy with labeled data, and DeepEdgeIDS, a label-free Autoencoder-DRL hybrid reporting 98% offline accuracy via anomaly detection and online mitigation. Both incorporate a custom carbon-aware multi-objective reward formulation intended to jointly optimize detection performance and sustainability metrics such as energy efficiency and carbon impact, with validation via theoretical analysis and experiments on edge gateways.
Significance. If the central performance claims and non-conflicting objective improvements hold under rigorous controls, the work would contribute to the emerging intersection of sustainable computing and adaptive cybersecurity for resource-constrained IoT. The comparative supervised versus label-free DRL framing, combined with explicit carbon awareness in the reward, could inform practical edge deployments where both security and environmental constraints matter. The emphasis on online mitigation feedback and real-time operation addresses acknowledged limitations of static signature-based IDS.
major comments (2)
- [Abstract and experimental evaluation] Abstract and experimental evaluation sections: the central claims of 94% and 98% detection accuracy are presented without any description of the underlying dataset(s), attack traffic generation method, baseline algorithms (e.g., standard supervised ML classifiers or other DRL-IDS), number of independent runs, or statistical measures such as standard deviation or confidence intervals. These omissions directly undermine assessment of whether the reported figures support the superiority or sustainability claims.
- [Reward formulation] Reward formulation section: the multi-objective carbon-aware reward is asserted to enable simultaneous gains in detection accuracy and sustainability without degradation, yet no explicit trade-off curves, Pareto analysis, or sensitivity results on the objective weights are supplied. This leaves the weakest assumption—that the custom reward can be optimized without one objective harming the other—unsupported by concrete evidence in the reported experiments.
minor comments (2)
- [Methodology] Notation for the carbon impact term and energy consumption metric should be defined consistently when first introduced and cross-referenced to the reward equation to avoid ambiguity for readers unfamiliar with carbon-aware RL.
- [Discussion] The manuscript would benefit from a dedicated limitations subsection discussing potential overfitting to the chosen edge-gateway hardware or sensitivity to hyperparameter choices in the DRL agents.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. The comments highlight important aspects for improving clarity and rigor. We address each major comment below and indicate the changes we will make in the revised version.
read point-by-point responses
-
Referee: [Abstract and experimental evaluation] Abstract and experimental evaluation sections: the central claims of 94% and 98% detection accuracy are presented without any description of the underlying dataset(s), attack traffic generation method, baseline algorithms (e.g., standard supervised ML classifiers or other DRL-IDS), number of independent runs, or statistical measures such as standard deviation or confidence intervals. These omissions directly undermine assessment of whether the reported figures support the superiority or sustainability claims.
Authors: We agree that these details are necessary for a complete evaluation of the reported accuracy figures. In the revised manuscript, we will expand both the abstract and the experimental evaluation section to explicitly describe the dataset(s) employed, the DDoS attack traffic generation approach, the baseline algorithms used for comparison (including standard supervised ML classifiers and other DRL-IDS methods), the number of independent runs, and statistical measures such as standard deviations and confidence intervals. This will provide the necessary context to assess the performance and sustainability claims. revision: yes
-
Referee: [Reward formulation] Reward formulation section: the multi-objective carbon-aware reward is asserted to enable simultaneous gains in detection accuracy and sustainability without degradation, yet no explicit trade-off curves, Pareto analysis, or sensitivity results on the objective weights are supplied. This leaves the weakest assumption—that the custom reward can be optimized without one objective harming the other—unsupported by concrete evidence in the reported experiments.
Authors: We acknowledge that additional evidence is required to substantiate the joint optimization claim. We will add to the revised manuscript explicit trade-off curves, Pareto analysis, and sensitivity results with respect to the objective weights. These additions will demonstrate that the carbon-aware multi-objective reward supports simultaneous improvements in detection performance and sustainability metrics without one objective degrading the other. revision: yes
Circularity Check
No significant circularity
full rationale
The paper introduces a custom carbon-aware multi-objective reward formulation to support both supervised and label-free DRL training for IDS on IoT edge gateways. Reported accuracies (94% for AutoDRL-IDS, 98% for DeepEdgeIDS) are presented as outcomes of experimental evaluation rather than direct algebraic consequences of the reward definition itself. No equations or steps in the abstract reduce the performance metrics to parameters fitted on the identical evaluation data, nor do they rely on self-citation chains or imported uniqueness theorems for the central claims. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- objective weights in carbon-aware reward
axioms (1)
- domain assumption Reinforcement learning agents can converge to effective policies for real-time intrusion mitigation through environment interaction and reward feedback.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
R(st, at) = α DR − β FPR − λ Lresp − δ Eoverhead − ϵ Mutil − ζ Cemission
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Internet of things: Security and solutions survey,
P. K. Sadhu, V . P. Yanambaka, and A. Abdelgawad, “Internet of things: Security and solutions survey,”Sensors, vol. 22, no. 19, p. 7433, 2022
work page 2022
-
[2]
Security and internet of things: benefits, challenges, and future perspectives,
H. Taherdoost, “Security and internet of things: benefits, challenges, and future perspectives,”Electronics, vol. 12, no. 8, p. 1901, 2023
work page 1901
-
[3]
K. W ´ojcicki, M. Biega ´nska, B. Paliwoda, and J. G ´orna, “Internet of things in industry: Research profiling, application, challenges and opportunities—a review,”Energies, vol. 15, no. 5, p. 1806, 2022
work page 2022
-
[4]
The internet of things for logistics: Perspectives, application review, and challenges,
H. Tran-Dang, N. Krommenacker, P. Charpentier, and D.-S. Kim, “The internet of things for logistics: Perspectives, application review, and challenges,”IETE Technical Review, vol. 39, no. 1, pp. 93–121, 2022
work page 2022
-
[5]
Z. Shah, I. Ullah, H. Li, A. Levula, and K. Khurshid, “Blockchain based solutions to mitigate distributed denial of service (ddos) attacks in the internet of things (iot): A survey,”Sensors, vol. 22, no. 3, p. 1094, 2022
work page 2022
-
[6]
Enhanced method of ann based model for detection of ddos attacks on multimedia internet of things,
R. Gopi, V . Sathiyamoorthi, S. Selvakumar, R. Manikandan, P. Chat- terjee, N. Jhanjhi, and A. K. Luhach, “Enhanced method of ann based model for detection of ddos attacks on multimedia internet of things,” Multimedia Tools and Applications, pp. 1–19, 2022
work page 2022
-
[7]
H. Zhou, Y . Zheng, X. Jia, and J. Shu, “Collaborative prediction and detection of ddos attacks in edge computing: A deep learning-based approach with distributed sdn,”Computer Networks, vol. 225, p. 109642, 2023
work page 2023
-
[8]
The internet of things security: A survey en- compassing unexplored areas and new insights,
A. E. Omolara, A. Alabdulatif, O. I. Abiodun, M. Alawida, A. Alab- dulatif, H. Arshadet al., “The internet of things security: A survey en- compassing unexplored areas and new insights,”Computers & Security, vol. 112, p. 102494, 2022
work page 2022
-
[9]
Towards detection of ddos attacks in iot with optimal features selection,
P. Kumari, A. K. Jain, Y . Pal, K. Singh, and A. Singh, “Towards detection of ddos attacks in iot with optimal features selection,”Wireless Personal Communications, vol. 137, no. 2, pp. 951–976, 2024
work page 2024
-
[10]
Internet of things intrusion de- tection systems: a comprehensive review and future directions,
A. Heidari and M. A. Jabraeil Jamali, “Internet of things intrusion de- tection systems: a comprehensive review and future directions,”Cluster Computing, vol. 26, no. 6, pp. 3753–3780, 2023
work page 2023
-
[11]
N. Moustafa, N. Koroniotis, M. Keshk, A. Y . Zomaya, and Z. Tari, “Explainable intrusion detection for cyber defences in the internet of things: Opportunities and solutions,”IEEE Communications Surveys & Tutorials, vol. 25, no. 3, pp. 1775–1807, 2023
work page 2023
-
[12]
S. Arisdakessian, O. A. Wahab, A. Mourad, H. Otrok, and M. Guizani, “A survey on iot intrusion detection: Federated learning, game theory, social psychology, and explainable ai as future directions,”IEEE Internet of Things Journal, vol. 10, no. 5, pp. 4059–4092, 2022
work page 2022
-
[13]
Intrusion detection system for industrial internet of things based on deep reinforcement learning,
S. Tharewal, M. W. Ashfaque, S. S. Banu, P. Uma, S. M. Hassen, and M. Shabaz, “Intrusion detection system for industrial internet of things based on deep reinforcement learning,”Wireless Communications and Mobile Computing, vol. 2022, no. 1, p. 9023719, 2022
work page 2022
-
[14]
Security defense strategy algorithm for internet of things based on deep reinforcement learning,
X. Feng, J. Han, R. Zhang, S. Xu, and H. Xia, “Security defense strategy algorithm for internet of things based on deep reinforcement learning,” High-Confidence Computing, vol. 4, no. 1, p. 100167, 2024
work page 2024
-
[15]
A. Rizzardi, S. Sicari, A. C. Porisiniet al., “Deep reinforcement learning for intrusion detection in internet of things: Best practices, lessons learnt, and open challenges,”Computer Networks, vol. 236, p. 110016, 2023
work page 2023
-
[16]
M. A. Zormati, H. Lakhlef, and S. Ouni, “Review and analysis of recent advances in intelligent network softwarization for the internet of things,” Computer Networks, p. 110215, 2024
work page 2024
-
[17]
Z. Zhao, L. Alzubaidi, J. Zhang, Y . Duan, and Y . Gu, “A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations,”Expert Systems with Appli- cations, vol. 242, p. 122807, 2024
work page 2024
-
[18]
M. J. Neuer, “Unsupervised learning,” inMachine Learning for Engi- neers: Introduction to Physics-Informed, Explainable Learning Methods for AI in Engineering Applications. Springer, 2024, pp. 141–172
work page 2024
-
[19]
Machine learning in real-time internet of things (iot) systems: A survey,
J. Bian, A. Al Arafat, H. Xiong, J. Li, L. Li, H. Chen, J. Wang, D. Dou, and Z. Guo, “Machine learning in real-time internet of things (iot) systems: A survey,”IEEE Internet of Things Journal, vol. 9, no. 11, pp. 8364–8386, 2022
work page 2022
-
[20]
Unsupervised deep learning for iot time series,
Y . Liu, Y . Zhou, K. Yang, and X. Wang, “Unsupervised deep learning for iot time series,”IEEE Internet of Things Journal, vol. 10, no. 16, pp. 14 285–14 306, 2023
work page 2023
-
[21]
R. Priyadarshi, “Exploring machine learning solutions for overcoming challenges in iot-based wireless sensor network routing: a comprehen- sive review,”Wireless Networks, pp. 1–27, 2024
work page 2024
-
[22]
Comparative review of supervised vs. unsupervised learning in cloud security applications,
N. S. Kharbanda, “Comparative review of supervised vs. unsupervised learning in cloud security applications,” 2024
work page 2024
-
[23]
Y . Feng, W. Zhang, S. Yin, H. Tang, Y . Xiang, and Y . Zhang, “A collaborative stealthy ddos detection method based on reinforcement learning at the edge of internet of things,”IEEE Internet of Things Journal, vol. 10, no. 20, pp. 17 934–17 948, 2023
work page 2023
-
[24]
M. Aljebreen, H. A. Mengash, M. A. Arasi, S. S. Aljameel, A. S. Salama, and M. A. Hamza, “Enhancing ddos attack detection using snake optimizer with ensemble learning on internet of things environment,” IEEE Access, 2023
work page 2023
-
[25]
Early intru- sion detection system using honeypot for industrial control networks,
A. Pashaei, M. E. Akbari, M. Z. Lighvan, and A. Charmin, “Early intru- sion detection system using honeypot for industrial control networks,” Results in Engineering, vol. 16, p. 100576, 2022
work page 2022
-
[26]
Real-time ddos flooding attack detection in intelligent transportation systems,
H. Karthikeyan and G. Usha, “Real-time ddos flooding attack detection in intelligent transportation systems,”Computers and Electrical Engi- neering, vol. 101, p. 107995, 2022
work page 2022
-
[27]
Bellman operator convergence enhancements in reinforcement learning algorithms,
D. K. Kadurha, D. J. L. Moutouo, and Y . U. Gaba, “Bellman operator convergence enhancements in reinforcement learning algorithms,”arXiv preprint arXiv:2505.14564, 2025
-
[28]
L. St, S. Woldet al., “Analysis of variance (anova),”Chemometrics and intelligent laboratory systems, vol. 6, no. 4, pp. 259–272, 1989
work page 1989
-
[29]
Ddos attack detection in internet of things using recurrent neural network,
O. Yousuf and R. N. Mir, “Ddos attack detection in internet of things using recurrent neural network,”Computers and Electrical Engineering, vol. 101, p. 108034, 2022
work page 2022
-
[30]
Performance analysis of entropy variation- based detection of ddos attacks in iot,
N. Pandey and P. K. Mishra, “Performance analysis of entropy variation- based detection of ddos attacks in iot,”Internet of Things, vol. 23, p. 100812, 2023
work page 2023
-
[31]
I. Ahmad, Z. Wan, and A. Ahmad, “A big data analytics for ddos attack detection using optimized ensemble framework in internet of things,” Internet of Things, vol. 23, p. 100825, 2023
work page 2023
-
[32]
Robust detection of unknown dos/ddos attacks in iot networks using a hybrid learning model,
X.-H. Nguyen and K.-H. Le, “Robust detection of unknown dos/ddos attacks in iot networks using a hybrid learning model,”Internet of Things, vol. 23, p. 100851, 2023
work page 2023
-
[33]
B. B. Gupta, A. Gaurav, V . Arya, and P. Kim, “A deep cnn-based framework for distributed denial of services (ddos) attack detection in internet of things (iot),” inProceedings of the 2023 international conference on research in adaptive and convergent systems, 2023, pp. 1–6
work page 2023
-
[34]
Ieee p2668-compliant multi-layer iot-ddos defense system using deep reinforcement learning,
Y . Liu, K.-F. Tsang, C. K. Wu, Y . Wei, H. Wang, and H. Zhu, “Ieee p2668-compliant multi-layer iot-ddos defense system using deep reinforcement learning,”IEEE Transactions on Consumer Electronics, vol. 69, no. 1, pp. 49–64, 2022
work page 2022
-
[35]
S. Vadigi, K. Sethi, D. Mohanty, S. P. Das, and P. Bera, “Federated reinforcement learning based intrusion detection system using dynamic 16 attention mechanism,”Journal of Information Security and Applications, vol. 78, p. 103608, 2023
work page 2023
-
[36]
T. Ramana, M. Thirunavukkarasan, A. S. Mohammed, G. G. Devarajan, and S. M. Nagarajan, “Ambient intelligence approach: Internet of things based decision performance analysis for intrusion detection,”Computer Communications, vol. 195, pp. 315–322, 2022
work page 2022
-
[37]
Decision model of intrusion response based on markov game in fog computing environment,
X. Ma, Y . Li, and Y . Gao, “Decision model of intrusion response based on markov game in fog computing environment,”Wireless Networks, vol. 29, no. 8, pp. 3383–3392, 2023
work page 2023
-
[38]
Anti-attack scheme for edge devices based on deep reinforcement learning,
R. Zhang, H. Xia, C. Liu, R.-b. Jiang, and X.-g. Cheng, “Anti-attack scheme for edge devices based on deep reinforcement learning,”Wireless Communications and Mobile Computing, vol. 2021, no. 1, p. 6619715, 2021
work page 2021
-
[39]
Malbot- drl: Malware botnet detection using deep reinforcement learning in iot networks,
M. Al-Fawa’reh, J. Abu-Khalaf, P. Szewczyk, and J. J. Kang, “Malbot- drl: Malware botnet detection using deep reinforcement learning in iot networks,”IEEE Internet of Things Journal, 2023
work page 2023
-
[40]
Dual- objective reinforcement learning with novel hamilton-jacobi-bellman formulations,
W. Sharpless, D. Hirsch, S. Tonkens, N. Shinde, and S. Herbert, “Dual- objective reinforcement learning with novel hamilton-jacobi-bellman formulations,”arXiv preprint arXiv:2506.16016, 2025
-
[41]
M. A. Vasfi and B. S. Ghahfarokhi, “Channel-hopping sequence genera- tion for blind rendezvous in cognitive radio-enabled internet of vehicles: A multi-agent twin delayed deep deterministic policy gradient-based method,”Computer Communications, p. 108318, 2025
work page 2025
-
[42]
S. Bi, L. Huang, H. Wang, and Y .-J. A. Zhang, “Lyapunov-guided deep reinforcement learning for stable online computation offloading in mobile-edge computing networks,”IEEE Transactions on Wireless Communications, vol. 20, no. 11, pp. 7519–7537, 2021
work page 2021
-
[43]
S. Wang and Z. Luo, “Real-time data collection and trajectory scheduling using a drl–lagrangian framework in multiple uavs collaborative com- munication systems,”Remote Sensing, vol. 16, no. 23, p. 4378, 2024
work page 2024
-
[44]
Policy learning with constraints in model- free reinforcement learning: A survey,
Y . Liu, A. Halev, and X. Liu, “Policy learning with constraints in model- free reinforcement learning: A survey,” inThe 30th international joint conference on artificial intelligence (ijcai), 2021
work page 2021
-
[45]
J. Dou, X. Wang, Z. Liu, Q. Sun, X. Wang, and J. He, “Towards pareto-optimal energy management in integrated energy systems: A multi-agent and multi-objective deep reinforcement learning approach,” International Journal of Electrical Power & Energy Systems, vol. 159, p. 110022, 2024
work page 2024
-
[46]
B. Nie, J. Ji, Y . Fu, and Y . Gao, “Improve robustness of reinforcement learning against observation perturbations via l∞lipschitz policy net- works,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 13, 2024, pp. 14 457–14 465
work page 2024
-
[47]
Multi-agent reinforcement learning-based dis- tributed cooperative voltage control,
W. Xu and A. Kamsin, “Multi-agent reinforcement learning-based dis- tributed cooperative voltage control,” in2025 6th International Confer- ence on Electrical Technology and Automatic Control (ICETAC). IEEE, 2025, pp. 474–477
work page 2025
-
[48]
K. Ryu and W. Kim, “Multi-objective optimization of energy saving and throughput in heterogeneous networks using deep reinforcement learning,”Sensors, vol. 21, no. 23, p. 7925, 2021
work page 2021
-
[49]
N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull, “Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset,”Future Generation Computer Systems, vol. 100, pp. 779–796, 2019
work page 2019
-
[50]
G. Seo, S. Yoon, J. Song, E. Srivastava, and E. Hwang, “Label-free fault detection scheme for inverters of pv systems: Deep reinforcement learning-based dynamic threshold,”Applied Sciences, vol. 13, no. 4, p. 2470, 2023
work page 2023
-
[51]
A transformer-based network intrusion detection approach for cloud se- curity,
Z. Long, H. Yan, G. Shen, X. Zhang, H. He, and L. Cheng, “A transformer-based network intrusion detection approach for cloud se- curity,”Journal of Cloud Computing, vol. 13, no. 1, p. 5, 2024
work page 2024
-
[52]
Scalable graph- aware edge representation learning for wireless iot intrusion detection,
Z. Jiang, J. Li, Q. Hu, W. Meng, W. Pedrycz, and Z. Su, “Scalable graph- aware edge representation learning for wireless iot intrusion detection,” IEEE Internet of Things Journal, vol. 11, no. 16, pp. 26 955–26 969, 2024
work page 2024
-
[53]
F. Ullah, S. Ullah, G. Srivastava, and J. Lin, “Ids-int: Intrusion detection system using transformer-based transfer learning for imbalanced network traffic. digit commun netw 10 (1): 190–204,” 2024
work page 2024
-
[54]
Harnessing kali linux for advanced penetration testing and cybersecurity threat mitigation,
V . Yarlagadda, S. Kumar, R. Anumandla, S. Charan, R. Vennapusa, and C. Wholesale, “Harnessing kali linux for advanced penetration testing and cybersecurity threat mitigation,”J. Comput. Digit. Technol., no. April, 2024
work page 2024
-
[55]
G. O. Anyanwu, C. I. Nwakanma, J.-M. Lee, and D.-S. Kim, “Opti- mization of rbf-svm kernel using grid search algorithm for ddos attack detection in sdn-based vanet,”IEEE Internet of Things Journal, vol. 10, no. 10, pp. 8477–8490, 2022
work page 2022
-
[56]
Sampling complexity of td and ppo in rkhs,
L. Zou, W. Ren, W. Zhang, L. Ding, and S. Li, “Sampling complexity of td and ppo in rkhs,”arXiv preprint arXiv:2509.24991, 2025
-
[57]
Y . Peng, L. Zhang, and Z. Zhang, “Statistical efficiency of distribu- tional temporal difference learning and freedman’s inequality in hilbert spaces,”arXiv preprint arXiv:2403.05811, 2024. 17 X. APPENDIXOVERVIEW This appendix provides the mathematical derivations un- derpinning the proposed dual-solution DRL-based IDS, DeepEdgeIDS, and AutoDRL-IDS. We f...
-
[58]
Sample Bound For a reproducing kernel Hilbert [57] spaceH k: |QT −Q⋆| ≤ ˜O s log det(I+ 1 σ2 KT ) T ,R (D) dyn (T)≤ ˜O( √ ST). DeepEdgeIDS exhibits provable stability, diffusion-regularized contraction, convex sustainability coupling, and sublinear re- gret with bounded carbon dynamics. XII. AUTODRL-IDS A. LSTM Encoding and Supervised-DRL Coupling...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.