pith. sign in

arxiv: 2606.30701 · v1 · pith:YJGAP533new · submitted 2026-06-29 · 💻 cs.CR · cs.AI

An AI-Based Solution for Secure Service Provisioning in IoT

Pith reviewed 2026-07-01 02:00 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords IoT securityservice provisioningdeep reinforcement learningfederated learningbehavioral fingerprintingreliability scoringdevice selection
0
0 comments X

The pith

Deep reinforcement learning selects IoT service providers while incorporating reliability scores from federated behavioral fingerprinting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that a framework pairing deep reinforcement learning for dynamic provider selection with federated learning for behavioral fingerprinting can secure service provisioning in IoT networks. The DRL agent learns to pick suitable objects while the FL model supplies reliability scores that reflect compliance with security constraints. Experiments indicate the combined system runs on devices with limited resources. A sympathetic reader would care because expanding IoT ecosystems need ways to choose trustworthy providers without central data collection or heavy computation.

Core claim

The authors claim that an intelligent DRL agent learns through interaction to select providers while a fully distributed FL behavioral fingerprinting model computes reliability scores reflecting each provider's compliance, and that integrating these scores into the selection process yields a solution deployable on resource-constrained IoT devices.

What carries the argument

The DRL agent for provider selection that receives and acts on reliability scores produced by the federated learning behavioral fingerprinting model.

Load-bearing premise

The federated learning behavioral fingerprinting model must generate reliability scores that accurately measure provider compliance with security constraints and that these scores must improve the quality of DRL selections.

What would settle it

A controlled deployment in which providers assigned low reliability scores violate security constraints at the same rate as those with high scores, or the full system exceeds the computational limits of typical constrained IoT devices.

Figures

Figures reproduced from arXiv: 2606.30701 by Antonino Nocera, Marco Arazzi, Mert Cihangiroglu, Serena Nicolazzo, Vinod P.

Figure 1
Figure 1. Figure 1: SecSLA life cycle The SLA and SecSLA contracts have been mostly written using natural expressions, and compliance examination was carried out manually [21]. The current most prominent industrial approaches for SLA language specification are: WSLA, WS￾Agreement, SLA∗, CSLA, SLAC, RBSLA, and SLA-IoT [22]. In our approach, we refer to the WSLA framework proposed by IBM [23, 24]. Primarily, the WSLA allows the… view at source ↗
Figure 2
Figure 2. Figure 2: Federated Learning basic workflow • Source Port Type refers to the category of the source port (user, system, or dynamic) from which an IoT device transmits a packet. Typically, IoT devices use predefined ports designated by the manufacturer for communication. • TCP Flag is the flag embedded in each packet that in￾dicates its purpose, such as SYN, SYN-ACK, PUSH, or FIN. • Encapsulated Protocol Types specif… view at source ↗
Figure 3
Figure 3. Figure 3: Service Provisioning workflow As stated in Section 3, FL is a distributed and collaborative ML approach that enables model training across multiple decen￾tralized devices while keeping local data private, eliminating the need to share raw datasets [27]. In recent years, this paradigm has gained attention for its potential to develop intelligent and privacy-preserving IoT applications [28, 29]. Thanks to FL… view at source ↗
Figure 4
Figure 4. Figure 4: Behavioral Fingerprint Training phase Ts = 1 − 1 N X N t=1 d˜ s(t) where N is the number of observations in the considered time window. This formulation ensures that service providers whose behavior closely matches the expected profile achieve higher trustworthiness scores, while deviations from expected behavior lead to lower values of Ts . 5. Experiments This section describes the experiments carried out… view at source ↗
Figure 5
Figure 5. Figure 5: Behavioral Fingerprint Inference phase 5.2. Experimental Environment We simulate a realistic IoT service marketplace in which multiple service providers are simultaneously available to client agents. Unlike previous approaches that present a single provider per timestep [7], our simulation models the competitive nature of real-world IoT environments where clients must actively select among competing provid… view at source ↗
Figure 6
Figure 6. Figure 6: DQN marketplace results over the 15-day simulation. The dashed vertical line marks the attack onset at day 6. [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Scalability analysis across network sizes [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
read the original abstract

As the Internet of Things (IoT) continues its rapid expansion, the attack surface grows accordingly, with emerging threats targeting smart objects and their interactions. In this evolving landscape, securing service provisioning is crucial to ensure the proper functioning, security, and reliability of the IoT ecosystem. Service provisioning encompasses key tasks such as device registration, configuration, authentication, authorization, and software deployment, all of which are essential for seamless and secure IoT operations. In this paper, we present a comprehensive framework designed to select the most suitable smart objects to deliver a target service within a given IoT environment while also monitoring the behavior of the entities involved during the service provisioning phase. To achieve this, we employ a Deep Reinforcement Learning (DRL) approach in which an intelligent agent learns, through interaction with a complex, dynamic environment, how to adapt to changes while adhering to predefined security constraints. For behavioral monitoring, we leverage Federated Learning (FL) to develop a global Behavioral Fingerprinting (BF) model that is fully distributed and can analyze how IoT devices interact within the network. In addition, the BF is used to compute a reliability score for each service provider, reflecting its degree of compliance with the defined security constraints. This score is then incorporated into the service provisioning process, allowing smart objects to select providers not only according to functional suitability but also to their reliability level. Finally, we conduct an extensive experimental evaluation to assess the robustness and scalability of our approach. The results demonstrate that our solution can be effectively deployed even on resource-constrained IoT devices, making it a viable and scalable security-enhancing mechanism for modern IoT ecosystems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes a framework for secure IoT service provisioning that combines Deep Reinforcement Learning (DRL) for selecting suitable smart objects under security constraints with Federated Learning (FL) to build a distributed Behavioral Fingerprinting (BF) model. The BF model generates reliability scores reflecting provider compliance, which are fed into the DRL selection process. The authors state that extensive experimental evaluation demonstrates robustness, scalability, and effective deployment on resource-constrained devices.

Significance. If the experimental claims were substantiated with data, the integration of DRL and FL for both selection and behavioral monitoring could represent a practical contribution to IoT security, particularly for dynamic environments where functional and security criteria must be balanced. The distributed nature of the FL component aligns with IoT constraints, but the absence of any supporting evidence prevents assessment of whether this potential is realized.

major comments (2)
  1. [Abstract] Abstract: the claim of 'extensive experimental evaluation' supporting 'robustness and scalability' and 'effective deployment even on resource-constrained IoT devices' is presented without any data, tables, figures, baselines, error bars, or method details. This is load-bearing for the central claim that the solution is viable and scalable.
  2. [Abstract] Abstract: no formulation is given for the DRL state/action space, reward function, or how the FL-derived reliability score is incorporated into the DRL policy; likewise, no description of the BF model architecture, feature extraction, or aggregation process appears. Without these, the weakest assumption (that FL reliability scores accurately reflect compliance and meaningfully improve DRL selection) cannot be evaluated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comments point by point below and agree that revisions are needed to substantiate the claims made in the abstract.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of 'extensive experimental evaluation' supporting 'robustness and scalability' and 'effective deployment even on resource-constrained IoT devices' is presented without any data, tables, figures, baselines, error bars, or method details. This is load-bearing for the central claim that the solution is viable and scalable.

    Authors: We agree that the abstract makes these claims without including or referencing any supporting data, tables, figures, baselines, error bars, or method details. The manuscript as provided does not contain an experimental section with such evidence. We will revise the manuscript to either remove the unsubstantiated claims from the abstract or add a concise summary of results along with the required experimental details, tables, and figures in a new or expanded section. revision: yes

  2. Referee: [Abstract] Abstract: no formulation is given for the DRL state/action space, reward function, or how the FL-derived reliability score is incorporated into the DRL policy; likewise, no description of the BF model architecture, feature extraction, or aggregation process appears. Without these, the weakest assumption (that FL reliability scores accurately reflect compliance and meaningfully improve DRL selection) cannot be evaluated.

    Authors: We agree that the abstract (and the provided manuscript text) contains no formulations for the DRL state/action space, reward function, incorporation of the FL reliability score, or descriptions of the BF model architecture, feature extraction, and aggregation process. We will revise the manuscript to add these technical details in dedicated sections so that the approach and assumptions can be properly evaluated. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents a descriptive framework combining DRL for adaptive service selection under security constraints and FL for distributed behavioral fingerprinting to generate reliability scores, followed by experimental validation on resource-constrained devices. No equations, derivations, fitted parameters renamed as predictions, or self-citation chains appear in the abstract or framework description. The central claims rest on empirical results rather than analytical steps that reduce to inputs by construction, making the derivation self-contained against external benchmarks with no load-bearing reductions identified.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; all technical details are absent.

pith-pipeline@v0.9.1-grok · 5839 in / 1041 out tokens · 47499 ms · 2026-07-01T02:00:42.967984+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    V . Adat, B. B. Gupta, Security in internet of things: issues, challenges, taxonomy, and architecture, Telecommunication Systems 67 (3) (2018) 423–441

  2. [2]

    Arazzi, M

    M. Arazzi, M. Cihangiroglu, S. Nicolazzo, A. Nocera, A privacy- preserving and biometric-aware tasks reallocation strategy in industry 5.0, in: 2025 IEEE 30th International Conference on Emerging Technologies and Factory Automation (ETFA), IEEE, 2025, pp. 1–8

  3. [3]

    T. A. Alghamdi, I. Ali, N. Javaid, M. Shafiq, Secure service provisioning scheme for lightweight iot devices with a fair payment system and an incentive mechanism based on blockchain, IEEE Access 8 (2019) 1048– 1061

  4. [4]

    E. Rios, M. Higuero, X. Larrucea, M. Rak, V . Casola, E. Iturbe, Security and privacy service level agreement composition for internet of things systems on top of standard controls, Computers & Electrical Engineering 98 (2022) 107690

  5. [5]

    Casola, A

    V . Casola, A. De Benedictis, M. Rak, U. Villano, A security metric cata- logue for cloud applications, in: Complex, Intelligent, and Software Inten- sive Systems: Proceedings of the 11th International Conference on Com- plex, Intelligent, and Software Intensive Systems (CISIS-2017), Springer, 2018, pp. 854–863

  6. [6]

    Nicolazzo, A

    S. Nicolazzo, A. Nocera, W. Pedrycz, Service level agreement (sla) and security sla (secsla): A comprehensive survey, Journal of Network and Systems Management 34 (3) (2026) 74

  7. [7]

    Arazzi, S

    M. Arazzi, S. Nicolazzo, A. Nocera, A deep reinforcement learning ap- proach for security-aware service acquisition in iot, Journal of Information Security and Applications (2024)

  8. [8]

    Aramini, M

    A. Aramini, M. Arazzi, T. Facchinetti, L. S. Ngankem, A. Nocera, An enhanced behavioral fingerprinting approach for the internet of things, in: 2022 IEEE 18th International Conference on Factory Communication Systems (WFCS), IEEE, 2022, pp. 1–8

  9. [9]

    Ferretti, S

    M. Ferretti, S. Nicolazzo, A. Nocera, H2O: Secure Interactions in IoT via Behavioral Fingerprinting, Future Internet 13 (5) (2021) 117

  10. [10]

    Arazzi, S

    M. Arazzi, S. Nicolazzo, A. Nocera, A fully privacy-preserving solution for anomaly detection in iot using federated learning and homomorphic encryption, Information Systems Frontiers (2023) 1–24

  11. [11]

    S. Deng, Z. Xiang, J. Yin, J. Taheri, A. Y . Zomaya, Composition-driven iot service provisioning in distributed edges, IEEE Access 6 (2018) 54258– 54269

  12. [12]

    Niemirepo, M

    T. Niemirepo, M. Sihvonen, V . Jordan, J. Heinilä, Service platform for automated iot service provisioning, in: 2015 9th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, IEEE, 2015, pp. 325–329

  13. [13]

    S. Zhao, L. Yu, B. Cheng, An event-driven service provisioning mechanism for iot (internet of things) system interaction, IEEE Access 4 (2016) 5038– 5051

  14. [14]

    Z. Khan, Z. Pervez, A. G. Abbasi, Towards a secure service provisioning framework in a smart city environment, Future Generation Computer Systems 77 (2017) 112–135

  15. [15]

    Shahidinejad, J

    A. Shahidinejad, J. Abawajy, Blockchain-based self-certified key exchange protocol for hybrid electric vehicles, IEEE Transactions on Consumer Electronics (2023)

  16. [16]

    Kazim, L

    M. Kazim, L. Liu, S. Y . Zhu, A framework for orchestrating secure and dynamic access of iot services in multi-cloud environments, IEEE Access 6 (2018) 58619–58633

  17. [17]

    R. R. Henning, Security service level agreements: quantifiable security for the enterprise?, in: Proceedings of the 1999 workshop on New security paradigms, ACM, Ontario, Canada, 1999, pp. 54–60

  18. [18]

    ENISA, Enisa (2009)

    S. ENISA, Enisa (2009)

  19. [19]

    Casola, A

    V . Casola, A. De Benedictis, M. Era¸ scu, J. Modic, M. Rak, Automati- cally enforcing security slas in the cloud, IEEE Transactions on Services Computing 10 (5) (2016) 741–755

  20. [20]

    C. K. Chan, U. Chandrashekhar, S. H. Richman, S. R. Vasireddy, The role of slas in reducing vulnerabilities and recovering from disasters, Bell Labs Technical Journal 9 (2) (2004) 189–203

  21. [21]

    Keller, H

    A. Keller, H. Ludwig, The wsla framework: Specifying and monitoring service level agreements for web services, Journal of Network and Systems Management 11 (2003) 57–81

  22. [22]

    Maarouf, A

    A. Maarouf, A. Marzouk, A. Haqiq, A review of sla specification lan- guages in the cloud computing, in: 2015 10th International Conference on Intelligent Systems: Theories and Applications (SITA), IEEE, Rabat Morocco, 2015, pp. 1–6

  23. [23]

    Ludwig, A

    H. Ludwig, A. Keller, A. Dan, R. P. King, R. Franck, Web service level agreement (wsla) language specification, Ibm corporation (2003) 815–824

  24. [24]

    Bianco, G

    P. Bianco, G. A. Lewis, P. Merson, Service level agreements in service- oriented architecture environments, Carnegie Mellon University, Software Engineering Institute, 2008

  25. [25]

    Zhang, Y

    C. Zhang, Y . Xie, H. Bai, B. Yu, W. Li, Y . Gao, A survey on federated learning, Knowledge-Based Systems 216 (2021) 106775

  26. [26]

    Arulkumaran, M

    K. Arulkumaran, M. P. Deisenroth, M. Brundage, A. A. Bharath, Deep reinforcement learning: A brief survey, IEEE Signal Processing Magazine 34 (6) (2017) 26–38

  27. [27]

    Federated Optimization:Distributed Optimization Beyond the Datacenter

    J. Koneˇcn`y, B. McMahan, D. Ramage, Federated optimization: Distributed optimization beyond the datacenter, arXiv preprint arXiv:1511.03575 (2015)

  28. [28]

    D. C. Nguyen, M. Ding, P. N. Pathirana, A. Seneviratne, J. Li, H. V . Poor, Federated learning for internet of things: A comprehensive survey, IEEE Communications Surveys & Tutorials 23 (3) (2021) 1622–1658

  29. [29]

    P. M. S. Sánchez, J. M. J. Valero, A. H. Celdrán, G. Bovet, M. G. Pérez, G. M. Pérez, A survey on device behavior fingerprinting: Data sources, techniques, application scenarios, and datasets, IEEE Communications Surveys & Tutorials 23 (2) (2021) 1048–1077

  30. [30]

    Hamza, H

    A. Hamza, H. Habibi Gharakheili, T. A. Benson, V . Sivaraman, Detecting V olumetric Attacks on LoT Devices via SDN-Based Monitoring of MUD Activity, in: Proc. ACM SOSR, San Jose, CA, USA, 2019. doi:10.1145/3314148.3314352. URL https://www2.ee.unsw.edu.au/~hhabibi/publications. html#19sosrIoT

  31. [31]

    V . Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, et al., Human-level control through deep reinforcement learning, nature 518 (7540) (2015) 529–533. 15