Recognition: 2 theorem links
· Lean TheoremSystems-Level Attack Surface of Edge Agent Deployments on IoT
Pith reviewed 2026-05-15 19:30 UTC · model grok-4.3
The pith
IoT agent security hinges on deployment architecture, not model choice
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Edge deployment of LLM agents on IoT hardware introduces attack surfaces absent from cloud-hosted orchestration. The empirical analysis of cloud-hosted, edge-local swarm, and hybrid architectures on a multi-device home-automation testbed identifies five systems-level attack surfaces, including coordination-state divergence and induced trust erosion. Edge-local deployments eliminate routine cloud data exposure but silently degrade sovereignty when fallback mechanisms trigger, with boundary crossings invisible at the application layer. Provenance chains remain complete under cooperative operation yet are trivially bypassed without cryptographic enforcement. Failover windows create transient盲点可
What carries the argument
Three deployment architectures (cloud-hosted, edge-local swarm, hybrid) tested on a multi-device home-automation testbed with local MQTT messaging and Android edge node, tracked via metrics of data egress volume, failover window exposure, sovereignty boundary integrity, and provenance chain completeness.
If this is right
- Edge-local swarms avoid routine cloud data exposure compared to cloud-hosted setups.
- Fallback triggers in edge and hybrid setups silently reduce sovereignty with invisible boundary crossings.
- Provenance chains stay reliable only under cooperative operation and need cryptographic enforcement.
- Failover windows create transient blind spots that allow unauthorized device actuation.
- Security risk in agent-controlled IoT depends primarily on deployment architecture.
Where Pith is reading between the lines
- Similar architecture risks are likely present in other edge AI systems such as autonomous vehicles or industrial controls.
- Designers could reduce hidden problems by adding visible boundary monitoring at the application layer.
- Standards for IoT agents may need to specify deployment architecture requirements separately from model robustness.
- Routine security evaluations that test only prompts and models will miss major practical vulnerabilities.
Load-bearing premise
The home-automation testbed with local MQTT and Android smartphone as edge node accurately represents real-world edge agent deployments on IoT hardware.
What would settle it
A follow-up study on a different IoT setup, such as industrial sensors without MQTT, that shows model or prompt changes reduce security risks more than architecture changes would disprove the primary determinant claim.
Figures
read the original abstract
Edge deployment of LLM agents on IoT hardware introduces attack surfaces absent from cloud-hosted orchestration. We present an empirical security analysis of three architectures (cloud-hosted, edge-local swarm, and hybrid) using a multi-device home-automation testbed with local MQTT messaging and an Android smartphone as an edge inference node. We identify five systems-level attack surfaces, including two emergent failures observed during live testbed operation: coordination-state divergence and induced trust erosion. We frame core security properties as measurable systems metrics: data egress volume, failover window exposure, sovereignty boundary integrity, and provenance chain completeness. Our measurements show that edge-local deployments eliminate routine cloud data exposure but silently degrade sovereignty when fallback mechanisms trigger, with boundary crossings invisible at the application layer. Provenance chains remain complete under cooperative operation yet are trivially bypassed without cryptographic enforcement. Failover windows create transient blind spots exploitable for unauthorised actuation. These results demonstrate that deployment architecture, not just model or prompt design, is a primary determinant of security risk in agent-controlled IoT systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an empirical security analysis of LLM agent deployments on IoT hardware, comparing three architectures (cloud-hosted, edge-local swarm, and hybrid) via a multi-device home-automation testbed using local MQTT messaging and an Android smartphone as the edge inference node. It identifies five systems-level attack surfaces, including two emergent failures (coordination-state divergence and induced trust erosion) observed during live operation, and frames security properties as measurable metrics including data egress volume, failover window exposure, sovereignty boundary integrity, and provenance chain completeness. Measurements indicate that edge-local deployments eliminate routine cloud data exposure but introduce silent sovereignty degradation on fallback, with provenance chains trivially bypassable absent cryptographic enforcement and failover windows creating exploitable transient blind spots. The central claim is that deployment architecture, rather than model or prompt design alone, is a primary determinant of security risk in agent-controlled IoT systems.
Significance. If the testbed observations prove reproducible and generalizable, the work would usefully highlight architecture-driven risks (such as invisible boundary crossings and coordination failures) that are distinct from prompt-level vulnerabilities, providing concrete metrics for evaluating edge agent deployments. The empirical framing and identification of emergent failures represent a strength in shifting focus from isolated model attacks to systems-level properties.
major comments (3)
- [Abstract] Abstract and testbed description: the central claim that architecture is the primary determinant rests on observations from a single multi-device home-automation setup with MQTT and Android edge node; without explicit controls or comparisons isolating architectural effects from MQTT broker semantics, smartphone resource limits, or home-automation device interactions, the measurements (e.g., silent sovereignty degradation on fallback) may be configuration-specific rather than architecture-driven.
- [Abstract] Measurements section (implied by abstract): no detailed quantitative data, error bars, statistical analysis, or verification steps are described for the reported metrics such as data egress volume, failover window exposure, or sovereignty boundary integrity, leaving open whether the differences between cloud, edge-local, and hybrid architectures are significant or reproducible.
- [Abstract] Emergent failures (coordination-state divergence and induced trust erosion): these are presented as architecture-induced, yet the manuscript provides no ablation or alternative configuration (e.g., with cryptographic enforcement or different messaging) to demonstrate that they arise independently of the specific testbed choices rather than from the absence of enforcement mechanisms.
minor comments (1)
- [Abstract] The abstract would benefit from a brief enumeration of the five attack surfaces to allow readers to map them directly to the reported metrics.
Simulated Author's Rebuttal
We thank the referee for the valuable feedback on our manuscript. We have addressed each of the major comments by clarifying our experimental controls, enhancing the quantitative presentation, and providing additional context on the emergent failures. The revised manuscript incorporates these improvements.
read point-by-point responses
-
Referee: [Abstract] Abstract and testbed description: the central claim that architecture is the primary determinant rests on observations from a single multi-device home-automation setup with MQTT and Android edge node; without explicit controls or comparisons isolating architectural effects from MQTT broker semantics, smartphone resource limits, or home-automation device interactions, the measurements (e.g., silent sovereignty degradation on fallback) may be configuration-specific rather than architecture-driven.
Authors: We acknowledge that the analysis uses a single testbed configuration. However, all three architectures were evaluated using identical hardware, the same MQTT broker, and the same home-automation devices, providing an internal control that isolates architectural effects from the underlying messaging semantics, resource limits, and device interactions. Differences in metrics such as data egress volume and sovereignty degradation are therefore attributable to the deployment architecture. We have revised the testbed description to explicitly state these controls and added a limitations paragraph on generalizability. revision: partial
-
Referee: [Abstract] Measurements section (implied by abstract): no detailed quantitative data, error bars, statistical analysis, or verification steps are described for the reported metrics such as data egress volume, failover window exposure, or sovereignty boundary integrity, leaving open whether the differences between cloud, edge-local, and hybrid architectures are significant or reproducible.
Authors: The manuscript body contains the quantitative measurements, but we agree that error bars, statistical analysis, and explicit verification steps were insufficiently detailed. In the revised version we have expanded the Measurements section with tables reporting means and standard deviations across repeated runs, p-values for architecture comparisons, and a step-by-step verification protocol for each metric to establish reproducibility and significance. revision: yes
-
Referee: [Abstract] Emergent failures (coordination-state divergence and induced trust erosion): these are presented as architecture-induced, yet the manuscript provides no ablation or alternative configuration (e.g., with cryptographic enforcement or different messaging) to demonstrate that they arise independently of the specific testbed choices rather than from the absence of enforcement mechanisms.
Authors: The failures were observed exclusively in the edge-local and hybrid architectures during live operation and were absent from the cloud-hosted baseline, indicating they stem from architectural features such as distributed coordination and fallback logic. We agree that explicit ablations would strengthen the claim. We have added a discussion clarifying the architectural origin of each failure and a note on an alternative messaging configuration in which similar divergence was observed; a full ablation study with cryptographic enforcement is identified as future work. revision: partial
Circularity Check
No circularity: purely empirical testbed study with direct measurements
full rationale
The paper conducts an empirical security analysis of three deployment architectures using a home-automation testbed with MQTT and Android edge node. It reports observed attack surfaces and failures (coordination-state divergence, induced trust erosion) and measures concrete metrics (data egress volume, failover window exposure, sovereignty boundary integrity, provenance chain completeness) from live operation. No equations, derivations, fitted parameters, or self-referential definitions appear; claims follow from direct testbed observations rather than any reduction to inputs by construction. No self-citation chains or uniqueness theorems are invoked as load-bearing steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The multi-device home-automation testbed with MQTT and Android edge node represents typical real-world edge agent deployments
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We identify five systems-level attack surfaces... data egress volume, failover window exposure, sovereignty boundary integrity, and provenance chain completeness.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Edge-local deployments eliminate routine cloud data exposure but silently degrade sovereignty when fallback mechanisms trigger
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Omar Alrawi, Chaz Lever, Manos Antonakakis, and Fabian Monrose
-
[2]
Sok: Security evaluation of home-based iot deployments. (2019), 1362–1380. Systems-Level Attack Surface of Edge Agent Deployments on IoT EuroMLSys ’26, April 27–30, 2026, Edinburgh, Scotland Uk
work page 2019
-
[3]
Syaiful Andy, Budi Rahardjo, and Bagus Hanindhito. 2017. Attack scenarios and security analysis of MQTT communication protocol in IoT system. In2017 4th International conference on electrical engineering, computer science and informatics (EECSI). IEEE, 1–6
work page 2017
-
[4]
Davide Calvaresi, Kevin Appoggetti, Luca Lustrissimini, Mauro Mari- noni, Paolo Sernani, Aldo Franco Dragoni, Michael Schumacher, et al
-
[5]
Multi-Agent Systems’ Negotiation Protocols for Cyber-Physical Systems: Results from a Systematic Literature Review.ICAART (1), 224–235
-
[6]
Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer- Kellner, Marc Fischer, and Florian Tramèr. 2024. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents. (2024), 82895–82920 pages
work page 2024
-
[7]
Jason A Donenfeld. 2017. Wireguard: next generation kernel network tunnel.. InNDSS. 1–12
work page 2017
-
[8]
Ali Dorri, Salil S Kanhere, Raja Jurdak, and Praveen Gauravaram. 2017. Blockchain for IoT security and privacy: The case study of a smart home. In2017 IEEE international conference on pervasive computing and communications workshops (PerCom workshops). IEEE, 618–623
work page 2017
-
[9]
Syed Naeem Firdous, Zubair Baig, Craig Valli, and Ahmed Ibrahim
-
[10]
Modelling and evaluation of malicious attacks against the IoT MQTT protocol. In2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). IEEE, 748–755
-
[11]
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. InProceedings of the 16th ACM workshop on artificial intelligence and security. 79–90
work page 2023
-
[12]
Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V Chawla, Olaf Wiest, and Xiangliang Zhang. 2024. Large lan- guage model based multi-agents: A survey of progress and challenges. (2024)
work page 2024
-
[13]
MS Harsha, BM Bhavani, and KR Kundhavai. 2018. Analysis of vul- nerabilities in MQTT security using Shodan API and implementation of its countermeasures via authentication and ACLs. In2018 Inter- national Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE, 2244–2250
work page 2018
-
[14]
Sean Hollister. 2026. The DJI Romo robovac had secu- rity so poor, this man remotely accessed thousands of them. https://www.theverge.com/tech/879088/dji-romo-hack- vulnerability-remote-control-camera-access-mqtt
work page 2026
-
[15]
Home Assistant. 2026. Model Context Protocol. Home Assistant documentation. https://www.home-assistant.io/integrations/mcp/ Introduced in Home Assistant 2025.2
work page 2026
-
[16]
Francesca Meneghello, Matteo Calore, Daniel Zucchetto, Michele Polese, and Andrea Zanella. 2019. IoT: Internet of threats? A survey of practical security vulnerabilities in real IoT devices.IEEE Internet of Things Journal6, 5 (2019), 8182–8201
work page 2019
-
[17]
Biswajeeban Mishra and Attila Kertesz. 2020. The use of MQTT in M2M and IoT systems: A survey.Ieee Access8 (2020), 201071–201086
work page 2020
-
[18]
OpenClaw Project. 2026.OpenClaw. https://github.com/openclaw/ openclaw GitHub repository
work page 2026
-
[19]
OpenClaw Project. 2026. Security. OpenClaw documentation. https: //docs.openclaw.ai/gateway/security
work page 2026
-
[20]
OWASP Foundation. 2025. OWASP Top 10 for Agentic Applica- tions. https://owasp.org/www-project-top-10-for-large-language- model-applications/
work page 2025
-
[21]
Rodrigo Roman, Javier Lopez, and Masahiro Mambo. 2018. Mobile edge computing, fog et al.: A survey and analysis of security threats and challenges.Future Generation Computer Systems78 (2018), 680–698
work page 2018
-
[22]
Yangjun Ruan, Honghua Dong, Andrew Wang, Silviu Pitis, Yongchao Zhou, Jimmy Ba, Yann Dubois, Chris J Maddison, and Tatsunori Hashimoto. 2023. Identifying the risks of lm agents with an lm- emulated sandbox.arXiv preprint arXiv:2309.15817(2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[23]
Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. 2016. Edge computing: Vision and challenges.IEEE internet of things journal 3, 5 (2016), 637–646
work page 2016
-
[24]
Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. 2023. Reflexion: Language agents with verbal rein- forcement learning. (2023), 8634–8652 pages
work page 2023
-
[25]
Keith Stouffer, Joe Falco, Karen Scarfone, et al. 2011. Guide to industrial control systems (ICS) security.NIST special publication800, 82 (2011), 16–16
work page 2011
-
[26]
Milijana Surbatovich, Jassim Aljuraidan, Lujo Bauer, Anupam Das, and Limin Jia. 2017. Some recipes can do more than spoil your appetite: Analyzing the security and privacy risks of IFTTT recipes. (2017), 1501–1510
work page 2017
-
[27]
SwitchBot. 2026. SwitchBot AI Hub | Matter-Compatible Smart Home Hub with Local AI. Official Product Page. https://us.switch-bot.com/ products/switchbot-ai-hub
work page 2026
-
[28]
Yashar Talebirad and Amirhossein Nadiri. 2023. Multi-agent collabo- ration: Harnessing the power of intelligent llm agents. (2023)
work page 2023
-
[29]
Yuhang Wang, Feiming Xu, Zheng Lin, Guangyu He, Yuzhe Huang, Haichang Gao, Zhenxing Niu, Shiguo Lian, and Zhaoxiang Liu. 2026. From Assistant to Double Agent: Formalizing and Benchmarking At- tacks on OpenClaw for Personalized Local AI Agent.arXiv preprint arXiv:2602.08412(2026)
-
[30]
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. 2025. The rise and potential of large language model based agents: A survey. Science China Information Sciences68, 2 (2025), 121101
work page 2025
-
[31]
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik R Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. (2022)
work page 2022
-
[32]
Qiusi Zhan, Zhixiang Liang, Zifan Ying, and Daniel Kang. 2024. Injeca- gent: Benchmarking indirect prompt injections in tool-integrated large language model agents. InFindings of the Association for Computational Linguistics: ACL 2024. 10471–10506. A Extended Testbed and Measurements Following the measurements reported in the main text (three- node testbe...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.