pith. sign in

arxiv: 2605.23330 · v1 · pith:QKOY7RXMnew · submitted 2026-05-22 · 💻 cs.CR

Security, Privacy, and Ethical Risks in OpenClaw

Pith reviewed 2026-05-25 04:24 UTC · model grok-4.3

classification 💻 cs.CR
keywords security risksprivacy risksethical risksAI agentOpenClawtraceabilitypersistent storagetool invocation
0
0 comments X

The pith

OpenClaw's architecture introduces security, privacy, and ethical risks that form major barriers to its trustworthy deployment and adoption.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates the risks in OpenClaw, a locally executable AI agent for natural language tasks and real-world actions. It analyzes the system's architecture, functionalities, and scenarios to highlight issues with persistent local storage, tool invocation, cross-context aggregation, multi-user interactions, and plugin integration. The authors argue these create significant concerns for security, privacy, ethics, and traceability. A sympathetic reader would care because OpenClaw aims at personal assistance and automation but could undermine trust in digital environments if these risks go unaddressed.

Core claim

The central claim is that OpenClaw's highly privileged agent, when integrated into personal and organizational environments, raises risks from its persistent local storage, tool invocation capabilities, cross-context information aggregation, multi-user interaction features, and integration of plugins and external services, and that these issues constitute major barriers to the trustworthy deployment and widespread adoption of this technology.

What carries the argument

The system architecture and representative application scenarios of OpenClaw, which enable persistent local storage, tool invocation, cross-context aggregation, multi-user interaction, and plugin integration.

If this is right

  • Persistent local storage may expose sensitive user data to unauthorized access.
  • Tool invocation could enable unintended or malicious actions on connected systems.
  • Cross-context information aggregation risks linking private data across domains.
  • Multi-user interactions may lead to access conflicts or unauthorized sharing.
  • Plugin and external service integration opens additional attack surfaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Comparable local AI agent systems would likely require similar risk analyses to ensure safe use.
  • Standardized traceability requirements could emerge as a practical response to the identified gaps.
  • Developers of agent platforms might prioritize built-in controls for storage and invocation as a direct follow-on.

Load-bearing premise

That the listed risks are inherent to the system and remain unmitigated, derived solely from architectural analysis without evidence of implemented defenses or assessments of their actual likelihood and impact.

What would settle it

Empirical testing of OpenClaw in real deployments that shows effective mitigation of the risks through existing or added security measures, or quantitative risk assessments indicating negligible impact.

Figures

Figures reproduced from arXiv: 2605.23330 by Jianbing Ni, Yutong Jin, Zelin Zhang, Zhijin Lyu.

Figure 1
Figure 1. Figure 1: Analytical framework of the paper for OpenClaw. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: System architecture and main functions of OpenClaw. Redrawn based on OpenClaw’s official architecture [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Representative application scenarios of OpenClaw: communication, retrieval, and technical assistance [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Privacy risks in OpenClaw task execution: unintended expansion of data access across user-connected [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Reliability and Traceability of the OpenClaw Execution Chain. [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
read the original abstract

This paper systematically investigates the security, privacy, and ethical risks, as well as the traceability challenges of OpenClaw, a locally executable AI agent system for natural language interaction and real-world task completion. While OpenClaw shows strong potential for personal assistance, office automation, cross-platform task management, and information integration, it also raises serious security, privacy, and ethical concerns. By analyzing its system architecture, core functionalities, deployment model, and representative application scenarios, this paper aims to reveal the risks that may arise when such a highly privileged agent is integrated into personal and organizational digital environments. We focus in particular on the challenges associated with persistent local storage, tool invocation, cross-context information aggregation, multi-user interaction, and the integration of plugins and external services. We argue that these issues constitute major barriers to the trustworthy deployment and widespread adoption of this technology. Finally, we summarize the open challenges in security defenses, privacy protection, ethical governance, and traceability in agent use, and call for joint efforts from researchers, developers, deployers, and regulators to build AI agent systems that are safer, more reliable, and more trustworthy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript systematically investigates security, privacy, and ethical risks, as well as traceability challenges, in OpenClaw, a locally executable AI agent system. Through analysis of its system architecture, core functionalities, deployment model, and representative application scenarios, it identifies risks associated with persistent local storage, tool invocation, cross-context information aggregation, multi-user interaction, and plugin integration with external services. The paper argues that these issues constitute major barriers to trustworthy deployment and widespread adoption, summarizes open challenges in defenses, protection, governance, and traceability, and calls for joint efforts by researchers, developers, deployers, and regulators.

Significance. If the qualitative analysis holds and the enumerated risks are shown to be both inherent and resistant to standard controls, the work would usefully contribute to the literature on AI agent security by providing a structured enumeration of concerns specific to locally executable, highly privileged agents. It could inform design choices and regulatory discussions around personal assistance and office automation tools. The paper's strength lies in its focus on a concrete system and its explicit call for multi-stakeholder collaboration, though its impact is constrained by the lack of quantitative severity assessment or mitigation evaluation.

major comments (1)
  1. [Abstract] Abstract and § on risk analysis (architecture and scenarios): the central claim that the five listed features 'constitute major barriers to the trustworthy deployment and widespread adoption' is load-bearing but unsupported. The text enumerates potential risks from persistent local storage, tool invocation, cross-context aggregation, multi-user interaction, and plugin integration, yet provides no demonstration that standard mechanisms (sandboxing, capability-based access control, audit logging, or consent flows) would be inadequate, nor any attack traces, probability/impact estimates, or residual-risk evaluation after mitigations. This leaves the 'major barriers' conclusion dependent on the untested premise that the risks are both inherent and effectively unmitigable.
minor comments (1)
  1. [Abstract] The abstract and conclusion sections could more explicitly distinguish between risks that are unique to OpenClaw versus those common to other agent frameworks, to strengthen the novelty of the contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive critique, which highlights an important gap in supporting our central claim. We address the comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract and § on risk analysis (architecture and scenarios): the central claim that the five listed features 'constitute major barriers to the trustworthy deployment and widespread adoption' is load-bearing but unsupported. The text enumerates potential risks from persistent local storage, tool invocation, cross-context aggregation, multi-user interaction, and plugin integration, yet provides no demonstration that standard mechanisms (sandboxing, capability-based access control, audit logging, or consent flows) would be inadequate, nor any attack traces, probability/impact estimates, or residual-risk evaluation after mitigations. This leaves the 'major barriers' conclusion dependent on the untested premise that the risks are both inherent and effectively unmitigable.

    Authors: We agree that the manuscript's qualitative enumeration of risks does not include quantitative severity assessments, explicit attack traces, or systematic evaluation of residual risk after applying standard controls such as sandboxing or capability-based access. The paper's analysis is grounded in the architectural requirements of OpenClaw (local execution with persistent storage and broad tool privileges needed for its intended functionality), which we argue create tensions that standard mechanisms cannot fully resolve without reducing utility. However, we acknowledge this reasoning is presented at a high level without detailed mitigation analysis. We will revise the abstract and the risk-analysis sections to (1) explicitly state the qualitative basis of the 'major barriers' claim, (2) add a short discussion of why certain standard controls are likely to be insufficient given the agent's design goals, and (3) temper the language to reflect the absence of quantitative evidence while preserving the call for further multi-stakeholder work. revision: yes

Circularity Check

0 steps flagged

No circularity: qualitative risk enumeration from architecture is self-contained

full rationale

The paper performs a direct architectural analysis to enumerate risks (persistent storage, tool invocation, cross-context aggregation, etc.) and concludes they form major barriers. No equations, fitted parameters, predictions, or self-citations appear in the provided text. The central claim follows from describing the system features without any reduction of outputs to inputs by construction, self-definition, or load-bearing self-reference. This is a standard qualitative review with no derivation chain that collapses into its own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is a qualitative risk assessment relying on standard assumptions about system behavior in AI agents; no free parameters, mathematical axioms, or new entities are introduced.

pith-pipeline@v0.9.0 · 5731 in / 1053 out tokens · 29608 ms · 2026-05-25T04:24:04.620472+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 7 internal anchors

  1. [1]

    R. Shu, N. Das, M. Yuan, M. Sunkara, Y. Zhang, Towards effective genai multi-agent collaboration: design and evaluation for enterprise applications, arXiv preprint arXiv:2412.05449 (2024)

  2. [2]

    Schick, J

    T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, T. Scialom, Toolformer: Language models can teach themselves to use tools, Advances in neural information processing systems 36 (2023) 68539–68551

  3. [3]

    Chase, Langchain,https://github.com/langchain-ai/langchain, software, accessed 2026-03-24 (2022)

    H. Chase, Langchain,https://github.com/langchain-ai/langchain, software, accessed 2026-03-24 (2022)

  4. [4]

    P.Steinberger,theOpenClawCommunity,Openclaw,https://github.com/openclaw/openclaw,software,accessed2026-03-24(2026)

  5. [5]

    H. Su, J. Luo, C. Liu, X. Yang, Y. Zhang, Y. Dong, J. Zhu, A survey on autonomy-induced security risks in large model-based agents, arXiv preprint arXiv:2506.23844 (2025). First Author et al.:Preprint submitted to ElsevierPage 19 of 21 Security, Privacy, and Ethical Risks in OpenClaw

  6. [6]

    Z.Deng,Y.Guo,C.Han,W.Ma,J.Xiong,S.Wen,Y.Xiang,Aiagentsunderthreat:Asurveyofkeysecuritychallengesandfuturepathways, ACM Computing Surveys 57 (7) (2025) 1–36

  7. [7]

    V. S. Narajala, O. Narayan, Securing agentic ai: A comprehensive threat model and mitigation framework for generative ai agents, arXiv preprint arXiv:2504.19956 (2025)

  8. [8]

    The Dark Side of LLMs: Agent-based Attack Vectors for System-level Compromise

    M. Lupinacci, F. A. Pironti, F. Blefari, F. Romeo, L. Arena, A. Furfaro, The dark side of llms: Agent-based attacks for complete computer takeover, arXiv preprint arXiv:2507.06850 (2025)

  9. [9]

    ReAct: Synergizing Reasoning and Acting in Language Models

    S.Yao,J.Zhao,D.Yu,N.Du,I.Shafran,K.Narasimhan,Y.Cao,React:Synergizingreasoningandactinginlanguagemodels,in:International Conference on Learning Representations (ICLR), 2023. URLhttps://arxiv.org/abs/2210.03629

  10. [10]

    Generative Agents: Interactive Simulacra of Human Behavior

    J.S.Park,J.C.O’Brien,C.J.Cai,M.R.Morris,P.Liang,M.S.Bernstein,Generativeagents:Interactivesimulacraofhumanbehavior,arXiv preprint arXiv:2304.03442 (2023). URLhttps://arxiv.org/abs/2304.03442

  11. [11]

    L.Weng,Llmpoweredautonomousagents,https://lilianweng.github.io/posts/2023-06-23-agent/,lil’Log,accessed2026-03- 24 (Jun. 2023)

  12. [12]

    K.Greshake,S.Abdelnabi,S.Mishra,C.Endres,T.Holz,M.Fritz,Notwhatyou’vesignedupfor:Compromisingreal-worldLLM-integrated applicationswithindirectpromptinjection,in:Proceedingsofthe16thACMWorkshoponArtificialIntelligenceandSecurity(AISec@CCS), 2023

  13. [13]

    Defeating Prompt Injections by Design

    E. Debenedetti, et al., CaMeL: Capability-based security for LLM agents, arXiv preprint arXiv:2503.18813Google DeepMind/ETH Zurich (2025)

  14. [14]

    Q. Zhan, Z. Liang, Z. Ying, D. Kang, InjecAgent: Benchmarking indirect prompt injections in tool-integrated large language model agents, in: Findings of the Association for Computational Linguistics: ACL 2024, 2024, pp. 10471–10506

  15. [15]

    Debenedetti, J

    E. Debenedetti, J. Zhang, M. Balunović, L. Beurer-Kellner, M. Fischer, F. Tramèr, AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents, in: NeurIPS 2024, Datasets and Benchmarks Track, 2024

  16. [16]

    M.Nasr,N.Carlini,etal.,Theattackermovessecond,arXivpreprintarXiv:2510.09023JointOpenAI/Anthropic/GoogleDeepMindevaluation (2025)

  17. [17]

    T. Chen, D. Liu, X. Hu, J. Yu, W. Wang, A trajectory-based safety audit of clawdbot (openclaw), arXiv preprint arXiv:2602.14364 (2026)

  18. [18]

    X. Deng, Y. Zhang, J. Wu, J. Bai, S. Yi, Z. Zou, Y. Xiao, R. Qiu, J. Ma, J. Chen, et al., Taming openclaw: Security analysis and mitigation of autonomous llm agent threats, arXiv preprint arXiv:2603.11619 (2026)

  19. [19]

    Z. Ying, X. Yang, S. Wu, Y. Song, Y. Qu, H. Li, T. Li, J. Wang, A. Liu, X. Liu, Uncovering security threats and architecting defenses in autonomous agents: A case study of openclaw, arXiv preprint arXiv:2603.12644 (2026)

  20. [20]

    Wang, MCPTox: A systematic benchmark for MCP server security, arXiv preprint arXiv:2508.14925 (2025)

    o. Wang, MCPTox: A systematic benchmark for MCP server security, arXiv preprint arXiv:2508.14925 (2025)

  21. [21]

    X. Gu, X. Zheng, T. Pang, C. Du, Q. Liu, Y. Wang, J. Jiang, M. Lin, Agent smith: A single image can jailbreak one million multimodal LLM agents exponentially fast, in: International Conference on Machine Learning (ICML), 2024

  22. [22]

    Cohen, R

    S. Cohen, R. Bitton, B. Nassi, RAGworm: Self-replicating AI worms that spread through interconnected GenAI applications, in: ACM Conference on Computer and Communications Security (CCS), 2025, arXiv:2403.02817, 2024

  23. [23]

    Peigné-Lefebvre, et al., The multi-agent security tax, in: AAAI Conference on Artificial Intelligence, 2025, arXiv:2502.19145

    T. Peigné-Lefebvre, et al., The multi-agent security tax, in: AAAI Conference on Artificial Intelligence, 2025, arXiv:2502.19145

  24. [24]

    Cheng, o

    o. Cheng, o. Tsao, Privilege separation for OpenClaw, arXiv preprint arXiv:2603.13424 (2026)

  25. [25]

    OpenClaw, Openclaw trust center,https://trust.openclaw.ai/, accessed: 2026-03-19 (2026)

  26. [26]

    OpenClaw Documentation, Gateway security,https://docs.openclaw.ai/gateway/security, accessed: 2026-03-19 (2026)

  27. [27]

    Microsoft Security, Running openclaw safely: Identity, isolation, and runtime risk,https://www.microsoft.com/en-us/security/ blog/2026/02/19/running-openclaw-safely-identity-isolation-runtime-risk/, accessed: 2026-03-19 (2026)

  28. [28]

    OpenClaw Documentation, Memory - openclaw,https://docs.openclaw.ai/concepts/memory, accessed: 2026-03-19 (2026)

  29. [29]

    OpenClawDocumentation,Pluginarchitecture-openclaw,https://docs.openclaw.ai/plugins/architecture,accessed:2026-03-19 (2026)

  30. [30]

    OpenClaw Documentation, Multi-agent routing - openclaw,https://docs.openclaw.ai/concepts/multi-agent, accessed: 2026-03- 19 (2026)

  31. [31]

    OpenClaw Documentation, Context - openclaw,https://docs.openclaw.ai/concepts/context, accessed: 2026-03-19 (2026)

  32. [32]

    Zheng, Y

    Y. Zheng, Y. Hu, Audagent: Automated auditing of privacy policy compliance in ai agents (2025).arXiv:2511.07441. URLhttps://arxiv.org/abs/2511.07441

  33. [33]

    Ukani, H

    A. Ukani, H. Haddadi, A. S. Shamsabadi, P. Snyder, Privacy practices of browser agents (2025).arXiv:2512.07725. URLhttps://arxiv.org/abs/2512.07725

  34. [34]

    J. Zhou, N. Mireshghallah, T. Li, Operationalizing data minimization for privacy-preserving llm prompting (2025).arXiv:2510.03662. URLhttps://arxiv.org/abs/2510.03662

  35. [35]

    AgentCrypt: Advancing Privacy and (Secure) Computation in AI Agent Collaboration

    H. Karthikeyan, Y. Guo, L. de Castro, A. Polychroniadou, L. Ardon, U. M. Sehwag, S. Ganesh, M. Veloso, Agentcrypt: Advancing privacy and (secure) computation in ai agent collaboration (2025).arXiv:2512.08104. URLhttps://arxiv.org/abs/2512.08104

  36. [36]

    25013–25030.doi:10.18653/v1/2025.acl-long.1227

    B.Wang,W.He,P.He,S.Zeng,Z.Xiang,Y.Xing,J.Tang,Unveilingprivacyrisksinllmagentmemory,in:Proceedingsofthe63rdAnnual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, 2025, pp. 25013–25030.doi:10.18653/v1/2025.acl-long.1227. URLhttps://aclanthology.org/2025.acl-long.1227/

  37. [37]

    URLhttps://arxiv.org/abs/2601.05504 First Author et al.:Preprint submitted to ElsevierPage 20 of 21 Security, Privacy, and Ethical Risks in OpenClaw

    B.D.Sunil,I.Sinha,P.Maheshwari,S.Todmal,S.Mallik,S.Mishra,Memorypoisoningattackanddefenseonretrieval-augmentedgeneration based llm agents (2026).arXiv:2601.05504. URLhttps://arxiv.org/abs/2601.05504 First Author et al.:Preprint submitted to ElsevierPage 20 of 21 Security, Privacy, and Ethical Risks in OpenClaw

  38. [38]

    K. Zhu, X. Yang, J. Wang, W. Guo, W. Y. Wang, Melon: Indirect prompt injection defense via masked re-execution and tool comparison, in: Proceedings of the 42nd International Conference on Machine Learning, 2025. URLhttps://openreview.net/forum?id=gt1MmGaKdZ

  39. [39]

    arXiv:2506.12104

    H.Li,X.Liu,H.-C.Chiu,D.Li,N.Zhang,C.Xiao,Drift:Dynamicrule-baseddefensewithinjectionisolationforsecuringllmagents(2025). arXiv:2506.12104. URLhttps://arxiv.org/abs/2506.12104

  40. [40]

    rep., World Economic Forum (Nov

    World Economic Forum, Ai agents in action: Foundations for evaluation and governance, Tech. rep., World Economic Forum (Nov. 2025). URLhttps://reports.weforum.org/docs/WEF_AI_Agents_in_Action_Foundations_for_Evaluation_and_Governance_ 2025.pdf

  41. [41]

    M. Hahn, M. Tretter, P. Dabrock, Ethical perspectives on AI agents and agentic AI, AI and Ethics 6 (2026) 218.doi:10.1007/ s43681-026-01027-0. URLhttps://link.springer.com/article/10.1007/s43681-026-01027-0

  42. [42]

    Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task

    N. Kosmyna, E. Hauptmann, Y. T. Yuan, J. Situ, X.-H. Liao, A. V. Beresnitzky, I. Braunstein, P. Maes, Your brain on chatgpt: Accumulation of cognitive debt when using an ai assistant for essay writing task (2025).arXiv:2506.08872. URLhttps://arxiv.org/abs/2506.08872

  43. [43]

    Constantinescu, M

    M. Constantinescu, M. Kaptein, Responsibility gaps, LLMs & organisations: Many agents, many levels, and many interactions, Science and Engineering Ethics 31 (2025) 36.doi:10.1007/s11948-025-00560-1. URLhttps://doi.org/10.1007/s11948-025-00560-1

  44. [44]

    Lange, G

    B. Lange, G. Keeling, A. Manzini, A. McCroskery, We need accountability in human–AI agent relationships, npj Artificial Intelligence 1 (2025) 38.doi:10.1038/s44387-025-00041-7. URLhttps://doi.org/10.1038/s44387-025-00041-7

  45. [45]

    Organisation for Economic Co-operation and Development, The agentic ai landscape and its conceptual foundations, Tech. Rep. 56, OECD Publishing, Paris (Feb. 2026).doi:10.1787/396cf758-en. URLhttps://doi.org/10.1787/396cf758-en

  46. [46]

    Blaise, Openclaw agents can be guilt-tripped into self-sabotage, WIRED (Mar

    D. Blaise, Openclaw agents can be guilt-tripped into self-sabotage, WIRED (Mar. 2026). URLhttps://www.wired.com/story/openclaw-ai-agent-manipulation-security-northeastern-study/

  47. [47]

    URLhttps://arxiv.org/abs/2601.06223

    E.C.Cheng,J.Cheng,A.Siu,Towardsafeandresponsibleaiagents:Athree-pillarmodelfortransparency,accountability,andtrustworthiness (2026).arXiv:2601.06223. URLhttps://arxiv.org/abs/2601.06223

  48. [48]

    o.Gupta,ReliabilityBench:EvaluatingLLMagentreliabilityunderproduction-likestressconditions,arXivpreprintarXiv:2601.06112(2026)

  49. [49]

    Rabanser, S

    S. Rabanser, S. Kapoor, P. Kirgis, K. Liu, S. Utpala, A. Narayanan, Towards a science of ai agent reliability (2026).arXiv:2602.16666, doi:10.48550/arXiv.2602.16666. URLhttps://arxiv.org/abs/2602.16666

  50. [50]

    Souza, A

    R. Souza, A. Gueroudji, S. DeWitt, D. Rosendo, T. Ghosal, R. Ross, P. Balaprakash, R. F. da Silva, Prov-agent: Unified provenance for tracking ai agent interactions in agentic workflows, in: 2025 IEEE International Conference on eScience (eScience), 2025.doi:10.1109/ eScience65000.2025.00093. URLhttps://arxiv.org/abs/2508.02866

  51. [51]

    R. A. Rasheed, S. Banerjee, A. Mukherjee, R. Hazra, From fluent to verifiable: Claim-level auditability for deep research agents (2026). arXiv:2602.13855. URLhttps://arxiv.org/abs/2602.13855

  52. [52]

    First Author et al.:Preprint submitted to ElsevierPage 21 of 21

    OpenTelemetry, Semantic conventions for genai agent and framework spans,https://opentelemetry.io/docs/specs/semconv/ gen-ai/gen-ai-agent-spans/, openTelemetry specification, accessed 2026-03-31 (2025). First Author et al.:Preprint submitted to ElsevierPage 21 of 21