pith. sign in

arxiv: 2605.26298 · v1 · pith:5P4GF6OQnew · submitted 2026-05-25 · 💻 cs.CR · cs.OS

Sandlock: Confining AI Agent Code with Unprivileged Linux Primitives

Pith reviewed 2026-06-29 21:12 UTC · model grok-4.3

classification 💻 cs.CR cs.OS
keywords sandboxAI agentsunprivileged isolationLinux securityprocess confinementsyscall policyruntime supervisionpolicy compilation
0
0 comments X

The pith

Sandlock confines untrusted AI agent code on Linux using only unprivileged primitives by compiling static policies into kernel rules while routing runtime decisions to a narrow supervisor.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that a split between static input-independent policy compiled into kernel-enforced rules and a narrow supervisor for dynamic decisions enables full enforcement of filesystem, network, IPC, and syscall policies without root, cgroups, images, or mandatory namespaces. This matters for AI agent workloads because developers run shell commands, third-party scripts, and tool plugins of unknown origin on their own machines, where containers add privilege and startup costs while ad-hoc controls like chroot give weak guarantees. If the approach holds, it delivers TOCTOU-safe execve inspection, dynamic network and HTTP access control, reversible filesystem effects, and per-stage pipeline confinement at roughly 5 ms startup overhead with bare-metal throughput on workloads such as Redis.

Core claim

Sandlock is a lightweight Linux process sandbox organized around a split where static, input-independent policy is compiled into kernel-enforced rules and a narrow supervisor handles runtime-dependent decisions and virtualized effects, thereby enforcing filesystem, network, IPC, and syscall policies without root, cgroups, images, or mandatory namespaces while supporting dynamic network decisions, HTTP-level access control, TOCTOU-safe inspection of execve arguments, reversible filesystem effects, and a pipeline operator for per-stage confinement.

What carries the argument

The split between static policy compiled into kernel-enforced rules and a narrow supervisor for runtime-dependent decisions and virtualized effects.

If this is right

  • Filesystem, network, IPC, and syscall policies are enforced without root or mandatory namespaces.
  • Dynamic network decisions and HTTP-level access control become possible inside the sandbox.
  • TOCTOU-safe inspection of execve arguments and reversible filesystem effects are supported.
  • Startup overhead is roughly 5 ms with Redis achieving bare-metal throughput.
  • A pipeline operator enables per-stage confinement that separates data, network, and untrusted-content capabilities.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same static-plus-narrow-supervisor split could be applied to other untrusted code sources beyond AI agents, such as downloaded build scripts.
  • Policy compilation might be automated from high-level descriptions of expected agent behavior.
  • If the supervisor stays narrow, the technique could be composed with existing process tracing tools without privilege escalation.

Load-bearing premise

The narrow supervisor can be implemented without introducing new attack surfaces or requiring privileged operations, and static policy compilation suffices for all input-independent rules in the target workload.

What would settle it

A concrete test that launches an untrusted process attempting a TOCTOU race on execve arguments or an unauthorized network connection and checks whether the sandbox blocks it while the supervisor remains unprivileged.

Figures

Figures reproduced from arXiv: 2605.26298 by Cong Wang, Yusheng Zheng.

Figure 1
Figure 1. Figure 1: Confinement pipeline. Orange boxes execute in the [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Pipeline of heterogeneous sandboxes. blocks ptrace), the syscall is denied with EPERM rather than silently relaxed. The held syscalls, those that block until the callback returns, are execve, connect, sendto, sendmsg, sendmmsg, bind, and openat. Verdicts are: 0/False = allow, True/-1 = deny (EPERM), positive integer = deny with that errno, "audit" = allow + flag. 5.2 Copy-on-Write Workspace Backends Sandlo… view at source ↗
Figure 4
Figure 4. Figure 4: Preliminary benchmark results. Sandlock preserves low startup latency and near-bare-metal Redis throughput while avoiding Docker’s per-invocation and tail-latency costs on this workstation. as birdcage [26] provide embeddable static sandboxing APIs; these enforce static policy but expose neither a runtime callback nor host/HTTP-level network rules. BPF-LSM sandboxes such as bpf￾box [11] provide programmabl… view at source ↗
read the original abstract

AI agents increasingly run untrusted code on developer machines: shell commands generated by language models, third-party scripts retrieved at runtime, and tool plugins of unknown provenance. Existing isolation mechanisms impose tradeoffs that fit this workload poorly: containers and microVMs add privilege, image-management, and startup costs, while ad-hoc process controls and wrappers (e.g. chroot, ulimit) provide weak guarantees and little syscall-level control. Sandlock is a lightweight Linux process sandbox organized around a simple split: static, input-independent policy is compiled into kernel-enforced rules, while a narrow supervisor handles runtime-dependent decisions and virtualized effects. This split lets Sandlock enforce filesystem, network, IPC, and syscall policies without root, cgroups, images, or mandatory namespaces. It also supports dynamic network decisions, HTTP-level access control, TOCTOU-safe inspection of execve arguments, and reversible filesystem effects. On our workstation, Sandlock adds roughly 5 ms of startup overhead and runs Redis at bare-metal throughput (within measurement noise); its pipeline operator further supports per-stage confinement for separating data, network, and untrusted-content capabilities. Sandlock is available at https://github.com/multikernel/sandlock

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper presents Sandlock, a lightweight Linux process sandbox for AI agent workloads that splits static, input-independent policies (compiled to kernel-enforced rules via Landlock, seccomp, and similar unprivileged primitives) from a narrow supervisor process that handles only runtime-dependent decisions such as dynamic network access, TOCTOU-safe execve inspection, and reversible filesystem effects. It claims to enforce filesystem, network, IPC, and syscall policies without root, cgroups, container images, or mandatory namespaces, while adding ~5 ms startup overhead and achieving bare-metal throughput for Redis; a pipeline operator is also described for per-stage confinement.

Significance. If the central split holds and the supervisor can be confined without introducing new privileged surfaces or bypasses, Sandlock would offer a practical middle ground between heavy container/VM mechanisms and weak ad-hoc controls for untrusted AI-generated code. The emphasis on unprivileged primitives, open-source release, and reported performance numbers (if supported by reproducible measurements) would be notable strengths for the systems-security community.

major comments (3)
  1. [Abstract / Design overview] The central claim that the narrow supervisor can be implemented and confined without root or new attack surfaces (abstract and design overview) is load-bearing; the skeptic note correctly flags that mediation paths (e.g., ptrace, socket filters, or user-space policy evaluation) must themselves be TOCTOU-safe and confinable by the same static rules, yet no concrete threat model, capability analysis, or proof sketch is supplied to show this holds for a malicious child process.
  2. [Evaluation section (implied by abstract numbers)] Performance claims (roughly 5 ms startup overhead; Redis at bare-metal throughput within measurement noise) are presented without methodology, error bars, workload details, or comparison baselines; this directly affects the practicality argument and must be supported by explicit experimental setup, hardware description, and raw data.
  3. [Policy compilation and supervisor design] The static/dynamic policy split assumes all input-independent rules can be compiled ahead of time, but the paper does not address how this interacts with workloads that have any runtime-dependent but still policy-relevant behavior (e.g., reversible FS effects or HTTP-level decisions) without enlarging the supervisor's attack surface.
minor comments (2)
  1. [Abstract] The abstract states security properties and performance numbers but supplies no threat model, evaluation methodology, or citation to the full experimental section; this should be clarified even if the body contains the details.
  2. [Abstract] Link to https://github.com/multikernel/sandlock is given; the paper should state the commit hash or release tag used for all reported results to support reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and detailed comments. We believe Sandlock offers a practical approach for AI agent isolation and address each major comment below, proposing revisions to strengthen the presentation of the threat model, evaluation, and policy design.

read point-by-point responses
  1. Referee: [Abstract / Design overview] The central claim that the narrow supervisor can be implemented and confined without root or new attack surfaces (abstract and design overview) is load-bearing; the skeptic note correctly flags that mediation paths (e.g., ptrace, socket filters, or user-space policy evaluation) must themselves be TOCTOU-safe and confinable by the same static rules, yet no concrete threat model, capability analysis, or proof sketch is supplied to show this holds for a malicious child process.

    Authors: We agree this is a load-bearing claim and that the manuscript would benefit from an explicit threat model. In the revised manuscript, we will add a new subsection under Design that presents a concrete threat model for the supervisor, including a capability analysis of mediation paths such as ptrace and user-space evaluation. We will sketch why these paths remain TOCTOU-safe and confinable under the static rules, addressing the concern for malicious child processes. revision: yes

  2. Referee: [Evaluation section (implied by abstract numbers)] Performance claims (roughly 5 ms startup overhead; Redis at bare-metal throughput within measurement noise) are presented without methodology, error bars, workload details, or comparison baselines; this directly affects the practicality argument and must be supported by explicit experimental setup, hardware description, and raw data.

    Authors: The full paper includes an Evaluation section, but we acknowledge the abstract's performance numbers require more supporting detail. We will revise the Evaluation section to include explicit methodology, error bars, workload details, hardware description, comparison baselines, and availability of raw data or scripts for reproducibility. revision: yes

  3. Referee: [Policy compilation and supervisor design] The static/dynamic policy split assumes all input-independent rules can be compiled ahead of time, but the paper does not address how this interacts with workloads that have any runtime-dependent but still policy-relevant behavior (e.g., reversible FS effects or HTTP-level decisions) without enlarging the supervisor's attack surface.

    Authors: We will expand the Policy Compilation and Supervisor Design sections to address runtime-dependent behaviors. Specifically, we will explain how reversible FS effects and HTTP-level decisions are delegated to the supervisor in a way that keeps its attack surface narrow and confined by the static rules, using the pipeline operator as an example. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on implementation and benchmarks

full rationale

The paper is a systems implementation describing Sandlock's use of unprivileged Linux primitives (Landlock, seccomp) for sandboxing. No equations, fitted parameters, predictions, or mathematical derivations appear in the provided text. The static/dynamic policy split is an engineering design choice justified by the workload, not reduced to self-definition or self-citation. Performance claims (5 ms overhead, Redis throughput) are presented as empirical measurements. No load-bearing steps reduce to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The design rests on the existence of suitable unprivileged Linux kernel primitives for policy enforcement and the feasibility of a narrow supervisor without privilege escalation.

axioms (1)
  • domain assumption Linux kernel exposes unprivileged mechanisms sufficient to enforce the required filesystem, network, IPC, and syscall policies at the stated granularity.
    Invoked throughout the abstract description of enforcement without root or namespaces.
invented entities (1)
  • Sandlock sandbox with static/dynamic policy split no independent evidence
    purpose: To provide lightweight confinement for AI agent code
    New system introduced by the paper; no independent evidence supplied in abstract.

pith-pipeline@v0.9.1-grok · 5741 in / 1184 out tokens · 24185 ms · 2026-06-29T21:12:11.375539+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 8 canonical work pages · 1 internal anchor

  1. [1]

    Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. InProceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’20). USENIX Association, Santa Clara, CA, USA, 419–434

  2. [2]

    Anthropic. 2025. Claude Code: An Agentic Coding Tool. https://code.claude. com/docs/en/overview

  3. [3]

    Adam Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, and Christos Kozyrakis. 2012. Dune: Safe User-level Access to Privileged CPU Fea- tures. InProceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’12). USENIX Association, Hollywood, CA, USA, 335– 348

  4. [4]

    Christian Brauner. 2024. pidfd_getfd(2) — Linux Manual Page. https://man7. org/linux/man-pages/man2/pidfd_getfd.2.html

  5. [5]

    Neil Brown. 2016. Overlay Filesystem. https://www.kernel.org/doc/ Documentation/filesystems/overlayfs.txt

  6. [6]

    Bubblewrap Project. 2026. Bubblewrap: Low-Level Unprivileged Sandboxing Tool. https://github.com/containers/bubblewrap. Accessed: 2026-05-23

  7. [7]

    Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Car- lini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Flo- rian Tramèr. 2025. Defeating Prompt Injections by Design. arXiv preprint arXiv:2503.18813. doi:10.48550/arXiv.2503.18813

  8. [8]

    Docker Inc. 2026. Rootless Mode. https://docs.docker.com/engine/security/ rootless/. Accessed: 2026-05-23

  9. [9]

    Will Drewry and Jake Edge. 2024. seccomp(2) — Linux Manual Page. https: //man7.org/linux/man-pages/man2/seccomp.2.html

  10. [10]

    Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qix- uan Wu, and Haibo Chen. 2020. Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting. InProceedings of the 25th ACM In- ternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’20). ACM, Lausanne...

  11. [11]

    William Findlay, Anil Somayaji, and David Barrera. 2020. bpfbox: Simple Precise Process Confinement with eBPF. InProceedings of the 2020 ACM SIGSAC Con- ference on Cloud Computing Security Workshop (CCSW ’20). ACM, Virtual Event, USA, 91–103. doi:10.1145/3411495.3421358

  12. [12]

    Firejail Project. 2026. Firejail: Linux Namespaces and Seccomp-bpf Sandbox. https://github.com/netblue30/firejail. Accessed: 2026-05-23

  13. [13]

    Tal Garfinkel. 2003. Traps and Pitfalls: Practical Problems in System Call Interposition Based Security Tools. InProceedings of the Network and Dis- tributed System Security Symposium (NDSS). Internet Society, San Diego, CA, USA, 15 pages. https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls- practical-problems-system-call-interposition-based-securi...

  14. [14]

    Google. 2024. gVisor: Application Kernel for Containers. https://gvisor.dev/

  15. [15]

    Tejun Heo. 2015. Control Groups v2. https://www.kernel.org/doc/ Documentation/cgroup-v2.txt

  16. [16]

    Michael Kerrisk. 2013. Namespaces in Operation. LWN.net. https://lwn.net/ Articles/531114/

  17. [17]

    Michael Kerrisk. 2024. seccomp_unotify(2) — Linux Manual Page. https: //man7.org/linux/man-pages/man2/seccomp_unotify.2.html

  18. [18]

    Taesoo Kim and Nickolai Zeldovich. 2013. Practical and Effective Sandboxing for Non-root Users. InProceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC ’13). USENIX Association, San Jose, CA, USA, 139–144. https: //www.usenix.org/conference/atc13/technical-sessions/presentation/kim

  19. [19]

    Linux man-pages Project. 2025. capabilities(7) — Linux Manual Page. https: //man7.org/linux/man-pages/man7/capabilities.7.html

  20. [20]

    Linux man-pages Project. 2025. landlock(7) — Linux Manual Page. https: //man7.org/linux/man-pages/man7/landlock.7.html

  21. [21]

    James Litton, Anjo Vahldiek-Oberwagner, Eslam Elnikety, Deepak Garg, Bobby Bhattacharjee, and Peter Druschel. 2016. Light-Weight Contexts: An OS Abstrac- tion for Safety and Performance. InProceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16). USENIX Association, Savannah, GA, USA, 49–64

  22. [22]

    Dirk Merkel. 2014. Docker: Lightweight Linux Containers for Con- sistent Development and Deployment. Linux Journal. https: //www.linuxjournal.com/content/docker-lightweight-linux-containers- consistent-development-and-deployment

  23. [23]

    Model Context Protocol. 2026. Tools. https://modelcontextprotocol.io/ specification/draft/server/tools. Accessed: 2026-05-23

  24. [24]

    Nightingale, Peter M

    Edmund B. Nightingale, Peter M. Chen, and Jason Flinn. 2005. Speculative Execution in a Distributed File System. InProceedings of the Twentieth ACM Symposium on Operating Systems Principles (SOSP ’05). ACM, Brighton, United Kingdom, 191–205

  25. [25]

    Open Container Initiative. 2023. Open Container Initiative Runtime Specification, Version 1.1.0. https://oci-playground.github.io/specs-latest/specs/runtime/v1.1. 0/oci-runtime-spec.html

  26. [26]

    Phylum. 2023. Birdcage: Cross-platform Embeddable Sandboxing. https://github. com/phylum-dev/birdcage

  27. [27]

    Porter, Owen S

    Donald E. Porter, Owen S. Hofmann, Christopher J. Rossbach, Alexander Benn, and Emmett Witchel. 2009. Operating System Transactions. InProceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP ’09). ACM, Big Sky, MT, USA, 161–176

  28. [28]

    Mickaël Salaün. 2017. Landlock LSM: Toward Unprivileged Sandboxing. Linux Security Summit Europe. https://landlock.io/talks/2017-09-14_landlock-lss.pdf

  29. [29]

    Robert Święcki. 2024. nsjail: A Light-Weight Process Isolation Tool. https: //github.com/google/nsjail

  30. [30]

    The ChromiumOS Authors. 2024. Minijail. https://google.github.io/minijail/

  31. [31]

    Cong Wang. 2025. Agentry: Orchestration and Memory for Multi-Agent Systems. https://github.com/amtp-protocol/agentry

  32. [32]

    Cong Wang and Yusheng Zheng. 2026. Fork, Explore, Commit: OS Primitives for Agentic Exploration. arXiv preprint arXiv:2602.08199. doi:10.48550/arXiv.2602. 08199

  33. [33]

    Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H

    Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, and Graham Neubig. 2025. OpenHands: An Open Platform for A...

  34. [34]

    Robert N. M. Watson, Jonathan Anderson, Ben Laurie, and Kris Kennaway. 2010. Capsicum: Practical Capabilities for UNIX. InProceedings of the 19th USENIX Security Symposium (USENIX Security ’10). USENIX Association, Washington, DC, USA, 29–46

  35. [35]

    Simon Willison. 2023. The Dual LLM Pattern for Building AI Assistants That Can Resist Prompt Injection. https://simonwillison.net/2023/Apr/25/dual-llm- pattern/

  36. [36]

    Simon Willison. 2025. The Lethal Trifecta for AI Agents: Private Data, Untrusted Content, and External Communication. https://simonwillison.net/2025/Jun/16/ the-lethal-trifecta/

  37. [37]

    Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P

    Charles P. Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P. Quigley, Erez Zadok, and Mohammad Nayyer Zubair. 2006. Versatility and Unix Semantics in Namespace Unification.ACM Transactions on Storage2, 1 (2006), 74–105

  38. [38]

    Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press

    John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. SWE-agent: Agent-Computer Inter- faces Enable Automated Software Engineering. InAdvances in Neural Information Processing Systems 37 (NeurIPS). Curran Associates, Inc., Vancouver, BC, Canada, 50528–50652. doi:10.52202/079017-1601

  39. [39]

    Yiwei Yang, Aibo Hu, Yusheng Zheng, Brian Zhao, Xinqi Zhang, Dawei Xiang, Kexin Chu, Wei Zhang, and Andi Quinn. 2024. MVVM: Deploy Your AI Agents— Securely, Efficiently, Everywhere. arXiv preprint arXiv:2410.15894. doi:10.48550/ arXiv.2410.15894

  40. [40]

    Yusheng Zheng, Jiakun Fan, Quanzhi Fu, Yiwei Yang, Wei Zhang, and Andi Quinn

  41. [41]

    arXiv preprint arXiv:2602.09345

    AgentCgroup: Understanding and Controlling OS Resources of AI Agents. arXiv preprint arXiv:2602.09345. doi:10.48550/arXiv.2602.09345

  42. [42]

    Yusheng Zheng, Yiwei Yang, Maolin Chen, and Andrew Quinn. 2024. Kgent: Kernel Extensions Large Language Model Agent. InProceedings of the ACM Sandlock: Confining AI Agent Code with Unprivileged Linux Primitives Agentic OS Workshop, March 23, 2026, Pittsburgh, USA SIGCOMM 2024 Workshop on eBPF and Kernel Extensions (eBPF ’24). ACM, Sydney, NSW, Australia, ...

  43. [43]

    Yusheng Zheng, Yiwei Yang, Wei Zhang, and Andi Quinn. 2026. ACRFence: Preventing Semantic Rollback Attacks in Agent Checkpoint-Restore. arXiv preprint arXiv:2603.20625. doi:10.48550/arXiv.2603.20625