Sandlock: Confining AI Agent Code with Unprivileged Linux Primitives
Pith reviewed 2026-06-29 21:12 UTC · model grok-4.3
The pith
Sandlock confines untrusted AI agent code on Linux using only unprivileged primitives by compiling static policies into kernel rules while routing runtime decisions to a narrow supervisor.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Sandlock is a lightweight Linux process sandbox organized around a split where static, input-independent policy is compiled into kernel-enforced rules and a narrow supervisor handles runtime-dependent decisions and virtualized effects, thereby enforcing filesystem, network, IPC, and syscall policies without root, cgroups, images, or mandatory namespaces while supporting dynamic network decisions, HTTP-level access control, TOCTOU-safe inspection of execve arguments, reversible filesystem effects, and a pipeline operator for per-stage confinement.
What carries the argument
The split between static policy compiled into kernel-enforced rules and a narrow supervisor for runtime-dependent decisions and virtualized effects.
If this is right
- Filesystem, network, IPC, and syscall policies are enforced without root or mandatory namespaces.
- Dynamic network decisions and HTTP-level access control become possible inside the sandbox.
- TOCTOU-safe inspection of execve arguments and reversible filesystem effects are supported.
- Startup overhead is roughly 5 ms with Redis achieving bare-metal throughput.
- A pipeline operator enables per-stage confinement that separates data, network, and untrusted-content capabilities.
Where Pith is reading between the lines
- The same static-plus-narrow-supervisor split could be applied to other untrusted code sources beyond AI agents, such as downloaded build scripts.
- Policy compilation might be automated from high-level descriptions of expected agent behavior.
- If the supervisor stays narrow, the technique could be composed with existing process tracing tools without privilege escalation.
Load-bearing premise
The narrow supervisor can be implemented without introducing new attack surfaces or requiring privileged operations, and static policy compilation suffices for all input-independent rules in the target workload.
What would settle it
A concrete test that launches an untrusted process attempting a TOCTOU race on execve arguments or an unauthorized network connection and checks whether the sandbox blocks it while the supervisor remains unprivileged.
Figures
read the original abstract
AI agents increasingly run untrusted code on developer machines: shell commands generated by language models, third-party scripts retrieved at runtime, and tool plugins of unknown provenance. Existing isolation mechanisms impose tradeoffs that fit this workload poorly: containers and microVMs add privilege, image-management, and startup costs, while ad-hoc process controls and wrappers (e.g. chroot, ulimit) provide weak guarantees and little syscall-level control. Sandlock is a lightweight Linux process sandbox organized around a simple split: static, input-independent policy is compiled into kernel-enforced rules, while a narrow supervisor handles runtime-dependent decisions and virtualized effects. This split lets Sandlock enforce filesystem, network, IPC, and syscall policies without root, cgroups, images, or mandatory namespaces. It also supports dynamic network decisions, HTTP-level access control, TOCTOU-safe inspection of execve arguments, and reversible filesystem effects. On our workstation, Sandlock adds roughly 5 ms of startup overhead and runs Redis at bare-metal throughput (within measurement noise); its pipeline operator further supports per-stage confinement for separating data, network, and untrusted-content capabilities. Sandlock is available at https://github.com/multikernel/sandlock
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents Sandlock, a lightweight Linux process sandbox for AI agent workloads that splits static, input-independent policies (compiled to kernel-enforced rules via Landlock, seccomp, and similar unprivileged primitives) from a narrow supervisor process that handles only runtime-dependent decisions such as dynamic network access, TOCTOU-safe execve inspection, and reversible filesystem effects. It claims to enforce filesystem, network, IPC, and syscall policies without root, cgroups, container images, or mandatory namespaces, while adding ~5 ms startup overhead and achieving bare-metal throughput for Redis; a pipeline operator is also described for per-stage confinement.
Significance. If the central split holds and the supervisor can be confined without introducing new privileged surfaces or bypasses, Sandlock would offer a practical middle ground between heavy container/VM mechanisms and weak ad-hoc controls for untrusted AI-generated code. The emphasis on unprivileged primitives, open-source release, and reported performance numbers (if supported by reproducible measurements) would be notable strengths for the systems-security community.
major comments (3)
- [Abstract / Design overview] The central claim that the narrow supervisor can be implemented and confined without root or new attack surfaces (abstract and design overview) is load-bearing; the skeptic note correctly flags that mediation paths (e.g., ptrace, socket filters, or user-space policy evaluation) must themselves be TOCTOU-safe and confinable by the same static rules, yet no concrete threat model, capability analysis, or proof sketch is supplied to show this holds for a malicious child process.
- [Evaluation section (implied by abstract numbers)] Performance claims (roughly 5 ms startup overhead; Redis at bare-metal throughput within measurement noise) are presented without methodology, error bars, workload details, or comparison baselines; this directly affects the practicality argument and must be supported by explicit experimental setup, hardware description, and raw data.
- [Policy compilation and supervisor design] The static/dynamic policy split assumes all input-independent rules can be compiled ahead of time, but the paper does not address how this interacts with workloads that have any runtime-dependent but still policy-relevant behavior (e.g., reversible FS effects or HTTP-level decisions) without enlarging the supervisor's attack surface.
minor comments (2)
- [Abstract] The abstract states security properties and performance numbers but supplies no threat model, evaluation methodology, or citation to the full experimental section; this should be clarified even if the body contains the details.
- [Abstract] Link to https://github.com/multikernel/sandlock is given; the paper should state the commit hash or release tag used for all reported results to support reproducibility.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and detailed comments. We believe Sandlock offers a practical approach for AI agent isolation and address each major comment below, proposing revisions to strengthen the presentation of the threat model, evaluation, and policy design.
read point-by-point responses
-
Referee: [Abstract / Design overview] The central claim that the narrow supervisor can be implemented and confined without root or new attack surfaces (abstract and design overview) is load-bearing; the skeptic note correctly flags that mediation paths (e.g., ptrace, socket filters, or user-space policy evaluation) must themselves be TOCTOU-safe and confinable by the same static rules, yet no concrete threat model, capability analysis, or proof sketch is supplied to show this holds for a malicious child process.
Authors: We agree this is a load-bearing claim and that the manuscript would benefit from an explicit threat model. In the revised manuscript, we will add a new subsection under Design that presents a concrete threat model for the supervisor, including a capability analysis of mediation paths such as ptrace and user-space evaluation. We will sketch why these paths remain TOCTOU-safe and confinable under the static rules, addressing the concern for malicious child processes. revision: yes
-
Referee: [Evaluation section (implied by abstract numbers)] Performance claims (roughly 5 ms startup overhead; Redis at bare-metal throughput within measurement noise) are presented without methodology, error bars, workload details, or comparison baselines; this directly affects the practicality argument and must be supported by explicit experimental setup, hardware description, and raw data.
Authors: The full paper includes an Evaluation section, but we acknowledge the abstract's performance numbers require more supporting detail. We will revise the Evaluation section to include explicit methodology, error bars, workload details, hardware description, comparison baselines, and availability of raw data or scripts for reproducibility. revision: yes
-
Referee: [Policy compilation and supervisor design] The static/dynamic policy split assumes all input-independent rules can be compiled ahead of time, but the paper does not address how this interacts with workloads that have any runtime-dependent but still policy-relevant behavior (e.g., reversible FS effects or HTTP-level decisions) without enlarging the supervisor's attack surface.
Authors: We will expand the Policy Compilation and Supervisor Design sections to address runtime-dependent behaviors. Specifically, we will explain how reversible FS effects and HTTP-level decisions are delegated to the supervisor in a way that keeps its attack surface narrow and confined by the static rules, using the pipeline operator as an example. revision: yes
Circularity Check
No circularity; claims rest on implementation and benchmarks
full rationale
The paper is a systems implementation describing Sandlock's use of unprivileged Linux primitives (Landlock, seccomp) for sandboxing. No equations, fitted parameters, predictions, or mathematical derivations appear in the provided text. The static/dynamic policy split is an engineering design choice justified by the workload, not reduced to self-definition or self-citation. Performance claims (5 ms overhead, Redis throughput) are presented as empirical measurements. No load-bearing steps reduce to inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Linux kernel exposes unprivileged mechanisms sufficient to enforce the required filesystem, network, IPC, and syscall policies at the stated granularity.
invented entities (1)
-
Sandlock sandbox with static/dynamic policy split
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Alexandru Agache, Marc Brooker, Alexandra Iordache, Anthony Liguori, Rolf Neugebauer, Phil Piwonka, and Diana-Maria Popa. 2020. Firecracker: Lightweight Virtualization for Serverless Applications. InProceedings of the 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’20). USENIX Association, Santa Clara, CA, USA, 419–434
2020
-
[2]
Anthropic. 2025. Claude Code: An Agentic Coding Tool. https://code.claude. com/docs/en/overview
2025
-
[3]
Adam Belay, Andrea Bittau, Ali Mashtizadeh, David Terei, David Mazières, and Christos Kozyrakis. 2012. Dune: Safe User-level Access to Privileged CPU Fea- tures. InProceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’12). USENIX Association, Hollywood, CA, USA, 335– 348
2012
-
[4]
Christian Brauner. 2024. pidfd_getfd(2) — Linux Manual Page. https://man7. org/linux/man-pages/man2/pidfd_getfd.2.html
2024
-
[5]
Neil Brown. 2016. Overlay Filesystem. https://www.kernel.org/doc/ Documentation/filesystems/overlayfs.txt
2016
-
[6]
Bubblewrap Project. 2026. Bubblewrap: Low-Level Unprivileged Sandboxing Tool. https://github.com/containers/bubblewrap. Accessed: 2026-05-23
2026
-
[7]
Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Car- lini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Flo- rian Tramèr. 2025. Defeating Prompt Injections by Design. arXiv preprint arXiv:2503.18813. doi:10.48550/arXiv.2503.18813
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.18813 2025
-
[8]
Docker Inc. 2026. Rootless Mode. https://docs.docker.com/engine/security/ rootless/. Accessed: 2026-05-23
2026
-
[9]
Will Drewry and Jake Edge. 2024. seccomp(2) — Linux Manual Page. https: //man7.org/linux/man-pages/man2/seccomp.2.html
2024
-
[10]
Dong Du, Tianyi Yu, Yubin Xia, Binyu Zang, Guanglu Yan, Chenggang Qin, Qix- uan Wu, and Haibo Chen. 2020. Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting. InProceedings of the 25th ACM In- ternational Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’20). ACM, Lausanne...
2020
-
[11]
William Findlay, Anil Somayaji, and David Barrera. 2020. bpfbox: Simple Precise Process Confinement with eBPF. InProceedings of the 2020 ACM SIGSAC Con- ference on Cloud Computing Security Workshop (CCSW ’20). ACM, Virtual Event, USA, 91–103. doi:10.1145/3411495.3421358
-
[12]
Firejail Project. 2026. Firejail: Linux Namespaces and Seccomp-bpf Sandbox. https://github.com/netblue30/firejail. Accessed: 2026-05-23
2026
-
[13]
Tal Garfinkel. 2003. Traps and Pitfalls: Practical Problems in System Call Interposition Based Security Tools. InProceedings of the Network and Dis- tributed System Security Symposium (NDSS). Internet Society, San Diego, CA, USA, 15 pages. https://www.ndss-symposium.org/ndss2003/traps-and-pitfalls- practical-problems-system-call-interposition-based-securi...
2003
-
[14]
Google. 2024. gVisor: Application Kernel for Containers. https://gvisor.dev/
2024
-
[15]
Tejun Heo. 2015. Control Groups v2. https://www.kernel.org/doc/ Documentation/cgroup-v2.txt
2015
-
[16]
Michael Kerrisk. 2013. Namespaces in Operation. LWN.net. https://lwn.net/ Articles/531114/
2013
-
[17]
Michael Kerrisk. 2024. seccomp_unotify(2) — Linux Manual Page. https: //man7.org/linux/man-pages/man2/seccomp_unotify.2.html
2024
-
[18]
Taesoo Kim and Nickolai Zeldovich. 2013. Practical and Effective Sandboxing for Non-root Users. InProceedings of the 2013 USENIX Annual Technical Conference (USENIX ATC ’13). USENIX Association, San Jose, CA, USA, 139–144. https: //www.usenix.org/conference/atc13/technical-sessions/presentation/kim
2013
-
[19]
Linux man-pages Project. 2025. capabilities(7) — Linux Manual Page. https: //man7.org/linux/man-pages/man7/capabilities.7.html
2025
-
[20]
Linux man-pages Project. 2025. landlock(7) — Linux Manual Page. https: //man7.org/linux/man-pages/man7/landlock.7.html
2025
-
[21]
James Litton, Anjo Vahldiek-Oberwagner, Eslam Elnikety, Deepak Garg, Bobby Bhattacharjee, and Peter Druschel. 2016. Light-Weight Contexts: An OS Abstrac- tion for Safety and Performance. InProceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI ’16). USENIX Association, Savannah, GA, USA, 49–64
2016
-
[22]
Dirk Merkel. 2014. Docker: Lightweight Linux Containers for Con- sistent Development and Deployment. Linux Journal. https: //www.linuxjournal.com/content/docker-lightweight-linux-containers- consistent-development-and-deployment
2014
-
[23]
Model Context Protocol. 2026. Tools. https://modelcontextprotocol.io/ specification/draft/server/tools. Accessed: 2026-05-23
2026
-
[24]
Nightingale, Peter M
Edmund B. Nightingale, Peter M. Chen, and Jason Flinn. 2005. Speculative Execution in a Distributed File System. InProceedings of the Twentieth ACM Symposium on Operating Systems Principles (SOSP ’05). ACM, Brighton, United Kingdom, 191–205
2005
-
[25]
Open Container Initiative. 2023. Open Container Initiative Runtime Specification, Version 1.1.0. https://oci-playground.github.io/specs-latest/specs/runtime/v1.1. 0/oci-runtime-spec.html
2023
-
[26]
Phylum. 2023. Birdcage: Cross-platform Embeddable Sandboxing. https://github. com/phylum-dev/birdcage
2023
-
[27]
Porter, Owen S
Donald E. Porter, Owen S. Hofmann, Christopher J. Rossbach, Alexander Benn, and Emmett Witchel. 2009. Operating System Transactions. InProceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles (SOSP ’09). ACM, Big Sky, MT, USA, 161–176
2009
-
[28]
Mickaël Salaün. 2017. Landlock LSM: Toward Unprivileged Sandboxing. Linux Security Summit Europe. https://landlock.io/talks/2017-09-14_landlock-lss.pdf
2017
-
[29]
Robert Święcki. 2024. nsjail: A Light-Weight Process Isolation Tool. https: //github.com/google/nsjail
2024
-
[30]
The ChromiumOS Authors. 2024. Minijail. https://google.github.io/minijail/
2024
-
[31]
Cong Wang. 2025. Agentry: Orchestration and Memory for Multi-Agent Systems. https://github.com/amtp-protocol/agentry
2025
-
[32]
Cong Wang and Yusheng Zheng. 2026. Fork, Explore, Commit: OS Primitives for Agentic Exploration. arXiv preprint arXiv:2602.08199. doi:10.48550/arXiv.2602. 08199
-
[33]
Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H
Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, and Graham Neubig. 2025. OpenHands: An Open Platform for A...
2025
-
[34]
Robert N. M. Watson, Jonathan Anderson, Ben Laurie, and Kris Kennaway. 2010. Capsicum: Practical Capabilities for UNIX. InProceedings of the 19th USENIX Security Symposium (USENIX Security ’10). USENIX Association, Washington, DC, USA, 29–46
2010
-
[35]
Simon Willison. 2023. The Dual LLM Pattern for Building AI Assistants That Can Resist Prompt Injection. https://simonwillison.net/2023/Apr/25/dual-llm- pattern/
2023
-
[36]
Simon Willison. 2025. The Lethal Trifecta for AI Agents: Private Data, Untrusted Content, and External Communication. https://simonwillison.net/2025/Jun/16/ the-lethal-trifecta/
2025
-
[37]
Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P
Charles P. Wright, Jay Dave, Puja Gupta, Harikesavan Krishnan, David P. Quigley, Erez Zadok, and Mohammad Nayyer Zubair. 2006. Versatility and Unix Semantics in Namespace Unification.ACM Transactions on Storage2, 1 (2006), 74–105
2006
-
[38]
Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press
John Yang, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. SWE-agent: Agent-Computer Inter- faces Enable Automated Software Engineering. InAdvances in Neural Information Processing Systems 37 (NeurIPS). Curran Associates, Inc., Vancouver, BC, Canada, 50528–50652. doi:10.52202/079017-1601
- [39]
-
[40]
Yusheng Zheng, Jiakun Fan, Quanzhi Fu, Yiwei Yang, Wei Zhang, and Andi Quinn
-
[41]
arXiv preprint arXiv:2602.09345
AgentCgroup: Understanding and Controlling OS Resources of AI Agents. arXiv preprint arXiv:2602.09345. doi:10.48550/arXiv.2602.09345
-
[42]
Yusheng Zheng, Yiwei Yang, Maolin Chen, and Andrew Quinn. 2024. Kgent: Kernel Extensions Large Language Model Agent. InProceedings of the ACM Sandlock: Confining AI Agent Code with Unprivileged Linux Primitives Agentic OS Workshop, March 23, 2026, Pittsburgh, USA SIGCOMM 2024 Workshop on eBPF and Kernel Extensions (eBPF ’24). ACM, Sydney, NSW, Australia, ...
-
[43]
Yusheng Zheng, Yiwei Yang, Wei Zhang, and Andi Quinn. 2026. ACRFence: Preventing Semantic Rollback Attacks in Agent Checkpoint-Restore. arXiv preprint arXiv:2603.20625. doi:10.48550/arXiv.2603.20625
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.