Language-Based Agent Control

Loris D'Antoni; Nadia Polikarpova; Timothy Zhou

arxiv: 2605.12863 · v1 · pith:NGAEA3IJnew · submitted 2026-05-13 · 💻 cs.PL · cs.AI· cs.CR

Language-Based Agent Control

Timothy Zhou , Loris D'Antoni , Nadia Polikarpova This is my paper

Pith reviewed 2026-06-30 21:39 UTC · model grok-4.3

classification 💻 cs.PL cs.AIcs.CR

keywords language-based agent controltype systemsagent safetyinformation flow controlaccess controlprogramming modelsdata provenanceAI agents

0 comments

The pith

Requiring agents to generate well-typed programs allows safety policies to apply uniformly to both agent and developer code.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes language-based agent control as a model in which agents must produce programs that type-check in the context of the surrounding scaffolding code. Unsafe outputs are rejected by the type checker before any execution occurs. A sympathetic reader would care because this extends established language-based security techniques to agentic applications while still permitting side-effect-free computation and recursive subagent calls under the same or stricter policies. The method is shown through case studies covering I/O sandboxing, data provenance, and information-flow control.

Core claim

LBAC requires agents to generate programs that are well typed in the context of the surrounding scaffolding code. Unsafe programs are rejected by the type-checker before execution, allowing policies to apply uniformly across the entire application, including both agent-generated behavior and developer-written scaffolding. Agents may still perform arbitrary side-effect-free computation and recursively invoke subagents, which retain full tool access subject to the same or potentially more restrictive policies.

What carries the argument

Language-based agent control (LBAC), the requirement that agent-generated programs must themselves be well typed within the scaffolding code so that static and runtime enforcement can cover the whole application.

If this is right

Policies for access control, information flow, and data provenance apply uniformly to both scaffolding and agent-generated code.
Unsafe agent outputs are rejected before execution by the existing type checker.
Agents retain the ability to perform arbitrary side-effect-free computation.
Subagents can be invoked recursively with full or more restrictive tool access under the same policy framework.
The same mechanisms demonstrated for filesystem capabilities, provenance tracking, and information-flow control become available for other policies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Existing typed languages could host agent scaffolding directly, letting developers reuse familiar type systems for agent safety.
Runtime monitoring might become less necessary if type enforcement covers the agent portion of the program.
The model could be tested by porting an existing agent framework into a statically typed language and measuring how often agents succeed in producing well-typed solutions.

Load-bearing premise

Agents can generate programs that satisfy the type system of the scaffolding code while retaining substantial expressiveness including arbitrary side-effect-free computation and recursive subagent invocation.

What would settle it

An experiment in which every agent attempt to solve a useful task either produces an ill-typed program that the checker rejects or loses necessary functionality when restricted to well-typed programs only.

Figures

Figures reproduced from arXiv: 2605.12863 by Loris D'Antoni, Nadia Polikarpova, Timothy Zhou.

**Figure 2.** Figure 2: Overview of the Language-Based Agent Control (LBAC) model. The entire agentic system [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

This paper introduces language-based agent control (LBAC), a new programming model for agentic applications that brings techniques from programming languages and language-based security to the problem of agent control. In conventional programming, combinations of static typing and runtime enforcement have long been used to guarantee that well-typed programs satisfy user-specified policies, including policies for access control, information flow, data provenance, and more. The key idea behind LBAC is to extend these guarantees to agentic applications by requiring agents to generate programs that are themselves well typed in the context of the surrounding scaffolding code. Unsafe programs are rejected by the type-checker before execution, allowing policies to apply uniformly across the entire application, including both agent-generated behavior and developer-written scaffolding. At the same time, LBAC preserves substantial expressiveness: agents may perform arbitrary side-effect-free computation and recursively invoke subagents, which retain full tool access subject to the same -- or potentially more restrictive -- policies. We demonstrate LBAC with three case studies: I/O sandboxing via filesystem capabilities, data provenance, and information-flow control.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LBAC frames agent control as a typing problem for uniform policy enforcement, but the abstract supplies no type system, generation method, or results, so the feasibility claim stays untested.

read the letter

The paper's main move is to treat agent-generated behavior as code that must type-check in the context of the developer's scaffolding. This lets existing language-based security tools—static typing plus runtime checks—apply the same policies to both parts of the system. The three case studies (filesystem capabilities, provenance, information flow) are presented as direct applications of that idea, and the text is clear that subagents can inherit the same or tighter restrictions while still allowing side-effect-free computation and recursion.

What stands out is the clean transfer of PL ideas to the agent setting without inventing new primitives from scratch. The authors correctly identify that the type checker becomes the gate before any execution happens, which avoids some of the monitoring overhead that current agent frameworks rely on.

The central weakness is the missing evidence on whether the approach can actually be used. The abstract asserts that agents can still produce expressive programs while satisfying the type discipline, yet it gives no type rules, no synthesis procedure, no success metrics, and no discussion of how an agent would reliably produce well-typed output instead of getting rejected. If the required types turn out to be too restrictive for practical generation, the security guarantee holds only for trivial cases. That assumption is load-bearing and currently unsupported.

This is aimed at programming-language researchers who work on security and at people building agent frameworks who want enforceable static guarantees. It is worth sending to peer review because the framing is distinct from existing runtime or prompt-based controls, but any serious review will need the authors to show a concrete type system and at least preliminary generation results before the claims can be evaluated.

Referee Report

2 major / 1 minor

Summary. The paper introduces Language-Based Agent Control (LBAC), a programming model for agentic applications that extends language-based security techniques from conventional programming. The core idea requires agents to generate programs that are well-typed in the context of surrounding scaffolding code, so that type checkers can reject unsafe programs before execution and enforce policies (access control, information flow, provenance) uniformly across both developer-written and agent-generated code. The approach claims to preserve substantial expressiveness, allowing arbitrary side-effect-free computation and recursive invocation of subagents that retain full or more restrictive tool access. Three case studies are presented to demonstrate the idea: I/O sandboxing via filesystem capabilities, data provenance, and information-flow control.

Significance. If the central claims are substantiated with concrete type-system design and evidence that agents can reliably synthesize the required programs, LBAC would represent a meaningful application of established PL techniques to the emerging problem of controlling LLM-based agents. It could provide static, uniform security guarantees without requiring separate runtime monitors for agent behavior, which would be of interest to both the programming-languages and AI-safety communities.

major comments (2)

[Abstract] Abstract: The manuscript asserts that a single type discipline can simultaneously (a) enforce uniform policies across scaffolding and agent code and (b) still permit arbitrary side-effect-free computation plus recursive subagent calls with equal or stricter tool access. No formal type-system definition, no generation algorithm, and no success-rate or expressiveness measurements are supplied, making it impossible to evaluate whether the two requirements are compatible. This balance is load-bearing for the central claim.
[Case studies] Case studies: The three demonstrations (I/O sandboxing, data provenance, information-flow control) are cited as evidence that the approach works, yet the text contains no implementation details, no description of how agents are prompted or constrained to produce well-typed programs, and no analysis of type-checking outcomes or policy violations. Without these, the case studies cannot support the feasibility or expressiveness claims.

minor comments (1)

[Abstract] The abstract would benefit from a short related-work paragraph situating LBAC with respect to prior language-based security systems (e.g., those using capabilities or IFC) and existing agent-control frameworks.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript introducing Language-Based Agent Control (LBAC). We agree that the central claims require more formal and concrete substantiation through a defined type system and expanded case study details. We will revise the manuscript accordingly. We respond point-by-point to the major comments below.

read point-by-point responses

Referee: [Abstract] Abstract: The manuscript asserts that a single type discipline can simultaneously (a) enforce uniform policies across scaffolding and agent code and (b) still permit arbitrary side-effect-free computation plus recursive subagent calls with equal or stricter tool access. No formal type-system definition, no generation algorithm, and no success-rate or expressiveness measurements are supplied, making it impossible to evaluate whether the two requirements are compatible. This balance is load-bearing for the central claim.

Authors: We acknowledge that the abstract presents the key claims at a high level without the supporting formalisms. The manuscript focuses on introducing the LBAC model and illustrating it through case studies, but does not include a complete formal type system or quantitative measurements. In the revised manuscript, we will add a formal definition of the type system that demonstrates how policies are enforced uniformly while allowing the specified expressiveness. We will also describe the program generation approach and include any expressiveness analysis possible from the case studies. revision: yes
Referee: [Case studies] Case studies: The three demonstrations (I/O sandboxing, data provenance, information-flow control) are cited as evidence that the approach works, yet the text contains no implementation details, no description of how agents are prompted or constrained to produce well-typed programs, and no analysis of type-checking outcomes or policy violations. Without these, the case studies cannot support the feasibility or expressiveness claims.

Authors: The case studies are currently described conceptually to show how LBAC can be applied to different policies. We agree that they lack the necessary implementation details to fully support the claims. In the revision, we will provide detailed descriptions of the implementations, including agent prompting strategies to ensure well-typed outputs, the type checking process, and analysis of any policy violations or type-checking results. This will strengthen the evidence for feasibility and expressiveness. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual model with case studies only

full rationale

The paper introduces LBAC as a new programming model extending type systems and language-based security to agents, with claims supported by three case studies (I/O sandboxing, data provenance, IFC). No equations, derivations, fitted parameters, or load-bearing self-citations appear in the abstract or described structure. The central idea—that agents generate well-typed programs in scaffolding context—is presented as a design choice demonstrated empirically, not derived from prior self-referential results or by construction. The derivation chain is self-contained as a proposal without reduction to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract contains no mathematical derivations, fitted parameters, or new postulated entities; the contribution is described at a conceptual level only.

pith-pipeline@v0.9.1-grok · 5712 in / 1163 out tokens · 31327 ms · 2026-06-30T21:39:15.588331+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 17 canonical work pages

[1]

Guardians: A static verifier for AI agent workflows

Nada Amin and contributors. Guardians: A static verifier for AI agent workflows. https: //github.com/metareflection/guardians, 2026. Open-source implementation of the proposal in Meijer [15]

2026
[2]

Build with Claude: Agents

Anthropic. Build with Claude: Agents. https://docs.anthropic.com/en/docs/ build-with-claude/agents, 2025. URL https://docs.anthropic.com/en/docs/ build-with-claude/agents

2025
[3]

Newton, Simon Peyton Jones, and Arnaud Spiwack

Jean-Philippe Bernardy, Mathieu Boespflug, Ryan R. Newton, Simon Peyton Jones, and Arnaud Spiwack. Linear haskell: practical linearity in a higher-order polymorphic language. Proc. ACM Program. Lang., 2(POPL):5:1–5:29, 2018. doi: 10.1145/3158093. URL https://doi. org/10.1145/3158093

work page doi:10.1145/3158093 2018
[4]

Securing ai agents with information-flow control, 2025

Manuel Costa, Boris Köpf, Aashish Kolluri, Andrew Paverd, Mark Russinovich, Ahmed Salem, Shruti Tople, Lukas Wutschitz, and Santiago Zanella-Béguelin. Securing ai agents with information-flow control, 2025. URL https://arxiv.org/abs/2505.23643

Pith/arXiv arXiv 2025
[5]

CrewAI Inc. CrewAI. https://github.com/crewAIInc/crewAI, 2024. URL https: //github.com/crewAIInc/crewAI

2024
[6]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents. In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024. URL https://openreview.net/forum? id=m1YYAQjO3w

2024
[7]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents, 2024

Edoardo Debenedetti, Jie Zhang, Mislav Balunovi´c, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents, 2024. URL https://arxiv.org/abs/2406.13352

Pith/arXiv arXiv 2024
[8]

Defeating prompt injections by design, 2025

Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Florian Tramèr. Defeating prompt injections by design, 2025. URL https://arxiv.org/abs/2503.18813

Pith/arXiv arXiv 2025
[9]

Multivariate amortized resource analysis

Jan Hoffmann, Klaus Aehlig, and Martin Hofmann. Multivariate amortized resource analysis. In Thomas Ball and Mooly Sagiv, editors, Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, pages 357–370. ACM, 2011. doi: 10.1145/1926385.1926427. URL https: //doi.org/10.1145...

work page doi:10.1145/1926385.1926427 2011
[10]

Resource-guided program synthesis

Tristan Knoth, Di Wang, Nadia Polikarpova, and Jan Hoffmann. Resource-guided program synthesis. In Kathryn S. McKinley and Kathleen Fisher, editors, Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, Phoenix, AZ, USA, June 22-26, 2019 , pages 253–268. ACM, 2019. doi: 10.1145/3314221. 3314602. URL h...

work page doi:10.1145/3314221 2019
[11]

Optimizing agent planning for security and autonomy, 2026

Aashish Kolluri, Rishi Sharma, Manuel Costa, Boris Köpf, Tobias Nießen, Mark Russinovich, Shruti Tople, and Santiago Zanella-Béguelin. Optimizing agent planning for security and autonomy, 2026. URL https://arxiv.org/abs/2602.11416

arXiv 2026
[12]

LangGraph

LangChain AI. LangGraph. https://github.com/langchain-ai/langgraph, 2024. URL https://github.com/langchain-ai/langgraph

2024
[13]

Geller, Niki Vazou, and Ranjit Jhala

Nico Lehmann, Adam T. Geller, Niki Vazou, and Ranjit Jhala. Flux: Liquid types for rust. Proc. ACM Program. Lang., 7(PLDI):1533–1557, 2023. doi: 10.1145/3591283. URL https: //doi.org/10.1145/3591283

work page doi:10.1145/3591283 2023
[14]

Xavier Leroy

Daan Leijen. Type directed compilation of row-typed algebraic effects. In Giuseppe Castagna and Andrew D. Gordon, editors, Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017, pages 486–499. ACM, 2017. doi: 10.1145/3009837.3009872. URL https://doi.org/10.1145/ 3009837.3009872. 10

work page doi:10.1145/3009837.3009872 2017
[15]

Guardians of the agents.Commun

Erik Meijer. Guardians of the agents.Commun. ACM, 69(1):46–52, 2026. doi: 10.1145/3777544. URL https://doi.org/10.1145/3777544

work page doi:10.1145/3777544 2026
[16]

Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control

Mark Samuel Miller. Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control. PhD thesis, Johns Hopkins University, Baltimore, Maryland, USA, 2006. URL http://erights.org/talks/thesis/markm-thesis.pdf

2006
[17]

Andrew C. Myers. Jflow: Practical mostly-static information flow control. In Andrew W. Appel and Alex Aiken, editors, POPL ’99, Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Antonio, TX, USA, January 20-22, 1999, pages 228–241. ACM, 1999. doi: 10.1145/292540.292561. URL https://doi.org/ 10.1145/292540.292561

work page doi:10.1145/292540.292561 1999
[18]

OpenAI agents SDK

OpenAI. OpenAI agents SDK. https://github.com/openai/openai-agents-python,
[19]

URL https://github.com/openai/openai-agents-python
[20]

Quantitative program reasoning with graded modal types

Dominic Orchard, Vilem-Benjamin Liepelt, and Harley Eades III. Quantitative program reasoning with graded modal types. Proc. ACM Program. Lang., 3(ICFP), July 2019. doi: 10.1145/3341714. URL https://doi.org/10.1145/3341714

work page doi:10.1145/3341714 2019
[21]

Plotkin and Matija Pretnar

Gordon D. Plotkin and Matija Pretnar. Handlers of algebraic effects. In Giuseppe Castagna, editor, Programming Languages and Systems, 18th European Symposium on Programming, ESOP 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York, UK, March 22-29, 2009. Proceedings, Lecture Notes in Computer Science, ...

work page doi:10.1007/978-3-642-00590-9 2009
[22]

Liquid information flow control

Nadia Polikarpova, Deian Stefan, Jean Yang, Shachar Itzhaky, Travis Hance, and Armando Solar-Lezama. Liquid information flow control. Proc. ACM Program. Lang., 4(ICFP):105:1– 105:30, 2020. doi: 10.1145/3408987. URL https://doi.org/10.1145/3408987

work page doi:10.1145/3408987 2020
[23]

Chained bash commands silently bypass the tool permission allowlist

Pro777. Chained bash commands silently bypass the tool permission allowlist. GitHub issue #36637, anthropics/claude-code, March 2026. URL https://github.com/ anthropics/claude-code/issues/36637. Accessed: 2026-05-06

2026
[24]

Andrei Sabelfeld and Andrew C. Myers. Language-based information-flow security. IEEE J. Sel. Areas Commun., 21(1):5–19, 2003. doi: 10.1109/JSAC.2002.806121. URL https: //doi.org/10.1109/JSAC.2002.806121

work page doi:10.1109/jsac.2002.806121 2003
[25]

Schneider, J

Fred B. Schneider, J. Gregory Morrisett, and Robert Harper. A language-based approach to security. In Reinhard Wilhelm, editor, Informatics - 10 Years Back. 10 Years Ahead, Lecture Notes in Computer Science, pages 86–101. Springer, 2001. doi: 10.1007/3-540-44577-3\_6. URL https://doi.org/10.1007/3-540-44577-3_6

work page doi:10.1007/3-540-44577-3 2001
[26]

An ai agent execution environment to safeguard user data, 2026

Robert Stanley, Avi Verma, Lillian Tsai, Konstantinos Kallas, and Sam Kumar. An ai agent execution environment to safeguard user data, 2026. URL https://arxiv.org/abs/2604. 19657

2026
[27]

Mitchell

Deian Stefan, Alejandro Russo, David Mazières, and John C. Mitchell. Disjunction category labels. In Peeter Laud, editor, Information Security Technology for Applications - 16th Nordic Conference on Secure IT Systems, NordSec 2011, Tallinn, Estonia, October 26-28, 2011, Revised Selected Papers, Lecture Notes in Computer Science, pages 223–239. Springer, 2...

work page doi:10.1007/978-3-642-29615-4_ 2011
[28]

Mitchell, and David Mazières

Deian Stefan, Alejandro Russo, John C. Mitchell, and David Mazières. Flexible dynamic information flow control in haskell. In Koen Claessen, editor, Proceedings of the 4th ACM SIGPLAN Symposium on Haskell, Haskell 2011, Tokyo, Japan, 22 September 2011 , pages 95–106. ACM, 2011. doi: 10.1145/2034675.2034688. URL https://doi.org/10.1145/ 2034675.2034688

work page doi:10.1145/2034675.2034688 2011
[29]

Mitchell, and Alejandro Russo

Deian Stefan, David Mazières, John C. Mitchell, and Alejandro Russo. Flexible dynamic information flow control in the presence of exceptions. J. Funct. Program., 27:e5, 2017. doi: 10.1017/S0956796816000241. URL https://doi.org/10.1017/S0956796816000241. 11

work page doi:10.1017/s0956796816000241 2017
[30]

Peyton Jones, and David Mazières

David Terei, Simon Marlow, Simon L. Peyton Jones, and David Mazières. Safe haskell. In Janis V oigtländer, editor,Proceedings of the 5th ACM SIGPLAN Symposium on Haskell, Haskell 2012, Copenhagen, Denmark, 13 September 2012, pages 137–148. ACM, 2012. doi: 10.1145/2364506.2364524. URL https://doi.org/10.1145/2364506.2364524

work page doi:10.1145/2364506.2364524 2012
[31]

Contextual agent security: A policy for every purpose,

Lillian Tsai and Eugene Bagdasarian. Contextual agent security: A policy for every purpose,
[32]

URL https://arxiv.org/abs/2501.17070

arXiv
[33]

Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon L

Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon L. Peyton Jones. Re- finement types for haskell. In Johan Jeuring and Manuel M. T. Chakravarty, editors,Proceedings of the 19th ACM SIGPLAN international conference on Functional programming, Gothenburg, Sweden, September 1-3, 2014, pages 269–282. ACM, 2014. doi: 10.1145/2628136.262...

work page doi:10.1145/2628136.2628161 2014
[34]

The dual LLM pattern for building AI assistants that can resist prompt in- jection

Simon Willison. The dual LLM pattern for building AI assistants that can resist prompt in- jection. https://simonwillison.net/2023/Apr/25/dual-llm-pattern/, April 2023. Accessed: 2026-05-06

2023
[35]

Autogen: Enabling next-gen LLM applications via multi-agent conversation,

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, and Chi Wang. Autogen: Enabling next-gen LLM applications via multi-agent conversation,
[36]

" "

URL https://arxiv.org/abs/2308.08155. 12 A CaMeL Comparison Data Table 1: Number of AgentDojo Slack tasks completed (out of 21) by TYPE GUARD and CaMeL when not under attack, with and without IFC policies. No Policies With Policies Metric T YPE GUARD CaMeL T YPE GUARD CaMeL Utility (/21) 15 15 8 7 Table 2: Utility (benign tasks completed) and security (in...

Pith/arXiv arXiv

[1] [1]

Guardians: A static verifier for AI agent workflows

Nada Amin and contributors. Guardians: A static verifier for AI agent workflows. https: //github.com/metareflection/guardians, 2026. Open-source implementation of the proposal in Meijer [15]

2026

[2] [2]

Build with Claude: Agents

Anthropic. Build with Claude: Agents. https://docs.anthropic.com/en/docs/ build-with-claude/agents, 2025. URL https://docs.anthropic.com/en/docs/ build-with-claude/agents

2025

[3] [3]

Newton, Simon Peyton Jones, and Arnaud Spiwack

Jean-Philippe Bernardy, Mathieu Boespflug, Ryan R. Newton, Simon Peyton Jones, and Arnaud Spiwack. Linear haskell: practical linearity in a higher-order polymorphic language. Proc. ACM Program. Lang., 2(POPL):5:1–5:29, 2018. doi: 10.1145/3158093. URL https://doi. org/10.1145/3158093

work page doi:10.1145/3158093 2018

[4] [4]

Securing ai agents with information-flow control, 2025

Manuel Costa, Boris Köpf, Aashish Kolluri, Andrew Paverd, Mark Russinovich, Ahmed Salem, Shruti Tople, Lukas Wutschitz, and Santiago Zanella-Béguelin. Securing ai agents with information-flow control, 2025. URL https://arxiv.org/abs/2505.23643

Pith/arXiv arXiv 2025

[5] [5]

CrewAI Inc. CrewAI. https://github.com/crewAIInc/crewAI, 2024. URL https: //github.com/crewAIInc/crewAI

2024

[6] [6]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents

Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents. In The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024. URL https://openreview.net/forum? id=m1YYAQjO3w

2024

[7] [7]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents, 2024

Edoardo Debenedetti, Jie Zhang, Mislav Balunovi´c, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents, 2024. URL https://arxiv.org/abs/2406.13352

Pith/arXiv arXiv 2024

[8] [8]

Defeating prompt injections by design, 2025

Edoardo Debenedetti, Ilia Shumailov, Tianqi Fan, Jamie Hayes, Nicholas Carlini, Daniel Fabian, Christoph Kern, Chongyang Shi, Andreas Terzis, and Florian Tramèr. Defeating prompt injections by design, 2025. URL https://arxiv.org/abs/2503.18813

Pith/arXiv arXiv 2025

[9] [9]

Multivariate amortized resource analysis

Jan Hoffmann, Klaus Aehlig, and Martin Hofmann. Multivariate amortized resource analysis. In Thomas Ball and Mooly Sagiv, editors, Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, pages 357–370. ACM, 2011. doi: 10.1145/1926385.1926427. URL https: //doi.org/10.1145...

work page doi:10.1145/1926385.1926427 2011

[10] [10]

Resource-guided program synthesis

Tristan Knoth, Di Wang, Nadia Polikarpova, and Jan Hoffmann. Resource-guided program synthesis. In Kathryn S. McKinley and Kathleen Fisher, editors, Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019, Phoenix, AZ, USA, June 22-26, 2019 , pages 253–268. ACM, 2019. doi: 10.1145/3314221. 3314602. URL h...

work page doi:10.1145/3314221 2019

[11] [11]

Optimizing agent planning for security and autonomy, 2026

Aashish Kolluri, Rishi Sharma, Manuel Costa, Boris Köpf, Tobias Nießen, Mark Russinovich, Shruti Tople, and Santiago Zanella-Béguelin. Optimizing agent planning for security and autonomy, 2026. URL https://arxiv.org/abs/2602.11416

arXiv 2026

[12] [12]

LangGraph

LangChain AI. LangGraph. https://github.com/langchain-ai/langgraph, 2024. URL https://github.com/langchain-ai/langgraph

2024

[13] [13]

Geller, Niki Vazou, and Ranjit Jhala

Nico Lehmann, Adam T. Geller, Niki Vazou, and Ranjit Jhala. Flux: Liquid types for rust. Proc. ACM Program. Lang., 7(PLDI):1533–1557, 2023. doi: 10.1145/3591283. URL https: //doi.org/10.1145/3591283

work page doi:10.1145/3591283 2023

[14] [14]

Xavier Leroy

Daan Leijen. Type directed compilation of row-typed algebraic effects. In Giuseppe Castagna and Andrew D. Gordon, editors, Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017, pages 486–499. ACM, 2017. doi: 10.1145/3009837.3009872. URL https://doi.org/10.1145/ 3009837.3009872. 10

work page doi:10.1145/3009837.3009872 2017

[15] [15]

Guardians of the agents.Commun

Erik Meijer. Guardians of the agents.Commun. ACM, 69(1):46–52, 2026. doi: 10.1145/3777544. URL https://doi.org/10.1145/3777544

work page doi:10.1145/3777544 2026

[16] [16]

Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control

Mark Samuel Miller. Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control. PhD thesis, Johns Hopkins University, Baltimore, Maryland, USA, 2006. URL http://erights.org/talks/thesis/markm-thesis.pdf

2006

[17] [17]

Andrew C. Myers. Jflow: Practical mostly-static information flow control. In Andrew W. Appel and Alex Aiken, editors, POPL ’99, Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Antonio, TX, USA, January 20-22, 1999, pages 228–241. ACM, 1999. doi: 10.1145/292540.292561. URL https://doi.org/ 10.1145/292540.292561

work page doi:10.1145/292540.292561 1999

[18] [18]

OpenAI agents SDK

OpenAI. OpenAI agents SDK. https://github.com/openai/openai-agents-python,

[19] [19]

URL https://github.com/openai/openai-agents-python

[20] [20]

Quantitative program reasoning with graded modal types

Dominic Orchard, Vilem-Benjamin Liepelt, and Harley Eades III. Quantitative program reasoning with graded modal types. Proc. ACM Program. Lang., 3(ICFP), July 2019. doi: 10.1145/3341714. URL https://doi.org/10.1145/3341714

work page doi:10.1145/3341714 2019

[21] [21]

Plotkin and Matija Pretnar

Gordon D. Plotkin and Matija Pretnar. Handlers of algebraic effects. In Giuseppe Castagna, editor, Programming Languages and Systems, 18th European Symposium on Programming, ESOP 2009, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, York, UK, March 22-29, 2009. Proceedings, Lecture Notes in Computer Science, ...

work page doi:10.1007/978-3-642-00590-9 2009

[22] [22]

Liquid information flow control

Nadia Polikarpova, Deian Stefan, Jean Yang, Shachar Itzhaky, Travis Hance, and Armando Solar-Lezama. Liquid information flow control. Proc. ACM Program. Lang., 4(ICFP):105:1– 105:30, 2020. doi: 10.1145/3408987. URL https://doi.org/10.1145/3408987

work page doi:10.1145/3408987 2020

[23] [23]

Chained bash commands silently bypass the tool permission allowlist

Pro777. Chained bash commands silently bypass the tool permission allowlist. GitHub issue #36637, anthropics/claude-code, March 2026. URL https://github.com/ anthropics/claude-code/issues/36637. Accessed: 2026-05-06

2026

[24] [24]

Andrei Sabelfeld and Andrew C. Myers. Language-based information-flow security. IEEE J. Sel. Areas Commun., 21(1):5–19, 2003. doi: 10.1109/JSAC.2002.806121. URL https: //doi.org/10.1109/JSAC.2002.806121

work page doi:10.1109/jsac.2002.806121 2003

[25] [25]

Schneider, J

Fred B. Schneider, J. Gregory Morrisett, and Robert Harper. A language-based approach to security. In Reinhard Wilhelm, editor, Informatics - 10 Years Back. 10 Years Ahead, Lecture Notes in Computer Science, pages 86–101. Springer, 2001. doi: 10.1007/3-540-44577-3\_6. URL https://doi.org/10.1007/3-540-44577-3_6

work page doi:10.1007/3-540-44577-3 2001

[26] [26]

An ai agent execution environment to safeguard user data, 2026

Robert Stanley, Avi Verma, Lillian Tsai, Konstantinos Kallas, and Sam Kumar. An ai agent execution environment to safeguard user data, 2026. URL https://arxiv.org/abs/2604. 19657

2026

[27] [27]

Mitchell

Deian Stefan, Alejandro Russo, David Mazières, and John C. Mitchell. Disjunction category labels. In Peeter Laud, editor, Information Security Technology for Applications - 16th Nordic Conference on Secure IT Systems, NordSec 2011, Tallinn, Estonia, October 26-28, 2011, Revised Selected Papers, Lecture Notes in Computer Science, pages 223–239. Springer, 2...

work page doi:10.1007/978-3-642-29615-4_ 2011

[28] [28]

Mitchell, and David Mazières

Deian Stefan, Alejandro Russo, John C. Mitchell, and David Mazières. Flexible dynamic information flow control in haskell. In Koen Claessen, editor, Proceedings of the 4th ACM SIGPLAN Symposium on Haskell, Haskell 2011, Tokyo, Japan, 22 September 2011 , pages 95–106. ACM, 2011. doi: 10.1145/2034675.2034688. URL https://doi.org/10.1145/ 2034675.2034688

work page doi:10.1145/2034675.2034688 2011

[29] [29]

Mitchell, and Alejandro Russo

Deian Stefan, David Mazières, John C. Mitchell, and Alejandro Russo. Flexible dynamic information flow control in the presence of exceptions. J. Funct. Program., 27:e5, 2017. doi: 10.1017/S0956796816000241. URL https://doi.org/10.1017/S0956796816000241. 11

work page doi:10.1017/s0956796816000241 2017

[30] [30]

Peyton Jones, and David Mazières

David Terei, Simon Marlow, Simon L. Peyton Jones, and David Mazières. Safe haskell. In Janis V oigtländer, editor,Proceedings of the 5th ACM SIGPLAN Symposium on Haskell, Haskell 2012, Copenhagen, Denmark, 13 September 2012, pages 137–148. ACM, 2012. doi: 10.1145/2364506.2364524. URL https://doi.org/10.1145/2364506.2364524

work page doi:10.1145/2364506.2364524 2012

[31] [31]

Contextual agent security: A policy for every purpose,

Lillian Tsai and Eugene Bagdasarian. Contextual agent security: A policy for every purpose,

[32] [32]

URL https://arxiv.org/abs/2501.17070

arXiv

[33] [33]

Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon L

Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon L. Peyton Jones. Re- finement types for haskell. In Johan Jeuring and Manuel M. T. Chakravarty, editors,Proceedings of the 19th ACM SIGPLAN international conference on Functional programming, Gothenburg, Sweden, September 1-3, 2014, pages 269–282. ACM, 2014. doi: 10.1145/2628136.262...

work page doi:10.1145/2628136.2628161 2014

[34] [34]

The dual LLM pattern for building AI assistants that can resist prompt in- jection

Simon Willison. The dual LLM pattern for building AI assistants that can resist prompt in- jection. https://simonwillison.net/2023/Apr/25/dual-llm-pattern/, April 2023. Accessed: 2026-05-06

2023

[35] [35]

Autogen: Enabling next-gen LLM applications via multi-agent conversation,

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, and Chi Wang. Autogen: Enabling next-gen LLM applications via multi-agent conversation,

[36] [36]

" "

URL https://arxiv.org/abs/2308.08155. 12 A CaMeL Comparison Data Table 1: Number of AgentDojo Slack tasks completed (out of 21) by TYPE GUARD and CaMeL when not under attack, with and without IFC policies. No Policies With Policies Metric T YPE GUARD CaMeL T YPE GUARD CaMeL Utility (/21) 15 15 8 7 Table 2: Utility (benign tasks completed) and security (in...

Pith/arXiv arXiv