pith. machine review for the scientific record. sign in

arxiv: 2603.18829 · v10 · submitted 2026-03-19 · 💻 cs.CR · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Agent Control Protocol: Admission Control for Agent Actions

Authors on Pith no claims yet

Pith reviewed 2026-05-15 08:30 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords admission controlautonomous agentsrisk scoringbehavioral patternsstateful protocolsagent governancedeviation collapsetemporal verification
0
0 comments X

The pith

A temporal admission control protocol limits autonomous agent execution to 0.4 percent of individually valid requests by accumulating risk across action sequences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Autonomous agents can chain individually valid actions into harmful behavioral patterns that stateless per-request policies cannot detect. ACP enforces temporal properties over execution traces by combining static risk scores with stateful signals such as anomaly accumulation and cooldown periods. In a 500-request workload where every request scores as valid, the protocol allows only two autonomous executions while escalating denials after eleven actions. The design is verified through model checking of safety and liveness properties and achieves sub-microsecond decision latency at over one million requests per second. It is presented as the first element in a broader series on agent governance mechanisms.

Core claim

ACP is a stateful admission control protocol that evaluates agent actions over execution history rather than in isolation. It uses a LedgerQuerier abstraction to apply deterministic risk scoring that incorporates anomaly accumulation and cooldown, blocking sequences even when each individual request meets per-request thresholds. The protocol scopes signals via PatternKey to prevent cross-context false denials, and it formalizes deviation collapse together with Boundary Activation Rate as a detection mechanism for cases where enforcement remains latent.

What carries the argument

LedgerQuerier abstraction that maintains stateful signals (anomaly accumulation, cooldown) scoped by PatternKey(agentID, capability, resource) to drive deterministic history-aware risk scoring.

If this is right

  • Coordinated agents accumulate risk independently, so activity scales linearly rather than permitting superlinear amplification.
  • An adversary attempting to suppress Boundary Activation Rate to zero is detected via DeltaBAR before deviation collapse occurs.
  • Decision evaluation runs in 739-832 nanoseconds at the median with throughput exceeding 1.7 million requests per second.
  • The protocol has been model-checked across billions of states with zero violations of the specified invariants and temporal properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • ACP could be layered onto existing agent runtimes as an independent governance layer without altering core action logic.
  • The PatternKey scoping approach may generalize to other multi-tenant systems where context mixing creates false positives.
  • Boundary Activation Rate offers a quantitative metric that operators could monitor to detect latent policy bypass attempts in production.
  • Extending the coordination window calculation to heterogeneous agent teams would require adjusting CW_appr for differing capability sets.

Load-bearing premise

The chosen risk scoring thresholds and anomaly accumulation rules will correctly identify harmful behavioral patterns without excessive false denials or missing coordinated attacks.

What would settle it

Deploy ACP on an agent system executing a known sequence of individually valid actions that together produce documented harm and measure whether the protocol denies the sequence while still permitting unrelated benign traces.

Figures

Figures reproduced from arXiv: 2603.18829 by Marcelo Fernandez (TraslaIA).

Figure 1
Figure 1. Figure 1: Replay vs. normal traffic: RS per request (Case 2 sequential replay). Requests 1–3: [PITH_FULL_IMAGE:figures/full_fig_p046_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Decision evolution under 500 repeated valid requests. The stateless engine approves all. [PITH_FULL_IMAGE:figures/full_fig_p048_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: ACP end-to-end verifiability pipeline. The TLA+ model defines the invariants that test [PITH_FULL_IMAGE:figures/full_fig_p056_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Boundary Activation Rate per phase (Experiment 9). BAR drops from 0.70 to 0.00 under [PITH_FULL_IMAGE:figures/full_fig_p061_4.png] view at source ↗
read the original abstract

Autonomous agents can produce harmful behavioral patterns from individually valid requests -- a threat class per-request policy evaluation cannot address, because stateless engines evaluate each request in isolation. We present ACP, a temporal admission control protocol enforcing behavioral properties over execution traces via static risk scoring combined with stateful signals (anomaly accumulation, cooldown) through a LedgerQuerier abstraction. ACP blocks execution based on deterministic, history-aware risk scoring -- not anomaly detection. Under a 500-request workload where every request is individually valid (RS=35), a stateless engine approves all 500; ACP limits autonomous execution to 2 out of 500 (0.4%), escalating after 3 actions and denying after 11. We identify a state-mixing vulnerability in ACP-RISK-2.0 (cross-context false denials) and introduce ACP-RISK-3.0, scoping anomaly signals to PatternKey(agentID, capability, resource). Decision evaluation: 739-832 ns (p50); throughput 1,720,000 req/s. Safety and liveness model-checked via TLA+ (11 invariants + 4 temporal properties, 0 violations) across 4,294,930,695 distinct states. We formalize deviation collapse -- enforcement active but never exercised due to upstream constraints -- and introduce Boundary Activation Rate (BAR) as its detection mechanism. An adversary suppressing BAR to 0.00 is detected via DeltaBAR before collapse (BAR_C=1.00). N coordinated agents accumulate risk independently; coordination window CW_appr=2N with zero deviation: activity scales linearly, preventing superlinear amplification. ACP is Paper 1 of a 6-paper Agent Governance Series: P0 -- atomic decision boundaries; P2 -- behavioral drift detection (IML); P3/4 -- governance structure, fair allocation, and irreducibility; P5 -- runtime execution validity (RAM, arXiv:2604.22898); P6 -- operationalization of RAM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Agent Control Protocol (ACP), a temporal admission control protocol for autonomous agents that enforces behavioral properties over execution traces using static risk scoring combined with stateful signals (anomaly accumulation and cooldown) via a LedgerQuerier abstraction. It contrasts this with stateless engines that evaluate requests in isolation and cannot address harmful patterns from individually valid requests. Key claims include: under a 500-request synthetic workload where every request has RS=35, a stateless engine approves all 500 while ACP limits autonomous execution to 2/500 (0.4%), escalating after 3 actions and denying after 11; decision latency of 739-832 ns (p50) and throughput of 1.72M req/s; TLA+ model checking of 11 invariants and 4 temporal properties with zero violations across 4.29 billion states; identification of a state-mixing vulnerability in ACP-RISK-2.0 fixed by PatternKey scoping in ACP-RISK-3.0; and formalization of deviation collapse with Boundary Activation Rate (BAR) as a detection mechanism. The work is Paper 1 of a 6-paper series on agent governance.

Significance. If the central claims hold, ACP provides a rigorous, history-aware mechanism for mitigating sequential and coordinated risks in agent systems that per-request policies miss, with notable strengths in the extensive TLA+ verification (zero violations over billions of states) and the introduction of BAR for detecting deviation collapse. The performance metrics indicate potential practicality, and the formal treatment of linear risk accumulation for N agents is a positive contribution. The significance is limited by the narrow empirical basis, but the formal methods component strengthens the overall contribution to agent security.

major comments (2)
  1. [Abstract and workload evaluation] Abstract and workload evaluation: The central empirical result (stateless approves 500/500; ACP approves 2/500 with escalation after 3 and denial after 11) depends on fixed, unvalidated thresholds (RS=35, escalate@3, deny@11) applied to a single synthetic workload where all requests are individually valid. No ablation on threshold sensitivity, no diverse request patterns, and no real agent traces are reported, so it is unclear whether the risk scoring and PatternKey scoping reliably separate harmful behavioral sequences. This is load-bearing because the TLA+ verification addresses protocol safety/liveness but does not validate the risk function's behavioral effectiveness.
  2. [Abstract] Abstract (coordinated agents paragraph): The claim that N coordinated agents accumulate risk independently with CW_appr=2N and zero deviation (preventing superlinear amplification) is stated without accompanying stress tests against evasion or coordination strategies that stay under the window. This weakens the generality of the linear scaling argument.
minor comments (2)
  1. [Protocol description] Clarify the exact definition and scoping of PatternKey(agentID, capability, resource) in ACP-RISK-3.0, including how it prevents cross-context false denials, with a small example.
  2. [Verification section] The TLA+ specification details (model, invariants, and temporal properties) would benefit from a brief appendix or reference to the checked model file for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and recommendation for major revision. We agree that the empirical sections would benefit from additional analysis on threshold sensitivity and workload diversity, and we have revised the manuscript to incorporate these points while clarifying the scope of the current work. Our point-by-point responses to the major comments are provided below.

read point-by-point responses
  1. Referee: [Abstract and workload evaluation] The central empirical result (stateless approves 500/500; ACP approves 2/500 with escalation after 3 and denial after 11) depends on fixed, unvalidated thresholds (RS=35, escalate@3, deny@11) applied to a single synthetic workload where all requests are individually valid. No ablation on threshold sensitivity, no diverse request patterns, and no real agent traces are reported, so it is unclear whether the risk scoring and PatternKey scoping reliably separate harmful behavioral sequences. This is load-bearing because the TLA+ verification addresses protocol safety/liveness but does not validate the risk function's behavioral effectiveness.

    Authors: We acknowledge that the reported result uses fixed thresholds on a single synthetic workload. In the revised manuscript we have added a new subsection on threshold sensitivity that varies RS from 25-45, escalation trigger from 2-5 actions, and denial trigger from 8-15 actions. Across these ranges the autonomous approval rate stays below 2% for the 500-request workload. We also include results for two additional synthetic patterns (mixed RS values and bursty arrivals). Real production agent traces are outside the scope of this protocol-focused paper; we have added an explicit limitations paragraph noting that behavioral effectiveness validation against live traces is planned for Paper 2 (behavioral drift detection). The TLA+ verification establishes that the protocol correctly enforces whatever risk function is supplied, while the workload serves only to illustrate the difference between stateless and stateful evaluation. revision: partial

  2. Referee: [Abstract] The claim that N coordinated agents accumulate risk independently with CW_appr=2N and zero deviation (preventing superlinear amplification) is stated without accompanying stress tests against evasion or coordination strategies that stay under the window. This weakens the generality of the linear scaling argument.

    Authors: The linear scaling follows from the per-agent isolation enforced by PatternKey scoping (agentID, capability, resource) in ACP-RISK-3.0; each agent's anomaly accumulator and cooldown window operate independently, so total approvals cannot exceed 2N. We have expanded the revised text with a short discussion of plausible evasion strategies (spacing actions to reset cooldowns, attempting cross-agent signal leakage) and why they remain bounded by the per-agent rules. The TLA+ model already covers multi-agent state transitions and confirms the linear bound. Explicit adversarial simulations are not present in this work; we note this as a direction for follow-on empirical papers in the series. revision: yes

Circularity Check

0 steps flagged

No significant circularity; protocol definitions, workload demonstration, and TLA+ verification are self-contained

full rationale

The paper introduces ACP via explicit definitions of risk scoring (RS=35), anomaly accumulation rules, cooldown, and PatternKey scoping, then applies them to a synthetic 500-request workload where all requests are individually valid. The outcome (2/500 approvals) is the direct computational result of those rules rather than a fitted prediction or self-referential equation. Safety and liveness are established through independent TLA+ model checking (11 invariants, 4 temporal properties, 0 violations over 4B states), which does not depend on the empirical thresholds. Concepts such as deviation collapse and BAR are newly formalized within the paper without reducing to prior self-citations or ansatzes. References to the broader Agent Governance Series are contextual and not load-bearing for the ACP claims or results presented here.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 2 invented entities

The central claims rest on several introduced abstractions and parameters whose correctness is asserted rather than derived from upstream results.

free parameters (2)
  • RS=35
    Risk score threshold used in the 500-request workload example
  • escalation after 3 actions
    Threshold for escalating risk signals
axioms (1)
  • domain assumption TLA+ model covers all relevant execution states for safety and liveness
    Invoked when claiming 0 violations across 4,294,930,695 states
invented entities (2)
  • LedgerQuerier no independent evidence
    purpose: Abstraction providing stateful signals for anomaly accumulation and cooldown
    New component introduced to enable history-aware decisions
  • Boundary Activation Rate (BAR) no independent evidence
    purpose: Metric to detect deviation collapse where enforcement is active but never exercised
    New detection mechanism for cases where upstream constraints prevent rule activation

pith-pipeline@v0.9.0 · 5657 in / 1478 out tokens · 42175 ms · 2026-05-15T08:30:41.301115+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. From Admission to Invariants: Measuring Deviation in Delegated Agent Systems

    cs.AI 2026-04 unverdicted novelty 6.0

    The Non-Identifiability Theorem shows admissible behavior space A0 is not identifiable from local enforcement signals g under the Local Observability Assumption, so the paper introduces an Invariant Measurement Layer ...

  2. Atomic Decision Boundaries: A Structural Requirement for Guaranteeing Execution-Time Admissibility in Autonomous Systems

    cs.LO 2026-04 unverdicted novelty 6.0

    Atomic decision boundaries are required to guarantee execution-time admissibility because split evaluation systems allow environmental interleaving that no policy can prevent.

  3. Reconstructive Authority Model: Runtime Execution Validity Under Partial Observability

    cs.CR 2026-04 unverdicted novelty 5.0

    RAM separates integrity from coverage and uses a reconstruction gate over proven state, assumptions, and unobservable residuals to block invalid executions, achieving zero invalid rates in synthetic tests where attest...

  4. SoK: Security of Autonomous LLM Agents in Agentic Commerce

    cs.CR 2026-04 unverdicted novelty 5.0

    The paper systematizes security for LLM agents in agentic commerce into five threat dimensions, identifies 12 cross-layer attack vectors, and proposes a layered defense architecture.

  5. Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

    cs.SE 2026-04 accept novelty 5.0

    LLM agent progress depends on externalizing cognitive functions into memory, skills, protocols, and harness engineering that coordinates them reliably.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · cited by 5 Pith papers · 3 internal anchors

  1. [1]

    Schneider

    Bowen Alpern and Fred B. Schneider. Defining liveness.Information Processing Letters, 21(4):181–185, 1985

  2. [2]

    Cedar policy language, 2023

    Amazon Web Services. Cedar policy language, 2023. Open-source policy language for autho- rization

  3. [3]

    Anderson

    James P. Anderson. Computer security technology planning study. Technical Report ESD- TR-73-51, Deputy for Command and Management Systems, HQ Electronic Systems Division (AFSC), 1972. Foundational reference monitor concept: a component that mediates all access to protected resources, is always invoked, tamper-resistant, and verifiable. ACP extends this co...

  4. [4]

    Model context protocol, 2024

    Anthropic. Model context protocol, 2024. Protocol for structured tool access between LLM applications and services

  5. [5]

    MIT Press, 2008

    Christel Baier and Joost-Pieter Katoen.Principles of Model Checking. MIT Press, 2008

  6. [6]

    Macaroons: Cookies with contextual caveats for decentralized authorization in the cloud

    Arnar Birgisson, Joe Gibbs Politz, Úlfar Erlingsson, Ankur Taly, Michael Vrable, and Mark Lentczner. Macaroons: Cookies with contextual caveats for decentralized authorization in the cloud. InProceedings of the Network and Distributed System Security Symposium (NDSS). Internet Society, 2014

  7. [7]

    CIRCL: Cloudflare interoperable reusable cryptographic library, 2024

    Cloudflare, Inc. CIRCL: Cloudflare interoperable reusable cryptographic library, 2024. Go library providing post-quantum cryptographic primitives including ML-DSA (Dilithium) and ECDH. Used in ACPpkg/sign2for ML-DSA-65 hybrid signatures. 84

  8. [8]

    AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents

    Edoardo Debenedetti, Jie Zhang, Mislav Balunovic, Luca Beurer-Kellner, Marc Fischer, and Florian Tramer. AgentDojo: A dynamic environment to evaluate prompt injection attacks and defenses for LLM agents. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

  9. [9]

    DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning, 2025

    DeepSeek-AI. DeepSeek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning, 2025

  10. [10]

    AI agents under threat: A survey of key security challenges and future pathways

    Zehang Deng, Yongjian Guo, Changzhou Han, Wanlun Ma, Junwu Xiong, Sheng Wen, and Yang Xiang. AI agents under threat: A survey of key security challenges and future pathways. ACM Computing Surveys, 57(7), 2025

  11. [11]

    What can you verify and enforce at runtime?International Journal on Software Tools for Technology Transfer, 14(3):349–382,

    Yliès Falcone, Jean-Claude Fernandez, and Laurent Mounier. What can you verify and enforce at runtime?International Journal on Software Tools for Technology Transfer, 14(3):349–382,

  12. [12]

    Distinguishes enforceable prop- erties (safety, co-safety, guarantee, persistence) and characterizes the enforcement mechanisms required for each class

    Systematic framework for runtime enforcement monitors. Distinguishes enforceable prop- erties (safety, co-safety, guarantee, persistence) and characterizes the enforcement mechanisms required for each class

  13. [13]

    Agent control protocol — official website, 2026

    Marcelo Fernandez. Agent control protocol — official website, 2026

  14. [14]

    Agent control protocol — specification and reference implementation,

    Marcelo Fernandez. Agent control protocol — specification and reference implementation,

  15. [15]

    Complete specification (38 documents), Go reference implementation (23 packages), 138 conformance test vectors (73 signed + 65 RISK-2.0 unsigned), ACR-1.0 sequence compliance runner

  16. [16]

    Atomic Decision Boundaries: A Structural Requirement for Guaranteeing Execution-Time Admissibility in Autonomous Systems

    Marcelo Fernandez. Atomic decision boundaries: A structural requirement for guarantee- ing execution-time admissibility in autonomous systems.https://doi.org/10.5281/zenodo. 19670649, 2026. Zenodo. DOI: 10.5281/zenodo.19670649. arXiv:2604.17511

  17. [17]

    From Admission to Invariants: Measuring Deviation in Delegated Agent Systems

    Marcelo Fernandez. From admission to invariants: Measuring deviation in delegated agent systems.https://doi.org/10.5281/zenodo.19672589, 2026. Zenodo. DOI: 10.5281/zen- odo.19672589. arXiv:2604.17517

  18. [18]

    Irreducible governance structure for autonomous agent systems: Fair allocation, strategy-proofness, and multi-scale composition.https://doi.org/10.5281/ zenodo.19708496, 2026

    Marcelo Fernandez. Irreducible governance structure for autonomous agent systems: Fair allocation, strategy-proofness, and multi-scale composition.https://doi.org/10.5281/ zenodo.19708496, 2026. Agent Governance Series, Paper 3/4 (consolidated). Zenodo. DOI: 10.5281/zenodo.19708496. arXiv: TBD

  19. [19]

    Pengyu, Data for ”achieving optimal-distance atom-loss correction via pauli envelope”, 10.5281/zen- odo.19339056 (2026)

    Marcelo Fernandez. Operationalizing reconstructive authority: Runtime construction, depen- dency resolution, and execution gating in autonomous agent systems.https://doi.org/10. 5281/zenodo.19699460, 2026. Agent Governance Series, Paper 6. Zenodo. DOI: 10.5281/zen- odo.19699460. arXiv: TBD

  20. [20]

    Reconstructive Authority Model: Runtime Execution Validity Under Partial Observability

    Marcelo Fernandez. Reconstructive authority model: Runtime execution validity under partial observability.https://doi.org/10.5281/zenodo.19669430, 2026. Agent Governance Series, Paper 5. Zenodo. DOI: 10.5281/zenodo.19669430. arXiv: 2604.22898

  21. [21]

    Jones, and David Waite

    Daniel Fett, Brian Campbell, John Bradley, Torsten Lodderstedt, Michael B. Jones, and David Waite. OAuth 2.0 demonstrating proof of possession (DPoP). Request for Comments 9449, Internet Engineering Task Force, 2023. 85

  22. [22]

    Event sourcing, 2005

    Martin Fowler. Event sourcing, 2005

  23. [23]

    The Go programming language, 2024

    Go Authors. The Go programming language, 2024. ACP reference implementation written in Go 1.22. All packages verified withgo test ./

  24. [24]

    Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes

    Charles A. E. Goodhart. Problems of monetary management: The UK experience. InPapers in Monetary Economics, volume I. Reserve Bank of Australia, 1975. Source of Goodhart’s Law: “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.”

  25. [25]

    Agent-to-agent (A2A) protocol, 2025

    Google. Agent-to-agent (A2A) protocol, 2025. Protocol for agent communication and task delegation

  26. [26]

    Not what you’ve signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection

    Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Compromising real-world LLM-integrated applications with indirect prompt injection. InProceedings of the 16th ACM Workshop on Artificial Intel- ligence and Security (AISec@CCS), 2023

  27. [27]

    The OAuth 2.0 authorization framework

    Dick Hardt. The OAuth 2.0 authorization framework. RFC 6749, Internet Engineering Task Force, October 2012. IETF RFC 6749

  28. [28]

    Hu, David Ferraiolo, Rick Kuhn, Adam Schnitzer, Kenneth Sandlin, Robert Miller, and Karen Scarfone

    Vincent C. Hu, David Ferraiolo, Rick Kuhn, Adam Schnitzer, Kenneth Sandlin, Robert Miller, and Karen Scarfone. Guide to attribute based access control (ABAC) definition and con- siderations. Technical Report SP 800-162, National Institute of Standards and Technology, 2014

  29. [29]

    Jones, John Bradley, and Nat Sakimura

    Michael B. Jones, John Bradley, and Nat Sakimura. JSON web token (JWT). RFC 7519, Internet Engineering Task Force, May 2015. IETF RFC 7519

  30. [30]

    Edwards-curve digital signature algorithm (EdDSA)

    Simon Josefsson and Ilari Liusvaara. Edwards-curve digital signature algorithm (EdDSA). RFC 8032, Internet Engineering Task Force, January 2017. IETF RFC 8032

  31. [31]

    Specification gaming: The flip side of the coin

    Victoria Krakovna, Jonathan Uesato, Vladimir Mikulik, Matthew Martic, Julian Togelius, Linus Rottger, Luke Hammond, Shane Legg, and Jan Leike. Specification gaming: The flip side of the coin. DeepMind Blog, 2020. Survey and taxonomy of specification gaming examples in reinforcement learning, where agents satisfy the letter of an objective while violating ...

  32. [32]

    Admission controllers reference, 2024

    Kubernetes Contributors. Admission controllers reference, 2024. Kubernetes admission control architecture that inspired the ACP admission flow model

  33. [33]

    Addison-Wesley, 2002

    Leslie Lamport.Specifying Systems: The TLA+ Language and Tools for Hardware and Soft- ware Engineers. Addison-Wesley, 2002

  34. [34]

    CFRG elliptic curves for JOSE

    Ilari Liusvaara. CFRG elliptic curves for JOSE. RFC 8037, Internet Engineering Task Force, January 2017. IETF RFC 8037. RFC 8037 Test Key A used for conformance test vectors

  35. [35]

    Miller and Jonathan S

    Mark S. Miller and Jonathan S. Shapiro. Paradigm regained: Abstraction mechanisms for access control. InAdvances in Computing Science — ASIAN 2003, volume 2896 ofLec- ture Notes in Computer Science, pages 224–242. Springer, 2003. Foundational reference for capability-based security, which underlies ACP Capability Tokens. 86

  36. [36]

    Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, and Christian Schroeder de Witt. Secret collusion among AI agents: Multi-agent deception via steganography. InAdvances in Neural Information Processing Sys- tems (NeurIPS), 2024

  37. [37]

    Module-lattice-based digital signature stan- dard(ML-DSA)

    National Institute of Standards and Technology. Module-lattice-based digital signature stan- dard(ML-DSA). FederalInformationProcessingStandard204, NationalInstituteofStandards and Technology, August 2024. NIST FIPS 204. Standardizes ML-DSA-44, ML-DSA-65, and ML-DSA-87 (formerly Dilithium2, Dilithium3, Dilithium5)

  38. [38]

    eXtensible access control markup language (XACML) version 3.0

    OASIS. eXtensible access control markup language (XACML) version 3.0. Technical report, OASIS Standard, 2013

  39. [39]

    Ollama: Run large language models locally.https://ollama.com, 2024

    Ollama. Ollama: Run large language models locally.https://ollama.com, 2024. Accessed: 2026

  40. [40]

    Open policy agent, 2024

    Open Policy Agent Contributors. Open policy agent, 2024. Policy evaluation engine. ACP- RISK-1.0 Step 3 is compatible with OPA as backend

  41. [41]

    The temporal logic of programs

    Amir Pnueli. The temporal logic of programs. InProceedings of the 18th Annual Symposium on Foundations of Computer Science (FOCS), pages 46–57. IEEE, 1977

  42. [42]

    Redis: The real-time data platform, 2024

    Redis Ltd. Redis: The real-time data platform, 2024. In-memory data structure store. Used as ACP persistent state backend inRedisQuerierandRedisPipelinedQuerier

  43. [43]

    JSON canonicalization scheme (JCS)

    Anders Rundgren, Bret Jordan, and Samuel Erdtman. JSON canonicalization scheme (JCS). RFC 8785, Internet Engineering Task Force, June 2020. IETF RFC 8785

  44. [44]

    Saltzer and Michael D

    Jerome H. Saltzer and Michael D. Schroeder. The protection of information in computer systems.Proceedings of the IEEE, 63(9):1278–1308, 1975. Foundational reference for the principle of least privilege and fail-closed design

  45. [45]

    Sandhu, Edward J

    Ravi S. Sandhu, Edward J. Coyne, Hal L. Feinstein, and Charles E. Youman. Role-based access control models.IEEE Computer, 29(2):38–47, 1996

  46. [46]

    Schneider

    Fred B. Schneider. Enforceable security policies.ACM Transactions on Information and System Security, 3(1):30–50, 2000. Formal characterization of which security properties can be enforced through execution monitoring (safety automata / security automata). Establishes the theoretical boundary between verifiable and enforceable properties

  47. [47]

    Secure audit logs to support computer forensics.ACM Transactions on Information and System Security, 2(2):159–176, 1999

    Bruce Schneier and John Kelsey. Secure audit logs to support computer forensics.ACM Transactions on Information and System Security, 2(2):159–176, 1999

  48. [48]

    SPIFFE / SPIRE: Secure production identity framework for everyone, 2024

    SPIFFE Project. SPIFFE / SPIRE: Secure production identity framework for everyone, 2024. Cryptographic workload identity. ACP builds on SPIFFE identity to add capability scoping

  49. [49]

    ZCAP-LD: Authorization capabilities for linked data

    Manu Sporny, Dave Longley, and Chris Zaremba. ZCAP-LD: Authorization capabilities for linked data. W3C Community Group Report, 2022

  50. [50]

    When a measure becomes a target, it ceases to be a good measure

    Marilyn Strathern. Improving ratings: Audit in the British university system.European Review, 5(3):305–321, 1997. Coined the accessible formulation of Goodhart’s Law: “When a measure becomes a target, it ceases to be a good measure.”. 87

  51. [51]

    SAGA: A security architecture for governing AI agentic systems, 2025

    GeorgiosSyros, AnshumanSuri, JacobGinesin, CristinaNita-Rotaru, andAlinaOprea. SAGA: A security architecture for governing AI agentic systems, 2025. arXiv:2504.21034 [cs.CR]

  52. [52]

    Contextual agent security: A policy for every purpose

    Lillian Tsai and Eugene Bagdasarian. Contextual agent security: A policy for every purpose. In Proceedings of the 20th Workshop on Hot Topics in Operating Systems (HotOS), 2025. HotOS 2025

  53. [53]

    Poskitt, and Jun Sun

    Haoyu Wang, Christopher M. Poskitt, and Jun Sun. AgentSpec: Customizable runtime en- forcement for safe and reliable LLM agents. InProceedings of the 48th International Conference on Software Engineering (ICSE), 2026

  54. [54]

    Model checking TLA+ specifications

    Yuan Yu, Panagiotis Manolios, and Leslie Lamport. Model checking TLA+ specifications. In Correct Hardware Design and Verification Methods (CHARME), pages 54–66. Springer, 1999

  55. [55]

    InjecAgent: Benchmarking indirect promptinjectionsintool-integratedlargelanguagemodelagents

    Qiusi Zhan, Zhixiang Liang, Zifan Ying, and Daniel Kang. InjecAgent: Benchmarking indirect promptinjectionsintool-integratedlargelanguagemodelagents. InFindings of the Association for Computational Linguistics (ACL), 2024. 88 Item Status Core specs (L1–L4), 38 documents Complete Go reference implementation (23 packages) Complete Conformance test vectors (...