Recognition: no theorem link
From Governance Norms to Enforceable Controls: A Layered Translation Method for Runtime Guardrails in Agentic AI
Pith reviewed 2026-05-10 18:43 UTC · model grok-4.3
The pith
Governance standards translate into runtime guardrails for agentic AI only when controls are observable, determinate, and time-sensitive.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
This paper proposes a layered translation method that connects standards-derived governance objectives to four control layers: governance objectives, design-time constraints, runtime mediation, and assurance feedback. It distinguishes governance objectives, technical controls, runtime guardrails, and assurance evidence; introduces a control tuple and runtime-enforceability rubric for layer assignment; and demonstrates the method in a procurement-agent case study. The central claim is that standards should guide control placement across architecture, runtime policy, human escalation, and audit, while runtime guardrails are reserved for controls that are observable, determinate, and time-sens
What carries the argument
The layered translation method with its control tuple and runtime-enforceability rubric that assesses observability, determinacy, and time-sensitivity to assign controls to the appropriate layer.
If this is right
- Standards such as ISO/IEC 42001 and the NIST AI Risk Management Framework inform control placement but do not directly yield runtime code.
- Controls are distributed across architecture, runtime policy, human escalation, and audit according to the rubric rather than defaulting to execution-time checks.
- Runtime guardrails apply only to controls that are observable, determinate, and time-sensitive enough to justify intervention during execution.
- Assurance feedback loops execution outcomes back to refine governance objectives and design constraints.
Where Pith is reading between the lines
- The method could reduce unnecessary runtime overhead in multi-step agents by clarifying when design changes or human escalation suffice instead of automated guards.
- It offers a template for adapting the same standards to other execution-heavy domains such as robotic planning without requiring full re-derivation of controls.
- If widely adopted, the rubric might support consistent auditing of agent trajectories against evolving governance norms.
Load-bearing premise
Governance standards can be translated into technical controls and a runtime-enforceability rubric without substantial loss of intent or introduction of new ambiguities that undermine the original objectives.
What would settle it
A demonstration in the procurement-agent case study or similar where the rubric assigns a control to runtime mediation but the control later proves non-observable or introduces an ambiguity absent from the source standard such as NIST AI RMF.
read the original abstract
Agentic AI systems plan, use tools, maintain state, and produce multi-step trajectories with external effects. Those properties create a governance problem that differs materially from single-turn generative AI: important risks emerge dur- ing execution, not only at model development or deployment time. Governance standards such as ISO/IEC 42001, ISO/IEC 23894, ISO/IEC 42005, ISO/IEC 5338, ISO/IEC 38507, and the NIST AI Risk Management Framework are therefore highly relevant to agentic AI, but they do not by themselves yield implementable runtime guardrails. This paper proposes a layered translation method that connects standards-derived governance objectives to four control layers: governance objectives, design- time constraints, runtime mediation, and assurance feedback. It distinguishes governance objectives, technical controls, runtime guardrails, and assurance evidence; introduces a control tuple and runtime-enforceability rubric for layer assignment; and demonstrates the method in a procurement-agent case study. The central claim is modest: standards should guide control placement across architecture, runtime policy, human escalation, and audit, while runtime guardrails are reserved for controls that are observable, determinate, and time-sensitive enough to justify execution-time intervention.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a layered translation method that maps governance standards (ISO/IEC 42001, ISO/IEC 23894, NIST AI RMF, etc.) to four control layers—governance objectives, design-time constraints, runtime mediation, and assurance feedback—for agentic AI systems. It introduces a control tuple and a runtime-enforceability rubric that assigns controls to runtime guardrails only when they are observable, determinate, and time-sensitive, while routing others to architecture, policy, human escalation, or audit. The approach is illustrated in a procurement-agent case study, with the modest central claim that this layering prevents inappropriate runtime interventions while ensuring standards inform overall system design.
Significance. If the translation method can be made operational without loss of intent, the framework would help close the gap between high-level governance norms and concrete runtime mechanisms in stateful, multi-step agentic systems. This is potentially significant for compliance, risk management, and auditability in deployed agents, as it explicitly reserves execution-time controls for a narrow, justifiable subset of requirements.
major comments (1)
- [description of the runtime-enforceability rubric and control tuple] The runtime-enforceability rubric is defined only in terms of three high-level properties (observable, determinate, time-sensitive) with no formal decision procedure, threshold values, or explicit handling of edge cases such as partial observability, delayed effects, or nondeterministic tool outcomes. This directly threatens the central claim that the layered method avoids introducing new ambiguities in control placement and layer assignment.
Simulated Author's Rebuttal
We thank the referee for their constructive review and for recognizing the potential value of the layered translation method in bridging governance standards with runtime mechanisms in agentic AI. We address the major comment below, agreeing where clarification is needed and outlining specific revisions to strengthen the operational aspects of the framework.
read point-by-point responses
-
Referee: The runtime-enforceability rubric is defined only in terms of three high-level properties (observable, determinate, time-sensitive) with no formal decision procedure, threshold values, or explicit handling of edge cases such as partial observability, delayed effects, or nondeterministic tool outcomes. This directly threatens the central claim that the layered method avoids introducing new ambiguities in control placement and layer assignment.
Authors: We agree that the rubric, as currently presented, relies on three high-level properties without a formal decision procedure or detailed edge-case guidance, which could introduce application inconsistencies. The properties were selected to provide a minimal, domain-agnostic filter that directly supports the modest central claim by restricting runtime guardrails to controls feasible for execution-time enforcement, as demonstrated in the procurement-agent case study. However, to mitigate the identified risk of ambiguity, we will revise the manuscript (primarily Section 3 and the associated figures) to include: (1) a step-by-step decision procedure expressed as pseudocode for applying the three properties in sequence; (2) illustrative threshold guidance (e.g., 'observable' requires sufficient state logging to verify the property at runtime); and (3) concrete examples addressing partial observability (routed to assurance feedback), delayed effects (design-time constraints), and nondeterministic tool outcomes (human escalation or audit). These additions will make layer assignment more transparent and reproducible while preserving the framework's generality and modest scope. We believe this directly addresses the concern without overstating the method's current formality. revision: yes
Circularity Check
No circularity: methodological framework rests on external standards
full rationale
The paper presents a conceptual layered translation method connecting external governance standards (ISO/IEC 42001, NIST AI RMF, etc.) to four control layers and a runtime-enforceability rubric. No equations, fitted parameters, or quantitative derivations exist. The central claim—that runtime guardrails apply only to observable, determinate, time-sensitive controls—is introduced as a definitional distinction rather than derived from prior results within the paper. No self-citations, self-definitional loops, or renaming of known results are load-bearing; the method is self-contained against the cited external standards without reducing to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Governance standards contain objectives that are sufficiently precise to be mapped to technical control layers without introducing new ambiguities.
invented entities (2)
-
Control tuple
no independent evidence
-
Runtime-enforceability rubric
no independent evidence
Forward citations
Cited by 1 Pith paper
-
Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation
A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.
Reference graph
Works this paper leans on
-
[1]
ISO/IEC 42001:2023 — Information technology — Artificial intelligence — Management system,
ISO, “ISO/IEC 42001:2023 — Information technology — Artificial intelligence — Management system,” Dec. 2023. [Online]. Available: https://www.iso.org/standard/42001
2023
-
[2]
ISO/IEC 23894:2023 — Information technology — Artificial in- telligence — Guidance on risk management,
ISO, “ISO/IEC 23894:2023 — Information technology — Artificial in- telligence — Guidance on risk management,” 2023. [Online]. Available: https://www.iso.org/standard/77304.html
2023
-
[3]
ISO/IEC 42005:2025 — Information technology — Artificial intelligence (AI) — AI system impact assessment,
ISO, “ISO/IEC 42005:2025 — Information technology — Artificial intelligence (AI) — AI system impact assessment,” May 2025. [Online]. Available: https://www.iso.org/standard/42005
2025
-
[4]
ISO/IEC 5338:2023 — Information technology — Artificial in- telligence — AI system life cycle processes,
ISO, “ISO/IEC 5338:2023 — Information technology — Artificial in- telligence — AI system life cycle processes,” 2023. [Online]. Available: https://www.iso.org/standard/81118.html
2023
-
[5]
ISO/IEC 38507:2022 — Information technology — Governance of IT — Governance implications of the use of artificial intelligence by organizations,
ISO, “ISO/IEC 38507:2022 — Information technology — Governance of IT — Governance implications of the use of artificial intelligence by organizations,” 2022. [Online]. Available: https://www.iso.org/standard/ 56641.html
2022
-
[6]
Artificial intelligence risk management framework (AI RMF 1.0),
NIST, “Artificial Intelligence Risk Management Framework (AI RMF 1.0),” NIST AI 100-1, Jan. 2023. doi: 10.6028/NIST.AI.100-1
-
[7]
NIST, “Artificial Intelligence Risk Management Framework: Genera- tive Artificial Intelligence Profile,” NIST AI 600-1, Jul. 2024. doi: 10.6028/NIST.AI.600-1
-
[8]
AI Agent Standards Initiative,
NIST, “AI Agent Standards Initiative,” Center for AI Standards and Innovation (CAISI), Feb. 2026. [Online]. Available: https://www.nist. gov/caisi/ai-agent-standards-initiative
2026
-
[9]
Accelerating the Adoption of Software and Artificial Intelligence Agent Identity and Authorization,
H. Booth, W. Fisher, R. Galluzzo, and J. Roberts, “Accelerating the Adoption of Software and Artificial Intelligence Agent Identity and Authorization,” Initial Public Draft, National Cybersecurity Center of Excellence, NIST, Feb. 2026. [Online]. Available: https://csrc.nist.gov/pubs/other/2026/02/05/ accelerating-the-adoption-of-software-and-ai-agent/ipd
2026
-
[10]
NPR 7150.2D — NASA Software Engineering Requirements,
NASA, “NPR 7150.2D — NASA Software Engineering Requirements,” Office of the Chief Engineer, Mar. 2022. [Online]. Available: https:// nodis3.gsfc.nasa.gov/displayDir.cfm?t=NPR&c=7150&s=2D
2022
-
[11]
NASA-STD-8739.8B — Software Assurance and Soft- ware Safety Standard,
NASA, “NASA-STD-8739.8B — Software Assurance and Soft- ware Safety Standard,” Office of Safety and Mission Assurance, Sep. 2022. [Online]. Available: https://standards.nasa.gov/standard/nasa/ nasa-std-87398
2022
-
[12]
NASA-HDBK-2203 — NASA Software Engineering and As- surance Handbook,
NASA, “NASA-HDBK-2203 — NASA Software Engineering and As- surance Handbook,” Office of the Chief Engineer, current public hand- book and standards entry. [Online]. Available: https://swehb.nasa.gov/; https://standards.nasa.gov/standard/nasa/nasa-hdbk-2203
-
[13]
NIST, “SP 800-160 V ol. 2 Rev. 1 — Developing Cyber-Resilient Systems: A Systems Security Engineering Approach,” Dec. 2021. doi: 10.6028/NIST.SP.800-160v2r1
-
[14]
Assured Autonomy,
DARPA, “Assured Autonomy,” Information Innovation Office program summary. [Online]. Available: https://www.darpa.mil/research/programs/ assured-autonomy
-
[15]
MI9 – agent intelligence protocol: Runtime governance for agentic AI systems,
C. L. Wang, T. Singhal, A. Kelkar, and J. Tuo, “MI9: An Inte- grated Runtime Governance Framework for Agentic AI,”arXiv preprint arXiv:2508.03858, Nov. 2025
-
[16]
The AI agent code of conduct: Automated guardrail policy-as-prompt synthesis,
G. Kholkar and R. Ahuja, “Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents,” inProc. 3rd Regulatable ML Workshop, NeurIPS 2025,arXiv preprint arXiv:2509.23994, Nov. 2025
-
[17]
Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents,
J. Mavracic, “Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents,”arXiv preprint arXiv:2510.24383, Oct. 2025
-
[18]
M. Kaptein, V .-J. Khan, and A. Podstavnychy, “Runtime Governance for AI Agents: Policies on Paths,”arXiv preprint arXiv:2603.16586, Mar. 2026
-
[19]
Safeguarding large language models: A survey
Y . Dong, R. Mu, Y . Zhang, S. Sun, T. Zhang, C. Wu, G. Jin, Y . Qi, J. Hu, J. Meng, S. Bensalem, and X. Huang, “Safeguarding Large Language Models: A Survey,”arXiv preprint arXiv:2406.02622, Jun. 2024
-
[20]
Building a foundational guardrail for general agentic systems via synthetic data
Y . Huang, H. Hua, Y . Zhou, P. Jing, M. Nagireddy, I. Padhi, G. Dolcetti, Z. Xu, S. Chaudhury, A. Rawat, L. Nedoshivina, P.-Y . Chen, P. Sattigeri, and X. Zhang, “Building a Foundational Guardrail for General Agentic Systems via Synthetic Data,”arXiv preprint arXiv:2510.09781, Oct. 2025
-
[21]
Agent-SafetyBench: Evaluating the Safety of LLM Agents
Z. Zhang, S. Cui, Y . Lu, J. Zhou, J. Yang, H. Wang, and M. Huang, “Agent-SafetyBench: Evaluating the Safety of LLM Agents,”arXiv preprint arXiv:2412.14470, May 2025
work page internal anchor Pith review arXiv 2025
-
[22]
WebGuard: Building a Generalizable Guardrail for Web Agents,
B. Zheng, Z. Liao, S. Salisbury, Z. Liu, M. Lin, Q. Zheng, Z. Wang, X. Deng, D. Song, H. Sun, and Y . Su, “WebGuard: Building a Generalizable Guardrail for Web Agents,”arXiv preprint arXiv:2507.14293, Jul. 2025
-
[23]
A. Cartagena and A. Teixeira, “Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents,”arXiv preprint arXiv:2602.16943, Feb. 2026
-
[24]
Y . Mou, Z. Xue, L. Li, P. Liu, S. Zhang, W. Ye, and J. Shao, “ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedback,”arXiv preprint arXiv:2601.10156, Jan. 2026
-
[25]
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security
D. Liuet al., “AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security,”arXiv preprint arXiv:2601.18491, Jan. 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[26]
Proof-of- Guardrail in AI Agents and What (Not) to Trust from It,
X. Jin, M. Duan, Q. Lin, A. Chan, Z. Chen, J. Du, and X. Ren, “Proof-of- Guardrail in AI Agents and What (Not) to Trust from It,”arXiv preprint arXiv:2603.05786, Mar. 2026
-
[27]
Y . Zhuet al., “Establishing Best Practices for Building Rigorous Agentic Benchmarks,”arXiv preprint arXiv:2507.02825, Aug. 2025
-
[28]
Enforceable Security Policies,
F. B. Schneider, “Enforceable Security Policies,”ACM Transactions on Information and System Security, vol. 3, no. 1, pp. 30–50, Feb. 2000. doi: 10.1145/353323.353382
-
[29]
The Protection of Information in Computer Systems
J. H. Saltzer and M. D. Schroeder, “The Protection of Information in Computer Systems,”Proceedings of the IEEE, vol. 63, no. 9, pp. 1278– 1308, Sep. 1975. doi: 10.1109/PROC.1975.9939
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.