arxiv: 2604.05229 · v1 · submitted 2026-04-06 · 💻 cs.AI · cs.HC· cs.LG· cs.MA

Recognition: no theorem link

From Governance Norms to Enforceable Controls: A Layered Translation Method for Runtime Guardrails in Agentic AI

Christopher Koch

Authors on Pith no claims yet

Pith reviewed 2026-05-10 18:43 UTC · model grok-4.3

classification 💻 cs.AI cs.HCcs.LGcs.MA

keywords agentic AIruntime guardrailsgovernance standardslayered translation methodcontrol placementenforceability rubricAI risk managementprocurement agent

0 comments

The pith

Governance standards translate into runtime guardrails for agentic AI only when controls are observable, determinate, and time-sensitive.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a layered translation method that turns abstract governance standards into concrete controls for agentic AI systems, which plan, use tools, and produce multi-step trajectories with external effects. It maps standards-derived objectives to four layers—governance, design-time constraints, runtime mediation, and assurance feedback—while using a control tuple and runtime-enforceability rubric to decide placement. A sympathetic reader would care because risks in these systems often arise during execution rather than at development time, so controls must be actionable without over-relying on runtime intervention. The method distinguishes governance objectives from technical controls, runtime guardrails, and assurance evidence, and it shows the approach in a procurement-agent example.

Core claim

This paper proposes a layered translation method that connects standards-derived governance objectives to four control layers: governance objectives, design-time constraints, runtime mediation, and assurance feedback. It distinguishes governance objectives, technical controls, runtime guardrails, and assurance evidence; introduces a control tuple and runtime-enforceability rubric for layer assignment; and demonstrates the method in a procurement-agent case study. The central claim is that standards should guide control placement across architecture, runtime policy, human escalation, and audit, while runtime guardrails are reserved for controls that are observable, determinate, and time-sens

What carries the argument

The layered translation method with its control tuple and runtime-enforceability rubric that assesses observability, determinacy, and time-sensitivity to assign controls to the appropriate layer.

If this is right

Standards such as ISO/IEC 42001 and the NIST AI Risk Management Framework inform control placement but do not directly yield runtime code.
Controls are distributed across architecture, runtime policy, human escalation, and audit according to the rubric rather than defaulting to execution-time checks.
Runtime guardrails apply only to controls that are observable, determinate, and time-sensitive enough to justify intervention during execution.
Assurance feedback loops execution outcomes back to refine governance objectives and design constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could reduce unnecessary runtime overhead in multi-step agents by clarifying when design changes or human escalation suffice instead of automated guards.
It offers a template for adapting the same standards to other execution-heavy domains such as robotic planning without requiring full re-derivation of controls.
If widely adopted, the rubric might support consistent auditing of agent trajectories against evolving governance norms.

Load-bearing premise

Governance standards can be translated into technical controls and a runtime-enforceability rubric without substantial loss of intent or introduction of new ambiguities that undermine the original objectives.

What would settle it

A demonstration in the procurement-agent case study or similar where the rubric assigns a control to runtime mediation but the control later proves non-observable or introduces an ambiguity absent from the source standard such as NIST AI RMF.

read the original abstract

Agentic AI systems plan, use tools, maintain state, and produce multi-step trajectories with external effects. Those properties create a governance problem that differs materially from single-turn generative AI: important risks emerge dur- ing execution, not only at model development or deployment time. Governance standards such as ISO/IEC 42001, ISO/IEC 23894, ISO/IEC 42005, ISO/IEC 5338, ISO/IEC 38507, and the NIST AI Risk Management Framework are therefore highly relevant to agentic AI, but they do not by themselves yield implementable runtime guardrails. This paper proposes a layered translation method that connects standards-derived governance objectives to four control layers: governance objectives, design- time constraints, runtime mediation, and assurance feedback. It distinguishes governance objectives, technical controls, runtime guardrails, and assurance evidence; introduces a control tuple and runtime-enforceability rubric for layer assignment; and demonstrates the method in a procurement-agent case study. The central claim is modest: standards should guide control placement across architecture, runtime policy, human escalation, and audit, while runtime guardrails are reserved for controls that are observable, determinate, and time-sensitive enough to justify execution-time intervention.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps governance standards to four control layers for agentic AI and introduces a simple rubric for runtime decisions, but the rubric stays too high-level to remove ambiguity in practice.

read the letter

The paper gives a workable way to turn standards like ISO 42001 and the NIST AI RMF into concrete controls for agents that plan and act over time. The central move is a four-layer translation: start with governance objectives, add design-time constraints, put some things into runtime mediation, and close with assurance feedback. It adds a control tuple to describe each control and a rubric that routes only observable, determinate, and time-sensitive items to runtime guardrails. That distinction is useful because agentic systems create risks during execution that single-turn models do not.

Referee Report

1 major / 0 minor

Summary. The paper proposes a layered translation method that maps governance standards (ISO/IEC 42001, ISO/IEC 23894, NIST AI RMF, etc.) to four control layers—governance objectives, design-time constraints, runtime mediation, and assurance feedback—for agentic AI systems. It introduces a control tuple and a runtime-enforceability rubric that assigns controls to runtime guardrails only when they are observable, determinate, and time-sensitive, while routing others to architecture, policy, human escalation, or audit. The approach is illustrated in a procurement-agent case study, with the modest central claim that this layering prevents inappropriate runtime interventions while ensuring standards inform overall system design.

Significance. If the translation method can be made operational without loss of intent, the framework would help close the gap between high-level governance norms and concrete runtime mechanisms in stateful, multi-step agentic systems. This is potentially significant for compliance, risk management, and auditability in deployed agents, as it explicitly reserves execution-time controls for a narrow, justifiable subset of requirements.

major comments (1)

[description of the runtime-enforceability rubric and control tuple] The runtime-enforceability rubric is defined only in terms of three high-level properties (observable, determinate, time-sensitive) with no formal decision procedure, threshold values, or explicit handling of edge cases such as partial observability, delayed effects, or nondeterministic tool outcomes. This directly threatens the central claim that the layered method avoids introducing new ambiguities in control placement and layer assignment.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive review and for recognizing the potential value of the layered translation method in bridging governance standards with runtime mechanisms in agentic AI. We address the major comment below, agreeing where clarification is needed and outlining specific revisions to strengthen the operational aspects of the framework.

read point-by-point responses

Referee: The runtime-enforceability rubric is defined only in terms of three high-level properties (observable, determinate, time-sensitive) with no formal decision procedure, threshold values, or explicit handling of edge cases such as partial observability, delayed effects, or nondeterministic tool outcomes. This directly threatens the central claim that the layered method avoids introducing new ambiguities in control placement and layer assignment.

Authors: We agree that the rubric, as currently presented, relies on three high-level properties without a formal decision procedure or detailed edge-case guidance, which could introduce application inconsistencies. The properties were selected to provide a minimal, domain-agnostic filter that directly supports the modest central claim by restricting runtime guardrails to controls feasible for execution-time enforcement, as demonstrated in the procurement-agent case study. However, to mitigate the identified risk of ambiguity, we will revise the manuscript (primarily Section 3 and the associated figures) to include: (1) a step-by-step decision procedure expressed as pseudocode for applying the three properties in sequence; (2) illustrative threshold guidance (e.g., 'observable' requires sufficient state logging to verify the property at runtime); and (3) concrete examples addressing partial observability (routed to assurance feedback), delayed effects (design-time constraints), and nondeterministic tool outcomes (human escalation or audit). These additions will make layer assignment more transparent and reproducible while preserving the framework's generality and modest scope. We believe this directly addresses the concern without overstating the method's current formality. revision: yes

Circularity Check

0 steps flagged

No circularity: methodological framework rests on external standards

full rationale

The paper presents a conceptual layered translation method connecting external governance standards (ISO/IEC 42001, NIST AI RMF, etc.) to four control layers and a runtime-enforceability rubric. No equations, fitted parameters, or quantitative derivations exist. The central claim—that runtime guardrails apply only to observable, determinate, time-sensitive controls—is introduced as a definitional distinction rather than derived from prior results within the paper. No self-citations, self-definitional loops, or renaming of known results are load-bearing; the method is self-contained against the cited external standards without reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The paper introduces a new methodological structure rather than deriving from first principles or fitting parameters. It relies on the assumption that standards can be decomposed into enforceable controls without loss of fidelity.

axioms (1)

domain assumption Governance standards contain objectives that are sufficiently precise to be mapped to technical control layers without introducing new ambiguities.
Invoked in the description of the layered translation method connecting standards to runtime guardrails.

invented entities (2)

Control tuple no independent evidence
purpose: To classify and assign controls to the four layers.
Introduced as part of the translation method; no independent evidence provided beyond the proposal itself.
Runtime-enforceability rubric no independent evidence
purpose: To decide which controls belong at runtime versus other layers.
New rubric defined in the paper; no external validation or falsifiable test given in the abstract.

pith-pipeline@v0.9.0 · 5517 in / 1478 out tokens · 38429 ms · 2026-05-10T18:43:18.131734+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation
cs.CR 2026-05 unverdicted novelty 5.0

A TEE-backed architecture isolates security-critical decisions in self-hosted AI agents to prevent host-level abuse from malicious inputs while maintaining allowed functionality.

Reference graph

Works this paper leans on

29 extracted references · 18 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

ISO/IEC 42001:2023 — Information technology — Artificial intelligence — Management system,

ISO, “ISO/IEC 42001:2023 — Information technology — Artificial intelligence — Management system,” Dec. 2023. [Online]. Available: https://www.iso.org/standard/42001

2023
[2]

ISO/IEC 23894:2023 — Information technology — Artificial in- telligence — Guidance on risk management,

ISO, “ISO/IEC 23894:2023 — Information technology — Artificial in- telligence — Guidance on risk management,” 2023. [Online]. Available: https://www.iso.org/standard/77304.html

2023
[3]

ISO/IEC 42005:2025 — Information technology — Artificial intelligence (AI) — AI system impact assessment,

ISO, “ISO/IEC 42005:2025 — Information technology — Artificial intelligence (AI) — AI system impact assessment,” May 2025. [Online]. Available: https://www.iso.org/standard/42005

2025
[4]

ISO/IEC 5338:2023 — Information technology — Artificial in- telligence — AI system life cycle processes,

ISO, “ISO/IEC 5338:2023 — Information technology — Artificial in- telligence — AI system life cycle processes,” 2023. [Online]. Available: https://www.iso.org/standard/81118.html

2023
[5]

ISO/IEC 38507:2022 — Information technology — Governance of IT — Governance implications of the use of artificial intelligence by organizations,

ISO, “ISO/IEC 38507:2022 — Information technology — Governance of IT — Governance implications of the use of artificial intelligence by organizations,” 2022. [Online]. Available: https://www.iso.org/standard/ 56641.html

2022
[6]

Artificial intelligence risk management framework (AI RMF 1.0),

NIST, “Artificial Intelligence Risk Management Framework (AI RMF 1.0),” NIST AI 100-1, Jan. 2023. doi: 10.6028/NIST.AI.100-1

work page doi:10.6028/nist.ai.100-1 2023
[7]

Autio et al.Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile

NIST, “Artificial Intelligence Risk Management Framework: Genera- tive Artificial Intelligence Profile,” NIST AI 600-1, Jul. 2024. doi: 10.6028/NIST.AI.600-1

work page doi:10.6028/nist.ai.600-1 2024
[8]

AI Agent Standards Initiative,

NIST, “AI Agent Standards Initiative,” Center for AI Standards and Innovation (CAISI), Feb. 2026. [Online]. Available: https://www.nist. gov/caisi/ai-agent-standards-initiative

2026
[9]

Accelerating the Adoption of Software and Artificial Intelligence Agent Identity and Authorization,

H. Booth, W. Fisher, R. Galluzzo, and J. Roberts, “Accelerating the Adoption of Software and Artificial Intelligence Agent Identity and Authorization,” Initial Public Draft, National Cybersecurity Center of Excellence, NIST, Feb. 2026. [Online]. Available: https://csrc.nist.gov/pubs/other/2026/02/05/ accelerating-the-adoption-of-software-and-ai-agent/ipd

2026
[10]

NPR 7150.2D — NASA Software Engineering Requirements,

NASA, “NPR 7150.2D — NASA Software Engineering Requirements,” Office of the Chief Engineer, Mar. 2022. [Online]. Available: https:// nodis3.gsfc.nasa.gov/displayDir.cfm?t=NPR&c=7150&s=2D

2022
[11]

NASA-STD-8739.8B — Software Assurance and Soft- ware Safety Standard,

NASA, “NASA-STD-8739.8B — Software Assurance and Soft- ware Safety Standard,” Office of Safety and Mission Assurance, Sep. 2022. [Online]. Available: https://standards.nasa.gov/standard/nasa/ nasa-std-87398

2022
[12]

NASA-HDBK-2203 — NASA Software Engineering and As- surance Handbook,

NASA, “NASA-HDBK-2203 — NASA Software Engineering and As- surance Handbook,” Office of the Chief Engineer, current public hand- book and standards entry. [Online]. Available: https://swehb.nasa.gov/; https://standards.nasa.gov/standard/nasa/nasa-hdbk-2203
[13]

SP 800-160 V ol. 2 Rev. 1 — Developing Cyber-Resilient Systems: A Systems Security Engineering Approach,

NIST, “SP 800-160 V ol. 2 Rev. 1 — Developing Cyber-Resilient Systems: A Systems Security Engineering Approach,” Dec. 2021. doi: 10.6028/NIST.SP.800-160v2r1

work page doi:10.6028/nist.sp.800-160v2r1 2021
[14]

Assured Autonomy,

DARPA, “Assured Autonomy,” Information Innovation Office program summary. [Online]. Available: https://www.darpa.mil/research/programs/ assured-autonomy
[15]

MI9 – agent intelligence protocol: Runtime governance for agentic AI systems,

C. L. Wang, T. Singhal, A. Kelkar, and J. Tuo, “MI9: An Inte- grated Runtime Governance Framework for Agentic AI,”arXiv preprint arXiv:2508.03858, Nov. 2025

work page arXiv 2025
[16]

The AI agent code of conduct: Automated guardrail policy-as-prompt synthesis,

G. Kholkar and R. Ahuja, “Policy-as-Prompt: Turning AI Governance Rules into Guardrails for AI Agents,” inProc. 3rd Regulatable ML Workshop, NeurIPS 2025,arXiv preprint arXiv:2509.23994, Nov. 2025

work page arXiv 2025
[17]

Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents,

J. Mavracic, “Policy Cards: Machine-Readable Runtime Governance for Autonomous AI Agents,”arXiv preprint arXiv:2510.24383, Oct. 2025

work page arXiv 2025
[18]

Kaptein, V.-J

M. Kaptein, V .-J. Khan, and A. Podstavnychy, “Runtime Governance for AI Agents: Policies on Paths,”arXiv preprint arXiv:2603.16586, Mar. 2026

work page arXiv 2026
[19]

Safeguarding large language models: A survey

Y . Dong, R. Mu, Y . Zhang, S. Sun, T. Zhang, C. Wu, G. Jin, Y . Qi, J. Hu, J. Meng, S. Bensalem, and X. Huang, “Safeguarding Large Language Models: A Survey,”arXiv preprint arXiv:2406.02622, Jun. 2024

work page arXiv 2024
[20]

Building a foundational guardrail for general agentic systems via synthetic data

Y . Huang, H. Hua, Y . Zhou, P. Jing, M. Nagireddy, I. Padhi, G. Dolcetti, Z. Xu, S. Chaudhury, A. Rawat, L. Nedoshivina, P.-Y . Chen, P. Sattigeri, and X. Zhang, “Building a Foundational Guardrail for General Agentic Systems via Synthetic Data,”arXiv preprint arXiv:2510.09781, Oct. 2025

work page arXiv 2025
[21]

Agent-SafetyBench: Evaluating the Safety of LLM Agents

Z. Zhang, S. Cui, Y . Lu, J. Zhou, J. Yang, H. Wang, and M. Huang, “Agent-SafetyBench: Evaluating the Safety of LLM Agents,”arXiv preprint arXiv:2412.14470, May 2025

work page internal anchor Pith review arXiv 2025
[22]

WebGuard: Building a Generalizable Guardrail for Web Agents,

B. Zheng, Z. Liao, S. Salisbury, Z. Liu, M. Lin, Q. Zheng, Z. Wang, X. Deng, D. Song, H. Sun, and Y . Su, “WebGuard: Building a Generalizable Guardrail for Web Agents,”arXiv preprint arXiv:2507.14293, Jul. 2025

work page arXiv 2025
[23]

Mind the gap: Text safety does not transfer to tool-call safety in llm agents.arXiv preprint arXiv:2602.16943, 2026

A. Cartagena and A. Teixeira, “Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents,”arXiv preprint arXiv:2602.16943, Feb. 2026

work page arXiv 2026
[24]

Toolsafe: Enhancing tool invocation safety of llm-based agents via proactive step-level guardrail and feedback,

Y . Mou, Z. Xue, L. Li, P. Liu, S. Zhang, W. Ye, and J. Shao, “ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedback,”arXiv preprint arXiv:2601.10156, Jan. 2026

work page arXiv 2026
[25]

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

D. Liuet al., “AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security,”arXiv preprint arXiv:2601.18491, Jan. 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[26]

Proof-of- Guardrail in AI Agents and What (Not) to Trust from It,

X. Jin, M. Duan, Q. Lin, A. Chan, Z. Chen, J. Du, and X. Ren, “Proof-of- Guardrail in AI Agents and What (Not) to Trust from It,”arXiv preprint arXiv:2603.05786, Mar. 2026

work page arXiv 2026
[27]

arXiv:2507.02825 , year =

Y . Zhuet al., “Establishing Best Practices for Building Rigorous Agentic Benchmarks,”arXiv preprint arXiv:2507.02825, Aug. 2025

work page arXiv 2025
[28]

Enforceable Security Policies,

F. B. Schneider, “Enforceable Security Policies,”ACM Transactions on Information and System Security, vol. 3, no. 1, pp. 30–50, Feb. 2000. doi: 10.1145/353323.353382

work page doi:10.1145/353323.353382 2000
[29]

The Protection of Information in Computer Systems

J. H. Saltzer and M. D. Schroeder, “The Protection of Information in Computer Systems,”Proceedings of the IEEE, vol. 63, no. 9, pp. 1278– 1308, Sep. 1975. doi: 10.1109/PROC.1975.9939

work page doi:10.1109/proc.1975.9939 1975