pith. sign in

arxiv: 2604.18652 · v2 · pith:FXWA44DAnew · submitted 2026-04-20 · 💻 cs.CR · cs.AI

From Craft to Kernel: A Governance-First Execution Architecture and Semantic ISA for Agentic Computers

Pith reviewed 2026-05-21 01:10 UTC · model grok-4.3

classification 💻 cs.CR cs.AI
keywords agentic AIsemantic ISAtaint propagationexecution kernelmicroarchitectural securitygovernance architectureprobabilistic processing unitdependency graph
0
0 comments X

The pith

Arbiter-K wraps probabilistic AI models in a deterministic kernel that uses a Semantic ISA for taint-based security enforcement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies the root cause of fragility in agentic AI as the practice of letting large language models run the control loop while adding only heuristic guardrails afterward. It proposes Arbiter-K as a governance-first architecture that treats the model as a Probabilistic Processing Unit inside a neuro-symbolic kernel. The kernel applies a Semantic ISA to convert probabilistic outputs into discrete instructions, builds an Instruction Dependency Graph at runtime, and performs active taint propagation from each reasoning node. This setup lets the system block unsafe actions at exact deterministic points such as high-risk tool calls or unauthorized network egress and supports automatic correction or rollback when policies trigger.

Core claim

Arbiter-K reconceptualizes the underlying model as a Probabilistic Processing Unit encapsulated by a deterministic neuro-symbolic kernel. It implements a Semantic ISA to reify probabilistic messages into discrete instructions, maintains a Security Context Registry, and constructs an Instruction Dependency Graph at runtime. Active taint propagation based on data-flow pedigree then enables precise interdiction of unsafe trajectories at deterministic sinks along with autonomous execution correction and architectural rollback when security policies activate.

What carries the argument

Semantic ISA that converts probabilistic model outputs into discrete instructions to support runtime construction of an Instruction Dependency Graph and taint propagation inside the governance kernel.

If this is right

  • Security enforcement shifts from heuristic patches to a built-in microarchitectural property of the execution environment.
  • Unsafe trajectories are intercepted at rates from 76% to 95%, for a 92.79% absolute improvement over native policies on OpenClaw and NanoBot.
  • Interdiction occurs at precise deterministic sinks such as high-risk tool calls or unauthorized network egress.
  • The kernel triggers autonomous execution correction and architectural rollback when security policies activate.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same kernel structure could be applied to other probabilistic control systems that must interact with external tools or networks.
  • The discrete-instruction layer may eventually support formal verification of agent safety properties beyond runtime taint checks.
  • Adoption could allow existing agent platforms to gain deterministic guards without retraining the underlying models for each new policy.

Load-bearing premise

A Semantic ISA can translate probabilistic model outputs into discrete instructions while retaining enough information for accurate taint propagation and dependency graph construction without introducing new failure modes or excessive false positives.

What would settle it

A test on additional agent tasks that measures whether taint propagation either misses novel unsafe patterns or blocks safe operations at rates higher than the reported native-policy baseline.

Figures

Figures reproduced from arXiv: 2604.18652 by Changran Xu, Fangxin Liu, Haomin Li, Jianrong Ding, Li Jiang, Lingjun Chen, Qiang Xu, Shu Chi, XiangYu Wen, Xiaoyu Xu, Yuang Zhao, Zeju Li.

Figure 1
Figure 1. Figure 1: An example of a “Semantic Deviation” where a subtle prompt injection in a multi-step task leads to an unau￾thorized tool call. apply IAM principles to gate externally visible actions. How￾ever, internal probabilistic transitions remain unverified be￾cause policies do not constrain the generation process. Inte￾grated monitors including AgentSafe [10] and related frame￾works [20] utilize risk taxonomies to m… view at source ↗
Figure 2
Figure 2. Figure 2: Cost profile under increasing task length: cumu￾lative waste from repeated abort-and-retry behavior. These results expose a limitation of orchestration-only safety mechanisms: they lack visibility into execution se￾mantics. When model outputs remain opaque text, the sys￾tem cannot attribute how earlier untrusted inputs shape later high-impact actions. Governance therefore remains reactive and typically act… view at source ↗
Figure 4
Figure 4. Figure 4: Architecture of Arbiter-K. 5 Arbiter-K Design 5.1 Discrete Instruction Set Architecture A neuro-symbolic architecture predicated on the analogy of a PPU necessitates a well-defined Instruction Set Architec￾ture (ISA). The ISA serves as the formal contract that drives the PPU and supports the kernel runtime. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: , we arrange the ISA into five logical cores, where each governs a distinct functional domain of the agent run￾time. These cores provide a structured framework for man￾aging everything from probabilistic reasoning to determin￾istic safety enforcement. As summarized in [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Example and procedures of taint analysis. layer reifies abstract instructions into executable units by explicitly mapping implementation logic to specific instruc￾tion types and enforcing structural constraints, as presented in the following code snippet. By establishing these map￾pings through a dedicated binding interface, the architec￾ture ensures that every instruction operates within a prede￾fined fun… view at source ↗
Figure 7
Figure 7. Figure 7: Performance of Arbiter-K [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: shows that Arbiter-K’s security gains primarily come from its semantic policy layers rather than from host-specific rules alone. OpenClawPolicy by itself (O) preserves most be￾nign executions but intercepts only 6.2% of unsafe cases, in￾dicating that handcrafted host rules are insufficient as the main line of defense. In contrast, RelationalPolicy (R) and UnaryGatePolicy (U) substantially improve unsafe in… view at source ↗
Figure 8
Figure 8. Figure 8: Instruction coverage of Arbiter-K [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
read the original abstract

The transition of agentic AI from brittle prototypes to production systems is stalled by a pervasive crisis of craft. We suggest that the prevailing orchestration paradigm-delegating the system control loop to large language models and merely patching with heuristic guardrails-is the root cause of this fragility. Instead, we propose Arbiter-K, a Governance-First execution architecture that reconceptualizes the underlying model as a Probabilistic Processing Unit encapsulated by a deterministic, neuro-symbolic kernel. Arbiter-K implements a Semantic Instruction Set Architecture (ISA) to reify probabilistic messages into discrete instructions. This allows the kernel to maintain a Security Context Registry and construct an Instruction Dependency Graph at runtime, enabling active taint propagation based on the data-flow pedigree of each reasoning node. By leveraging this mechanism, Arbiter-K precisely interdicts unsafe trajectories at deterministic sinks (e.g., high-risk tool calls or unauthorized network egress) and enables autonomous execution correction and architectural rollback when security policies are triggered. Evaluations on OpenClaw and NanoBot demonstrate that Arbiter-K enforces security as a microarchitectural property, achieving 76% to 95% unsafe interception for a 92.79% absolute gain over native policies. The code is publicly available at https://github.com/cure-lab/ArbiterOS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes Arbiter-K, a governance-first execution architecture for agentic AI that treats the underlying LLM as a Probabilistic Processing Unit encapsulated by a deterministic neuro-symbolic kernel. It introduces a Semantic Instruction Set Architecture (ISA) to reify probabilistic outputs into discrete instructions, enabling runtime maintenance of a Security Context Registry and construction of an Instruction Dependency Graph for active taint propagation. This mechanism is claimed to interdict unsafe trajectories at deterministic sinks (e.g., high-risk tool calls) with autonomous correction and rollback. Evaluations on OpenClaw and NanoBot report 76%–95% unsafe interception rates and a 92.79% absolute gain over native policies; the code is released publicly.

Significance. If the central claims hold after proper validation, the work could meaningfully advance secure agentic systems by shifting from heuristic guardrails to microarchitectural enforcement of security properties via taint tracking and dependency graphs. Public code availability is a clear strength that enables reproducibility and community scrutiny.

major comments (2)
  1. [Evaluation] Evaluation section (results on OpenClaw and NanoBot): The abstract and results report concrete performance figures (76% to 95% interception, 92.79% absolute gain) yet supply no description of experimental setup, number of trials or trajectories tested, precise definition of “unsafe trajectories,” choice of baselines, or statistical analysis. This information is load-bearing for attributing the reported gains to the proposed architecture rather than test-specific conditions.
  2. [Architecture / Semantic ISA] Semantic ISA and Instruction Dependency Graph construction (core mechanism description): The central security claim requires that probabilistic LLM outputs are reliably discretized into instructions such that taint propagation and dependency-graph construction remain accurate. No concrete mapping rules, uncertainty-handling procedure, or analysis of information loss/false-positive rates during discretization are provided; without these, the microarchitectural security property does not demonstrably follow.
minor comments (1)
  1. [Abstract] Abstract: The term “native policies” is used for the baseline comparison but is not defined; a brief parenthetical clarification would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify key areas where additional detail will strengthen the manuscript. We respond to each major comment below and will make the indicated revisions.

read point-by-point responses
  1. Referee: [Evaluation] Evaluation section (results on OpenClaw and NanoBot): The abstract and results report concrete performance figures (76% to 95% interception, 92.79% absolute gain) yet supply no description of experimental setup, number of trials or trajectories tested, precise definition of “unsafe trajectories,” choice of baselines, or statistical analysis. This information is load-bearing for attributing the reported gains to the proposed architecture rather than test-specific conditions.

    Authors: We agree that the Evaluation section requires substantially more detail to support the reported figures. In the revised manuscript we will add a complete description of the experimental protocol, including the total number of trajectories and trials executed on OpenClaw and NanoBot, the precise definition of unsafe trajectories (those that violate any policy encoded in the Security Context Registry), the baseline configurations (native LLM policies without the Arbiter-K kernel), and the statistical analysis performed (including variance across runs and any confidence intervals). These additions will allow readers to evaluate whether the observed gains are attributable to the architecture. revision: yes

  2. Referee: [Architecture / Semantic ISA] Semantic ISA and Instruction Dependency Graph construction (core mechanism description): The central security claim requires that probabilistic LLM outputs are reliably discretized into instructions such that taint propagation and dependency-graph construction remain accurate. No concrete mapping rules, uncertainty-handling procedure, or analysis of information loss/false-positive rates during discretization are provided; without these, the microarchitectural security property does not demonstrably follow.

    Authors: We acknowledge that the current description of the Semantic ISA would benefit from greater concreteness. In the revision we will insert an explicit subsection that defines the mapping rules used to convert probabilistic LLM outputs into discrete instructions, describes the uncertainty-handling mechanisms (for example, confidence-threshold gating and fallback to conservative taint labels), and reports measured rates of information loss and false-positive discretization errors on the evaluation workloads. These additions will make the accuracy of subsequent taint propagation and Instruction Dependency Graph construction more transparent. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the proposed architecture and evaluation

full rationale

The paper presents Arbiter-K as a new system design: a governance-first execution architecture that encapsulates a probabilistic model inside a deterministic neuro-symbolic kernel using a Semantic ISA. It describes runtime mechanisms for Security Context Registry, Instruction Dependency Graph, and taint propagation, then reports empirical interception rates (76-95%, 92.79% gain) from evaluations on OpenClaw and NanoBot. No equations, fitted parameters, or self-citation chains are shown that reduce the central security claims to inputs by construction. The results are positioned as outcomes of the architecture rather than self-referential derivations, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 4 invented entities

The proposal rests on several new architectural concepts introduced without upstream evidence in the abstract; no explicit free parameters are mentioned.

axioms (1)
  • domain assumption Probabilistic model outputs can be faithfully reified into discrete instructions via a Semantic ISA without critical loss of intent or safety-relevant information.
    This premise is required for the kernel to construct accurate dependency graphs and perform taint propagation.
invented entities (4)
  • Probabilistic Processing Unit no independent evidence
    purpose: Encapsulates the underlying LLM as a probabilistic component inside a deterministic kernel.
    New framing introduced to separate probabilistic reasoning from deterministic control.
  • Semantic Instruction Set Architecture (ISA) no independent evidence
    purpose: Reifies probabilistic messages into discrete instructions for the kernel.
    Core new mechanism enabling dependency graphs and taint tracking.
  • Security Context Registry no independent evidence
    purpose: Maintains context for security decisions across reasoning nodes.
    Supports active taint propagation.
  • Instruction Dependency Graph no independent evidence
    purpose: Tracks data-flow pedigree of each reasoning node at runtime.
    Enables precise interdiction of unsafe trajectories.

pith-pipeline@v0.9.0 · 5792 in / 1521 out tokens · 33358 ms · 2026-05-21T01:10:24.374998+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 6 internal anchors

  1. [1]

    Amazon. 2025. Amazon Bedrock AgentCore Policy: Control Agent-to-Tool Interactions. https://docs.aws.amazon.com/bedrock- agentcore/latest/devguide/policy.html

  2. [2]

    Anthropic. 2025. Equipping agents for the real world with Agent Skills. https://www.anthropic.com/engineering/equipping-agents- for-the-real-world- with-agent-skills

  3. [3]

    Xiaohe Bo, Zeyu Zhang, Quanyu Dai, Xueyang Feng, Lei Wang, Rui Li, Xu Chen, and Ji-Rong Wen. 2024. Reflective Multi-Agent Collabo- ration based on Large Language Models. In Proceedings of NeurIPS

  4. [4]

    Edoardo Debenedetti, Jie Zhang, Mislav Balunovi’c, Luca Beurer- Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents. ArXiv abs/2406.13352 (2024). https://api.semanticscholar. org/CorpusID:270619628

  5. [5]

    Hassen Dhrif. 2025. Reasoning-Aware Prompt Orchestration: A Foun- dation Model for Multi-Agent Language Model Coordination. ArXiv abs/2510.00326 (2025)

  6. [6]

    Invariant Labs. 2025. Invariant Guardrails. https://github.com/invariantlabs-ai/invariant

  7. [7]

    IronClaw Contributors. 2026. IronClaw: Your secure personal AI as- sistant, always on your side. https://github.com/nearai/ironclaw

  8. [8]

    Yang JingYi, Shuai Shao, Dongrui Liu, and Jing Shao. 2025. RiOS- World: Benchmarking the Risk of Multimodal Computer-Use Agents. In NeurIPS

  9. [9]

    Edward Junprung. 2023. Exploring the Intersection of Large Lan- guage Models and Agent-Based Modeling via Prompt Engineering. ArXiv abs/2308.07411 (2023)

  10. [10]

    Rafflesia Khan, Declan Joyce, and Mansura Habiba. 2025. AGENTSAFE: A Unified Framework for Ethical Assurance and Governance in Agentic AI. arXiv abs/2512.03180 (2025)

  11. [11]

    Thomas Kuntz, Agatha Duzan, Hao Zhao, Francesco Croce, J Zico Kolter, Nicolas Flammarion, and Maksym Andriushchenko. 2025. OS- Harm: A Benchmark for Measuring Safety of Computer Use Agents. In NeurIPS

  12. [12]

    Puzhuo Liu, Chengnian Sun, Yaowen Zheng, Xuan Feng, Chuan Qin, Yuncheng Wang, Zhi Li, and Limin Sun. 2023. Harnessing the Power of LLM to Support Binary Taint Analysis. ArXiv abs/2310.08275 (2023)

  13. [13]

    Songyang Liu, Chaozhuo Li, Chenxu Wang, Jinyu Hou, Zejian Chen, Litian Zhang, Zheng Liu, Qiwei Ye, Yiming Hei, Xi Zhang, and Zhongyuan Wang. 2026. ClawKeeper: Comprehensive Safety Pro- tection for OpenClaw Agents Through Skills, Plugins, and Watchers. ArXiv abs/2603.24414 (2026)

  14. [14]

    Lingrui Mei, Jiayu Yao, Yuyao Ge, Yiwei Wang, Baolong Bi, Yujun Cai, Jiazhi Liu, Mingyu Li, Zhong-Zhi Li, Duzhen Zhang, Chenlin Zhou, Jiayi Mao, Tianze Xia, Jiafeng Guo, and Shenghua Liu. 2025. A Survey of Context Engineering for Large Language Models. ArXiv abs/2507.13334 (2025)

  15. [15]

    NanoBot Contributors. 2026. NanoBot: Ultra-Lightweight Personal AI Agent. https://github.com/HKUDS/nanobot

  16. [16]

    OpenClaw Contributors. 2026. OpenClaw: Open-Source AI Agent Runtime. https://github.com/openclaw/openclaw

  17. [17]

    Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Sam- rat Sohel Mondal, and Aman Chadha. 2024. A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Ap- plications. ArXiv abs/2402.07927 (2024)

  18. [18]

    The Prompt Report: A Systematic Survey of Prompt Engineering Techniques

    Sander Schulhoff, Michael Ilie, Nishant Balepur, Konstantine Ka- hadze, Amanda Liu, Chenglei Si, Yinheng Li, Aayush Gupta, Hyo- Jung Han, Sevien Schulhoff, Pranav Sandeep Dulepet, Saurav Vidyad- hara, Dayeon Ki, Sweta Agrawal, Chau Minh Pham, Gerson C. Kroiz, Feileen Li, Hudson Tao, Ashay Srivastava, Hevander Da Costa, Sa- loni Gupta, Megan L. Rogers, Inn...

  19. [19]

    Ava Spataru, Eric Hambro, Elena Voita, and Nicola Cancedda. 2024. Know When To Stop: A Study of Semantic Drift in Text Generation. In Proceedings of NAACL. 3656–3671

  20. [20]

    Wang, Trisha Singhal, Ameya Kelkar, and Jason Tuo

    Charles L. Wang, Trisha Singhal, Ameya Kelkar, and Jason Tuo. 2025. MI9: An Integrated Runtime Governance Framework for Agentic AI. ArXiv abs/2508.03858 (2025)

  21. [21]

    Shuyue Wang, Runxin Xu, Zirui Zhu, Zekun Wu, Chen Zhang, Weize Liu, Zheyuan Liu, Yushi Qin, Yiran Yang, Yuan Zhang, et al. 2024. TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks. ArXiv abs/2405.06451 (2024)

  22. [22]

    Bin Xu. 2026. AI Agent Systems: Architectures, Applications, and Evaluation. ArXiv abs/2601.01743 (2026)

  23. [23]

    Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, and Yongfeng Zhang. 2025. Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents. In ICLR

  24. [24]

    Zhexin Zhang, Shiyao Cui, Yida Lu, Jingzhuo Zhou, Junxiao Yang, Hongning Wang, and Minlie Huang. 2024. Agent-SafetyBench: Eval- uating the Safety of LLM Agents. ArXiv abs/2412.14470 (2024). https: //api.semanticscholar.org/CorpusID:274859514

  25. [25]

    Wei Zhao, Zhe Li, Peixin Zhang, and Jun Sun. 2026. ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection. ArXiv abs/2604.11790v1 (2026)