The Two Boundaries: Why Behavioral AI Governance Fails Structurally
Pith reviewed 2026-05-07 09:36 UTC · model grok-4.3
The pith
AI systems governing effects must make their capability boundary identical to the governance boundary or else risk and theater are inevitable.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that behavioral governance of effects in Turing-complete AI systems is undecidable in general by Rice's theorem, because no algorithm can determine whether an arbitrary program satisfies a non-trivial semantic property such as policy compliance. Coterminous governance is therefore required: the expressiveness boundary must equal the governance boundary. This equality is achieved only by an architectural separation of computation from effect, after which governance checks become part of the execution pipeline and subsume any separate governance infrastructure. The testable criterion follows directly: if the two boundaries are not provably identical, then ungoverned risk,
What carries the argument
Coterminous governance, the requirement that an AI system's expressiveness boundary (what effects it can produce) exactly equals its governance boundary, enforced by separating computation from effects so that policy checks are structural rather than behavioral.
If this is right
- Any behavioral governance layer added after the fact on unrestricted programs will leave either ungoverned capabilities or policies that cover nothing.
- Governance checks must be moved inside the execution pipeline rather than run as a parallel system.
- Structural governance under separated computation and effect renders separate governance infrastructure redundant.
- The undecidability result applies to any attempt to decide non-trivial properties of effects in Turing-complete systems.
- Coterminous boundaries become the single measurable test for whether a governance approach avoids structural failure.
Where Pith is reading between the lines
- Restricting the effect-generating component to a non-Turing-complete language would remove the undecidability barrier and allow effective behavioral governance.
- System designers could verify coterminous boundaries by enumerating every possible effect and confirming that each is explicitly covered and that no policy addresses an impossible action.
- The same boundary-coincidence requirement may apply to other domains where programs produce external effects, such as operating-system access control or robotic action planning.
- In practice this would favor agent architectures whose action sets are declared and finite rather than generated on the fly by general computation.
Load-bearing premise
The claim depends on modeling deployed AI effect-governance systems as arbitrary Turing-complete programs whose semantic compliance properties cannot be decided algorithmically after the fact.
What would settle it
A working deployed system that governs effects behaviorally on a Turing-complete architecture yet produces neither ungoverned risky effects nor policies that address impossible actions would falsify the claim.
Figures
read the original abstract
Every system that performs effects has two boundaries: what it can do (expressiveness) and what governance covers (governance). In nearly all deployed AI systems, these boundaries are defined independently, creating three regions: governed capabilities (the only useful region), ungoverned capabilities (risk), and governance policies that address non-existent capabilities (theater). Two of the three regions are failure modes. We focus on the governance of effects: actions that AI systems perform in the world (API calls, database writes, tool invocations). This is distinct from the governance of model outputs (content quality, bias, fairness), which operates at a different level and requires different mechanisms. We present a formal framework for analyzing this structural gap. Rice's theorem (1953) proves the gap is undecidable in the general case for any Turing-complete architecture that attempts to govern effects behaviorally: no algorithm can decide non-trivial semantic properties of arbitrary programs, including the property "this program's effects comply with the governance policy." We define coterminous governance: a system property where the expressivenessboundary equals the governance boundary. We show that coterminous governance requires an architectural decision (separatingcomputation from effect) rather than a governance layer added after the fact. We show that structural governance under this separation subsumes separate governance infrastructure: governance checks become part of the execution pipeline rather than a second system running alongside it. We propose coterminous governance as the testable criterion for any AI governance system: either the two boundaries are provably identical, or risk and theater are structurally inevitable. Proofs are mechanized in Coq (454 theorems, 36 modules, 0 admitted).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that every AI system performing effects has two independently defined boundaries (expressiveness and governance), creating three regions of which two are structural failure modes (ungoverned risk and governance theater). It invokes Rice's theorem to prove that deciding non-trivial semantic properties such as policy-compliant effects is undecidable for any Turing-complete architecture attempting behavioral governance, defines coterminous governance as the property that the two boundaries coincide, shows this requires an architectural separation of computation from effect rather than a post-hoc layer, and mechanizes the framework in Coq (454 theorems, 36 modules, 0 admits).
Significance. If the reduction from deployed AI effect mechanisms to arbitrary Turing-complete programs holds, the result supplies a formal, testable criterion that subsumes many existing post-hoc governance proposals and explains why behavioral approaches are prone to either residual risk or ineffective theater. The Coq mechanization of 454 theorems with zero admits is a clear strength, providing machine-checked support for the undecidability argument and the derived architectural requirements.
major comments (2)
- [§3.2] §3.2 (Mapping to AI architectures): the claim that current tool-calling and API-effect mechanisms in deployed systems are sufficiently expressive to inherit the full undecidability of Rice's theorem is asserted via informal reduction; a concrete lemma or example showing how an arbitrary program is simulated by an LLM-plus-tool loop would make the application load-bearing rather than illustrative.
- [Definition 4.1] Definition 4.1 (coterminous governance): the requirement that governance checks become part of the execution pipeline is derived from the undecidability result, yet the paper does not exhibit a formal statement showing that any post-hoc governance layer is necessarily non-coterminous; adding such a lemma would tighten the subsumption claim.
minor comments (2)
- [Abstract] Abstract: the three-region diagram is described in text but not referenced by figure number; adding '(see Figure 1)' would improve readability.
- [§5.3] §5.3: the statement that 'structural governance subsumes separate infrastructure' uses the term 'subsumes' without a precise set-theoretic or simulation relation; a short clarifying sentence would remove ambiguity.
Simulated Author's Rebuttal
We thank the referee for the insightful comments and the recommendation for minor revision. The suggestions to formalize the reduction and the non-coterminous property of post-hoc layers will improve the clarity and rigor of the paper. We outline our responses below and confirm that revisions will be made accordingly.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Mapping to AI architectures): the claim that current tool-calling and API-effect mechanisms in deployed systems are sufficiently expressive to inherit the full undecidability of Rice's theorem is asserted via informal reduction; a concrete lemma or example showing how an arbitrary program is simulated by an LLM-plus-tool loop would make the application load-bearing rather than illustrative.
Authors: We concur that the mapping in §3.2 relies on an informal argument. In the revised version, we will provide a concrete example illustrating the simulation of an arbitrary Turing-complete program using an LLM with tool-calling capabilities, assuming tools that support persistent state and control flow. Furthermore, we will add a lemma in the Coq formalization that captures this simulation, building on the existing 454 theorems to make the inheritance of undecidability explicit and machine-checked. revision: yes
-
Referee: [Definition 4.1] Definition 4.1 (coterminous governance): the requirement that governance checks become part of the execution pipeline is derived from the undecidability result, yet the paper does not exhibit a formal statement showing that any post-hoc governance layer is necessarily non-coterminous; adding such a lemma would tighten the subsumption claim.
Authors: The referee correctly identifies that the derivation of coterminous governance from undecidability would benefit from an explicit lemma. We will add a new lemma stating that for any Turing-complete system, a post-hoc governance layer (operating externally on effects) cannot be coterminous with the expressiveness boundary, because it would necessitate an algorithm to decide non-trivial semantic properties of programs, contradicting Rice's theorem. This lemma will be mechanized in Coq and integrated into the definition of coterminous governance in the revised manuscript. revision: yes
Circularity Check
No significant circularity; derivation relies on external Rice's theorem and Coq mechanization
full rationale
The paper grounds its core claim in Rice's theorem (1953), an independent external result on undecidability of non-trivial semantic properties for arbitrary programs, and mechanizes the mapping to behavioral AI governance effects in Coq (454 theorems, 36 modules, 0 admitted). Coterminous governance is defined directly from the two-boundary distinction and shown to require separation of computation from effect as a logical consequence of the undecidability result rather than by redefinition or fitting. No load-bearing step reduces to self-citation, ansatz smuggling, renaming of known results, or any input-output equivalence by construction within the paper itself. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Rice's theorem: non-trivial semantic properties of programs are undecidable for Turing-complete systems
invented entities (2)
-
coterminous governance
no independent evidence
-
three regions (governed capabilities, ungoverned capabilities, theater)
no independent evidence
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.