Incompleteness of AI Safety Verification via Kolmogorov Complexity
Pith reviewed 2026-05-10 20:10 UTC · model grok-4.3
The pith
No fixed formal verifier can certify all policy-compliant AI behaviors of arbitrary complexity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We formalize policy compliance as a verification problem over encoded system behaviors and prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity.
What carries the argument
Kolmogorov complexity of encoded behaviors, which bounds the shortest program that produces a given behavior and transfers its incompleteness property to any fixed sound verifier of compliance.
If this is right
- Verification of AI safety is incomplete for all sufficiently complex compliant instances.
- No amount of additional computational resources overcomes the certification gap for a fixed verifier.
- Every finite verifier leaves some policy-compliant behaviors uncertified.
- Proof-carrying approaches become necessary to supply instance-level guarantees instead of relying on a universal verifier.
Where Pith is reading between the lines
- Runtime or dynamic checks may be required alongside static verification to handle high-complexity cases.
- Preferring policies whose compliant behaviors admit short descriptions could improve the reach of verification.
- The same complexity-based incompleteness may limit verification of other AI properties such as fairness or robustness.
Load-bearing premise
Policy compliance can be formalized as a property of encoded behaviors such that the incompleteness of Kolmogorov complexity directly transfers to the inability of any fixed sound verifier to certify high-complexity compliant instances.
What would settle it
Exhibit one fixed sound computably enumerable verifier together with a family of policy-compliant behaviors whose Kolmogorov complexities increase without bound yet each is successfully certified by that verifier.
Figures
read the original abstract
Ensuring that artificial intelligence (AI) systems satisfy formal safety and policy constraints is a central challenge in safety-critical domains. While limitations of verification are often attributed to combinatorial complexity and model expressiveness, we show that they arise from intrinsic information-theoretic limits. We formalize policy compliance as a verification problem over encoded system behaviors and analyze it using Kolmogorov complexity. We prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. This reveals a fundamental limitation of AI safety verification independent of computational resources, and motivates proof-carrying approaches that provide instance-level correctness guarantees.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims to prove an incompleteness theorem for AI safety verification using Kolmogorov complexity. It formalizes policy compliance as a verification problem over encoded system behaviors and asserts that for any fixed sound computably enumerable verifier there exists a threshold beyond which true policy-compliant instances cannot be certified once their Kolmogorov complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. The result is presented as a fundamental, resource-independent limitation that motivates proof-carrying approaches.
Significance. If correct, the result would establish an intrinsic information-theoretic barrier to exhaustive formal verification of AI policies, independent of computational power, and would strengthen the case for proof-carrying or instance-specific certification methods. However, the central claim does not follow from the stated assumptions: standard properties of Kolmogorov complexity and c.e. sets show that infinite c.e. sets necessarily contain members of arbitrarily high complexity, so the incompleteness does not transfer to an inability to certify high-K compliant instances.
major comments (2)
- [Abstract] Abstract: The abstract states that a proof exists but supplies no definitions of the verifier, no formalization of how policy compliance is encoded as a property over behaviors, and no derivation steps showing how Kolmogorov complexity properties yield a certification threshold. Without these elements the central claim cannot be checked for correctness or gaps.
- [The claimed incompleteness result] The claimed incompleteness result: The assertion that a fixed sound c.e. verifier must fail to certify compliant instances above some K-threshold is inconsistent with the fact that every infinite c.e. set contains strings of arbitrarily high Kolmogorov complexity. Only finitely many strings have K(x) < n for any n, and the standard bound K(x) ≤ K(V) + K(index of x in the enumeration) + O(1) permits K(x) to grow unbounded as the index increases; no additional structure in the paper's encoding of certification is shown that would impose a hard threshold.
Simulated Author's Rebuttal
We thank the referee for the careful reading and valuable comments on our manuscript. We address the major comments point by point below, providing clarifications and indicating revisions where appropriate to strengthen the presentation.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract states that a proof exists but supplies no definitions of the verifier, no formalization of how policy compliance is encoded as a property over behaviors, and no derivation steps showing how Kolmogorov complexity properties yield a certification threshold. Without these elements the central claim cannot be checked for correctness or gaps.
Authors: We agree that the abstract is highly condensed for length reasons and omits explicit definitions. In the full manuscript, the verifier is defined as a sound computably enumerable set of encoded behaviors (soundness meaning all certified instances are policy-compliant), policy compliance is formalized by mapping AI system behaviors to strings in a language, and the threshold is derived from the fixed Kolmogorov complexity of the verifier's description limiting its ability to cover all high-complexity compliant instances. We will revise the abstract to include brief definitions of the verifier and compliance encoding, along with a pointer to the formal theorem and proof. revision: yes
-
Referee: [The claimed incompleteness result] The claimed incompleteness result: The assertion that a fixed sound c.e. verifier must fail to certify compliant instances above some K-threshold is inconsistent with the fact that every infinite c.e. set contains strings of arbitrarily high Kolmogorov complexity. Only finitely many strings have K(x) < n for any n, and the standard bound K(x) ≤ K(V) + K(index of x in the enumeration) + O(1) permits K(x) to grow unbounded as the index increases; no additional structure in the paper's encoding of certification is shown that would impose a hard threshold.
Authors: The referee correctly recalls that any infinite c.e. set contains elements of arbitrarily high Kolmogorov complexity via the standard bound K(x) ≤ K(V) + K(index) + O(1). Our result does not claim that a fixed verifier certifies zero instances above a K-threshold; rather, it establishes that for any fixed sound c.e. verifier V, there exist policy-compliant instances whose Kolmogorov complexity exceeds any bound derived from K(V) and that V fails to certify. This holds because the full set of policy-compliant instances is not assumed to be c.e. and can contain members outside the enumeration produced by V's fixed finite description. The 'threshold' is the point at which the verifier's coverage necessarily becomes incomplete for the non-c.e. compliant set. We will add a clarifying paragraph distinguishing coverage of some high-K instances from exhaustive coverage of all high-K compliant instances and reference the relevant Kolmogorov complexity lemmas. revision: partial
Circularity Check
No significant circularity; derivation applies standard Kolmogorov complexity to verifier formalization
full rationale
The paper's central incompleteness claim is derived by formalizing policy compliance as a verification problem over encoded behaviors and then invoking the standard fact that Kolmogorov complexity is unbounded over any infinite computably enumerable set. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or described derivation chain; the result is presented as a direct transfer of an external mathematical property rather than a reduction to the paper's own inputs by construction. The formalization step is explicitly stated as an assumption and does not collapse into the conclusion.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Kolmogorov complexity is well-defined for strings and satisfies the standard incompressibility and invariance properties used in incompleteness arguments
- domain assumption A sound computably enumerable verifier exists as a fixed object whose own complexity bounds what it can certify
Reference graph
Works this paper leans on
-
[1]
In: 2018 IEEE Symposium on Security and Privacy (SP)
URLhttps://arxiv.org/abs/1606.06565. Limits of AI Safety Verification 17 E. Ben-Sasson et al. Scalable, transparent, and post-quantum secure computational integrity.IACR ePrint, 2018. G. J. Chaitin. Information-Theoretic Limitations of Formal Systems.Journal of the ACM (JACM), 21(3):403–424, 1974. J. M. Cohen, E. Rosenfeld, and J. Z. Kolter. Certified Adv...
-
[2]
Springer Berlin Heidelberg. ISBN 978-3-642-14623-7. K. G¨ odel. On formally undecidable propositions of principia mathematica and related systems i 1 (1931). Monatshefte f¨ ur Mathematik und Physik, 38:173–198, 1931. J. Groth. On the Size of Pairing-based Non-interactive Arguments. InEUROCRYPT, 2016. P. D. Gr¨ unwald.The Minimum Description Length Princip...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.