Incompleteness of AI Safety Verification via Kolmogorov Complexity

Munawar Hasan

arxiv: 2604.04876 · v1 · submitted 2026-04-06 · 💻 cs.AI

Incompleteness of AI Safety Verification via Kolmogorov Complexity

Munawar Hasan This is my paper

Pith reviewed 2026-05-10 20:10 UTC · model grok-4.3

classification 💻 cs.AI

keywords AI safetyverificationKolmogorov complexityincompletenesspolicy compliancecomputable verifiersformal methods

0 comments

The pith

No fixed formal verifier can certify all policy-compliant AI behaviors of arbitrary complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that verification of AI policy compliance runs into a hard limit from Kolmogorov complexity. For any fixed sound computably enumerable verifier, compliant behaviors whose shortest description exceeds a certain length cannot be certified as safe. This limit does not depend on available computing power or time. A reader would care because it means formal methods alone cannot guarantee safety across every possible compliant system, no matter how elaborate the checker.

Core claim

We formalize policy compliance as a verification problem over encoded system behaviors and prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity.

What carries the argument

Kolmogorov complexity of encoded behaviors, which bounds the shortest program that produces a given behavior and transfers its incompleteness property to any fixed sound verifier of compliance.

If this is right

Verification of AI safety is incomplete for all sufficiently complex compliant instances.
No amount of additional computational resources overcomes the certification gap for a fixed verifier.
Every finite verifier leaves some policy-compliant behaviors uncertified.
Proof-carrying approaches become necessary to supply instance-level guarantees instead of relying on a universal verifier.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Runtime or dynamic checks may be required alongside static verification to handle high-complexity cases.
Preferring policies whose compliant behaviors admit short descriptions could improve the reach of verification.
The same complexity-based incompleteness may limit verification of other AI properties such as fairness or robustness.

Load-bearing premise

Policy compliance can be formalized as a property of encoded behaviors such that the incompleteness of Kolmogorov complexity directly transfers to the inability of any fixed sound verifier to certify high-complexity compliant instances.

What would settle it

Exhibit one fixed sound computably enumerable verifier together with a family of policy-compliant behaviors whose Kolmogorov complexities increase without bound yet each is successfully certified by that verifier.

Figures

Figures reproduced from arXiv: 2604.04876 by Munawar Hasan.

**Figure 1.** Figure 1: Verification Paradigm: A fixed verifier T may fail to certify policy-compliant instances due to fundamental incompleteness, while proof-carrying approaches attach an instance-specific proof π enabling verifiable acceptance. correctness cannot be certified by any fixed formal verifier due to their intrinsic informational complexity [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗

read the original abstract

Ensuring that artificial intelligence (AI) systems satisfy formal safety and policy constraints is a central challenge in safety-critical domains. While limitations of verification are often attributed to combinatorial complexity and model expressiveness, we show that they arise from intrinsic information-theoretic limits. We formalize policy compliance as a verification problem over encoded system behaviors and analyze it using Kolmogorov complexity. We prove an incompleteness result: for any fixed sound computably enumerable verifier, there exists a threshold beyond which true policy-compliant instances cannot be certified once their complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. This reveals a fundamental limitation of AI safety verification independent of computational resources, and motivates proof-carrying approaches that provide instance-level correctness guarantees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims a Kolmogorov complexity threshold that stops fixed sound c.e. AI safety verifiers from certifying high-complexity compliant policies, but that threshold does not follow from the premises.

read the letter

The paper claims that for any fixed sound computably enumerable verifier of AI policy compliance, there is a complexity threshold past which it cannot certify truly compliant instances. If this were right, it would mean no finite verifier can cover all high-complexity safe behaviors, pushing the field toward proof-carrying methods. What the paper does well is to connect Kolmogorov complexity to the verification problem in AI safety. The abstract lays out the motivation clearly and draws a parallel to Gödel-style incompleteness, which is a reasonable way to frame the issue even if the details don't land. The main soft spot is that the claimed result does not appear to follow from the premises. Any infinite c.e. set must contain members of arbitrarily high Kolmogorov complexity, because there are only finitely many strings below any given complexity bound. A sound verifier that accepts an infinite collection of compliant policies should therefore be able to certify high-complexity ones as its enumeration proceeds. The abstract gives no indication that the certification process supplies short reconstructible descriptions or other structure that would cap the complexity. Without that, the incompleteness of Kolmogorov complexity does not force a threshold on what the verifier can handle. The full proof is not in the abstract, so perhaps the paper introduces a specific encoding of policies or verifiers that creates the bound. But based on the stated claim, it looks like the argument skips a necessary step. This paper is aimed at researchers thinking about foundational limits in AI safety and formal methods. A reader already familiar with computability theory might spot the issue quickly, but others could find the high-level idea useful for discussion. It deserves a serious referee because the topic is important and the claim, if repaired, could be worth publishing. I would send it out for peer review with the expectation that the authors clarify or fix the gap in the reasoning.

Referee Report

2 major / 0 minor

Summary. The manuscript claims to prove an incompleteness theorem for AI safety verification using Kolmogorov complexity. It formalizes policy compliance as a verification problem over encoded system behaviors and asserts that for any fixed sound computably enumerable verifier there exists a threshold beyond which true policy-compliant instances cannot be certified once their Kolmogorov complexity exceeds that threshold. Consequently, no finite formal verifier can certify all policy-compliant instances of arbitrarily high complexity. The result is presented as a fundamental, resource-independent limitation that motivates proof-carrying approaches.

Significance. If correct, the result would establish an intrinsic information-theoretic barrier to exhaustive formal verification of AI policies, independent of computational power, and would strengthen the case for proof-carrying or instance-specific certification methods. However, the central claim does not follow from the stated assumptions: standard properties of Kolmogorov complexity and c.e. sets show that infinite c.e. sets necessarily contain members of arbitrarily high complexity, so the incompleteness does not transfer to an inability to certify high-K compliant instances.

major comments (2)

[Abstract] Abstract: The abstract states that a proof exists but supplies no definitions of the verifier, no formalization of how policy compliance is encoded as a property over behaviors, and no derivation steps showing how Kolmogorov complexity properties yield a certification threshold. Without these elements the central claim cannot be checked for correctness or gaps.
[The claimed incompleteness result] The claimed incompleteness result: The assertion that a fixed sound c.e. verifier must fail to certify compliant instances above some K-threshold is inconsistent with the fact that every infinite c.e. set contains strings of arbitrarily high Kolmogorov complexity. Only finitely many strings have K(x) < n for any n, and the standard bound K(x) ≤ K(V) + K(index of x in the enumeration) + O(1) permits K(x) to grow unbounded as the index increases; no additional structure in the paper's encoding of certification is shown that would impose a hard threshold.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and valuable comments on our manuscript. We address the major comments point by point below, providing clarifications and indicating revisions where appropriate to strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract states that a proof exists but supplies no definitions of the verifier, no formalization of how policy compliance is encoded as a property over behaviors, and no derivation steps showing how Kolmogorov complexity properties yield a certification threshold. Without these elements the central claim cannot be checked for correctness or gaps.

Authors: We agree that the abstract is highly condensed for length reasons and omits explicit definitions. In the full manuscript, the verifier is defined as a sound computably enumerable set of encoded behaviors (soundness meaning all certified instances are policy-compliant), policy compliance is formalized by mapping AI system behaviors to strings in a language, and the threshold is derived from the fixed Kolmogorov complexity of the verifier's description limiting its ability to cover all high-complexity compliant instances. We will revise the abstract to include brief definitions of the verifier and compliance encoding, along with a pointer to the formal theorem and proof. revision: yes
Referee: [The claimed incompleteness result] The claimed incompleteness result: The assertion that a fixed sound c.e. verifier must fail to certify compliant instances above some K-threshold is inconsistent with the fact that every infinite c.e. set contains strings of arbitrarily high Kolmogorov complexity. Only finitely many strings have K(x) < n for any n, and the standard bound K(x) ≤ K(V) + K(index of x in the enumeration) + O(1) permits K(x) to grow unbounded as the index increases; no additional structure in the paper's encoding of certification is shown that would impose a hard threshold.

Authors: The referee correctly recalls that any infinite c.e. set contains elements of arbitrarily high Kolmogorov complexity via the standard bound K(x) ≤ K(V) + K(index) + O(1). Our result does not claim that a fixed verifier certifies zero instances above a K-threshold; rather, it establishes that for any fixed sound c.e. verifier V, there exist policy-compliant instances whose Kolmogorov complexity exceeds any bound derived from K(V) and that V fails to certify. This holds because the full set of policy-compliant instances is not assumed to be c.e. and can contain members outside the enumeration produced by V's fixed finite description. The 'threshold' is the point at which the verifier's coverage necessarily becomes incomplete for the non-c.e. compliant set. We will add a clarifying paragraph distinguishing coverage of some high-K instances from exhaustive coverage of all high-K compliant instances and reference the relevant Kolmogorov complexity lemmas. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard Kolmogorov complexity to verifier formalization

full rationale

The paper's central incompleteness claim is derived by formalizing policy compliance as a verification problem over encoded behaviors and then invoking the standard fact that Kolmogorov complexity is unbounded over any infinite computably enumerable set. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations appear in the abstract or described derivation chain; the result is presented as a direct transfer of an external mathematical property rather than a reduction to the paper's own inputs by construction. The formalization step is explicitly stated as an assumption and does not collapse into the conclusion.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on standard properties of Kolmogorov complexity and computability theory; no free parameters, new entities, or ad-hoc axioms are introduced in the abstract.

axioms (2)

standard math Kolmogorov complexity is well-defined for strings and satisfies the standard incompressibility and invariance properties used in incompleteness arguments
Invoked to establish the existence of a complexity threshold beyond which certification fails
domain assumption A sound computably enumerable verifier exists as a fixed object whose own complexity bounds what it can certify
Central to transferring the complexity limit to the verification setting

pith-pipeline@v0.9.0 · 5410 in / 1413 out tokens · 57563 ms · 2026-05-10T20:10:17.919798+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

2 extracted references · 2 canonical work pages

[1]

In: 2018 IEEE Symposium on Security and Privacy (SP)

URLhttps://arxiv.org/abs/1606.06565. Limits of AI Safety Verification 17 E. Ben-Sasson et al. Scalable, transparent, and post-quantum secure computational integrity.IACR ePrint, 2018. G. J. Chaitin. Information-Theoretic Limitations of Formal Systems.Journal of the ACM (JACM), 21(3):403–424, 1974. J. M. Cohen, E. Rosenfeld, and J. Z. Kolter. Certified Adv...

work page doi:10.1109/sp.2018.00058 2018
[2]

ISBN 978-3-642-14623-7

Springer Berlin Heidelberg. ISBN 978-3-642-14623-7. K. G¨ odel. On formally undecidable propositions of principia mathematica and related systems i 1 (1931). Monatshefte f¨ ur Mathematik und Physik, 38:173–198, 1931. J. Groth. On the Size of Pairing-based Non-interactive Arguments. InEUROCRYPT, 2016. P. D. Gr¨ unwald.The Minimum Description Length Princip...

work page arXiv 1931

[1] [1]

In: 2018 IEEE Symposium on Security and Privacy (SP)

URLhttps://arxiv.org/abs/1606.06565. Limits of AI Safety Verification 17 E. Ben-Sasson et al. Scalable, transparent, and post-quantum secure computational integrity.IACR ePrint, 2018. G. J. Chaitin. Information-Theoretic Limitations of Formal Systems.Journal of the ACM (JACM), 21(3):403–424, 1974. J. M. Cohen, E. Rosenfeld, and J. Z. Kolter. Certified Adv...

work page doi:10.1109/sp.2018.00058 2018

[2] [2]

ISBN 978-3-642-14623-7

Springer Berlin Heidelberg. ISBN 978-3-642-14623-7. K. G¨ odel. On formally undecidable propositions of principia mathematica and related systems i 1 (1931). Monatshefte f¨ ur Mathematik und Physik, 38:173–198, 1931. J. Groth. On the Size of Pairing-based Non-interactive Arguments. InEUROCRYPT, 2016. P. D. Gr¨ unwald.The Minimum Description Length Princip...

work page arXiv 1931