arxiv: 2605.02175 · v2 · submitted 2026-05-04 · 💻 cs.AI

Recognition: 2 theorem links

· Lean Theorem

Intervention Complexity as a Canonical Reward and a Measure of Intelligence

Brendan McCane

Authors on Pith no claims yet

Pith reviewed 2026-05-12 00:46 UTC · model grok-4.3

classification 💻 cs.AI

keywords intervention complexitycanonical rewarduniversal intelligenceresource biascomputabilitylearning efficiencyagent competenceintelligence measure

0 comments

The pith

Intervention complexity provides a canonical reward that completes universal intelligence measures without external normative input.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that intervention complexity satisfies five key properties—environment-derivedness, universality, minimality, sensitivity, and achievement preference—making it a suitable universal reward. This approach allows for a family of such rewards depending on the chosen resource bias, such as program length or time. A sympathetic reader would care because it offers a principled way to assess general intelligence across all possible environments. The work also separates intelligence into how competently an agent performs and how efficiently it learns from experience. Finally, it shows that the choice of bias affects whether the measure can be computed at all.

Core claim

Intervention complexity is a universal reward with the five properties of environment-derivedness, universality, minimality, sensitivity, and achievement preference. This yields a family of canonical rewards indexed by resource bias, completing the universal intelligence measure without external normative input. Intelligence is characterised in two dimensions as agent competence relative to the optimum and learning efficiency in improving that competence. A separation theorem shows that resource bias determines computability, with action-count versions being polynomial-time computable and program-length versions uncomputable without an oracle, the gap quantifying learning's information-theo

What carries the argument

Intervention complexity, which quantifies the minimal complexity of actions or interventions to achieve goals in an environment and functions as the internal reward signal derived from the environment itself.

If this is right

A family of canonical rewards can be defined for any chosen resource bias such as action counts or program lengths.
Universal intelligence can be assessed in two dimensions of competence relative to optimum and efficiency of learning improvement.
The computability of the resulting intelligence measure depends directly on the resource bias selected.
The gap between oracle-assisted and bare versions quantifies the information-theoretic content of learning.
This supports analysis of superintelligence development and pre-training of universal agents.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Different resource biases could produce intelligence measures suited to distinct computational or physical constraints.
Agents might be designed or trained to directly optimize these intrinsic intervention-based rewards in place of hand-specified ones.
Empirical verification in controlled environments could test whether the five properties produce intuitive rankings of agent performance.
The approach might connect to other areas of complexity theory by treating resource functions as variable parameters.

Load-bearing premise

The five properties of environment-derivedness, universality, minimality, sensitivity, and achievement preference are sufficient to establish intervention complexity as the canonical reward and the separation theorem follows directly from the definitions.

What would settle it

A specific computation or proof showing that program-length intervention complexity without oracle access is computable, or the identification of another reward function satisfying the five properties but distinct from intervention complexity.

read the original abstract

The Legg--Hutter universal intelligence measure provides a rigorous scalar assessment of general intelligence as expected reward across all computable environments, weighted by simplicity. However, the measure presupposes an externally specified reward function, raising the question of whether the reward primitive is inherently arbitrary or whether a canonical choice exists. We propose a new measure, called intervention complexity, that has five natural properties: environment-derivedness, universality, minimality, sensitivity, and achievement preference. Given a resource function rho encoding an inductive bias (such as program length, execution time, or energy), rho-intervention complexity is a universal reward. The result yields a family of canonical rewards indexed by resource bias, providing a principled completion of the Legg--Hutter framework that does not require external normative input. We further propose a two-dimensional characterisation of intelligence: agent competence (how well the agent performs relative to the oracle optimum) and learning efficiency (how quickly this competence improves with experience). A separation theorem establishes that the choice of resource bias determines the computability of the resulting measure: action-count IC is computable in polynomial time, while program-length IC without oracle access is uncomputable, with the gap between oracle and bare IC precisely quantifying the information-theoretic content of learning. We discuss implications for superintelligence and for pre-training universal agents.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines intervention complexity with five properties to supply canonical rewards for the Legg-Hutter framework and adds a computability separation plus a two-dimensional intelligence split.

read the letter

The core new material is the intervention complexity measure, its five listed properties, the competence-versus-learning-efficiency split, and the theorem that ties the choice of resource function rho to whether the resulting intelligence measure is computable in polynomial time or requires an oracle. These pieces are not in the Legg-Hutter work the paper cites, so the contribution is genuinely additive on the formal side. The move to index rewards by different inductive biases (program length, time, energy) is a clean way to generate a family rather than a single arbitrary choice. The separation result, if it holds, gives a concrete way to quantify how much of intelligence is about learning versus raw search. That part is worth having on the table for anyone working on universal measures. The main soft spot is that the abstract only states the properties and the theorem without derivations, so it is not yet clear whether the five conditions are jointly sufficient to rule out other candidates or whether the computability gap follows directly once intervention and rho are defined. If the full text supplies the missing steps and checks that no obvious alternative satisfies the same list, the argument strengthens; if it does not, the canonicity claim stays more aspirational than demonstrated. The paper is written for people already inside the universal-intelligence literature who want to tighten the reward primitive. A reader who already knows Legg-Hutter or AIXI-style agents will see the point immediately and can judge the formal gaps for themselves. I would send it to a serious referee because the question it attacks is real and the proposed constructs are specific enough to be checked.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes rho-intervention complexity as a canonical reward that completes the Legg-Hutter universal intelligence measure. It defines this measure via a resource function rho encoding an inductive bias and claims that the construction satisfies five properties (environment-derivedness, universality, minimality, sensitivity, and achievement preference), thereby supplying a family of rewards without external normative input. The paper further introduces a two-dimensional intelligence characterization (competence relative to oracle optimum and learning efficiency) and states a separation theorem asserting that the choice of rho determines computability: action-count intervention complexity is polynomial-time computable, program-length intervention complexity is uncomputable without an oracle, and the oracle-to-bare gap quantifies the information-theoretic content of learning. Implications for superintelligence and pre-training universal agents are discussed.

Significance. If the formal derivations hold, the work would supply a principled internal completion of the Legg-Hutter framework by deriving the reward primitive from environment interactions and a chosen resource bias, thereby reducing arbitrariness in universal intelligence assessment. The resulting family of measures indexed by rho and the explicit computability separation could inform both theoretical analysis of agent learning and practical questions about scalable pre-training. The manuscript correctly credits its foundation in Legg-Hutter while adding independent structure.

major comments (3)

[Abstract] Abstract: The central claim that the five listed properties suffice to single out rho-intervention complexity as the canonical reward is asserted without a demonstration that the properties are jointly necessary and sufficient or that plausible alternatives (e.g., state-complexity or pure description-length variants) fail at least one property. A uniqueness or exclusion argument is load-bearing for the canonicity conclusion.
[Separation theorem] Separation theorem (stated in abstract and presumably proved in §4 or §5): The manuscript asserts that action-count IC is polynomial-time computable while program-length IC is uncomputable without oracle access and that the gap quantifies learning content, yet the abstract and visible outline provide no derivation showing that this follows directly from the definitions of intervention and rho without additional assumptions on the environment class or oracle model.
[§3] §3 (definitions of intervention complexity): The weakest assumption identified—that the five properties are sufficient for canonicity—remains unverified in the absence of an explicit proof that the measure is the unique (or minimal) construction satisfying all five simultaneously; without this, the claim that the framework requires no external normative input is not fully established.

minor comments (1)

[Abstract] Abstract: The term 'intervention complexity' is introduced without a one-sentence operational definition or toy example, which would improve immediate readability for readers unfamiliar with the Legg-Hutter setting.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We appreciate the recognition of its potential to complete the Legg-Hutter framework. We address each major comment below and indicate the revisions we will undertake.

read point-by-point responses

Referee: [Abstract] The central claim that the five listed properties suffice to single out rho-intervention complexity as the canonical reward is asserted without a demonstration that the properties are jointly necessary and sufficient or that plausible alternatives (e.g., state-complexity or pure description-length variants) fail at least one property. A uniqueness or exclusion argument is load-bearing for the canonicity conclusion.

Authors: We agree that an explicit uniqueness argument would strengthen the canonicity claim. In the revised manuscript we will insert a new subsection immediately after the statement of the five properties in §3. This subsection will contain a formal proof that rho-intervention complexity is the unique (up to equivalence) reward function satisfying all five properties simultaneously. We will also show that plausible alternatives, such as state-complexity rewards and pure description-length variants, each violate at least one property (minimality for the former, sensitivity for the latter). This addition will make rigorous the assertion that the construction requires no external normative input. revision: yes
Referee: [Separation theorem] The manuscript asserts that action-count IC is polynomial-time computable while program-length IC is uncomputable without oracle access and that the gap quantifies learning content, yet the abstract and visible outline provide no derivation showing that this follows directly from the definitions of intervention and rho without additional assumptions on the environment class or oracle model.

Authors: The separation theorem and its derivation appear in full in Section 5. The proof proceeds directly from the definitions: for action-count rho the intervention search space is finite and bounded by a polynomial in the size of the environment description, yielding polynomial-time computability; for program-length rho the problem is Turing-equivalent to the halting problem and therefore requires an oracle. The oracle-to-bare gap is shown to equal the Kolmogorov complexity of the optimal policy relative to the chosen rho, thereby quantifying the information-theoretic content of learning. We will revise the abstract to include a one-sentence outline of this derivation and add a clarifying paragraph in §5 that explicitly states the environment class (universal computable environments) and oracle model, both of which are the standard Legg-Hutter setting. revision: partial
Referee: [§3] The weakest assumption identified—that the five properties are sufficient for canonicity—remains unverified in the absence of an explicit proof that the measure is the unique (or minimal) construction satisfying all five simultaneously; without this, the claim that the framework requires no external normative input is not fully established.

Authors: We concur that an explicit uniqueness proof is needed to fully substantiate the claim. As described in our response to the abstract comment, the revised §3 will contain a dedicated proof establishing that the five properties jointly characterize rho-intervention complexity as the unique minimal construction satisfying them all. This will directly support the assertion that the reward is derived without external normative input. revision: yes

Circularity Check

0 steps flagged

No significant circularity: derivation self-contained from stated properties and definitions

full rationale

The paper defines rho-intervention complexity directly from the five listed properties (environment-derivedness, universality, minimality, sensitivity, achievement preference) and the resource function rho, then derives the family of canonical rewards and the separation theorem on computability from those definitions. No step reduces a claimed result to a fitted parameter, self-citation chain, or input by construction; the Legg-Hutter extension adds independent structure rather than renaming or presupposing the target measure. The framework is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the definition of intervention complexity via five properties, the existence of a separation theorem, and the assumption that computable environments weighted by simplicity form a valid base as in Legg-Hutter; rho is treated as an indexing choice rather than a fitted constant.

free parameters (1)

resource function rho
The family of measures is indexed by choice of rho encoding inductive bias such as program length or execution time; no specific numerical value is fitted.

axioms (1)

domain assumption Computable environments exist and can be weighted by simplicity as in the Legg-Hutter measure
The proposal builds directly on the Legg-Hutter framework of expected reward over computable environments.

invented entities (1)

Intervention complexity no independent evidence
purpose: To act as an environment-derived canonical universal reward
Newly defined measure characterized by five properties and used to complete the prior framework.

pith-pipeline@v0.9.0 · 5522 in / 1423 out tokens · 64583 ms · 2026-05-12T00:46:25.829302+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel; dAlembert_to_ODE_general_theorem echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

ρ-intervention complexity is the minimum ρ-cost of a program achieving a given state transition... five natural properties... minimality: ICρμ depends on Iμ(s,s′) only through the minimum, discarding all suboptimal interventions.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean embed_injective; logicNat_initial refines

?

refines
Relation between the paper passage and the cited Recognition theorem.

The gap between bare and oracle IC quantifies the information content of learning the environment... separation theorem establishes that the choice of resource bias determines the computability of the resulting measure.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 1 internal anchor

[1]

C. H. Bennett. Logical depth and physical complexity. In R. Herken, editor,The Universal Turing Machine: A Half-Century Survey, pages 227–257. Oxford University Press, 1988

work page 1988
[2]

Bostrom.Superintelligence: Paths, Dangers, Strategies

N. Bostrom.Superintelligence: Paths, Dangers, Strategies. Oxford University Press, 2014

work page 2014
[3]

Bylander

T. Bylander. The computational complexity of propositional STRIPS planning.Artificial Intelligence, 69(1–2):165–204, 1994

work page 1994
[4]

F. Chollet. On the measure of intelligence.arXiv preprint arXiv:1911.01547, 2019

work page internal anchor Pith review arXiv 1911
[5]

Christiano, J

P. Christiano, J. Leike, T. B. Brown, M. Martic, S. Legg, and D. Amodei. Deep reinforcement learning from human preferences. InAdvances in Neural Information Processing Systems, 2017

work page 2017
[6]

M. K. Cohen, M. Hutter, and M. A. Osborne. Advanced artificial agents intervene in the provision of reward.AI Magazine, 43(3):282–293, 2022

work page 2022
[7]

K. Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010

work page 2010
[8]

Hernández-Orallo.The Measure of All Minds: Evaluating Natural and Artificial Intelli- gence

J. Hernández-Orallo.The Measure of All Minds: Evaluating Natural and Artificial Intelli- gence. Cambridge University Press, 2017

work page 2017
[9]

Hernández-Orallo and D

J. Hernández-Orallo and D. L. Dowe. Measuring universal intelligence: Towards an anytime intelligence test.Artificial Intelligence, 174(18):1508–1539, 2010

work page 2010
[10]

Hutter.Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability

M. Hutter.Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, 2005

work page 2005
[11]

A theory of universal artificial intelligence based on algorithmic complexity, 2000

Marcus Hutter. A theory of universal artificial intelligence based on algorithmic complexity, 2000

work page 2000
[12]

Chapman and Hall/CRC, 2024

Marcus Hutter, David Quarel, and Elliot Catt.An introduction to universal artificial intelligence. Chapman and Hall/CRC, 2024

work page 2024
[13]

Empowerment: A universal agent-centric measure of control

Alexander S Klyubin, Daniel Polani, and Chrystopher L Nehaniv. Empowerment: A universal agent-centric measure of control. In2005 ieee congress on evolutionary computation, volume 1, pages 128–135. IEEE, 2005

work page 2005
[14]

Legg and M

S. Legg and M. Hutter. Universal intelligence: A definition of machine intelligence.Minds and Machines, 17(4):391–444, 2007

work page 2007
[15]

Universal sequential search problems.Problems of information transmission, 9(3):265–266, 1973

Leonid A Levin. Universal sequential search problems.Problems of information transmission, 9(3):265–266, 1973

work page 1973
[16]

Springer, 2008

Ming Li, Paul Vitányi, et al.An introduction to Kolmogorov complexity and its applications, volume 3. Springer, 2008. 11

work page 2008
[17]

Natural emergent misalignment from reward hacking in production rl, 2025

Monte MacDiarmid, Benjamin Wright, Jonathan Uesato, Joe Benton, Jon Kutasov, Sara Price, Naia Bouscal, Sam Bowman, Trenton Bricken, Alex Cloud, et al. Natural emergent misalignment from reward hacking in production rl.arXiv preprint arXiv:2511.18397, 2025

work page arXiv 2025
[18]

A. Y. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. InInternational Conference on Machine Learning, pages 278–287, 1999

work page 1999
[19]

C. H. Papadimitriou and M. Yannakakis. On complexity as bounded rationality. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (STOC), pages 726–733, 1994

work page 1994
[20]

Pearl.Causality: Models, Reasoning, and Inference

J. Pearl.Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition, 2009

work page 2009
[21]

Rafailov, A

R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn. Direct preference optimization: Your language model is secretly a reward model. InAdvances in Neural Information Processing Systems, 2023

work page 2023
[22]

S. J. Russell. Rationality and intelligence.Artificial Intelligence, 94(1–2):57–77, 1997

work page 1997
[23]

Provably bounded-optimal agents.Journal of Artificial Intelligence Research, 2:575–609, 1994

Stuart J Russell and Devika Subramanian. Provably bounded-optimal agents.Journal of Artificial Intelligence Research, 2:575–609, 1994

work page 1994
[24]

Schmidhuber

J. Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247, 2010

work page 1990
[25]

H. A. Simon. A behavioral model of rational choice.The Quarterly Journal of Economics, 69(1):99–118, 1955

work page 1955
[26]

R. J. Solomonoff. A formal theory of inductive inference, parts 1 and 2.Information and Control, 7:1–22 and 224–254, 1964

work page 1964
[27]

D. H. Wolpert and W. G. Macready. No free lunch theorems for optimization.IEEE Transactions on Evolutionary Computation, 1(1):67–82, 1997. A Independence of the properties A.1 Properties of IC I claim that intervention complexity is an appropriate canonical, or universal, reward function. Following is a list of properties of IC, why the property is useful...

work page 1997
[28]

ρ(p, µ1, s1) = minp∈Iµ2 (s2,s′

work page
[29]

ρ(p, µ2, s2), we haveICρ µ1(s1, s′

work page
[30]

feasible

= ICρ µ2(s2, s′ 2). Consider R(µ, s, s′) = −H(Iµ(s, s′)), where H measures the richness of the intervention set—for instance, the number of distinct programs below some length threshold that achieve the transition. This rewards transitions that are achievable in many ways, regardless of whether the best way is efficient. It measures accessibility or robus...

work page
[31]

gated corridor

<IC ρℓ µ,U1(s, s′ 2)with a gap exceeding2 c, then underU2 each value shifts by at mostc, so the ordering is preserved. For (iii): if ICρt µ,U1(s, s′ 1)and ICρt µ,U1(s, s′ 2)differ by more than a polynomial factor, the polynomial simulation overhead cannot reverse the ordering. B Metric structure of IC We examine how IC behaves under composition of transit...

work page