pith. machine review for the scientific record. sign in

arxiv: 2605.02175 · v2 · submitted 2026-05-04 · 💻 cs.AI

Recognition: 2 theorem links

· Lean Theorem

Intervention Complexity as a Canonical Reward and a Measure of Intelligence

Authors on Pith no claims yet

Pith reviewed 2026-05-12 00:46 UTC · model grok-4.3

classification 💻 cs.AI
keywords intervention complexitycanonical rewarduniversal intelligenceresource biascomputabilitylearning efficiencyagent competenceintelligence measure
0
0 comments X

The pith

Intervention complexity provides a canonical reward that completes universal intelligence measures without external normative input.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that intervention complexity satisfies five key properties—environment-derivedness, universality, minimality, sensitivity, and achievement preference—making it a suitable universal reward. This approach allows for a family of such rewards depending on the chosen resource bias, such as program length or time. A sympathetic reader would care because it offers a principled way to assess general intelligence across all possible environments. The work also separates intelligence into how competently an agent performs and how efficiently it learns from experience. Finally, it shows that the choice of bias affects whether the measure can be computed at all.

Core claim

Intervention complexity is a universal reward with the five properties of environment-derivedness, universality, minimality, sensitivity, and achievement preference. This yields a family of canonical rewards indexed by resource bias, completing the universal intelligence measure without external normative input. Intelligence is characterised in two dimensions as agent competence relative to the optimum and learning efficiency in improving that competence. A separation theorem shows that resource bias determines computability, with action-count versions being polynomial-time computable and program-length versions uncomputable without an oracle, the gap quantifying learning's information-theo

What carries the argument

Intervention complexity, which quantifies the minimal complexity of actions or interventions to achieve goals in an environment and functions as the internal reward signal derived from the environment itself.

If this is right

  • A family of canonical rewards can be defined for any chosen resource bias such as action counts or program lengths.
  • Universal intelligence can be assessed in two dimensions of competence relative to optimum and efficiency of learning improvement.
  • The computability of the resulting intelligence measure depends directly on the resource bias selected.
  • The gap between oracle-assisted and bare versions quantifies the information-theoretic content of learning.
  • This supports analysis of superintelligence development and pre-training of universal agents.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Different resource biases could produce intelligence measures suited to distinct computational or physical constraints.
  • Agents might be designed or trained to directly optimize these intrinsic intervention-based rewards in place of hand-specified ones.
  • Empirical verification in controlled environments could test whether the five properties produce intuitive rankings of agent performance.
  • The approach might connect to other areas of complexity theory by treating resource functions as variable parameters.

Load-bearing premise

The five properties of environment-derivedness, universality, minimality, sensitivity, and achievement preference are sufficient to establish intervention complexity as the canonical reward and the separation theorem follows directly from the definitions.

What would settle it

A specific computation or proof showing that program-length intervention complexity without oracle access is computable, or the identification of another reward function satisfying the five properties but distinct from intervention complexity.

read the original abstract

The Legg--Hutter universal intelligence measure provides a rigorous scalar assessment of general intelligence as expected reward across all computable environments, weighted by simplicity. However, the measure presupposes an externally specified reward function, raising the question of whether the reward primitive is inherently arbitrary or whether a canonical choice exists. We propose a new measure, called intervention complexity, that has five natural properties: environment-derivedness, universality, minimality, sensitivity, and achievement preference. Given a resource function rho encoding an inductive bias (such as program length, execution time, or energy), rho-intervention complexity is a universal reward. The result yields a family of canonical rewards indexed by resource bias, providing a principled completion of the Legg--Hutter framework that does not require external normative input. We further propose a two-dimensional characterisation of intelligence: agent competence (how well the agent performs relative to the oracle optimum) and learning efficiency (how quickly this competence improves with experience). A separation theorem establishes that the choice of resource bias determines the computability of the resulting measure: action-count IC is computable in polynomial time, while program-length IC without oracle access is uncomputable, with the gap between oracle and bare IC precisely quantifying the information-theoretic content of learning. We discuss implications for superintelligence and for pre-training universal agents.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes rho-intervention complexity as a canonical reward that completes the Legg-Hutter universal intelligence measure. It defines this measure via a resource function rho encoding an inductive bias and claims that the construction satisfies five properties (environment-derivedness, universality, minimality, sensitivity, and achievement preference), thereby supplying a family of rewards without external normative input. The paper further introduces a two-dimensional intelligence characterization (competence relative to oracle optimum and learning efficiency) and states a separation theorem asserting that the choice of rho determines computability: action-count intervention complexity is polynomial-time computable, program-length intervention complexity is uncomputable without an oracle, and the oracle-to-bare gap quantifies the information-theoretic content of learning. Implications for superintelligence and pre-training universal agents are discussed.

Significance. If the formal derivations hold, the work would supply a principled internal completion of the Legg-Hutter framework by deriving the reward primitive from environment interactions and a chosen resource bias, thereby reducing arbitrariness in universal intelligence assessment. The resulting family of measures indexed by rho and the explicit computability separation could inform both theoretical analysis of agent learning and practical questions about scalable pre-training. The manuscript correctly credits its foundation in Legg-Hutter while adding independent structure.

major comments (3)
  1. [Abstract] Abstract: The central claim that the five listed properties suffice to single out rho-intervention complexity as the canonical reward is asserted without a demonstration that the properties are jointly necessary and sufficient or that plausible alternatives (e.g., state-complexity or pure description-length variants) fail at least one property. A uniqueness or exclusion argument is load-bearing for the canonicity conclusion.
  2. [Separation theorem] Separation theorem (stated in abstract and presumably proved in §4 or §5): The manuscript asserts that action-count IC is polynomial-time computable while program-length IC is uncomputable without oracle access and that the gap quantifies learning content, yet the abstract and visible outline provide no derivation showing that this follows directly from the definitions of intervention and rho without additional assumptions on the environment class or oracle model.
  3. [§3] §3 (definitions of intervention complexity): The weakest assumption identified—that the five properties are sufficient for canonicity—remains unverified in the absence of an explicit proof that the measure is the unique (or minimal) construction satisfying all five simultaneously; without this, the claim that the framework requires no external normative input is not fully established.
minor comments (1)
  1. [Abstract] Abstract: The term 'intervention complexity' is introduced without a one-sentence operational definition or toy example, which would improve immediate readability for readers unfamiliar with the Legg-Hutter setting.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We appreciate the recognition of its potential to complete the Legg-Hutter framework. We address each major comment below and indicate the revisions we will undertake.

read point-by-point responses
  1. Referee: [Abstract] The central claim that the five listed properties suffice to single out rho-intervention complexity as the canonical reward is asserted without a demonstration that the properties are jointly necessary and sufficient or that plausible alternatives (e.g., state-complexity or pure description-length variants) fail at least one property. A uniqueness or exclusion argument is load-bearing for the canonicity conclusion.

    Authors: We agree that an explicit uniqueness argument would strengthen the canonicity claim. In the revised manuscript we will insert a new subsection immediately after the statement of the five properties in §3. This subsection will contain a formal proof that rho-intervention complexity is the unique (up to equivalence) reward function satisfying all five properties simultaneously. We will also show that plausible alternatives, such as state-complexity rewards and pure description-length variants, each violate at least one property (minimality for the former, sensitivity for the latter). This addition will make rigorous the assertion that the construction requires no external normative input. revision: yes

  2. Referee: [Separation theorem] The manuscript asserts that action-count IC is polynomial-time computable while program-length IC is uncomputable without oracle access and that the gap quantifies learning content, yet the abstract and visible outline provide no derivation showing that this follows directly from the definitions of intervention and rho without additional assumptions on the environment class or oracle model.

    Authors: The separation theorem and its derivation appear in full in Section 5. The proof proceeds directly from the definitions: for action-count rho the intervention search space is finite and bounded by a polynomial in the size of the environment description, yielding polynomial-time computability; for program-length rho the problem is Turing-equivalent to the halting problem and therefore requires an oracle. The oracle-to-bare gap is shown to equal the Kolmogorov complexity of the optimal policy relative to the chosen rho, thereby quantifying the information-theoretic content of learning. We will revise the abstract to include a one-sentence outline of this derivation and add a clarifying paragraph in §5 that explicitly states the environment class (universal computable environments) and oracle model, both of which are the standard Legg-Hutter setting. revision: partial

  3. Referee: [§3] The weakest assumption identified—that the five properties are sufficient for canonicity—remains unverified in the absence of an explicit proof that the measure is the unique (or minimal) construction satisfying all five simultaneously; without this, the claim that the framework requires no external normative input is not fully established.

    Authors: We concur that an explicit uniqueness proof is needed to fully substantiate the claim. As described in our response to the abstract comment, the revised §3 will contain a dedicated proof establishing that the five properties jointly characterize rho-intervention complexity as the unique minimal construction satisfying them all. This will directly support the assertion that the reward is derived without external normative input. revision: yes

Circularity Check

0 steps flagged

No significant circularity: derivation self-contained from stated properties and definitions

full rationale

The paper defines rho-intervention complexity directly from the five listed properties (environment-derivedness, universality, minimality, sensitivity, achievement preference) and the resource function rho, then derives the family of canonical rewards and the separation theorem on computability from those definitions. No step reduces a claimed result to a fitted parameter, self-citation chain, or input by construction; the Legg-Hutter extension adds independent structure rather than renaming or presupposing the target measure. The framework is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The central claim rests on the definition of intervention complexity via five properties, the existence of a separation theorem, and the assumption that computable environments weighted by simplicity form a valid base as in Legg-Hutter; rho is treated as an indexing choice rather than a fitted constant.

free parameters (1)
  • resource function rho
    The family of measures is indexed by choice of rho encoding inductive bias such as program length or execution time; no specific numerical value is fitted.
axioms (1)
  • domain assumption Computable environments exist and can be weighted by simplicity as in the Legg-Hutter measure
    The proposal builds directly on the Legg-Hutter framework of expected reward over computable environments.
invented entities (1)
  • Intervention complexity no independent evidence
    purpose: To act as an environment-derived canonical universal reward
    Newly defined measure characterized by five properties and used to complete the prior framework.

pith-pipeline@v0.9.0 · 5522 in / 1423 out tokens · 64583 ms · 2026-05-12T00:46:25.829302+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 1 internal anchor

  1. [1]

    C. H. Bennett. Logical depth and physical complexity. In R. Herken, editor,The Universal Turing Machine: A Half-Century Survey, pages 227–257. Oxford University Press, 1988

  2. [2]

    Bostrom.Superintelligence: Paths, Dangers, Strategies

    N. Bostrom.Superintelligence: Paths, Dangers, Strategies. Oxford University Press, 2014

  3. [3]

    Bylander

    T. Bylander. The computational complexity of propositional STRIPS planning.Artificial Intelligence, 69(1–2):165–204, 1994

  4. [4]

    F. Chollet. On the measure of intelligence.arXiv preprint arXiv:1911.01547, 2019

  5. [5]

    Christiano, J

    P. Christiano, J. Leike, T. B. Brown, M. Martic, S. Legg, and D. Amodei. Deep reinforcement learning from human preferences. InAdvances in Neural Information Processing Systems, 2017

  6. [6]

    M. K. Cohen, M. Hutter, and M. A. Osborne. Advanced artificial agents intervene in the provision of reward.AI Magazine, 43(3):282–293, 2022

  7. [7]

    K. Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010

  8. [8]

    Hernández-Orallo.The Measure of All Minds: Evaluating Natural and Artificial Intelli- gence

    J. Hernández-Orallo.The Measure of All Minds: Evaluating Natural and Artificial Intelli- gence. Cambridge University Press, 2017

  9. [9]

    Hernández-Orallo and D

    J. Hernández-Orallo and D. L. Dowe. Measuring universal intelligence: Towards an anytime intelligence test.Artificial Intelligence, 174(18):1508–1539, 2010

  10. [10]

    Hutter.Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability

    M. Hutter.Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, 2005

  11. [11]

    A theory of universal artificial intelligence based on algorithmic complexity, 2000

    Marcus Hutter. A theory of universal artificial intelligence based on algorithmic complexity, 2000

  12. [12]

    Chapman and Hall/CRC, 2024

    Marcus Hutter, David Quarel, and Elliot Catt.An introduction to universal artificial intelligence. Chapman and Hall/CRC, 2024

  13. [13]

    Empowerment: A universal agent-centric measure of control

    Alexander S Klyubin, Daniel Polani, and Chrystopher L Nehaniv. Empowerment: A universal agent-centric measure of control. In2005 ieee congress on evolutionary computation, volume 1, pages 128–135. IEEE, 2005

  14. [14]

    Legg and M

    S. Legg and M. Hutter. Universal intelligence: A definition of machine intelligence.Minds and Machines, 17(4):391–444, 2007

  15. [15]

    Universal sequential search problems.Problems of information transmission, 9(3):265–266, 1973

    Leonid A Levin. Universal sequential search problems.Problems of information transmission, 9(3):265–266, 1973

  16. [16]

    Springer, 2008

    Ming Li, Paul Vitányi, et al.An introduction to Kolmogorov complexity and its applications, volume 3. Springer, 2008. 11

  17. [17]

    Natural emergent misalignment from reward hacking in production rl, 2025

    Monte MacDiarmid, Benjamin Wright, Jonathan Uesato, Joe Benton, Jon Kutasov, Sara Price, Naia Bouscal, Sam Bowman, Trenton Bricken, Alex Cloud, et al. Natural emergent misalignment from reward hacking in production rl.arXiv preprint arXiv:2511.18397, 2025

  18. [18]

    A. Y. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. InInternational Conference on Machine Learning, pages 278–287, 1999

  19. [19]

    C. H. Papadimitriou and M. Yannakakis. On complexity as bounded rationality. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (STOC), pages 726–733, 1994

  20. [20]

    Pearl.Causality: Models, Reasoning, and Inference

    J. Pearl.Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition, 2009

  21. [21]

    Rafailov, A

    R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn. Direct preference optimization: Your language model is secretly a reward model. InAdvances in Neural Information Processing Systems, 2023

  22. [22]

    S. J. Russell. Rationality and intelligence.Artificial Intelligence, 94(1–2):57–77, 1997

  23. [23]

    Provably bounded-optimal agents.Journal of Artificial Intelligence Research, 2:575–609, 1994

    Stuart J Russell and Devika Subramanian. Provably bounded-optimal agents.Journal of Artificial Intelligence Research, 2:575–609, 1994

  24. [24]

    Schmidhuber

    J. Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247, 2010

  25. [25]

    H. A. Simon. A behavioral model of rational choice.The Quarterly Journal of Economics, 69(1):99–118, 1955

  26. [26]

    R. J. Solomonoff. A formal theory of inductive inference, parts 1 and 2.Information and Control, 7:1–22 and 224–254, 1964

  27. [27]

    D. H. Wolpert and W. G. Macready. No free lunch theorems for optimization.IEEE Transactions on Evolutionary Computation, 1(1):67–82, 1997. A Independence of the properties A.1 Properties of IC I claim that intervention complexity is an appropriate canonical, or universal, reward function. Following is a list of properties of IC, why the property is useful...

  28. [28]

    ρ(p, µ1, s1) = minp∈Iµ2 (s2,s′

  29. [29]

    ρ(p, µ2, s2), we haveICρ µ1(s1, s′

  30. [30]

    feasible

    = ICρ µ2(s2, s′ 2). Consider R(µ, s, s′) = −H(Iµ(s, s′)), where H measures the richness of the intervention set—for instance, the number of distinct programs below some length threshold that achieve the transition. This rewards transitions that are achievable in many ways, regardless of whether the best way is efficient. It measures accessibility or robus...

  31. [31]

    gated corridor

    <IC ρℓ µ,U1(s, s′ 2)with a gap exceeding2 c, then underU2 each value shifts by at mostc, so the ordering is preserved. For (iii): if ICρt µ,U1(s, s′ 1)and ICρt µ,U1(s, s′ 2)differ by more than a polynomial factor, the polynomial simulation overhead cannot reverse the ordering. B Metric structure of IC We examine how IC behaves under composition of transit...