Recognition: 2 theorem links
· Lean TheoremIntervention Complexity as a Canonical Reward and a Measure of Intelligence
Pith reviewed 2026-05-12 00:46 UTC · model grok-4.3
The pith
Intervention complexity provides a canonical reward that completes universal intelligence measures without external normative input.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Intervention complexity is a universal reward with the five properties of environment-derivedness, universality, minimality, sensitivity, and achievement preference. This yields a family of canonical rewards indexed by resource bias, completing the universal intelligence measure without external normative input. Intelligence is characterised in two dimensions as agent competence relative to the optimum and learning efficiency in improving that competence. A separation theorem shows that resource bias determines computability, with action-count versions being polynomial-time computable and program-length versions uncomputable without an oracle, the gap quantifying learning's information-theo
What carries the argument
Intervention complexity, which quantifies the minimal complexity of actions or interventions to achieve goals in an environment and functions as the internal reward signal derived from the environment itself.
If this is right
- A family of canonical rewards can be defined for any chosen resource bias such as action counts or program lengths.
- Universal intelligence can be assessed in two dimensions of competence relative to optimum and efficiency of learning improvement.
- The computability of the resulting intelligence measure depends directly on the resource bias selected.
- The gap between oracle-assisted and bare versions quantifies the information-theoretic content of learning.
- This supports analysis of superintelligence development and pre-training of universal agents.
Where Pith is reading between the lines
- Different resource biases could produce intelligence measures suited to distinct computational or physical constraints.
- Agents might be designed or trained to directly optimize these intrinsic intervention-based rewards in place of hand-specified ones.
- Empirical verification in controlled environments could test whether the five properties produce intuitive rankings of agent performance.
- The approach might connect to other areas of complexity theory by treating resource functions as variable parameters.
Load-bearing premise
The five properties of environment-derivedness, universality, minimality, sensitivity, and achievement preference are sufficient to establish intervention complexity as the canonical reward and the separation theorem follows directly from the definitions.
What would settle it
A specific computation or proof showing that program-length intervention complexity without oracle access is computable, or the identification of another reward function satisfying the five properties but distinct from intervention complexity.
read the original abstract
The Legg--Hutter universal intelligence measure provides a rigorous scalar assessment of general intelligence as expected reward across all computable environments, weighted by simplicity. However, the measure presupposes an externally specified reward function, raising the question of whether the reward primitive is inherently arbitrary or whether a canonical choice exists. We propose a new measure, called intervention complexity, that has five natural properties: environment-derivedness, universality, minimality, sensitivity, and achievement preference. Given a resource function rho encoding an inductive bias (such as program length, execution time, or energy), rho-intervention complexity is a universal reward. The result yields a family of canonical rewards indexed by resource bias, providing a principled completion of the Legg--Hutter framework that does not require external normative input. We further propose a two-dimensional characterisation of intelligence: agent competence (how well the agent performs relative to the oracle optimum) and learning efficiency (how quickly this competence improves with experience). A separation theorem establishes that the choice of resource bias determines the computability of the resulting measure: action-count IC is computable in polynomial time, while program-length IC without oracle access is uncomputable, with the gap between oracle and bare IC precisely quantifying the information-theoretic content of learning. We discuss implications for superintelligence and for pre-training universal agents.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes rho-intervention complexity as a canonical reward that completes the Legg-Hutter universal intelligence measure. It defines this measure via a resource function rho encoding an inductive bias and claims that the construction satisfies five properties (environment-derivedness, universality, minimality, sensitivity, and achievement preference), thereby supplying a family of rewards without external normative input. The paper further introduces a two-dimensional intelligence characterization (competence relative to oracle optimum and learning efficiency) and states a separation theorem asserting that the choice of rho determines computability: action-count intervention complexity is polynomial-time computable, program-length intervention complexity is uncomputable without an oracle, and the oracle-to-bare gap quantifies the information-theoretic content of learning. Implications for superintelligence and pre-training universal agents are discussed.
Significance. If the formal derivations hold, the work would supply a principled internal completion of the Legg-Hutter framework by deriving the reward primitive from environment interactions and a chosen resource bias, thereby reducing arbitrariness in universal intelligence assessment. The resulting family of measures indexed by rho and the explicit computability separation could inform both theoretical analysis of agent learning and practical questions about scalable pre-training. The manuscript correctly credits its foundation in Legg-Hutter while adding independent structure.
major comments (3)
- [Abstract] Abstract: The central claim that the five listed properties suffice to single out rho-intervention complexity as the canonical reward is asserted without a demonstration that the properties are jointly necessary and sufficient or that plausible alternatives (e.g., state-complexity or pure description-length variants) fail at least one property. A uniqueness or exclusion argument is load-bearing for the canonicity conclusion.
- [Separation theorem] Separation theorem (stated in abstract and presumably proved in §4 or §5): The manuscript asserts that action-count IC is polynomial-time computable while program-length IC is uncomputable without oracle access and that the gap quantifies learning content, yet the abstract and visible outline provide no derivation showing that this follows directly from the definitions of intervention and rho without additional assumptions on the environment class or oracle model.
- [§3] §3 (definitions of intervention complexity): The weakest assumption identified—that the five properties are sufficient for canonicity—remains unverified in the absence of an explicit proof that the measure is the unique (or minimal) construction satisfying all five simultaneously; without this, the claim that the framework requires no external normative input is not fully established.
minor comments (1)
- [Abstract] Abstract: The term 'intervention complexity' is introduced without a one-sentence operational definition or toy example, which would improve immediate readability for readers unfamiliar with the Legg-Hutter setting.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive feedback on our manuscript. We appreciate the recognition of its potential to complete the Legg-Hutter framework. We address each major comment below and indicate the revisions we will undertake.
read point-by-point responses
-
Referee: [Abstract] The central claim that the five listed properties suffice to single out rho-intervention complexity as the canonical reward is asserted without a demonstration that the properties are jointly necessary and sufficient or that plausible alternatives (e.g., state-complexity or pure description-length variants) fail at least one property. A uniqueness or exclusion argument is load-bearing for the canonicity conclusion.
Authors: We agree that an explicit uniqueness argument would strengthen the canonicity claim. In the revised manuscript we will insert a new subsection immediately after the statement of the five properties in §3. This subsection will contain a formal proof that rho-intervention complexity is the unique (up to equivalence) reward function satisfying all five properties simultaneously. We will also show that plausible alternatives, such as state-complexity rewards and pure description-length variants, each violate at least one property (minimality for the former, sensitivity for the latter). This addition will make rigorous the assertion that the construction requires no external normative input. revision: yes
-
Referee: [Separation theorem] The manuscript asserts that action-count IC is polynomial-time computable while program-length IC is uncomputable without oracle access and that the gap quantifies learning content, yet the abstract and visible outline provide no derivation showing that this follows directly from the definitions of intervention and rho without additional assumptions on the environment class or oracle model.
Authors: The separation theorem and its derivation appear in full in Section 5. The proof proceeds directly from the definitions: for action-count rho the intervention search space is finite and bounded by a polynomial in the size of the environment description, yielding polynomial-time computability; for program-length rho the problem is Turing-equivalent to the halting problem and therefore requires an oracle. The oracle-to-bare gap is shown to equal the Kolmogorov complexity of the optimal policy relative to the chosen rho, thereby quantifying the information-theoretic content of learning. We will revise the abstract to include a one-sentence outline of this derivation and add a clarifying paragraph in §5 that explicitly states the environment class (universal computable environments) and oracle model, both of which are the standard Legg-Hutter setting. revision: partial
-
Referee: [§3] The weakest assumption identified—that the five properties are sufficient for canonicity—remains unverified in the absence of an explicit proof that the measure is the unique (or minimal) construction satisfying all five simultaneously; without this, the claim that the framework requires no external normative input is not fully established.
Authors: We concur that an explicit uniqueness proof is needed to fully substantiate the claim. As described in our response to the abstract comment, the revised §3 will contain a dedicated proof establishing that the five properties jointly characterize rho-intervention complexity as the unique minimal construction satisfying them all. This will directly support the assertion that the reward is derived without external normative input. revision: yes
Circularity Check
No significant circularity: derivation self-contained from stated properties and definitions
full rationale
The paper defines rho-intervention complexity directly from the five listed properties (environment-derivedness, universality, minimality, sensitivity, achievement preference) and the resource function rho, then derives the family of canonical rewards and the separation theorem on computability from those definitions. No step reduces a claimed result to a fitted parameter, self-citation chain, or input by construction; the Legg-Hutter extension adds independent structure rather than renaming or presupposing the target measure. The framework is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- resource function rho
axioms (1)
- domain assumption Computable environments exist and can be weighted by simplicity as in the Legg-Hutter measure
invented entities (1)
-
Intervention complexity
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel; dAlembert_to_ODE_general_theorem echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
ρ-intervention complexity is the minimum ρ-cost of a program achieving a given state transition... five natural properties... minimality: ICρμ depends on Iμ(s,s′) only through the minimum, discarding all suboptimal interventions.
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanembed_injective; logicNat_initial refines?
refinesRelation between the paper passage and the cited Recognition theorem.
The gap between bare and oracle IC quantifies the information content of learning the environment... separation theorem establishes that the choice of resource bias determines the computability of the resulting measure.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
C. H. Bennett. Logical depth and physical complexity. In R. Herken, editor,The Universal Turing Machine: A Half-Century Survey, pages 227–257. Oxford University Press, 1988
work page 1988
-
[2]
Bostrom.Superintelligence: Paths, Dangers, Strategies
N. Bostrom.Superintelligence: Paths, Dangers, Strategies. Oxford University Press, 2014
work page 2014
- [3]
-
[4]
F. Chollet. On the measure of intelligence.arXiv preprint arXiv:1911.01547, 2019
work page internal anchor Pith review arXiv 1911
-
[5]
P. Christiano, J. Leike, T. B. Brown, M. Martic, S. Legg, and D. Amodei. Deep reinforcement learning from human preferences. InAdvances in Neural Information Processing Systems, 2017
work page 2017
-
[6]
M. K. Cohen, M. Hutter, and M. A. Osborne. Advanced artificial agents intervene in the provision of reward.AI Magazine, 43(3):282–293, 2022
work page 2022
-
[7]
K. Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2):127–138, 2010
work page 2010
-
[8]
Hernández-Orallo.The Measure of All Minds: Evaluating Natural and Artificial Intelli- gence
J. Hernández-Orallo.The Measure of All Minds: Evaluating Natural and Artificial Intelli- gence. Cambridge University Press, 2017
work page 2017
-
[9]
J. Hernández-Orallo and D. L. Dowe. Measuring universal intelligence: Towards an anytime intelligence test.Artificial Intelligence, 174(18):1508–1539, 2010
work page 2010
-
[10]
Hutter.Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability
M. Hutter.Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. Springer, 2005
work page 2005
-
[11]
A theory of universal artificial intelligence based on algorithmic complexity, 2000
Marcus Hutter. A theory of universal artificial intelligence based on algorithmic complexity, 2000
work page 2000
-
[12]
Marcus Hutter, David Quarel, and Elliot Catt.An introduction to universal artificial intelligence. Chapman and Hall/CRC, 2024
work page 2024
-
[13]
Empowerment: A universal agent-centric measure of control
Alexander S Klyubin, Daniel Polani, and Chrystopher L Nehaniv. Empowerment: A universal agent-centric measure of control. In2005 ieee congress on evolutionary computation, volume 1, pages 128–135. IEEE, 2005
work page 2005
-
[14]
S. Legg and M. Hutter. Universal intelligence: A definition of machine intelligence.Minds and Machines, 17(4):391–444, 2007
work page 2007
-
[15]
Universal sequential search problems.Problems of information transmission, 9(3):265–266, 1973
Leonid A Levin. Universal sequential search problems.Problems of information transmission, 9(3):265–266, 1973
work page 1973
-
[16]
Ming Li, Paul Vitányi, et al.An introduction to Kolmogorov complexity and its applications, volume 3. Springer, 2008. 11
work page 2008
-
[17]
Natural emergent misalignment from reward hacking in production rl, 2025
Monte MacDiarmid, Benjamin Wright, Jonathan Uesato, Joe Benton, Jon Kutasov, Sara Price, Naia Bouscal, Sam Bowman, Trenton Bricken, Alex Cloud, et al. Natural emergent misalignment from reward hacking in production rl.arXiv preprint arXiv:2511.18397, 2025
-
[18]
A. Y. Ng, D. Harada, and S. Russell. Policy invariance under reward transformations: Theory and application to reward shaping. InInternational Conference on Machine Learning, pages 278–287, 1999
work page 1999
-
[19]
C. H. Papadimitriou and M. Yannakakis. On complexity as bounded rationality. In Proceedings of the 26th Annual ACM Symposium on Theory of Computing (STOC), pages 726–733, 1994
work page 1994
-
[20]
Pearl.Causality: Models, Reasoning, and Inference
J. Pearl.Causality: Models, Reasoning, and Inference. Cambridge University Press, 2nd edition, 2009
work page 2009
-
[21]
R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn. Direct preference optimization: Your language model is secretly a reward model. InAdvances in Neural Information Processing Systems, 2023
work page 2023
-
[22]
S. J. Russell. Rationality and intelligence.Artificial Intelligence, 94(1–2):57–77, 1997
work page 1997
-
[23]
Provably bounded-optimal agents.Journal of Artificial Intelligence Research, 2:575–609, 1994
Stuart J Russell and Devika Subramanian. Provably bounded-optimal agents.Journal of Artificial Intelligence Research, 2:575–609, 1994
work page 1994
-
[24]
J. Schmidhuber. Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247, 2010
work page 1990
-
[25]
H. A. Simon. A behavioral model of rational choice.The Quarterly Journal of Economics, 69(1):99–118, 1955
work page 1955
-
[26]
R. J. Solomonoff. A formal theory of inductive inference, parts 1 and 2.Information and Control, 7:1–22 and 224–254, 1964
work page 1964
-
[27]
D. H. Wolpert and W. G. Macready. No free lunch theorems for optimization.IEEE Transactions on Evolutionary Computation, 1(1):67–82, 1997. A Independence of the properties A.1 Properties of IC I claim that intervention complexity is an appropriate canonical, or universal, reward function. Following is a list of properties of IC, why the property is useful...
work page 1997
-
[28]
ρ(p, µ1, s1) = minp∈Iµ2 (s2,s′
-
[29]
ρ(p, µ2, s2), we haveICρ µ1(s1, s′
-
[30]
= ICρ µ2(s2, s′ 2). Consider R(µ, s, s′) = −H(Iµ(s, s′)), where H measures the richness of the intervention set—for instance, the number of distinct programs below some length threshold that achieve the transition. This rewards transitions that are achievable in many ways, regardless of whether the best way is efficient. It measures accessibility or robus...
-
[31]
<IC ρℓ µ,U1(s, s′ 2)with a gap exceeding2 c, then underU2 each value shifts by at mostc, so the ordering is preserved. For (iii): if ICρt µ,U1(s, s′ 1)and ICρt µ,U1(s, s′ 2)differ by more than a polynomial factor, the polynomial simulation overhead cannot reverse the ordering. B Metric structure of IC We examine how IC behaves under composition of transit...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.