A Quantitative Definition of Intelligence

Kang-Sin Choi

arxiv: 2604.10873 · v2 · submitted 2026-04-13 · 💻 cs.AI · cs.CC· cs.LG

A Quantitative Definition of Intelligence

Kang-Sin Choi This is my paper

Pith reviewed 2026-05-10 16:41 UTC · model grok-4.3

classification 💻 cs.AI cs.CCcs.LG

keywords intelligence definitiongeneralizationdescription lengthmemorizationKolmogorov complexitycontextualitysemanticsChinese Room

0 comments

The pith

Intelligence is the ratio of a system's independent correct outputs to its total description length, distinguishing knowing from memorizing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper offers an operational definition of intelligence that applies to any physical system by comparing the number of distinct correct outputs it can generate against the length of its shortest description. A system memorizes when each new output requires adding to its description, but knows its domain when a single fixed description supports an ever-growing set of correct outputs through generalization. This places intelligence on a continuous scale from basic logic gates to brains. The definition also frames meaning as the choice and ordering of functions that yield specifiably correct results and introduces contextuality as the inverse of conditional description length given prior outputs. These elements together challenge the view that formal rules alone cannot produce meaningful behavior in domains where correctness can be checked.

Core claim

Intelligence density is defined as the ratio of the logarithm of the count of independent outputs to the total description length of the system. Systems know their domain when description length stays fixed while output count diverges via generalization; they memorize when description length must increase with each added output. Meaning over a domain is the selection and ordering of functions that produce correct outputs where correctness is specifiable. Contextuality of an output is the inverse of its conditional Kolmogorov complexity given prior outputs, which combines correctness and independence into one condition. This framework refutes the premise that syntax cannot suffice for meaning

What carries the argument

Intelligence density ratio, which tracks whether a fixed description length can support diverging numbers of independent correct outputs through generalization rather than requiring longer descriptions for each new output.

If this is right

Intelligence becomes comparable across any physical substrate, from silicon gates to biological brains, without favoring one medium.
Evaluation of systems shifts from counting stored facts to testing whether a fixed mechanism produces correct outputs over unbounded inputs.
Meaning becomes an operational property tied to producing specifiably correct outputs rather than an intrinsic feature of syntax or biology.
The Chinese Room argument is limited to domains where no objective correctness criterion can be stated.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Benchmarks for artificial systems could prioritize minimal core descriptions that still cover wide input ranges over raw scale of stored data.
The definition supplies a way to quantify how much context reduces the remaining description length needed for each new output.
Extensions to physical systems without obvious discrete outputs would require clear operational rules for counting independent correct behaviors.
If the ratio can be measured, it offers a substrate-neutral test for whether a mechanism has crossed from memorization into knowing.

Load-bearing premise

Independent outputs and total description length can be objectively identified and measured for arbitrary physical systems.

What would settle it

Demonstrating a concrete physical system where the number of independent correct outputs increases indefinitely while its minimal description length remains strictly constant.

read the original abstract

We propose an operational, quantitative definition of intelligence for arbitrary physical systems. The intelligence density of a system is the ratio of the logarithm of its independent outputs to its total description length. A system memorizes if its description length grows with its output count; it knows if its description length remains fixed while its output count diverges. The criterion for knowing is generalization. A system knows its domain if a single finite mechanism can produce correct outputs across an unbounded range of inputs, rather than storing each answer individually. The definition places intelligence on a substrate-independent continuum from logic gates to brains. We then argue that meaning over a domain is a selection and ordering of functions that produces correct outputs where correctness is specifiable. We also define a measure of contextuality of an output as the inverse of its conditional Kolmogorov complexity given the context of prior outputs, which unifies correctness and independence into a single condition. Together, these refute Searle's third premise, that syntax is insufficient for semantics, over any domain where correctness is specifiable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper offers a clean definitional package for intelligence via output count over Kolmogorov description length, but the Searle refutation is definitional and the measure cannot be applied objectively to physical systems.

read the letter

The paper's central move is to define intelligence density as log of the number of independent outputs divided by the total description length of the system, with both quantities drawn from Kolmogorov complexity. A system memorizes when description length grows with output count; it knows when the length stays fixed while outputs diverge. The authors treat this fixed-length case as generalization and then define meaning as the selection of functions that produce specifiable correct outputs. They add a contextuality measure as the inverse of conditional complexity and use the whole package to argue that syntax suffices for semantics over any domain where correctness can be checked, thereby addressing Searle's third premise.

Referee Report

3 major / 3 minor

Summary. The manuscript proposes an operational, quantitative definition of intelligence applicable to arbitrary physical systems: intelligence density is the ratio of the logarithm of the number of independent outputs to the system's total description length (via Kolmogorov complexity). Systems memorize when description length grows with output count and know when length remains fixed while outputs diverge through generalization. It defines meaning as selection of functions producing specifiable correct outputs, introduces contextuality as the inverse of conditional Kolmogorov complexity, and uses these to refute Searle's third premise that syntax is insufficient for semantics in domains where correctness is specifiable. The definition aims to place intelligence on a substrate-independent continuum.

Significance. If the proposed measures could be made objective and applicable, the work would provide a formal bridge between computational complexity and philosophical questions about intelligence, generalization, and semantics, potentially enabling quantitative comparisons across logic gates, brains, and other systems. The explicit use of Kolmogorov complexity for independence and contextuality is a clear strength in attempting a parameter-free, substrate-neutral approach, though its practical impact depends on resolving measurability.

major comments (3)

[Definition of intelligence density] Definition of intelligence density (early sections): the ratio log(# independent outputs)/total description length is presented as operational for physical systems, but no canonical encoding of system state or behavior into strings is specified; different representations (e.g., I/O traces vs. internal mechanisms) yield different Kolmogorov complexities, rendering the fixed-vs-diverging distinction non-unique and non-objective.
[Refutation of Searle] Refutation of Searle's third premise (later sections): the claim that a single finite mechanism producing correct outputs across unbounded inputs refutes 'syntax insufficient for semantics' follows directly from redefining 'knowing' and 'meaning' as generalization under the proposed density measure; this makes the philosophical conclusion tautological with the definitions rather than an independent argument.
[Criterion for knowing] Operationality and measurability: Kolmogorov complexity is uncomputable and the paper provides no approximation procedure or restriction to computable cases; without this, the criterion for 'knowing' (fixed description length while outputs diverge) cannot be verified for any concrete physical device, undermining the claim that the definition is quantitative and applicable beyond abstract computation.

minor comments (3)

[Abstract/Introduction] The abstract and introduction would benefit from a concrete example (e.g., a finite automaton or lookup table vs. a rule-based system) illustrating how description length remains fixed while outputs increase.
[Contextuality definition] Notation for 'independent outputs' and 'contextuality' should be formalized with explicit equations or pseudocode to clarify how conditional Kolmogorov complexity unifies correctness and independence.
[Related work] Missing references to standard results on Kolmogorov complexity (e.g., Chaitin, Li & Vitányi) and prior quantitative intelligence measures (e.g., in algorithmic information theory or AIXI) would strengthen the positioning.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We address each major point below and have revised the manuscript where needed to improve clarity on encodings, philosophical implications, and practical considerations.

read point-by-point responses

Referee: Definition of intelligence density (early sections): the ratio log(# independent outputs)/total description length is presented as operational for physical systems, but no canonical encoding of system state or behavior into strings is specified; different representations (e.g., I/O traces vs. internal mechanisms) yield different Kolmogorov complexities, rendering the fixed-vs-diverging distinction non-unique and non-objective.

Authors: We agree that Kolmogorov complexity is sensitive to the choice of reference machine or encoding. However, any two encodings differ by at most an additive constant, which does not affect the asymptotic distinction central to our definition: whether description length remains bounded or grows with output count. For physical systems we take the description length to be that of the shortest program, in a fixed reference language, that fully specifies the system's mechanism and its input-output behavior. We have added a subsection clarifying this canonical choice and its invariance properties. revision: yes
Referee: Refutation of Searle's third premise (later sections): the claim that a single finite mechanism producing correct outputs across unbounded inputs refutes 'syntax insufficient for semantics' follows directly from redefining 'knowing' and 'meaning' as generalization under the proposed density measure; this makes the philosophical conclusion tautological with the definitions rather than an independent argument.

Authors: The definitions are motivated independently by computational and information-theoretic considerations before being applied to Searle's argument. The refutation shows that, once intelligence and meaning are formalized in this way, a finite syntactic mechanism suffices for semantic correctness over unbounded inputs in domains where correctness is specifiable. This is a derived implication rather than a restatement. We have expanded the relevant section to separate the motivational justification of the definitions from the subsequent philosophical application. revision: partial
Referee: Operationality and measurability: Kolmogorov complexity is uncomputable and the paper provides no approximation procedure or restriction to computable cases; without this, the criterion for 'knowing' (fixed description length while outputs diverge) cannot be verified for any concrete physical device, undermining the claim that the definition is quantitative and applicable beyond abstract computation.

Authors: We acknowledge that Kolmogorov complexity is uncomputable and that the manuscript did not previously discuss approximation methods. The definition remains quantitative in the theoretical sense of algorithmic information theory. In the revision we have added a dedicated paragraph noting that practical estimates can be obtained via standard compression algorithms (e.g., Lempel-Ziv or other universal compressors) and that analysis can be restricted to computable mechanisms when applying the criterion to physical devices. This preserves the formal proposal while addressing verifiability. revision: yes

Circularity Check

1 steps flagged

Intelligence density and meaning definitions render Searle refutation tautological by construction

specific steps

self definitional [Abstract]
"A system memorizes if its description length grows with its output count; it knows if its description length remains fixed while its output count diverges. The criterion for knowing is generalization. A system knows its domain if a single finite mechanism can produce correct outputs across an unbounded range of inputs, rather than storing each answer individually. ... We then argue that meaning over a domain is a selection and ordering of functions that produces correct outputs where correctness is specifiable. ... Together, these refute Searle's third premise, that syntax is insufficient for "

The paper first defines 'knowing' as generalization (fixed description length, single finite mechanism for unbounded correct outputs). It then defines 'meaning' exactly as selection of functions producing such correct outputs, and states that these definitions together refute Searle. The refutation therefore reduces to a restatement of the initial definitions rather than an independent argument.

full rationale

The paper's central chain defines intelligence density via log(independent outputs)/description length (Kolmogorov-based), equates 'knowing' to fixed length with diverging outputs via finite mechanism, redefines meaning as production of specifiable correct outputs via such mechanisms, and concludes this refutes Searle. This reduces the philosophical claim directly to the chosen operational definitions without independent derivation or external verification.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The proposal rests on standard concepts from algorithmic information theory without introducing new free parameters or invented physical entities; the main load-bearing elements are the new definitions themselves.

axioms (1)

domain assumption Kolmogorov complexity provides a meaningful, substrate-independent measure of description length and conditional complexity for outputs
Invoked throughout the definition of intelligence density and contextuality.

invented entities (2)

intelligence density no independent evidence
purpose: To quantify intelligence as a ratio of outputs to description length
Central new measure introduced in the abstract.
contextuality of an output no independent evidence
purpose: To unify correctness and independence via inverse conditional Kolmogorov complexity
Defined to support the meaning and semantics argument.

pith-pipeline@v0.9.0 · 5464 in / 1463 out tokens · 72212 ms · 2026-05-10T16:41:29.749280+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

& Barak, B

Arora, S. & Barak, B. (2009). Computational Complexity: A Modern Approach. Cambridge University Press

work page 2009
[2]

Bender, E. M. & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proc.\ ACL 2020, 5185--5198

work page 2020
[3]

Block, N. (1981). Psychologism and behaviorism. The Philosophical Review, 90(1), 5--43

work page 1981
[4]

Chalmers, D. J. (1995). Facing up to the problem of consciousness. Journal of Consciousness Studies, 2(3), 200--219

work page 1995
[5]

Chaitin, G. J. (1974). Information-theoretic limitations of formal systems. Journal of the ACM, 21(3), 403--424

work page 1974
[6]

Chase, W. G. & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4(1), 55--81

work page 1973
[7]

& Chalmers, D

Clark, A. & Chalmers, D. (1998). The extended mind. Analysis, 58(1), 7--19

work page 1998
[8]

Curry, H. B. (1934). Functionality in combinatory logic. Proceedings of the National Academy of Sciences, 20(11), 584--590

work page 1934
[9]

Dennett, D. C. (1987). Fast thinking. In The Intentional Stance, 324--337. MIT Press

work page 1987
[10]

Firth, J. R. (1957). A synopsis of linguistic theory, 1930--1955. In Studies in Linguistic Analysis (pp. 1--32). Philological Society

work page 1957
[11]

Fodor, J. A. (1975). The Language of Thought. Harvard University Press

work page 1975
[12]

Harnad, S. (1990). The symbol grounding problem. Physica D, 42(1--3), 335--346

work page 1990
[13]

Harris, Z. S. (1954). Distributional structure. Word, 10(2--3), 146--162

work page 1954
[14]

Hofstadter, D. R. (1979). G\" o del, Escher, Bach: An Eternal Golden Braid . Basic Books

work page 1979
[15]

Howard, W. A. (1980). The formulae-as-types notion of construction. In J. P. Seldin & J. R. Hindley (Eds.), To H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, 479--490. Academic Press. (Original manuscript 1969.)

work page 1980
[16]

Hutter, M. (2005). Universal Artificial Intelligence. Springer

work page 2005
[17]

Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1), 1--7

work page 1965
[18]

Kripke, S. A. (1982). Wittgenstein on Rules and Private Language. Harvard University Press

work page 1982
[19]

Lambek, J. (1958). The mathematics of sentence structure. American Mathematical Monthly, 65(3), 154--170

work page 1958
[20]

& Hutter, M

Legg, S. & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391--444

work page 2007
[21]

& Vit \'a nyi, P

Li, M., Chen, X., Li, X., Ma, B. & Vit \'a nyi, P. M. B. (2004). The similarity metric. IEEE Transactions on Information Theory, 50(12), 3250--3264

work page 2004
[22]

& Vit \'a nyi, P

Li, M. & Vit \'a nyi, P. M. B. (2019). An Introduction to Kolmogorov Complexity and Its Applications (4th ed.). Springer

work page 2019
[23]

Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81--97

work page 1956
[24]

Montague, R. (1970). Universal grammar. Theoria, 36(3), 373--398

work page 1970
[25]

Nagel, T. (1974). What is it like to be a bat? The Philosophical Review, 83(4), 435--450

work page 1974
[26]

Plotkin, G. D. (1981). A structural approach to operational semantics. Technical Report DAIMI FN-19, Aarhus University. Reprinted in Journal of Logic and Algebraic Programming, 60--61, 17--139 (2004)

work page 1981
[27]

Putnam, H. (1967). The nature of mental states. In W. H. Capitan & D. D. Merrill (Eds.), Art, Mind, and Religion, 37--48. University of Pittsburgh Press

work page 1967
[28]

Putnam, H. (1975). The meaning of ``meaning''. In K. Gunderson (Ed.), Language, Mind, and Knowledge (pp. 131--193). University of Minnesota Press

work page 1975
[29]

Putnam, H. (1988). Representation and Reality. MIT Press

work page 1988
[30]

Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465--471

work page 1978
[31]

Schmidhuber, J. (2003). The new AI: General & sound & relevant for physics. In B. Goertzel & C. Pennachin (Eds.), Artificial General Intelligence. Springer

work page 2003
[32]

Scott, D. (1970). Outline of a mathematical theory of computation. Programming Research Group Technical Monograph PRG-2, Oxford University Computing Laboratory

work page 1970
[33]

Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417--457

work page 1980
[34]

Searle, J. R. (1992). The Rediscovery of the Mind. MIT Press

work page 1992
[35]

Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379--423

work page 1948
[36]

Shannon, C. E. (1949). The synthesis of two-terminal switching circuits. Bell System Technical Journal, 28(1), 59--98

work page 1949
[37]

Seth, A. (2021). Being You: A New Science of Consciousness. Dutton

work page 2021
[38]

Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11(1), 1--23

work page 1988
[39]

Solomonoff, R. J. (1964). A formal theory of inductive inference. Information and Control, 7(1), 1--22

work page 1964
[40]

Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience, 5, 42

work page 2004
[41]

Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433--460

work page 1950
[42]

Turing, A. M. (c.\ 1951). Letter to B. H. Worsley. Archives Centre, King's College, Cambridge

work page 1951
[43]

Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134--1142

work page 1984
[44]

Vapnik, V. N. (1998). Statistical Learning Theory. Wiley

work page 1998
[45]

N., Kaiser, ., & Polosukhin, I

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30

work page 2017
[46]

V., & Zhou, D

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824--24837

work page 2022

[1] [1]

& Barak, B

Arora, S. & Barak, B. (2009). Computational Complexity: A Modern Approach. Cambridge University Press

work page 2009

[2] [2]

Bender, E. M. & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proc.\ ACL 2020, 5185--5198

work page 2020

[3] [3]

Block, N. (1981). Psychologism and behaviorism. The Philosophical Review, 90(1), 5--43

work page 1981

[4] [4]

Chalmers, D. J. (1995). Facing up to the problem of consciousness. Journal of Consciousness Studies, 2(3), 200--219

work page 1995

[5] [5]

Chaitin, G. J. (1974). Information-theoretic limitations of formal systems. Journal of the ACM, 21(3), 403--424

work page 1974

[6] [6]

Chase, W. G. & Simon, H. A. (1973). Perception in chess. Cognitive Psychology, 4(1), 55--81

work page 1973

[7] [7]

& Chalmers, D

Clark, A. & Chalmers, D. (1998). The extended mind. Analysis, 58(1), 7--19

work page 1998

[8] [8]

Curry, H. B. (1934). Functionality in combinatory logic. Proceedings of the National Academy of Sciences, 20(11), 584--590

work page 1934

[9] [9]

Dennett, D. C. (1987). Fast thinking. In The Intentional Stance, 324--337. MIT Press

work page 1987

[10] [10]

Firth, J. R. (1957). A synopsis of linguistic theory, 1930--1955. In Studies in Linguistic Analysis (pp. 1--32). Philological Society

work page 1957

[11] [11]

Fodor, J. A. (1975). The Language of Thought. Harvard University Press

work page 1975

[12] [12]

Harnad, S. (1990). The symbol grounding problem. Physica D, 42(1--3), 335--346

work page 1990

[13] [13]

Harris, Z. S. (1954). Distributional structure. Word, 10(2--3), 146--162

work page 1954

[14] [14]

Hofstadter, D. R. (1979). G\" o del, Escher, Bach: An Eternal Golden Braid . Basic Books

work page 1979

[15] [15]

Howard, W. A. (1980). The formulae-as-types notion of construction. In J. P. Seldin & J. R. Hindley (Eds.), To H. B. Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism, 479--490. Academic Press. (Original manuscript 1969.)

work page 1980

[16] [16]

Hutter, M. (2005). Universal Artificial Intelligence. Springer

work page 2005

[17] [17]

Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1), 1--7

work page 1965

[18] [18]

Kripke, S. A. (1982). Wittgenstein on Rules and Private Language. Harvard University Press

work page 1982

[19] [19]

Lambek, J. (1958). The mathematics of sentence structure. American Mathematical Monthly, 65(3), 154--170

work page 1958

[20] [20]

& Hutter, M

Legg, S. & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391--444

work page 2007

[21] [21]

& Vit \'a nyi, P

Li, M., Chen, X., Li, X., Ma, B. & Vit \'a nyi, P. M. B. (2004). The similarity metric. IEEE Transactions on Information Theory, 50(12), 3250--3264

work page 2004

[22] [22]

& Vit \'a nyi, P

Li, M. & Vit \'a nyi, P. M. B. (2019). An Introduction to Kolmogorov Complexity and Its Applications (4th ed.). Springer

work page 2019

[23] [23]

Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81--97

work page 1956

[24] [24]

Montague, R. (1970). Universal grammar. Theoria, 36(3), 373--398

work page 1970

[25] [25]

Nagel, T. (1974). What is it like to be a bat? The Philosophical Review, 83(4), 435--450

work page 1974

[26] [26]

Plotkin, G. D. (1981). A structural approach to operational semantics. Technical Report DAIMI FN-19, Aarhus University. Reprinted in Journal of Logic and Algebraic Programming, 60--61, 17--139 (2004)

work page 1981

[27] [27]

Putnam, H. (1967). The nature of mental states. In W. H. Capitan & D. D. Merrill (Eds.), Art, Mind, and Religion, 37--48. University of Pittsburgh Press

work page 1967

[28] [28]

Putnam, H. (1975). The meaning of ``meaning''. In K. Gunderson (Ed.), Language, Mind, and Knowledge (pp. 131--193). University of Minnesota Press

work page 1975

[29] [29]

Putnam, H. (1988). Representation and Reality. MIT Press

work page 1988

[30] [30]

Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465--471

work page 1978

[31] [31]

Schmidhuber, J. (2003). The new AI: General & sound & relevant for physics. In B. Goertzel & C. Pennachin (Eds.), Artificial General Intelligence. Springer

work page 2003

[32] [32]

Scott, D. (1970). Outline of a mathematical theory of computation. Programming Research Group Technical Monograph PRG-2, Oxford University Computing Laboratory

work page 1970

[33] [33]

Searle, J. R. (1980). Minds, brains, and programs. Behavioral and Brain Sciences, 3(3), 417--457

work page 1980

[34] [34]

Searle, J. R. (1992). The Rediscovery of the Mind. MIT Press

work page 1992

[35] [35]

Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379--423

work page 1948

[36] [36]

Shannon, C. E. (1949). The synthesis of two-terminal switching circuits. Bell System Technical Journal, 28(1), 59--98

work page 1949

[37] [37]

Seth, A. (2021). Being You: A New Science of Consciousness. Dutton

work page 2021

[38] [38]

Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11(1), 1--23

work page 1988

[39] [39]

Solomonoff, R. J. (1964). A formal theory of inductive inference. Information and Control, 7(1), 1--22

work page 1964

[40] [40]

Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience, 5, 42

work page 2004

[41] [41]

Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433--460

work page 1950

[42] [42]

Turing, A. M. (c.\ 1951). Letter to B. H. Worsley. Archives Centre, King's College, Cambridge

work page 1951

[43] [43]

Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134--1142

work page 1984

[44] [44]

Vapnik, V. N. (1998). Statistical Learning Theory. Wiley

work page 1998

[45] [45]

N., Kaiser, ., & Polosukhin, I

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30

work page 2017

[46] [46]

V., & Zhou, D

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824--24837

work page 2022