The New Associationism: Lessons from Deep Learning

Daniel Rothschild

arxiv: 2606.20600 · v1 · pith:D6M3DTZLnew · submitted 2026-05-19 · 💻 cs.AI · cs.LG

The New Associationism: Lessons from Deep Learning

Daniel Rothschild This is my paper

Pith reviewed 2026-06-30 18:30 UTC · model grok-4.3

classification 💻 cs.AI cs.LG

keywords associationismsupervised learningdeep learningAI systemshuman learningerror-driven learningcognitive capacities

0 comments

The pith

Supervised learning powers diverse AI systems and revives associationist accounts of cognition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that modern AI successes, particularly through supervised learning driven by evaluative feedback, support a modest form of associationism. This approach relies on gradual, error-driven adjustments that work across many domains, from language to games. A sympathetic reader would see this as evidence against the idea that associationist mechanisms are too weak for human-like intelligence. The argument also acknowledges that current architectures exceed classical associationist ideas, treating supervised learning as a key component rather than the whole story.

Core claim

Supervised learning, which relies on evaluative feedback, underlies a wide range of contemporary AI systems including large language models and game-playing agents. These systems differ mainly in the effort needed to generate feedback signals. This finding supports associationist principles of uniform, gradual, error-driven learning mechanisms that operate across different domains and weakens arguments that such mechanisms cannot explain human cognitive abilities.

What carries the argument

Supervised learning as an error-driven feedback mechanism that operates within advanced computational architectures.

If this is right

Associationist learning mechanisms can handle complex, domain-spanning tasks in AI.
Human cognitive capacities may be achievable through gradual error correction rather than specialized innate structures.
Deep learning architectures extend beyond but incorporate classical associationist processes.
AI systems demonstrate that limited feedback mechanisms suffice for broad learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Future research could test whether human learning relies on similar feedback generation processes as in AI.
Architectural innovations in AI might suggest new ways to model human cognition beyond pure associationism.
Limitations in current AI could point to missing elements in associationist theories.

Load-bearing premise

That the success of AI systems provides evidence for the mechanisms that suffice for human cognitive capacities.

What would settle it

Demonstration of a human cognitive task that no supervised learning system with any architecture can perform.

read the original abstract

What can the success of modern AI tell us about how humans learn? This paper argues that taking AI seriously as a model of human learning supports a modest but genuine associationism. The central finding is that supervised learning -- learning driven by evaluative feedback -- underlies a surprisingly wide range of contemporary AI systems, from large language models to game-playing agents, differing primarily in how much work is required to generate the relevant feedback signal. This vindicates associationist ideals of a uniform, gradual, error-driven learning mechanism operating across domains, and defuses the once-influential argument that associationist mechanisms are too limited to account for human cognitive capacities. At the same time, the successes of deep learning depend on computational architectures that go well beyond anything classical associationists envisaged, and supervised learning operates within these as one component rather than a complete account of learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper reads modern AI as evidence for associationism but the step from artificial success to human mechanisms is asserted rather than argued.

read the letter

The main point is that supervised learning powers a broad set of current AI systems and this is taken to show that associationist ideas of gradual, error-driven learning are not as limited as earlier critics claimed. The paper frames this as a modest revival rather than a full return to classical views.

It does a clear job describing how feedback signals appear in language models, game agents, and other architectures, and it rightly notes that deep learning relies on network structures and optimization methods that go beyond anything the original associationists had in mind. That keeps the claim from overstating the historical parallel.

The weaker part is the move from computational feasibility in engineered systems to sufficiency for human cognition. Success with backpropagation and large-scale data in silicon does not by itself show that the same mechanisms explain human capacities or that innate structure is unnecessary. The paper offers no specific comparisons to psychological findings, no counterexamples from cognitive development, and no additional theoretical bridge. The inference stays at the level of possibility.

This is the sort of piece that fits a philosophy of mind or cognitive science audience interested in how AI examples bear on old debates. It is coherent on its own terms and engages the literature without internal contradictions, though it is interpretive rather than empirical or formal. I would send it for peer review so the mapping issue can be pressed in detail.

Referee Report

2 major / 0 minor

Summary. The paper claims that supervised learning—error-driven learning from evaluative feedback—underlies a wide range of modern AI systems (LLMs, game agents), differing mainly in the effort needed to produce the feedback signal. This is presented as vindicating a modest associationism for human learning by showing that uniform, gradual, error-driven mechanisms can scale across domains when embedded in sufficiently powerful architectures, while acknowledging that deep learning exceeds classical associationism and that supervised learning is only one component.

Significance. If the mapping from AI computational success to human cognitive mechanisms were secured, the result would contribute to philosophy of mind and cognitive science by offering a computational existence proof that associationist learning can handle complex, cross-domain capacities, thereby weakening claims that such mechanisms are inherently too limited. The paper correctly notes architectural extensions beyond classical views, but its interpretive approach yields no new empirical predictions or formal derivations.

major comments (2)

[Abstract] Abstract and opening sections: the central claim that AI successes 'vindicate associationist ideals' and 'defuse' arguments about the limits of associationist mechanisms requires a bridging argument showing why computational feasibility in engineered systems (with backpropagation and large-scale optimization) supplies evidence that such mechanisms suffice for human cognition. No such justification—empirical, theoretical, or analogical—is supplied; the inference is asserted rather than demonstrated.
[Abstract] The discussion of supervised learning across systems correctly identifies it as one component within richer architectures, yet the paper provides no case studies or counterexamples that would test whether non-associationist mechanisms are required for the capacities exhibited by those systems.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these comments, which help clarify the scope and evidential basis of our argument. We address each point below and note revisions that will be incorporated in the next version.

read point-by-point responses

Referee: [Abstract] Abstract and opening sections: the central claim that AI successes 'vindicate associationist ideals' and 'defuse' arguments about the limits of associationist mechanisms requires a bridging argument showing why computational feasibility in engineered systems (with backpropagation and large-scale optimization) supplies evidence that such mechanisms suffice for human cognition. No such justification—empirical, theoretical, or analogical—is supplied; the inference is asserted rather than demonstrated.

Authors: We accept that the inferential step from engineered AI systems to human cognition is stated rather than fully elaborated. The manuscript treats modern AI as an existence proof that uniform error-driven learning can support complex, cross-domain performance once embedded in sufficiently rich architectures. To make this explicit, we will revise the abstract and the opening sections to include a short paragraph on the relevant analogy: just as connectionist models were offered in cognitive science as demonstrations that certain learning rules are computationally viable for tasks previously thought to require symbolic mechanisms, the scaling successes of supervised deep learning show that associationist principles are not inherently limited in the way classical critics claimed. This remains an analogical rather than direct empirical claim, but we will flag its status more clearly. revision: yes
Referee: [Abstract] The discussion of supervised learning across systems correctly identifies it as one component within richer architectures, yet the paper provides no case studies or counterexamples that would test whether non-associationist mechanisms are required for the capacities exhibited by those systems.

Authors: The manuscript already notes that supervised learning operates as one component rather than a complete theory. To respond to the request for more concrete illustration, we will add a brief subsection that examines two systems (transformer-based language models and model-based reinforcement learning agents) to show how the core parameter updates remain error-driven even while auxiliary structures (attention, planning modules) are present. We do not claim this constitutes an exhaustive test that non-associationist mechanisms are never required; the paper's aim is narrower—to show that associationist learning suffices for the scaling that has been achieved. A fuller empirical test would require experiments outside the scope of this philosophical paper. revision: partial

Circularity Check

0 steps flagged

No circularity: interpretive argument lacks equations, fits, or self-referential derivations

full rationale

The paper advances a qualitative philosophical claim that supervised learning in modern AI systems supports associationist views of human learning. No equations, parameter fitting, or quantitative predictions appear in the abstract or described structure. The argument interprets existing AI successes (LLMs, game agents) as evidence for uniform error-driven mechanisms without reducing any step to a self-definition, fitted input renamed as prediction, or load-bearing self-citation chain. The mapping from artificial to biological cognition is presented as an inference rather than a formal derivation, rendering the reasoning self-contained against external benchmarks with no reductions to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that AI system behaviors can be read as evidence for human learning mechanisms. No free parameters or invented entities are introduced.

axioms (1)

domain assumption Success of artificial supervised learning systems provides evidence about the sufficiency of associationist mechanisms for human cognition.
Invoked throughout the abstract as the bridge from AI to human learning; no independent empirical test of this mapping is described.

pith-pipeline@v0.9.1-grok · 5658 in / 1266 out tokens · 27104 ms · 2026-06-30T18:30:56.519166+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

298 extracted references · 40 canonical work pages · 9 internal anchors

[1]

Silver, J

Silver, David and Schrittwieser, Julian and Simonyan, Karen and Antonoglou, Ioannis and Huang, Aja and Guez, Arthur and Hubert, Thomas and Baker, Lucas and Lai, Matthew and Bolton, Adrian and Chen, Yutian and Lillicrap, Timothy and Hui, Fan and Sifre, Laurent and van den Driessche, George and Graepel, Thore and Hassabis, Demis , date-added =. Mastering th...

work page doi:10.1038/nature24270
[2]

and Bates, Elizabeth A

Elman, Jeffrey L. and Bates, Elizabeth A. and Johnson, Mark H. and Karmiloff-Smith, Annette and Parisi, Domenico and Plunkett, Kim , date-added =. Rethinking Innateness: A Connectionist Perspective on Development , year =
[3]

, booktitle =

Piantadosi, Steven T. , booktitle =. Modern language models refute
[4]

The Building Blocks of Thought: The Foundations of Human Cognition , year =

Margolis, Eric and Laurence, Stephen , date-added =. The Building Blocks of Thought: The Foundations of Human Cognition , year =
[5]

Findings of the

Warstadt, Alex and Mueller, Aaron and Choshen, Leshem and Wilcox, Ethan and Zhuang, Chengxu and Ciro, Juan and Mosquera, Rafael and Paranjabe, Bhargavi and Williams, Adina and Linzen, Tal and Cotterell, Ryan , booktitle =. Findings of the
[6]

, date-added =

Mitchell, Tom M. , date-added =. The Need for Biases in Learning Generalizations , year =
[7]

, date-added =

Marcus, Gary F. , date-added =. The Birth of the Mind: How a Tiny Number of Genes Creates the Complexities of Human Thought , year =
[8]

arXiv , author =:2501.12948 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv
[9]

Mind Children: The Future of Robot and Human Intelligence , year =

Moravec, Hans , date-added =. Mind Children: The Future of Robot and Human Intelligence , year =
[10]

Unified Theories of Cognition , year =

Newell, Allen , date-added =. Unified Theories of Cognition , year =
[11]

, date-added =

Anderson, John R. , date-added =. The Architecture of Cognition , year =
[12]

, date-added =

Anderson, John R. , date-added =. How Can the Human Mind Occur in the Physical Universe? , year =
[13]

, date-added =

Mitchell, Tom M. , date-added =. Generalization as Search , volume =. 1982 , bdsk-url-1 =. doi:10.1016/0004-3702(82)90040-6 , journal =

work page doi:10.1016/0004-3702(82)90040-6 1982
[14]

Ross , date-added =

Quinlan, J. Ross , date-added =. Induction of Decision Trees , volume =. 1986 , bdsk-url-1 =. doi:10.1007/BF00116251 , journal =

work page doi:10.1007/bf00116251 1986
[15]

Ross , date-added =

Quinlan, J. Ross , date-added =. C4.5: Programs for Machine Learning , year =
[16]

Inductive Logic Programming , volume =

Muggleton, Stephen , date-added =. Inductive Logic Programming , volume =. 1991 , bdsk-url-1 =. doi:10.1007/BF03037089 , journal =

work page doi:10.1007/bf03037089 1991
[17]

The Society of Mind , year =

Minsky, Marvin , date-added =. The Society of Mind , year =
[18]

, date-added =

Griffiths, Thomas L. , date-added =. The Laws of Thought: The Quest for a Mathematical Theory of the Mind , year =
[19]

Highly accurate protein structure prediction with

Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf and Tunyasuvunakool, Kathryn and Bates, Russ and. Highly accurate protein structure prediction with. 2021 , bdsk-url-1 =. doi:10.1038/s41586-021-03819-2 , journal =

work page doi:10.1038/s41586-021-03819-2 2021
[20]

, date-added =

Hinton, Geoffrey E. , date-added =. Commencement Address,
[21]

Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations , year =
[22]

Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 2: Psychological and Biological Models , year =
[23]

Parallel Distributed Processing: Explorations in the Microstructure of Cognition , year =
[24]

, date-added =

Turing, Alan M. , date-added =. On Computable Numbers, with an Application to the Entscheidungsproblem , volume =. Proceedings of the London Mathematical Society , note =
[25]

, date-added =

Pitt, Leonard and Valiant, Leslie G. , date-added =. Computational Limitations on Learning from Examples , volume =. Journal of the ACM , number =
[26]

Perceptrons: An Introduction to Computational Geometry , year =

Minsky, Marvin and Papert, Seymour , date-added =. Perceptrons: An Introduction to Computational Geometry , year =
[27]

and Hinton, Geoffrey E

Rumelhart, David E. and Hinton, Geoffrey E. and Williams, Ronald J. , date-added =. Learning representations by back-propagating errors , volume =. Nature , number =
[28]

Software 2.0 , url =

Karpathy, Andrej , date-added =. Software 2.0 , url =. 2017 , bdsk-url-1 =

2017
[29]

The representation of the cumulative rounding error of an algorithm as a

Linnainmaa, Seppo , date-added =. The representation of the cumulative rounding error of an algorithm as a
[30]

, date-added =

Werbos, Paul J. , date-added =. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences , year =
[31]

and Pitts, Walter , date-added =

McCulloch, Warren S. and Pitts, Walter , date-added =. A Logical Calculus of the Ideas Immanent in Nervous Activity , volume =. Bulletin of Mathematical Biophysics , number =
[32]

, date-added =

Mitchell, Tom M. , date-added =. Machine Learning , year =
[33]

Artificial Intelligence: A Guide for Thinking Humans , year =

Mitchell, Melanie , date-added =. Artificial Intelligence: A Guide for Thinking Humans , year =
[34]

, date-added =

Elman, Jeffrey L. , date-added =. Finding structure in time , volume =. Cognitive Science , number =
[35]

and Dean, Jeffrey , booktitle =

Mikolov, Tomas and Sutskever, Ilya and Chen, Kai and Corrado, Greg S. and Dean, Jeffrey , booktitle =. Distributed Representations of Words and Phrases and their Compositionality , volume =
[36]

and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke , booktitle =

Peters, Matthew E. and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke , booktitle =. Deep contextualized word representations , year =
[37]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , year =

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle =. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , year =
[38]

Improving Language Understanding by Generative Pre-Training , year =

Radford, Alec and Narasimhan, Karthik and Salimans, Tim and Sutskever, Ilya , date-added =. Improving Language Understanding by Generative Pre-Training , year =
[39]

and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others , date-added =

Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D. and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others , date-added =. Advances in Neural Information Processing Systems , title =
[40]

, date-added =

Harris, Zellig S. , date-added =. Distributional structure , volume =. Word , number =
[41]

, date-added =

Kearns, Michael and Valiant, Leslie G. , date-added =. Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , volume =. Journal of the ACM , number =
[42]

Complexity Theoretic Limitations on Learning DNFs , year =

Daniely, Amit and Shalev-Shwartz, Shai , booktitle =. Complexity Theoretic Limitations on Learning DNFs , year =
[43]

, booktitle =

Cook, Stephen A. , booktitle =. The Complexity of Theorem-Proving Procedures , year =
[44]

and Kaiser, Lukasz and Polosukhin, Illia , booktitle =

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia , booktitle =. Attention Is All You Need , volume =
[45]

, date-added =

Hyafil, Laurent and Rivest, Ronald L. , date-added =. Constructing optimal binary decision trees is. Information Processing Letters , number =
[46]

, date-added =

Masek, William J. , date-added =. Some
[47]

David Marr , date-added =. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information , year =
[48]

arXiv preprint arXiv:2508.05776 , title =

Griffiths, Thomas L and Lake, Brenden M and McCoy, R Thomas and Pavlick, Ellie and Webb, Taylor W , date-added =. arXiv preprint arXiv:2508.05776 , title =

work page arXiv
[49]

Skinner, B. F. , date-added =. Science and Human Behavior , year =
[50]

, date-added =

Thorndike, Edward L. , date-added =. Animal Intelligence: Experimental Studies , year =
[51]

Associative engines: Connectionism, concepts, and representational change , year =

Clark, Andy , date-added =. Associative engines: Connectionism, concepts, and representational change , year =
[52]

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model , url =. 2024 , bdsk-url-1 =. arXiv , author =:2305.18290 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2024
[53]

The elements of statistical learning: data mining, inference, and prediction , volume =

Hastie, Trevor and Tibshirani, Robert and Friedman, Jerome H and Friedman, Jerome H , date-added =. The elements of statistical learning: data mining, inference, and prediction , volume =
[54]

Multilayer feedforward networks are universal approximators , volume =

Hornik, Kurt and Stinchcombe, Maxwell and White, Halbert , date-added =. Multilayer feedforward networks are universal approximators , volume =. Neural networks , number =
[55]

Mandelbaum, Eric and Milli. The
[56]

Surfing uncertainty: Prediction, action, and the embodied mind , year =

Clark, Andy , date-added =. Surfing uncertainty: Prediction, action, and the embodied mind , year =
[57]

Grounded language acquisition through the eyes and ears of a single child , volume =

Vong, Wai Keen and Wang, Wentao and Orhan, A Emin and Lake, Brenden M , date-added =. Grounded language acquisition through the eyes and ears of a single child , volume =. Science , number =
[58]

and Wagner, Allan R

Rescorla, Robert A. and Wagner, Allan R. , booktitle =. A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement , year =
[59]

, journal =

Rescorla, Robert A. , journal =. Pavlovian Conditioning: It's Not What You Think It Is , volume =
[60]

Sinking In: The Peripheral Baldwinisation of Human Cognition , volume =

Heyes, Cecilia and Chater, Nick and Dwyer, Dominic Michael , journal =. Sinking In: The Peripheral Baldwinisation of Human Cognition , volume =
[61]

and McClelland, James L

Rumelhart, David E. and McClelland, James L. and PDP Research Group , date-added =. Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations , url =. 1986 , bdsk-url-1 =. doi:10.7551/mitpress/5236.001.0001 , isbn =

work page doi:10.7551/mitpress/5236.001.0001 1986
[62]

On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , volume =

Pinker, Steven and Prince, Alan , date-added =. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , volume =. Cognition , number =
[63]

Deep unsupervised learning using nonequilibrium thermodynamics , year =

Sohl-Dickstein, Jascha and Weiss, Eric and Maheswaranathan, Niru and Ganguli, Surya , booktitle =. Deep unsupervised learning using nonequilibrium thermodynamics , year =
[64]

Weiskopf , date-added =

Daniel A. Weiskopf , date-added =. The Origins of Concepts , volume =. 2008 , bdsk-url-1 =. doi:10.1007/s11098-007-9150-8 , journal =

work page doi:10.1007/s11098-007-9150-8 2008
[65]

Learning Matters: The Role of Learning in Concept Acquisition , volume =

Eric Margolis and Stephen Laurence , date-added =. Learning Matters: The Role of Learning in Concept Acquisition , volume =. 2011 , bdsk-url-1 =. doi:10.1111/j.1468-0017.2011.01429.x , journal =

work page doi:10.1111/j.1468-0017.2011.01429.x 2011
[67]

2025 , bdsk-url-1 =

Propositional Interpretability in Artificial Intelligence , url =. 2025 , bdsk-url-1 =. arXiv , author =:2501.15740 , primaryclass =

work page arXiv 2025
[68]

How we learn: The new science of education and the brain , year =

Dehaene, Stanislas , date-added =. How we learn: The new science of education and the brain , year =
[69]

Letting structure emerge: connectionist and dynamical systems approaches to cognition , volume =

McClelland, James L and Botvinick, Matthew M and Noelle, David C and Plaut, David C and Rogers, Timothy T and Seidenberg, Mark S and Smith, Linda B , date-added =. Letting structure emerge: connectionist and dynamical systems approaches to cognition , volume =. Trends in cognitive sciences , number =
[70]

In Defense of Nativism , volume =

Eric Margolis and Stephen Laurence , date-added =. In Defense of Nativism , volume =. 2013 , bdsk-url-1 =. doi:10.1007/s11098-012-9972-x , journal =

work page doi:10.1007/s11098-012-9972-x 2013
[71]

Reinforcement learning: An introduction , year =

Sutton, Richard S and Barto, Andrew G , date-added =. Reinforcement learning: An introduction , year =
[72]

Two-process learning theory: Relationships between

Rescorla, Robert A and Solomon, Richard L , date-added =. Two-process learning theory: Relationships between. Psychological Review , number =
[73]

Cognitive gadgets: The cultural evolution of thinking , year =

Heyes, Cecilia , date-added =. Cognitive gadgets: The cultural evolution of thinking , year =
[74]

Thomas McCoy and Shunyu Yao and Dan Friedman and Mathew D

R. Thomas McCoy and Shunyu Yao and Dan Friedman and Mathew D. Hardy and Thomas L. Griffiths , date-added =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. PNAS , number =
[75]

Bishop , date-added =

Christopher M. Bishop , date-added =. Pattern Recognition and Machine Learning , year =
[76]

Backpropagation and the brain , volume =

Lillicrap, Timothy P and Santoro, Adam and Marris, Luke and Akerman, Colin J and Hinton, Geoffrey , date-added =. Backpropagation and the brain , volume =. Nature Reviews Neuroscience , number =
[77]

The psychology of associative learning , volume =

Shanks, David R , date-added =. The psychology of associative learning , volume =
[78]

Language and thought: The view from LLMs , year =

Rothschild, Daniel , booktitle =. Language and thought: The view from LLMs , year =
[80]

Kingma and Max Welling , booktitle =

Diederik P. Kingma and Max Welling , booktitle =. Stochastic gradient VB and the variational auto-encoder , year =
[81]

Goodfellow and Jean Pouget

Ian J. Goodfellow and Jean Pouget. Generative Adversarial Nets , year =. NeurIPS , date-added =
[82]

Advances in neural information processing systems , title =

Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E , date-added =. Advances in neural information processing systems , title =

Showing first 80 references.

[1] [1]

Silver, J

Silver, David and Schrittwieser, Julian and Simonyan, Karen and Antonoglou, Ioannis and Huang, Aja and Guez, Arthur and Hubert, Thomas and Baker, Lucas and Lai, Matthew and Bolton, Adrian and Chen, Yutian and Lillicrap, Timothy and Hui, Fan and Sifre, Laurent and van den Driessche, George and Graepel, Thore and Hassabis, Demis , date-added =. Mastering th...

work page doi:10.1038/nature24270

[2] [2]

and Bates, Elizabeth A

Elman, Jeffrey L. and Bates, Elizabeth A. and Johnson, Mark H. and Karmiloff-Smith, Annette and Parisi, Domenico and Plunkett, Kim , date-added =. Rethinking Innateness: A Connectionist Perspective on Development , year =

[3] [3]

, booktitle =

Piantadosi, Steven T. , booktitle =. Modern language models refute

[4] [4]

The Building Blocks of Thought: The Foundations of Human Cognition , year =

Margolis, Eric and Laurence, Stephen , date-added =. The Building Blocks of Thought: The Foundations of Human Cognition , year =

[5] [5]

Findings of the

Warstadt, Alex and Mueller, Aaron and Choshen, Leshem and Wilcox, Ethan and Zhuang, Chengxu and Ciro, Juan and Mosquera, Rafael and Paranjabe, Bhargavi and Williams, Adina and Linzen, Tal and Cotterell, Ryan , booktitle =. Findings of the

[6] [6]

, date-added =

Mitchell, Tom M. , date-added =. The Need for Biases in Learning Generalizations , year =

[7] [7]

, date-added =

Marcus, Gary F. , date-added =. The Birth of the Mind: How a Tiny Number of Genes Creates the Complexities of Human Thought , year =

[8] [8]

arXiv , author =:2501.12948 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv

[9] [9]

Mind Children: The Future of Robot and Human Intelligence , year =

Moravec, Hans , date-added =. Mind Children: The Future of Robot and Human Intelligence , year =

[10] [10]

Unified Theories of Cognition , year =

Newell, Allen , date-added =. Unified Theories of Cognition , year =

[11] [11]

, date-added =

Anderson, John R. , date-added =. The Architecture of Cognition , year =

[12] [12]

, date-added =

Anderson, John R. , date-added =. How Can the Human Mind Occur in the Physical Universe? , year =

[13] [13]

, date-added =

Mitchell, Tom M. , date-added =. Generalization as Search , volume =. 1982 , bdsk-url-1 =. doi:10.1016/0004-3702(82)90040-6 , journal =

work page doi:10.1016/0004-3702(82)90040-6 1982

[14] [14]

Ross , date-added =

Quinlan, J. Ross , date-added =. Induction of Decision Trees , volume =. 1986 , bdsk-url-1 =. doi:10.1007/BF00116251 , journal =

work page doi:10.1007/bf00116251 1986

[15] [15]

Ross , date-added =

Quinlan, J. Ross , date-added =. C4.5: Programs for Machine Learning , year =

[16] [16]

Inductive Logic Programming , volume =

Muggleton, Stephen , date-added =. Inductive Logic Programming , volume =. 1991 , bdsk-url-1 =. doi:10.1007/BF03037089 , journal =

work page doi:10.1007/bf03037089 1991

[17] [17]

The Society of Mind , year =

Minsky, Marvin , date-added =. The Society of Mind , year =

[18] [18]

, date-added =

Griffiths, Thomas L. , date-added =. The Laws of Thought: The Quest for a Mathematical Theory of the Mind , year =

[19] [19]

Highly accurate protein structure prediction with

Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf and Tunyasuvunakool, Kathryn and Bates, Russ and. Highly accurate protein structure prediction with. 2021 , bdsk-url-1 =. doi:10.1038/s41586-021-03819-2 , journal =

work page doi:10.1038/s41586-021-03819-2 2021

[20] [20]

, date-added =

Hinton, Geoffrey E. , date-added =. Commencement Address,

[21] [21]

Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations , year =

[22] [22]

Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 2: Psychological and Biological Models , year =

[23] [23]

Parallel Distributed Processing: Explorations in the Microstructure of Cognition , year =

[24] [24]

, date-added =

Turing, Alan M. , date-added =. On Computable Numbers, with an Application to the Entscheidungsproblem , volume =. Proceedings of the London Mathematical Society , note =

[25] [25]

, date-added =

Pitt, Leonard and Valiant, Leslie G. , date-added =. Computational Limitations on Learning from Examples , volume =. Journal of the ACM , number =

[26] [26]

Perceptrons: An Introduction to Computational Geometry , year =

Minsky, Marvin and Papert, Seymour , date-added =. Perceptrons: An Introduction to Computational Geometry , year =

[27] [27]

and Hinton, Geoffrey E

Rumelhart, David E. and Hinton, Geoffrey E. and Williams, Ronald J. , date-added =. Learning representations by back-propagating errors , volume =. Nature , number =

[28] [28]

Software 2.0 , url =

Karpathy, Andrej , date-added =. Software 2.0 , url =. 2017 , bdsk-url-1 =

2017

[29] [29]

The representation of the cumulative rounding error of an algorithm as a

Linnainmaa, Seppo , date-added =. The representation of the cumulative rounding error of an algorithm as a

[30] [30]

, date-added =

Werbos, Paul J. , date-added =. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences , year =

[31] [31]

and Pitts, Walter , date-added =

McCulloch, Warren S. and Pitts, Walter , date-added =. A Logical Calculus of the Ideas Immanent in Nervous Activity , volume =. Bulletin of Mathematical Biophysics , number =

[32] [32]

, date-added =

Mitchell, Tom M. , date-added =. Machine Learning , year =

[33] [33]

Artificial Intelligence: A Guide for Thinking Humans , year =

Mitchell, Melanie , date-added =. Artificial Intelligence: A Guide for Thinking Humans , year =

[34] [34]

, date-added =

Elman, Jeffrey L. , date-added =. Finding structure in time , volume =. Cognitive Science , number =

[35] [35]

and Dean, Jeffrey , booktitle =

Mikolov, Tomas and Sutskever, Ilya and Chen, Kai and Corrado, Greg S. and Dean, Jeffrey , booktitle =. Distributed Representations of Words and Phrases and their Compositionality , volume =

[36] [36]

and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke , booktitle =

Peters, Matthew E. and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke , booktitle =. Deep contextualized word representations , year =

[37] [37]

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , year =

Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle =. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , year =

[38] [38]

Improving Language Understanding by Generative Pre-Training , year =

Radford, Alec and Narasimhan, Karthik and Salimans, Tim and Sutskever, Ilya , date-added =. Improving Language Understanding by Generative Pre-Training , year =

[39] [39]

and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others , date-added =

Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D. and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others , date-added =. Advances in Neural Information Processing Systems , title =

[40] [40]

, date-added =

Harris, Zellig S. , date-added =. Distributional structure , volume =. Word , number =

[41] [41]

, date-added =

Kearns, Michael and Valiant, Leslie G. , date-added =. Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , volume =. Journal of the ACM , number =

[42] [42]

Complexity Theoretic Limitations on Learning DNFs , year =

Daniely, Amit and Shalev-Shwartz, Shai , booktitle =. Complexity Theoretic Limitations on Learning DNFs , year =

[43] [43]

, booktitle =

Cook, Stephen A. , booktitle =. The Complexity of Theorem-Proving Procedures , year =

[44] [44]

and Kaiser, Lukasz and Polosukhin, Illia , booktitle =

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia , booktitle =. Attention Is All You Need , volume =

[45] [45]

, date-added =

Hyafil, Laurent and Rivest, Ronald L. , date-added =. Constructing optimal binary decision trees is. Information Processing Letters , number =

[46] [46]

, date-added =

Masek, William J. , date-added =. Some

[47] [47]

David Marr , date-added =. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information , year =

[48] [48]

arXiv preprint arXiv:2508.05776 , title =

Griffiths, Thomas L and Lake, Brenden M and McCoy, R Thomas and Pavlick, Ellie and Webb, Taylor W , date-added =. arXiv preprint arXiv:2508.05776 , title =

work page arXiv

[49] [49]

Skinner, B. F. , date-added =. Science and Human Behavior , year =

[50] [50]

, date-added =

Thorndike, Edward L. , date-added =. Animal Intelligence: Experimental Studies , year =

[51] [51]

Associative engines: Connectionism, concepts, and representational change , year =

Clark, Andy , date-added =. Associative engines: Connectionism, concepts, and representational change , year =

[52] [52]

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Direct Preference Optimization: Your Language Model is Secretly a Reward Model , url =. 2024 , bdsk-url-1 =. arXiv , author =:2305.18290 , primaryclass =

work page internal anchor Pith review Pith/arXiv arXiv 2024

[53] [53]

The elements of statistical learning: data mining, inference, and prediction , volume =

Hastie, Trevor and Tibshirani, Robert and Friedman, Jerome H and Friedman, Jerome H , date-added =. The elements of statistical learning: data mining, inference, and prediction , volume =

[54] [54]

Multilayer feedforward networks are universal approximators , volume =

Hornik, Kurt and Stinchcombe, Maxwell and White, Halbert , date-added =. Multilayer feedforward networks are universal approximators , volume =. Neural networks , number =

[55] [55]

Mandelbaum, Eric and Milli. The

[56] [56]

Surfing uncertainty: Prediction, action, and the embodied mind , year =

Clark, Andy , date-added =. Surfing uncertainty: Prediction, action, and the embodied mind , year =

[57] [57]

Grounded language acquisition through the eyes and ears of a single child , volume =

Vong, Wai Keen and Wang, Wentao and Orhan, A Emin and Lake, Brenden M , date-added =. Grounded language acquisition through the eyes and ears of a single child , volume =. Science , number =

[58] [58]

and Wagner, Allan R

Rescorla, Robert A. and Wagner, Allan R. , booktitle =. A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement , year =

[59] [59]

, journal =

Rescorla, Robert A. , journal =. Pavlovian Conditioning: It's Not What You Think It Is , volume =

[60] [60]

Sinking In: The Peripheral Baldwinisation of Human Cognition , volume =

Heyes, Cecilia and Chater, Nick and Dwyer, Dominic Michael , journal =. Sinking In: The Peripheral Baldwinisation of Human Cognition , volume =

[61] [61]

and McClelland, James L

Rumelhart, David E. and McClelland, James L. and PDP Research Group , date-added =. Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations , url =. 1986 , bdsk-url-1 =. doi:10.7551/mitpress/5236.001.0001 , isbn =

work page doi:10.7551/mitpress/5236.001.0001 1986

[62] [62]

On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , volume =

Pinker, Steven and Prince, Alan , date-added =. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , volume =. Cognition , number =

[63] [63]

Deep unsupervised learning using nonequilibrium thermodynamics , year =

Sohl-Dickstein, Jascha and Weiss, Eric and Maheswaranathan, Niru and Ganguli, Surya , booktitle =. Deep unsupervised learning using nonequilibrium thermodynamics , year =

[64] [64]

Weiskopf , date-added =

Daniel A. Weiskopf , date-added =. The Origins of Concepts , volume =. 2008 , bdsk-url-1 =. doi:10.1007/s11098-007-9150-8 , journal =

work page doi:10.1007/s11098-007-9150-8 2008

[65] [65]

Learning Matters: The Role of Learning in Concept Acquisition , volume =

Eric Margolis and Stephen Laurence , date-added =. Learning Matters: The Role of Learning in Concept Acquisition , volume =. 2011 , bdsk-url-1 =. doi:10.1111/j.1468-0017.2011.01429.x , journal =

work page doi:10.1111/j.1468-0017.2011.01429.x 2011

[66] [67]

2025 , bdsk-url-1 =

Propositional Interpretability in Artificial Intelligence , url =. 2025 , bdsk-url-1 =. arXiv , author =:2501.15740 , primaryclass =

work page arXiv 2025

[67] [68]

How we learn: The new science of education and the brain , year =

Dehaene, Stanislas , date-added =. How we learn: The new science of education and the brain , year =

[68] [69]

Letting structure emerge: connectionist and dynamical systems approaches to cognition , volume =

McClelland, James L and Botvinick, Matthew M and Noelle, David C and Plaut, David C and Rogers, Timothy T and Seidenberg, Mark S and Smith, Linda B , date-added =. Letting structure emerge: connectionist and dynamical systems approaches to cognition , volume =. Trends in cognitive sciences , number =

[69] [70]

In Defense of Nativism , volume =

Eric Margolis and Stephen Laurence , date-added =. In Defense of Nativism , volume =. 2013 , bdsk-url-1 =. doi:10.1007/s11098-012-9972-x , journal =

work page doi:10.1007/s11098-012-9972-x 2013

[70] [71]

Reinforcement learning: An introduction , year =

Sutton, Richard S and Barto, Andrew G , date-added =. Reinforcement learning: An introduction , year =

[71] [72]

Two-process learning theory: Relationships between

Rescorla, Robert A and Solomon, Richard L , date-added =. Two-process learning theory: Relationships between. Psychological Review , number =

[72] [73]

Cognitive gadgets: The cultural evolution of thinking , year =

Heyes, Cecilia , date-added =. Cognitive gadgets: The cultural evolution of thinking , year =

[73] [74]

Thomas McCoy and Shunyu Yao and Dan Friedman and Mathew D

R. Thomas McCoy and Shunyu Yao and Dan Friedman and Mathew D. Hardy and Thomas L. Griffiths , date-added =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. PNAS , number =

[74] [75]

Bishop , date-added =

Christopher M. Bishop , date-added =. Pattern Recognition and Machine Learning , year =

[75] [76]

Backpropagation and the brain , volume =

Lillicrap, Timothy P and Santoro, Adam and Marris, Luke and Akerman, Colin J and Hinton, Geoffrey , date-added =. Backpropagation and the brain , volume =. Nature Reviews Neuroscience , number =

[76] [77]

The psychology of associative learning , volume =

Shanks, David R , date-added =. The psychology of associative learning , volume =

[77] [78]

Language and thought: The view from LLMs , year =

Rothschild, Daniel , booktitle =. Language and thought: The view from LLMs , year =

[78] [80]

Kingma and Max Welling , booktitle =

Diederik P. Kingma and Max Welling , booktitle =. Stochastic gradient VB and the variational auto-encoder , year =

[79] [81]

Goodfellow and Jean Pouget

Ian J. Goodfellow and Jean Pouget. Generative Adversarial Nets , year =. NeurIPS , date-added =

[80] [82]

Advances in neural information processing systems , title =

Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E , date-added =. Advances in neural information processing systems , title =