The New Associationism: Lessons from Deep Learning
Pith reviewed 2026-06-30 18:30 UTC · model grok-4.3
The pith
Supervised learning powers diverse AI systems and revives associationist accounts of cognition.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Supervised learning, which relies on evaluative feedback, underlies a wide range of contemporary AI systems including large language models and game-playing agents. These systems differ mainly in the effort needed to generate feedback signals. This finding supports associationist principles of uniform, gradual, error-driven learning mechanisms that operate across different domains and weakens arguments that such mechanisms cannot explain human cognitive abilities.
What carries the argument
Supervised learning as an error-driven feedback mechanism that operates within advanced computational architectures.
If this is right
- Associationist learning mechanisms can handle complex, domain-spanning tasks in AI.
- Human cognitive capacities may be achievable through gradual error correction rather than specialized innate structures.
- Deep learning architectures extend beyond but incorporate classical associationist processes.
- AI systems demonstrate that limited feedback mechanisms suffice for broad learning.
Where Pith is reading between the lines
- Future research could test whether human learning relies on similar feedback generation processes as in AI.
- Architectural innovations in AI might suggest new ways to model human cognition beyond pure associationism.
- Limitations in current AI could point to missing elements in associationist theories.
Load-bearing premise
That the success of AI systems provides evidence for the mechanisms that suffice for human cognitive capacities.
What would settle it
Demonstration of a human cognitive task that no supervised learning system with any architecture can perform.
read the original abstract
What can the success of modern AI tell us about how humans learn? This paper argues that taking AI seriously as a model of human learning supports a modest but genuine associationism. The central finding is that supervised learning -- learning driven by evaluative feedback -- underlies a surprisingly wide range of contemporary AI systems, from large language models to game-playing agents, differing primarily in how much work is required to generate the relevant feedback signal. This vindicates associationist ideals of a uniform, gradual, error-driven learning mechanism operating across domains, and defuses the once-influential argument that associationist mechanisms are too limited to account for human cognitive capacities. At the same time, the successes of deep learning depend on computational architectures that go well beyond anything classical associationists envisaged, and supervised learning operates within these as one component rather than a complete account of learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that supervised learning—error-driven learning from evaluative feedback—underlies a wide range of modern AI systems (LLMs, game agents), differing mainly in the effort needed to produce the feedback signal. This is presented as vindicating a modest associationism for human learning by showing that uniform, gradual, error-driven mechanisms can scale across domains when embedded in sufficiently powerful architectures, while acknowledging that deep learning exceeds classical associationism and that supervised learning is only one component.
Significance. If the mapping from AI computational success to human cognitive mechanisms were secured, the result would contribute to philosophy of mind and cognitive science by offering a computational existence proof that associationist learning can handle complex, cross-domain capacities, thereby weakening claims that such mechanisms are inherently too limited. The paper correctly notes architectural extensions beyond classical views, but its interpretive approach yields no new empirical predictions or formal derivations.
major comments (2)
- [Abstract] Abstract and opening sections: the central claim that AI successes 'vindicate associationist ideals' and 'defuse' arguments about the limits of associationist mechanisms requires a bridging argument showing why computational feasibility in engineered systems (with backpropagation and large-scale optimization) supplies evidence that such mechanisms suffice for human cognition. No such justification—empirical, theoretical, or analogical—is supplied; the inference is asserted rather than demonstrated.
- [Abstract] The discussion of supervised learning across systems correctly identifies it as one component within richer architectures, yet the paper provides no case studies or counterexamples that would test whether non-associationist mechanisms are required for the capacities exhibited by those systems.
Simulated Author's Rebuttal
We thank the referee for these comments, which help clarify the scope and evidential basis of our argument. We address each point below and note revisions that will be incorporated in the next version.
read point-by-point responses
-
Referee: [Abstract] Abstract and opening sections: the central claim that AI successes 'vindicate associationist ideals' and 'defuse' arguments about the limits of associationist mechanisms requires a bridging argument showing why computational feasibility in engineered systems (with backpropagation and large-scale optimization) supplies evidence that such mechanisms suffice for human cognition. No such justification—empirical, theoretical, or analogical—is supplied; the inference is asserted rather than demonstrated.
Authors: We accept that the inferential step from engineered AI systems to human cognition is stated rather than fully elaborated. The manuscript treats modern AI as an existence proof that uniform error-driven learning can support complex, cross-domain performance once embedded in sufficiently rich architectures. To make this explicit, we will revise the abstract and the opening sections to include a short paragraph on the relevant analogy: just as connectionist models were offered in cognitive science as demonstrations that certain learning rules are computationally viable for tasks previously thought to require symbolic mechanisms, the scaling successes of supervised deep learning show that associationist principles are not inherently limited in the way classical critics claimed. This remains an analogical rather than direct empirical claim, but we will flag its status more clearly. revision: yes
-
Referee: [Abstract] The discussion of supervised learning across systems correctly identifies it as one component within richer architectures, yet the paper provides no case studies or counterexamples that would test whether non-associationist mechanisms are required for the capacities exhibited by those systems.
Authors: The manuscript already notes that supervised learning operates as one component rather than a complete theory. To respond to the request for more concrete illustration, we will add a brief subsection that examines two systems (transformer-based language models and model-based reinforcement learning agents) to show how the core parameter updates remain error-driven even while auxiliary structures (attention, planning modules) are present. We do not claim this constitutes an exhaustive test that non-associationist mechanisms are never required; the paper's aim is narrower—to show that associationist learning suffices for the scaling that has been achieved. A fuller empirical test would require experiments outside the scope of this philosophical paper. revision: partial
Circularity Check
No circularity: interpretive argument lacks equations, fits, or self-referential derivations
full rationale
The paper advances a qualitative philosophical claim that supervised learning in modern AI systems supports associationist views of human learning. No equations, parameter fitting, or quantitative predictions appear in the abstract or described structure. The argument interprets existing AI successes (LLMs, game agents) as evidence for uniform error-driven mechanisms without reducing any step to a self-definition, fitted input renamed as prediction, or load-bearing self-citation chain. The mapping from artificial to biological cognition is presented as an inference rather than a formal derivation, rendering the reasoning self-contained against external benchmarks with no reductions to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Success of artificial supervised learning systems provides evidence about the sufficiency of associationist mechanisms for human cognition.
Reference graph
Works this paper leans on
-
[1]
Silver, David and Schrittwieser, Julian and Simonyan, Karen and Antonoglou, Ioannis and Huang, Aja and Guez, Arthur and Hubert, Thomas and Baker, Lucas and Lai, Matthew and Bolton, Adrian and Chen, Yutian and Lillicrap, Timothy and Hui, Fan and Sifre, Laurent and van den Driessche, George and Graepel, Thore and Hassabis, Demis , date-added =. Mastering th...
-
[2]
and Bates, Elizabeth A
Elman, Jeffrey L. and Bates, Elizabeth A. and Johnson, Mark H. and Karmiloff-Smith, Annette and Parisi, Domenico and Plunkett, Kim , date-added =. Rethinking Innateness: A Connectionist Perspective on Development , year =
-
[3]
, booktitle =
Piantadosi, Steven T. , booktitle =. Modern language models refute
-
[4]
The Building Blocks of Thought: The Foundations of Human Cognition , year =
Margolis, Eric and Laurence, Stephen , date-added =. The Building Blocks of Thought: The Foundations of Human Cognition , year =
-
[5]
Findings of the
Warstadt, Alex and Mueller, Aaron and Choshen, Leshem and Wilcox, Ethan and Zhuang, Chengxu and Ciro, Juan and Mosquera, Rafael and Paranjabe, Bhargavi and Williams, Adina and Linzen, Tal and Cotterell, Ryan , booktitle =. Findings of the
-
[6]
, date-added =
Mitchell, Tom M. , date-added =. The Need for Biases in Learning Generalizations , year =
-
[7]
, date-added =
Marcus, Gary F. , date-added =. The Birth of the Mind: How a Tiny Number of Genes Creates the Complexities of Human Thought , year =
-
[8]
arXiv , author =:2501.12948 , primaryclass =
work page internal anchor Pith review Pith/arXiv arXiv
-
[9]
Mind Children: The Future of Robot and Human Intelligence , year =
Moravec, Hans , date-added =. Mind Children: The Future of Robot and Human Intelligence , year =
-
[10]
Unified Theories of Cognition , year =
Newell, Allen , date-added =. Unified Theories of Cognition , year =
-
[11]
, date-added =
Anderson, John R. , date-added =. The Architecture of Cognition , year =
-
[12]
, date-added =
Anderson, John R. , date-added =. How Can the Human Mind Occur in the Physical Universe? , year =
-
[13]
Mitchell, Tom M. , date-added =. Generalization as Search , volume =. 1982 , bdsk-url-1 =. doi:10.1016/0004-3702(82)90040-6 , journal =
-
[14]
Quinlan, J. Ross , date-added =. Induction of Decision Trees , volume =. 1986 , bdsk-url-1 =. doi:10.1007/BF00116251 , journal =
-
[15]
Ross , date-added =
Quinlan, J. Ross , date-added =. C4.5: Programs for Machine Learning , year =
-
[16]
Inductive Logic Programming , volume =
Muggleton, Stephen , date-added =. Inductive Logic Programming , volume =. 1991 , bdsk-url-1 =. doi:10.1007/BF03037089 , journal =
-
[17]
The Society of Mind , year =
Minsky, Marvin , date-added =. The Society of Mind , year =
-
[18]
, date-added =
Griffiths, Thomas L. , date-added =. The Laws of Thought: The Quest for a Mathematical Theory of the Mind , year =
-
[19]
Highly accurate protein structure prediction with
Jumper, John and Evans, Richard and Pritzel, Alexander and Green, Tim and Figurnov, Michael and Ronneberger, Olaf and Tunyasuvunakool, Kathryn and Bates, Russ and. Highly accurate protein structure prediction with. 2021 , bdsk-url-1 =. doi:10.1038/s41586-021-03819-2 , journal =
-
[20]
, date-added =
Hinton, Geoffrey E. , date-added =. Commencement Address,
-
[21]
Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1: Foundations , year =
-
[22]
Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 2: Psychological and Biological Models , year =
-
[23]
Parallel Distributed Processing: Explorations in the Microstructure of Cognition , year =
-
[24]
, date-added =
Turing, Alan M. , date-added =. On Computable Numbers, with an Application to the Entscheidungsproblem , volume =. Proceedings of the London Mathematical Society , note =
-
[25]
, date-added =
Pitt, Leonard and Valiant, Leslie G. , date-added =. Computational Limitations on Learning from Examples , volume =. Journal of the ACM , number =
-
[26]
Perceptrons: An Introduction to Computational Geometry , year =
Minsky, Marvin and Papert, Seymour , date-added =. Perceptrons: An Introduction to Computational Geometry , year =
-
[27]
and Hinton, Geoffrey E
Rumelhart, David E. and Hinton, Geoffrey E. and Williams, Ronald J. , date-added =. Learning representations by back-propagating errors , volume =. Nature , number =
-
[28]
Software 2.0 , url =
Karpathy, Andrej , date-added =. Software 2.0 , url =. 2017 , bdsk-url-1 =
2017
-
[29]
The representation of the cumulative rounding error of an algorithm as a
Linnainmaa, Seppo , date-added =. The representation of the cumulative rounding error of an algorithm as a
-
[30]
, date-added =
Werbos, Paul J. , date-added =. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences , year =
-
[31]
and Pitts, Walter , date-added =
McCulloch, Warren S. and Pitts, Walter , date-added =. A Logical Calculus of the Ideas Immanent in Nervous Activity , volume =. Bulletin of Mathematical Biophysics , number =
-
[32]
, date-added =
Mitchell, Tom M. , date-added =. Machine Learning , year =
-
[33]
Artificial Intelligence: A Guide for Thinking Humans , year =
Mitchell, Melanie , date-added =. Artificial Intelligence: A Guide for Thinking Humans , year =
-
[34]
, date-added =
Elman, Jeffrey L. , date-added =. Finding structure in time , volume =. Cognitive Science , number =
-
[35]
and Dean, Jeffrey , booktitle =
Mikolov, Tomas and Sutskever, Ilya and Chen, Kai and Corrado, Greg S. and Dean, Jeffrey , booktitle =. Distributed Representations of Words and Phrases and their Compositionality , volume =
-
[36]
and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke , booktitle =
Peters, Matthew E. and Neumann, Mark and Iyyer, Mohit and Gardner, Matt and Clark, Christopher and Lee, Kenton and Zettlemoyer, Luke , booktitle =. Deep contextualized word representations , year =
-
[37]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , year =
Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , booktitle =. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , year =
-
[38]
Improving Language Understanding by Generative Pre-Training , year =
Radford, Alec and Narasimhan, Karthik and Salimans, Tim and Sutskever, Ilya , date-added =. Improving Language Understanding by Generative Pre-Training , year =
-
[39]
and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others , date-added =
Brown, Tom and Mann, Benjamin and Ryder, Nick and Subbiah, Melanie and Kaplan, Jared D. and Dhariwal, Prafulla and Neelakantan, Arvind and Shyam, Pranav and Sastry, Girish and Askell, Amanda and others , date-added =. Advances in Neural Information Processing Systems , title =
-
[40]
, date-added =
Harris, Zellig S. , date-added =. Distributional structure , volume =. Word , number =
-
[41]
, date-added =
Kearns, Michael and Valiant, Leslie G. , date-added =. Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , volume =. Journal of the ACM , number =
-
[42]
Complexity Theoretic Limitations on Learning DNFs , year =
Daniely, Amit and Shalev-Shwartz, Shai , booktitle =. Complexity Theoretic Limitations on Learning DNFs , year =
-
[43]
, booktitle =
Cook, Stephen A. , booktitle =. The Complexity of Theorem-Proving Procedures , year =
-
[44]
and Kaiser, Lukasz and Polosukhin, Illia , booktitle =
Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia , booktitle =. Attention Is All You Need , volume =
-
[45]
, date-added =
Hyafil, Laurent and Rivest, Ronald L. , date-added =. Constructing optimal binary decision trees is. Information Processing Letters , number =
-
[46]
, date-added =
Masek, William J. , date-added =. Some
-
[47]
David Marr , date-added =. Vision. A Computational Investigation into the Human Representation and Processing of Visual Information , year =
-
[48]
arXiv preprint arXiv:2508.05776 , title =
Griffiths, Thomas L and Lake, Brenden M and McCoy, R Thomas and Pavlick, Ellie and Webb, Taylor W , date-added =. arXiv preprint arXiv:2508.05776 , title =
-
[49]
Skinner, B. F. , date-added =. Science and Human Behavior , year =
-
[50]
, date-added =
Thorndike, Edward L. , date-added =. Animal Intelligence: Experimental Studies , year =
-
[51]
Associative engines: Connectionism, concepts, and representational change , year =
Clark, Andy , date-added =. Associative engines: Connectionism, concepts, and representational change , year =
-
[52]
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Direct Preference Optimization: Your Language Model is Secretly a Reward Model , url =. 2024 , bdsk-url-1 =. arXiv , author =:2305.18290 , primaryclass =
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[53]
The elements of statistical learning: data mining, inference, and prediction , volume =
Hastie, Trevor and Tibshirani, Robert and Friedman, Jerome H and Friedman, Jerome H , date-added =. The elements of statistical learning: data mining, inference, and prediction , volume =
-
[54]
Multilayer feedforward networks are universal approximators , volume =
Hornik, Kurt and Stinchcombe, Maxwell and White, Halbert , date-added =. Multilayer feedforward networks are universal approximators , volume =. Neural networks , number =
-
[55]
Mandelbaum, Eric and Milli. The
-
[56]
Surfing uncertainty: Prediction, action, and the embodied mind , year =
Clark, Andy , date-added =. Surfing uncertainty: Prediction, action, and the embodied mind , year =
-
[57]
Grounded language acquisition through the eyes and ears of a single child , volume =
Vong, Wai Keen and Wang, Wentao and Orhan, A Emin and Lake, Brenden M , date-added =. Grounded language acquisition through the eyes and ears of a single child , volume =. Science , number =
-
[58]
and Wagner, Allan R
Rescorla, Robert A. and Wagner, Allan R. , booktitle =. A Theory of Pavlovian Conditioning: Variations in the Effectiveness of Reinforcement and Nonreinforcement , year =
-
[59]
, journal =
Rescorla, Robert A. , journal =. Pavlovian Conditioning: It's Not What You Think It Is , volume =
-
[60]
Sinking In: The Peripheral Baldwinisation of Human Cognition , volume =
Heyes, Cecilia and Chater, Nick and Dwyer, Dominic Michael , journal =. Sinking In: The Peripheral Baldwinisation of Human Cognition , volume =
-
[61]
Rumelhart, David E. and McClelland, James L. and PDP Research Group , date-added =. Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations , url =. 1986 , bdsk-url-1 =. doi:10.7551/mitpress/5236.001.0001 , isbn =
-
[62]
On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , volume =
Pinker, Steven and Prince, Alan , date-added =. On language and connectionism: Analysis of a parallel distributed processing model of language acquisition , volume =. Cognition , number =
-
[63]
Deep unsupervised learning using nonequilibrium thermodynamics , year =
Sohl-Dickstein, Jascha and Weiss, Eric and Maheswaranathan, Niru and Ganguli, Surya , booktitle =. Deep unsupervised learning using nonequilibrium thermodynamics , year =
-
[64]
Daniel A. Weiskopf , date-added =. The Origins of Concepts , volume =. 2008 , bdsk-url-1 =. doi:10.1007/s11098-007-9150-8 , journal =
-
[65]
Learning Matters: The Role of Learning in Concept Acquisition , volume =
Eric Margolis and Stephen Laurence , date-added =. Learning Matters: The Role of Learning in Concept Acquisition , volume =. 2011 , bdsk-url-1 =. doi:10.1111/j.1468-0017.2011.01429.x , journal =
-
[67]
Propositional Interpretability in Artificial Intelligence , url =. 2025 , bdsk-url-1 =. arXiv , author =:2501.15740 , primaryclass =
-
[68]
How we learn: The new science of education and the brain , year =
Dehaene, Stanislas , date-added =. How we learn: The new science of education and the brain , year =
-
[69]
Letting structure emerge: connectionist and dynamical systems approaches to cognition , volume =
McClelland, James L and Botvinick, Matthew M and Noelle, David C and Plaut, David C and Rogers, Timothy T and Seidenberg, Mark S and Smith, Linda B , date-added =. Letting structure emerge: connectionist and dynamical systems approaches to cognition , volume =. Trends in cognitive sciences , number =
-
[70]
In Defense of Nativism , volume =
Eric Margolis and Stephen Laurence , date-added =. In Defense of Nativism , volume =. 2013 , bdsk-url-1 =. doi:10.1007/s11098-012-9972-x , journal =
-
[71]
Reinforcement learning: An introduction , year =
Sutton, Richard S and Barto, Andrew G , date-added =. Reinforcement learning: An introduction , year =
-
[72]
Two-process learning theory: Relationships between
Rescorla, Robert A and Solomon, Richard L , date-added =. Two-process learning theory: Relationships between. Psychological Review , number =
-
[73]
Cognitive gadgets: The cultural evolution of thinking , year =
Heyes, Cecilia , date-added =. Cognitive gadgets: The cultural evolution of thinking , year =
-
[74]
Thomas McCoy and Shunyu Yao and Dan Friedman and Mathew D
R. Thomas McCoy and Shunyu Yao and Dan Friedman and Mathew D. Hardy and Thomas L. Griffiths , date-added =. Embers of autoregression show how large language models are shaped by the problem they are trained to solve , volume =. PNAS , number =
-
[75]
Bishop , date-added =
Christopher M. Bishop , date-added =. Pattern Recognition and Machine Learning , year =
-
[76]
Backpropagation and the brain , volume =
Lillicrap, Timothy P and Santoro, Adam and Marris, Luke and Akerman, Colin J and Hinton, Geoffrey , date-added =. Backpropagation and the brain , volume =. Nature Reviews Neuroscience , number =
-
[77]
The psychology of associative learning , volume =
Shanks, David R , date-added =. The psychology of associative learning , volume =
-
[78]
Language and thought: The view from LLMs , year =
Rothschild, Daniel , booktitle =. Language and thought: The view from LLMs , year =
-
[80]
Kingma and Max Welling , booktitle =
Diederik P. Kingma and Max Welling , booktitle =. Stochastic gradient VB and the variational auto-encoder , year =
-
[81]
Goodfellow and Jean Pouget
Ian J. Goodfellow and Jean Pouget. Generative Adversarial Nets , year =. NeurIPS , date-added =
-
[82]
Advances in neural information processing systems , title =
Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E , date-added =. Advances in neural information processing systems , title =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.