Misalignment Between Backpropagation and the Hierarchy of Brain Responses to Images

Huy V. Vo; Jean-R\'emi King; J\'er\'emy Rapin; Jos\'ephine Raugel; Marc Szafraniec; Maximilian Seitzer; Patrick Labatut; Piotr Bojanowski; Valentin Wyart

arxiv: 2605.28693 · v1 · pith:ARHWLD6Knew · submitted 2026-05-27 · 🧬 q-bio.NC · cs.AI

Misalignment Between Backpropagation and the Hierarchy of Brain Responses to Images

Jos\'ephine Raugel , Maximilian Seitzer , Marc Szafraniec , Huy V. Vo , J\'er\'emy Rapin , Patrick Labatut , Piotr Bojanowski , Valentin Wyart

show 1 more author

Jean-R\'emi King

This is my paper

Pith reviewed 2026-06-29 08:57 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.AI

keywords backpropagationfMRIMEGvisual cortexneural encodingdeep learningbrain hierarchyself-supervised models

0 comments

The pith

Backpropagated gradients from vision models predict human brain responses in higher visual cortex and later latencies, yet their spatial layout and temporal order diverge from brain hierarchies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether backpropagation could operate in the brain by extending encoding models to map backpropagated gradients from pretrained vision networks onto fMRI and MEG recordings of human responses to natural images. Using DINOv3 and eight other models, the gradients successfully predict neural signals, particularly in higher-level visual cortex and at later response times. Despite this match in predictive power, the gradients fail to follow the expected spatial progression across cortical areas or the expected temporal progression across latencies. The results indicate that networks and brains may arrive at similar representations through different learning processes.

Core claim

Although backpropagated gradients can reliably predict both fMRI and MEG signals, specifically in higher-level visual cortex and for later latencies, the spatial and temporal organization of these backpropagated gradients diverges from the temporal and spatial hierarchies of the human brain.

What carries the argument

Extension of standard encoding analyses to map backpropagated gradients onto neural data from fMRI and MEG recordings.

If this is right

Backpropagated gradients predict fMRI and MEG signals reliably in higher-level visual cortex.
Backpropagated gradients predict signals for later latencies.
The spatial organization of gradients diverges from the cortical hierarchy.
The temporal ordering of gradients diverges from the latency hierarchy.
Deep networks and the brain likely rely on different mechanisms to learn similar representations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Alternative learning rules without a strict backward pass might better reproduce the observed brain hierarchies.
Training networks with explicit constraints on gradient ordering could test whether alignment with brain data becomes possible.
Extending the same gradient-mapping approach to other modalities could show whether the mismatch is vision-specific.
Hybrid learning algorithms could blend backpropagation with hierarchy-respecting update rules to reduce the divergence.

Load-bearing premise

A biologically plausible implementation of backpropagation in the brain would necessarily produce gradients whose spatial layout and temporal ordering match the observed cortical and latency hierarchies.

What would settle it

Finding a vision model in which the spatial progression of backpropagated gradients across visual cortex and their temporal ordering across latencies exactly reproduce the hierarchies measured in fMRI and MEG data.

read the original abstract

Backpropagation is the core learning mechanism underlying deep learning. However, whether and how this algorithm is implemented in the brain remains highly debated. In particular, while forward activations of pretrained models reliably map onto the cortical hierarchy of visual processing, it is unknown whether backpropagated gradients exhibit a similar correspondence. Here, we address this question using functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) recordings of human brain responses to natural images. For this, we extend standard encoding analyses of forward activations to map backpropagated gradients onto neural data. Focusing on a recent self-supervised vision model (DINOv3) and reproducing results on eight vision models, we find that backpropagated gradients can reliably predict both fMRI and MEG signals, specifically in higher-level visual cortex and for later latencies. However, the spatial and temporal organization of these backpropagated gradients in the brain diverges from the patterns expected under a biologically plausible backpropagation mechanism: specifically, both the order in which gradients are computed and their spatial organization diverge from the temporal and spatial hierarchies of the human brain. Together, these results suggest that, although deep networks and the brain may share similar representational content, they likely rely on fundamentally different mechanisms to learn those representations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Gradients from vision models predict brain signals in higher areas at later times but their spatial and temporal organization diverges from cortical hierarchies, with the claim of fundamentally different mechanisms resting on an unargued assumption.

read the letter

The main thing here is that backpropagated gradients can be mapped onto fMRI and MEG responses, especially higher up in visual cortex and at later latencies, yet the order and layout of those gradients do not follow the brain's known spatial and temporal hierarchies the way forward activations do.

They took the standard encoding setup used for activations and applied it to gradients, then reproduced the pattern across DINOv3 and eight other models. That check across models is straightforward and helpful. Running the same analysis on both fMRI and MEG adds a bit of cross-modal support for the dissociation between successful prediction and mismatched organization. The comparison stays direct, with no extra parameters fitted to the brain data.

The softer part is the interpretive step. The abstract states that the gradient organization diverges from what would be expected under a biologically plausible backpropagation mechanism. But the provided text does not spell out or cite why any brain version of backprop would be required to produce gradients whose spatial layout and computation order match the forward ventral-stream hierarchy. If error signals could travel through different routes or with different timing while still implementing credit assignment, the observed mismatch would not necessarily rule out backprop. Without that link made explicit, the evidence against backprop as a brain learning rule is less decisive than the conclusion suggests.

This is the sort of paper that would interest people working on whether deep learning rules can explain brain learning or on building more brain-like networks. A reader already following that debate would get a clear empirical dissociation to think about.

I would send it to peer review. The core observation is new and the reproduction work is there, so referees can check the stats, the exact divergence measure, and whether the assumption about expected gradient organization holds up.

Referee Report

3 major / 2 minor

Summary. The paper extends standard encoding analyses from forward activations to backpropagated gradients of pretrained vision models (DINOv3 and eight others). It reports that these gradients reliably predict fMRI and MEG responses, particularly in higher visual cortex and at later latencies, yet finds that the spatial layout of the gradients and the order of their computation diverge from the ventral-stream and latency hierarchies observed for forward activations and brain responses. The authors conclude that deep networks and brains likely rely on fundamentally different mechanisms to learn similar representations.

Significance. If the reported divergence is shown to be robust under appropriate statistical controls and if the interpretive step linking divergence to implausibility of backprop is justified, the work would supply a new empirical constraint on biological implementations of credit assignment. The use of multiple models and two imaging modalities (fMRI/MEG) strengthens the empirical basis; however, the central claim rests on an unargued assumption about what a brain-like backprop implementation must look like.

major comments (3)

[Methods] Methods (gradient-to-brain mapping procedure): the manuscript provides no explicit definition or formula for the 'divergence metric' used to quantify misalignment between gradient organization and brain hierarchies; without this, it is impossible to evaluate whether the reported spatial and temporal divergences are statistically reliable or merely descriptive.
[Results] Results (prediction analyses): the abstract and main text state that gradients 'reliably predict' signals in higher cortex and later latencies, yet no details are given on multiple-comparison correction, family-wise error control, or cross-validation procedures across the eight models and multiple brain regions/latencies; these controls are load-bearing for the claim that gradients predict above chance in a hierarchy-specific manner.
[Discussion] Discussion (interpretation of divergence): the conclusion that the observed misalignment implies 'fundamentally different mechanisms' treats the lack of alignment between gradient maps and forward cortical/latency hierarchies as diagnostic against backprop; no derivation, citation, or formal argument is supplied showing that any biologically plausible implementation of backpropagation (e.g., with delayed error propagation or alternative feedback wiring) would be required to reproduce the forward hierarchy.

minor comments (2)

[Methods] The choice of DINOv3 and the eight additional models is listed as a free parameter; a brief justification for this model set (or sensitivity analysis) would improve reproducibility.
[Figures] Figure legends should explicitly state the number of subjects, number of images, and exact latency windows used for each MEG analysis.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight areas where the manuscript can be strengthened for clarity and rigor. We address each major comment below and commit to revisions that incorporate explicit definitions, statistical details, and expanded interpretive discussion.

read point-by-point responses

Referee: [Methods] Methods (gradient-to-brain mapping procedure): the manuscript provides no explicit definition or formula for the 'divergence metric' used to quantify misalignment between gradient organization and brain hierarchies; without this, it is impossible to evaluate whether the reported spatial and temporal divergences are statistically reliable or merely descriptive.

Authors: We agree that an explicit definition and formula for the divergence metric is essential for evaluation and reproducibility. The metric compares the ordering of gradient-based predictions against the known ventral stream hierarchy (for spatial) and latency hierarchy (for temporal) using rank correlations or similar measures. In the revised manuscript, we will add a dedicated Methods subsection with the precise mathematical definition, including any normalization or statistical testing applied to the divergence scores. revision: yes
Referee: [Results] Results (prediction analyses): the abstract and main text state that gradients 'reliably predict' signals in higher cortex and later latencies, yet no details are given on multiple-comparison correction, family-wise error control, or cross-validation procedures across the eight models and multiple brain regions/latencies; these controls are load-bearing for the claim that gradients predict above chance in a hierarchy-specific manner.

Authors: The referee is correct that these procedural details were insufficiently specified. Our analyses used 10-fold cross-validation per model and region/latency, with FDR correction across tests. We will revise the Results and Methods sections to fully document the cross-validation scheme, multiple-comparison procedures (including any family-wise error controls), and how significance was assessed across the eight models. revision: yes
Referee: [Discussion] Discussion (interpretation of divergence): the conclusion that the observed misalignment implies 'fundamentally different mechanisms' treats the lack of alignment between gradient maps and forward cortical/latency hierarchies as diagnostic against backprop; no derivation, citation, or formal argument is supplied showing that any biologically plausible implementation of backpropagation (e.g., with delayed error propagation or alternative feedback wiring) would be required to reproduce the forward hierarchy.

Authors: This comment correctly identifies that the link between observed divergence and the implausibility of backprop-like mechanisms rests on an implicit assumption that requires more explicit justification. We will expand the Discussion to include a short formal argument (with citations to the credit-assignment literature) clarifying why standard or plausible variants of backprop would be expected to produce gradient hierarchies aligned with forward activation hierarchies, while acknowledging that highly non-standard implementations might differ. This will strengthen rather than alter the central claim. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical comparison of model gradients to independent brain recordings

full rationale

The paper performs standard encoding analyses that map both forward activations and backpropagated gradients from pretrained vision models onto fMRI and MEG recordings. These are independent neural datasets; no parameters are fitted to the target brain signals in a way that forces the reported spatial/temporal divergence, no self-citations supply load-bearing uniqueness theorems, and no equations define one quantity in terms of another by construction. The central claim rests on observed mismatches between gradient maps and cortical/latency hierarchies, which are falsifiable against the external recordings rather than tautological.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The paper relies on standard linear encoding assumptions from prior neuroimaging work and on the selection of particular vision models; no new free parameters are introduced to produce the misalignment result.

free parameters (1)

choice of vision models including DINOv3
The set of eight models plus DINOv3 determines the scope of the claimed generality of the misalignment.

axioms (1)

domain assumption Linear mapping between model features/gradients and BOLD/MEG signals is sufficient to test correspondence.
The extension of encoding analyses rests on this standard assumption in systems neuroscience.

pith-pipeline@v0.9.1-grok · 5799 in / 1291 out tokens · 26121 ms · 2026-06-29T08:57:55.854481+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 22 canonical work pages · 5 internal anchors

[1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...
[2]

Allen, Ghislain St-Yves, Yihan Wu, Jesse L

Emily J. Allen, Ghislain St-Yves, Yihan Wu, Jesse L. Breedlove, Jacob S. Prince, Logan T. Dowdle, Matthias Nau, Brad Caron, Franco Pestilli, Ian Charest, J. Benjamin Hutchinson, Thomas Naselaris, and Kendrick Kay. A massive 7t fmri dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 25 0 (1): 0 116–126, January 2022....

work page doi:10.1038/s41593-021-00962-x 2022
[3]

Bastos, W

Andre M. Bastos, W. Martin Usrey, Rick A. Adams, George R. Mangun, Pascal Fries, and Karl J. Friston. Canonical microcircuits for predictive coding. Neuron, 76 0 (4): 0 695–711, November 2012. ISSN 0896-6273. doi:10.1016/j.neuron.2012.10.038. http://dx.doi.org/10.1016/j.neuron.2012.10.038

work page doi:10.1016/j.neuron.2012.10.038 2012
[4]

Resolving human object recognition in space and time

Radoslaw Martin Cichy, Dimitrios Pantazis, and Aude Oliva. Resolving human object recognition in space and time. Nature neuroscience, 17 0 (3): 0 455--462, 2014

2014
[5]

How does the brain solve visual object recognition? Neuron, 73 0 (3): 0 415--434, 2012

James J DiCarlo, Davide Zoccolan, and Nicole C Rust. How does the brain solve visual object recognition? Neuron, 73 0 (3): 0 415--434, 2012

2012
[6]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2020. https://arxiv.org/abs/2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020
[7]

Predictive coding under the free-energy principle

Karl Friston and Stefan Kiebel. Predictive coding under the free-energy principle. Philosophical Transactions of the Royal Society B: Biological Sciences, 364 0 (1521): 0 1211–1221, May 2009. ISSN 1471-2970. doi:10.1098/rstb.2008.0300. http://dx.doi.org/10.1098/rstb.2008.0300

work page doi:10.1098/rstb.2008.0300 2009
[8]

Meg and eeg data analysis with mne-python

Alexandre Gramfort, Martin Luessi, Eric Larson, Denis A Engemann, Daniel Strohmeier, Christian Brodbeck, Roman Goj, Mainak Jas, Teon Brooks, Lauri Parkkonen, et al. Meg and eeg data analysis with mne-python. Frontiers in Neuroinformatics, 7: 0 267, 2013

2013
[9]

Guclu and M

U. Guclu and M. A. J. van Gerven. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. Journal of Neuroscience, 35 0 (27): 0 10005–10014, July 2015. ISSN 1529-2401. doi:10.1523/jneurosci.5023-14.2015. http://dx.doi.org/10.1523/JNEUROSCI.5023-14.2015

work page doi:10.1523/jneurosci.5023-14.2015 2015
[10]

Halko, P

N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53 0 (2): 0 217–288, January 2011. ISSN 1095-7200. doi:10.1137/090771806. http://dx.doi.org/10.1137/090771806

work page doi:10.1137/090771806 2011
[11]

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), page 770–778. IEEE, 2016. doi:10.1109/cvpr.2016.90. http://dx.doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[12]

Zickler, Jonathan T

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollar, and Ross Girshick. Masked autoencoders are scalable vision learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 15979–15988. IEEE, 2022. doi:10.1109/cvpr52688.2022.01553. http://dx.doi.org/10.1109/CVPR52688.2022.01553

work page doi:10.1109/cvpr52688.2022.01553 2022
[13]

things-meg

Martin N. Hebart, Oliver Contier, Lina Teichmann, Adam H. Rockter, Charles Zheng, Alexis Kidder, Anna Corriveau, Maryam Vaziri-Pashkam, and Chris I. Baker. "things-meg", 2023

2023
[14]

A hierarchy of linguistic predictions during natural language comprehension

Micha Heilbron, Kristijan Armeni, Jan-Mathijs Schoffelen, Peter Hagoort, and Floris P De Lange. A hierarchy of linguistic predictions during natural language comprehension. Proceedings of the National Academy of Sciences, 119 0 (32): 0 e2201968119, 2022

2022
[15]

doi: 10.1371/journal.pcbi.1003915

Seyed-Mahdi Khaligh-Razavi and Nikolaus Kriegeskorte. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Computational Biology, 10 0 (11): 0 e1003915, November 2014. ISSN 1553-7358. doi:10.1371/journal.pcbi.1003915. http://dx.doi.org/10.1371/journal.pcbi.1003915

work page doi:10.1371/journal.pcbi.1003915 2014
[16]

J-R. King, C. Bel, L. Evanson, J. Gadonneix, S. Houhamdi, J. L \'e vy, J. Raugel, A. Santos Revilla, M. Zhang, J. Bonnaire, C. Caucheteux, A. D \'e fossez, T. Desbordes, P. Diego-Sim \'o n, S. Khanna, J. Millet, P. Orhan, S. Panchavati, A. Ratouchniak, A. Thual, T. Brooks, K. Begany, Y. Benchetrit, M. Careil, H. Banville, S. d'Ascoli, S. Dahan, and J. Rap...

2026
[17]

Backpropagation and the brain

Timothy P Lillicrap, Adam Santoro, Luke Marris, Colin J Akerman, and Geoffrey E Hinton. Backpropagation and the brain. Nature Reviews Neuroscience, 21: 0 335--346, 2020

2020
[18]

Zickler, Jonathan T

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 11966–11976. IEEE, 2022. doi:10.1109/cvpr52688.2022.01167. http://dx.doi.org/10.1109/CVPR52688.2022.01167

work page doi:10.1109/cvpr52688.2022.01167 2022
[19]

Toward a realistic model of speech processing in the brain with self-supervised learning, 2023

Juliette Millet, Charlotte Caucheteux, Pierre Orhan, Yves Boubenec, Alexandre Gramfort, Ewan Dunbar, Christophe Pallier, and Jean-Remi King. Toward a realistic model of speech processing in the brain with self-supervised learning, 2023. https://arxiv.org/abs/2206.01685

work page arXiv 2023
[20]

Simple Open-Vocabulary Object Detection, page 728–755

Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby. Simple Open-Vocabulary Object Detection, page 728–755. Springer Nature Switzerland, 2022. ISBN 9783031200809. doi:10.1007/978-3-031...

work page doi:10.1007/978-3-031-20080-9_42 2022
[21]

Encoding and decoding in fMRI

Thomas Naselaris, Kendrick N Kay, Shinji Nishimoto, and Jack L Gallant. Encoding and decoding in fMRI . Neuroimage, 56 0 (2): 0 400--410, 2011

2011
[22]

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[23]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in P ython. Journal of Machine Learning Research, 12: 0 2825--2830, 2011

2011
[24]

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021. https://arxiv.org/abs/2103.00020

work page internal anchor Pith review Pith/arXiv arXiv 2021
[25]

Vo, Camille Couprie, Patrick Labatut, Piotr Bojanowski, Valentin Wyart, and Jean-Rémi King

Joséphine Raugel, Marc Szafraniec, Huy V. Vo, Camille Couprie, Patrick Labatut, Piotr Bojanowski, Valentin Wyart, and Jean-Rémi King. Disentangling the factors of convergence between brains and computer vision models, 2025. https://arxiv.org/abs/2508.18226

work page arXiv 2025
[26]

Rumelhart, Geoffrey E

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. Nature, 323 0 (6088): 0 533–536, October 1986. ISSN 1476-4687. doi:10.1038/323533a0. http://dx.doi.org/10.1038/323533a0

work page doi:10.1038/323533a0 1986
[27]

Brain-score: Which artificial neural network for object recognition is most brain-like? BioRxiv, page 407007, 2018

Martin Schrimpf, Jonas Kubilius, Ha Hong, Najib J Majaj, Rishi Rajalingham, Elias B Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott-Roy, Franziska Geiger, et al. Brain-score: Which artificial neural network for object recognition is most brain-like? BioRxiv, page 407007, 2018

2018
[28]

Oriane Siméoni, Huy V. Vo, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timothée Darcet, Théo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, Julien ...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[29]

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, Olivier Hénaff, Jeremiah Harmsen, Andreas Steiner, and Xiaohua Zhai. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features, 20...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[30]

Wang, Kendrick Kay, Thomas Naselaris, Michael J

Aria Y. Wang, Kendrick Kay, Thomas Naselaris, Michael J. Tarr, and Leila Wehbe. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset. Nature Machine Intelligence, 5 0 (12): 0 1415–1426, November 2023. ISSN 2522-5839. doi:10.1038/s42256-023-00753-y. http://dx.doi.org/10.1038/s42256-023-00753-y

work page doi:10.1038/s42256-023-00753-y 2023
[31]

James C. R. Whittington and Rafal Bogacz. An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity. Neural Computation, 29 0 (5): 0 1229–1262, May 2017. ISSN 1530-888X. doi:10.1162/neco_a_00949. http://dx.doi.org/10.1162/NECO_a_00949

work page doi:10.1162/neco_a_00949 2017
[32]

Daniel L. K. Yamins, Ha Hong, Charles F. Cadieu, Ethan A. Solomon, Darren Seibert, and James J. DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111 0 (23): 0 8619–8624, May 2014. ISSN 1091-6490. doi:10.1073/pnas.1403112111. http://dx.doi.org/10.1073/pnas....

work page doi:10.1073/pnas.1403112111 2014
[33]

The temporal paradox of hebbian learning and homeostatic plasticity

Friedemann Zenke, Wulfram Gerstner, and Surya Ganguli. The temporal paradox of hebbian learning and homeostatic plasticity. Current opinion in neurobiology, 43: 0 166--176, 2017

2017
[34]

Frank, James J

Chengxu Zhuang, Siming Yan, Aran Nayebi, Martin Schrimpf, Michael C. Frank, James J. DiCarlo, and Daniel L. K. Yamins. Unsupervised neural network models of the ventral visual stream. Proceedings of the National Academy of Sciences, 118 0 (3), January 2021. ISSN 1091-6490. doi:10.1073/pnas.2014196118. http://dx.doi.org/10.1073/pnas.2014196118

work page doi:10.1073/pnas.2014196118 2021
[35]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...
[36]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...
[37]

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...

[1] [1]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

[2] [2]

Allen, Ghislain St-Yves, Yihan Wu, Jesse L

Emily J. Allen, Ghislain St-Yves, Yihan Wu, Jesse L. Breedlove, Jacob S. Prince, Logan T. Dowdle, Matthias Nau, Brad Caron, Franco Pestilli, Ian Charest, J. Benjamin Hutchinson, Thomas Naselaris, and Kendrick Kay. A massive 7t fmri dataset to bridge cognitive neuroscience and artificial intelligence. Nature Neuroscience, 25 0 (1): 0 116–126, January 2022....

work page doi:10.1038/s41593-021-00962-x 2022

[3] [3]

Bastos, W

Andre M. Bastos, W. Martin Usrey, Rick A. Adams, George R. Mangun, Pascal Fries, and Karl J. Friston. Canonical microcircuits for predictive coding. Neuron, 76 0 (4): 0 695–711, November 2012. ISSN 0896-6273. doi:10.1016/j.neuron.2012.10.038. http://dx.doi.org/10.1016/j.neuron.2012.10.038

work page doi:10.1016/j.neuron.2012.10.038 2012

[4] [4]

Resolving human object recognition in space and time

Radoslaw Martin Cichy, Dimitrios Pantazis, and Aude Oliva. Resolving human object recognition in space and time. Nature neuroscience, 17 0 (3): 0 455--462, 2014

2014

[5] [5]

How does the brain solve visual object recognition? Neuron, 73 0 (3): 0 415--434, 2012

James J DiCarlo, Davide Zoccolan, and Nicole C Rust. How does the brain solve visual object recognition? Neuron, 73 0 (3): 0 415--434, 2012

2012

[6] [6]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2020. https://arxiv.org/abs/2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020

[7] [7]

Predictive coding under the free-energy principle

Karl Friston and Stefan Kiebel. Predictive coding under the free-energy principle. Philosophical Transactions of the Royal Society B: Biological Sciences, 364 0 (1521): 0 1211–1221, May 2009. ISSN 1471-2970. doi:10.1098/rstb.2008.0300. http://dx.doi.org/10.1098/rstb.2008.0300

work page doi:10.1098/rstb.2008.0300 2009

[8] [8]

Meg and eeg data analysis with mne-python

Alexandre Gramfort, Martin Luessi, Eric Larson, Denis A Engemann, Daniel Strohmeier, Christian Brodbeck, Roman Goj, Mainak Jas, Teon Brooks, Lauri Parkkonen, et al. Meg and eeg data analysis with mne-python. Frontiers in Neuroinformatics, 7: 0 267, 2013

2013

[9] [9]

Guclu and M

U. Guclu and M. A. J. van Gerven. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. Journal of Neuroscience, 35 0 (27): 0 10005–10014, July 2015. ISSN 1529-2401. doi:10.1523/jneurosci.5023-14.2015. http://dx.doi.org/10.1523/JNEUROSCI.5023-14.2015

work page doi:10.1523/jneurosci.5023-14.2015 2015

[10] [10]

Halko, P

N. Halko, P. G. Martinsson, and J. A. Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Review, 53 0 (2): 0 217–288, January 2011. ISSN 1095-7200. doi:10.1137/090771806. http://dx.doi.org/10.1137/090771806

work page doi:10.1137/090771806 2011

[11] [11]

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), page 770–778. IEEE, 2016. doi:10.1109/cvpr.2016.90. http://dx.doi.org/10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016

[12] [12]

Zickler, Jonathan T

Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollar, and Ross Girshick. Masked autoencoders are scalable vision learners. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 15979–15988. IEEE, 2022. doi:10.1109/cvpr52688.2022.01553. http://dx.doi.org/10.1109/CVPR52688.2022.01553

work page doi:10.1109/cvpr52688.2022.01553 2022

[13] [13]

things-meg

Martin N. Hebart, Oliver Contier, Lina Teichmann, Adam H. Rockter, Charles Zheng, Alexis Kidder, Anna Corriveau, Maryam Vaziri-Pashkam, and Chris I. Baker. "things-meg", 2023

2023

[14] [14]

A hierarchy of linguistic predictions during natural language comprehension

Micha Heilbron, Kristijan Armeni, Jan-Mathijs Schoffelen, Peter Hagoort, and Floris P De Lange. A hierarchy of linguistic predictions during natural language comprehension. Proceedings of the National Academy of Sciences, 119 0 (32): 0 e2201968119, 2022

2022

[15] [15]

doi: 10.1371/journal.pcbi.1003915

Seyed-Mahdi Khaligh-Razavi and Nikolaus Kriegeskorte. Deep supervised, but not unsupervised, models may explain it cortical representation. PLoS Computational Biology, 10 0 (11): 0 e1003915, November 2014. ISSN 1553-7358. doi:10.1371/journal.pcbi.1003915. http://dx.doi.org/10.1371/journal.pcbi.1003915

work page doi:10.1371/journal.pcbi.1003915 2014

[16] [16]

J-R. King, C. Bel, L. Evanson, J. Gadonneix, S. Houhamdi, J. L \'e vy, J. Raugel, A. Santos Revilla, M. Zhang, J. Bonnaire, C. Caucheteux, A. D \'e fossez, T. Desbordes, P. Diego-Sim \'o n, S. Khanna, J. Millet, P. Orhan, S. Panchavati, A. Ratouchniak, A. Thual, T. Brooks, K. Begany, Y. Benchetrit, M. Careil, H. Banville, S. d'Ascoli, S. Dahan, and J. Rap...

2026

[17] [17]

Backpropagation and the brain

Timothy P Lillicrap, Adam Santoro, Luke Marris, Colin J Akerman, and Geoffrey E Hinton. Backpropagation and the brain. Nature Reviews Neuroscience, 21: 0 335--346, 2020

2020

[18] [18]

Zickler, Jonathan T

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), page 11966–11976. IEEE, 2022. doi:10.1109/cvpr52688.2022.01167. http://dx.doi.org/10.1109/CVPR52688.2022.01167

work page doi:10.1109/cvpr52688.2022.01167 2022

[19] [19]

Toward a realistic model of speech processing in the brain with self-supervised learning, 2023

Juliette Millet, Charlotte Caucheteux, Pierre Orhan, Yves Boubenec, Alexandre Gramfort, Ewan Dunbar, Christophe Pallier, and Jean-Remi King. Toward a realistic model of speech processing in the brain with self-supervised learning, 2023. https://arxiv.org/abs/2206.01685

work page arXiv 2023

[20] [20]

Simple Open-Vocabulary Object Detection, page 728–755

Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby. Simple Open-Vocabulary Object Detection, page 728–755. Springer Nature Switzerland, 2022. ISBN 9783031200809. doi:10.1007/978-3-031...

work page doi:10.1007/978-3-031-20080-9_42 2022

[21] [21]

Encoding and decoding in fMRI

Thomas Naselaris, Kendrick N Kay, Shinji Nishimoto, and Jack L Gallant. Encoding and decoding in fMRI . Neuroimage, 56 0 (2): 0 400--410, 2011

2011

[22] [22]

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[23] [23]

Pedregosa, G

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in P ython. Journal of Machine Learning Research, 12: 0 2825--2830, 2011

2011

[24] [24]

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision, 2021. https://arxiv.org/abs/2103.00020

work page internal anchor Pith review Pith/arXiv arXiv 2021

[25] [25]

Vo, Camille Couprie, Patrick Labatut, Piotr Bojanowski, Valentin Wyart, and Jean-Rémi King

Joséphine Raugel, Marc Szafraniec, Huy V. Vo, Camille Couprie, Patrick Labatut, Piotr Bojanowski, Valentin Wyart, and Jean-Rémi King. Disentangling the factors of convergence between brains and computer vision models, 2025. https://arxiv.org/abs/2508.18226

work page arXiv 2025

[26] [26]

Rumelhart, Geoffrey E

David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. Learning representations by back-propagating errors. Nature, 323 0 (6088): 0 533–536, October 1986. ISSN 1476-4687. doi:10.1038/323533a0. http://dx.doi.org/10.1038/323533a0

work page doi:10.1038/323533a0 1986

[27] [27]

Brain-score: Which artificial neural network for object recognition is most brain-like? BioRxiv, page 407007, 2018

Martin Schrimpf, Jonas Kubilius, Ha Hong, Najib J Majaj, Rishi Rajalingham, Elias B Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott-Roy, Franziska Geiger, et al. Brain-score: Which artificial neural network for object recognition is most brain-like? BioRxiv, page 407007, 2018

2018

[28] [28]

Oriane Siméoni, Huy V. Vo, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Michaël Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timothée Darcet, Théo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie, Julien ...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[29] [29]

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, Olivier Hénaff, Jeremiah Harmsen, Andreas Steiner, and Xiaohua Zhai. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features, 20...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[30] [30]

Wang, Kendrick Kay, Thomas Naselaris, Michael J

Aria Y. Wang, Kendrick Kay, Thomas Naselaris, Michael J. Tarr, and Leila Wehbe. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset. Nature Machine Intelligence, 5 0 (12): 0 1415–1426, November 2023. ISSN 2522-5839. doi:10.1038/s42256-023-00753-y. http://dx.doi.org/10.1038/s42256-023-00753-y

work page doi:10.1038/s42256-023-00753-y 2023

[31] [31]

James C. R. Whittington and Rafal Bogacz. An approximation of the error backpropagation algorithm in a predictive coding network with local hebbian synaptic plasticity. Neural Computation, 29 0 (5): 0 1229–1262, May 2017. ISSN 1530-888X. doi:10.1162/neco_a_00949. http://dx.doi.org/10.1162/NECO_a_00949

work page doi:10.1162/neco_a_00949 2017

[32] [32]

Daniel L. K. Yamins, Ha Hong, Charles F. Cadieu, Ethan A. Solomon, Darren Seibert, and James J. DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111 0 (23): 0 8619–8624, May 2014. ISSN 1091-6490. doi:10.1073/pnas.1403112111. http://dx.doi.org/10.1073/pnas....

work page doi:10.1073/pnas.1403112111 2014

[33] [33]

The temporal paradox of hebbian learning and homeostatic plasticity

Friedemann Zenke, Wulfram Gerstner, and Surya Ganguli. The temporal paradox of hebbian learning and homeostatic plasticity. Current opinion in neurobiology, 43: 0 166--176, 2017

2017

[34] [34]

Frank, James J

Chengxu Zhuang, Siming Yan, Aran Nayebi, Martin Schrimpf, Michael C. Frank, James J. DiCarlo, and Daniel L. K. Yamins. Unsupervised neural network models of the ventral visual stream. Proceedings of the National Academy of Sciences, 118 0 (3), January 2021. ISSN 1091-6490. doi:10.1073/pnas.2014196118. http://dx.doi.org/10.1073/pnas.2014196118

work page doi:10.1073/pnas.2014196118 2021

[35] [35]

@esa (Ref

\@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

[36] [36]

\@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

[37] [37]

@open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...