Cross-Species RSA Reveals Conserved Early Visual Alignment but Divergent Higher-Area Rankings Across Human fMRI and Macaque Electrophysiology
Pith reviewed 2026-05-22 07:36 UTC · model grok-4.3
The pith
Early visual alignment holds across human fMRI and macaque electrophysiology, but higher-area rankings diverge with model capacity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using identical model weights, all five learning rules achieve higher alignment with macaque early visual cortex than with human fMRI data, with spike-timing-dependent plasticity and predictive coding leading at V1/V2; at IT, learning-rule rankings show zero correlation across species and a pretrained ResNet-50 substantially outperforms the custom models.
What carries the argument
Cross-species representational similarity analysis (RSA) that compares model activation patterns to brain recordings while holding model weights fixed from the human study.
If this is right
- Alignment in early visual areas remains robust even when switching from human fMRI to macaque single-unit recordings.
- The same learning rules that lead in human V1 also lead in macaque V1/V2.
- Higher-area alignment at IT is more sensitive to overall model capacity and training dataset than to the specific learning rule used.
- Null results on rule rankings at IT are expected given only five rules and are further limited by stimulus differences.
Where Pith is reading between the lines
- Shared early visual alignment suggests that basic feature extraction mechanisms can be modeled once and applied across primate species.
- Higher visual areas may require species-specific stimulus statistics or larger capacity to produce matching rankings.
- Matched-stimulus experiments across species would isolate whether the observed IT divergence reflects true species differences or experimental confounds.
Load-bearing premise
Differences in the stimulus sets used for the human fMRI and macaque electrophysiology experiments do not block meaningful comparison of learning-rule rankings at IT.
What would settle it
Repeating the IT ranking comparison after aligning the exact stimulus sets across species and still obtaining Kendall's tau near zero would confirm that the divergence is not explained by stimuli alone.
Figures
read the original abstract
Does the relationship between learning rules and brain alignment generalize across species? We extend our prior finding that untrained CNNs match backpropagation at human V1 by testing the same five learning rules against macaque electrophysiology. The rules are backpropagation (BP), feedback alignment (FA), predictive coding (PC), spike-timing-dependent plasticity (STDP), and an untrained random-weights baseline. The macaque data come from two datasets: MajajHong2015 (V4/IT, 3,200 stimulus presentations, 88/168 neurons) and FreemanZiemba2013 (V1/V2, 135 stimuli, 102/103 neurons). Using RSA with identical model weights from our human study, we find: (1) all models achieve higher alignment with macaque early visual cortex (rho = 0.15-0.30 at V1/V2) than with human fMRI (rho = 0.01-0.08), consistent with the higher signal-to-noise ratio of electrophysiology; (2) STDP and PC produce the highest macaque V1/V2 alignment (rho ~ 0.30 and 0.28), consistent with their leading position among trained rules in human V1; (3) at IT, learning rule rankings show no detectable correlation across species (Kendall's tau = 0.00, p = 1.00), though this null result is expected given that n = 5 provides power only at tau = +/-1.0, and is further confounded by stimulus set differences; (4) a pretrained ResNet-50 (ImageNet) achieves rho = 0.25 at macaque IT, substantially above all custom CNN conditions (rho = 0.07-0.14), suggesting IT alignment is limited by model capacity and training data rather than by the learning rule. Noise ceilings, multi-seed variability (5 seeds), and a stimulus-control analysis are reported. These results demonstrate that early visual alignment is robust across species, while higher-area alignment is modulated by model capacity and stimulus domain.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript extends prior RSA findings on learning-rule alignment with human V1 by evaluating the same five rules (backpropagation, feedback alignment, predictive coding, STDP, and random weights) against macaque electrophysiological recordings from MajajHong2015 (V4/IT) and FreemanZiemba2013 (V1/V2) using identical model weights. It reports higher early-area alignment in macaque (rho 0.15-0.30 at V1/V2) than human fMRI, consistent top rankings for STDP and PC, a null cross-species correlation at IT (Kendall tau = 0.00, p = 1.00) attributed to low power (n=5) and stimulus-set differences, and superior IT performance by a pretrained ResNet-50 (rho = 0.25) over custom models (rho 0.07-0.14). Noise ceilings, 5-seed variability, and a stimulus-control analysis are included as controls.
Significance. If the documented controls hold, the work indicates that early visual alignment is robust across species and measurement modalities despite SNR differences, while IT alignment is more strongly modulated by model capacity and stimulus domain than by learning rule. Credit is due for reusing identical weights, reporting noise ceilings, multi-seed checks, and performing a stimulus-control analysis, all of which support direct cross-species comparison and reproducibility.
major comments (2)
- [Abstract and Results (IT section)] Abstract and Results (IT section): The null result for learning-rule rankings at IT (Kendall's tau = 0.00, p = 1.00) is presented as expected given n = 5 (power only for |tau| = 1.0) and stimulus confounds; because this null underpins the claim of divergent higher-area rankings, a quantitative power analysis or bootstrap simulation of detectable effect sizes should be added to confirm the null is not merely an artifact of insufficient power.
- [Methods (stimulus-control analysis)] Methods (stimulus-control analysis): The stimulus-control analysis is invoked to address stimulus-set differences between the human fMRI and macaque experiments, yet without explicit details on the matching procedure or subset used, it is hard to evaluate whether this control adequately supports interpreting the IT null as reflecting species or domain differences rather than stimulus mismatch.
minor comments (3)
- [Abstract] Abstract: Report the precise neuron counts and stimulus presentation numbers separately for V1/V2 and V4/IT rather than aggregated ranges to improve precision.
- [Results] Results: Clarify whether the reported rho ranges for early visual cortex combine the two macaque datasets or are shown per dataset.
- [Figure legends] Figure legends: Indicate the number of random seeds (5) and how variability is visualized (e.g., error bars) for all alignment plots.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript and for the constructive suggestions. We address each major comment below and agree to incorporate the requested additions and clarifications in a revised version.
read point-by-point responses
-
Referee: [Abstract and Results (IT section)] Abstract and Results (IT section): The null result for learning-rule rankings at IT (Kendall's tau = 0.00, p = 1.00) is presented as expected given n = 5 (power only for |tau| = 1.0) and stimulus confounds; because this null underpins the claim of divergent higher-area rankings, a quantitative power analysis or bootstrap simulation of detectable effect sizes should be added to confirm the null is not merely an artifact of insufficient power.
Authors: We agree that a quantitative power analysis would strengthen the interpretation of the null result at IT. Although the manuscript already notes that n=5 provides power only for |tau|=1.0, we will add a bootstrap simulation in the revised Results section. This simulation will resample from the observed model-brain RSA correlations (across the 5 seeds) to estimate the sampling distribution of Kendall's tau and report the minimum detectable effect size at 80% power. This addition will provide explicit quantitative support for our claim that the observed tau=0.00 is consistent with low power rather than an artifact. revision: yes
-
Referee: [Methods (stimulus-control analysis)] Methods (stimulus-control analysis): The stimulus-control analysis is invoked to address stimulus-set differences between the human fMRI and macaque experiments, yet without explicit details on the matching procedure or subset used, it is hard to evaluate whether this control adequately supports interpreting the IT null as reflecting species or domain differences rather than stimulus mismatch.
Authors: We thank the referee for highlighting the need for greater detail. The stimulus-control analysis selected a subset of images from the MajajHong2015 set that matched the FreemanZiemba2013 stimuli on semantic category and low-level image statistics (mean luminance, RMS contrast, and spatial frequency content). The matched subset contained 112 images. We will expand the Methods section to describe the exact matching criteria, the size of the retained subset, and the RSA correlations obtained on this controlled stimulus set, allowing readers to directly evaluate the adequacy of the control. revision: yes
Circularity Check
No significant circularity; analysis uses independent macaque recordings on prior model weights
full rationale
The paper applies RSA to evaluate the same five learning-rule models (with identical weights from the authors' prior human fMRI study) against independent macaque electrophysiological datasets from MajajHong2015 and FreemanZiemba2013. Alignment scores, species comparisons, and IT null results are computed directly from the new neural recordings and stimulus presentations rather than being derived from or forced by the prior human equations. Reported controls including noise ceilings, multi-seed runs, and stimulus-control analysis further ground the claims empirically without self-referential reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Representational similarity analysis (RSA) provides a reliable measure of alignment between model layer activations and neural population responses.
- domain assumption The same CNN weights trained or initialized under each learning rule can be directly compared across species without species-specific retraining.
Reference graph
Works this paper leans on
-
[1]
Bi, G.-q. and Poo, M.-m. (1998). Synaptic modifications in cultured hippocampal neurons: dependence on spike timing, synaptic strength, and postsynaptic cell type. J. Neurosci., 18:10464–10472
work page 1998
-
[2]
Freeman, J., Ziemba, C. M., Heeger, D. J., Simoncelli, E. P., and Movshon, J. A. (2013). A functional and perceptual signature of the second visual area in pri- mates.Nature Neuroscience, 16:974–981
work page 2013
-
[3]
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition.CVPR, pp. 770–778
work page 2016
-
[4]
Leutenegger, N. (2026). Untrained CNNs match back- propagation at V1: A systematic RSA compar- ison of four learning rules against human fMRI. arXiv:2604.16875v2
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[5]
Lillicrap, T. P., Cownden, D., Tweed, D. B., and Aker- man, C. J. (2016). Random synaptic feedback weights support error backpropagation for deep learning.Na- ture Communications, 7:13276
work page 2016
-
[6]
Majaj, N. J., Hong, H., Solomon, E. A., and DiCarlo, J. J. (2015). Simple learned weighted sums of inferior temporal neuronal firing rates accurately predict hu- man core object recognition performance.J. Neurosci., 35:13402–13418
work page 2015
-
[7]
Schrimpf, M., Kubilius, J., Hong, H., et al. (2020). Brain- Score: Which artificial neural network for object recog- nition is most brain-like?bioRxiv
work page 2020
-
[8]
Whittington, J. C. R. and Bogacz, R. (2017). An approx- imation of the error backpropagation algorithm in a predictive coding network with local Hebbian synap- tic plasticity.Neural Computation, 29:1229–1262
work page 2017
-
[9]
Yamins, D. L. K. and DiCarlo, J. J. (2016). Using goal- driven deep learning models to understand sensory cor- tex.Nature Neuroscience, 19:356–365. 5 /uni00000039/uni00000014/uni00000039/uni00000015/uni00000039/uni00000017/uni0000002f/uni00000032/uni00000026/uni0000002c/uni00000037 /uni00000025/uni00000055/uni00000044/uni0000004c/uni00000051/uni00000003/u...
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.