Meta-learning as a principle for human-like visual representations

Alireza Modirshanechi; Can Demircan; Eric Schulz; Marcel Binz

arxiv: 2606.28399 · v1 · pith:UF43PFBInew · submitted 2026-06-24 · 💻 cs.CV · cs.LG· q-bio.NC

Meta-learning as a principle for human-like visual representations

Can Demircan , Marcel Binz , Alireza Modirshanechi , Eric Schulz This is my paper

Pith reviewed 2026-06-30 01:25 UTC · model grok-4.3

classification 💻 cs.CV cs.LGq-bio.NC

keywords meta-learningvisual representationshuman similarity judgementsvisual cortexsemantic rule learningsequence modeltask distributions

0 comments

The pith

Meta-learning across thousands of tasks yields visual representations that better predict human similarity judgments and brain activity than standard pretraining.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes that human visual representations gain flexibility because they are shaped by meta-learning pressure to support rapid acquisition of new tasks. To test this, a sequence model is trained without any human data across thousands of semantically rich image-to-concept mapping tasks. The resulting representations outperform their pretrained base encoders at predicting human similarity judgments, semantic rule learning performance, and alignment with high-level visual cortex responses. Behavioural improvements depend on using disentangled high-level task distributions, while brain alignment stems mainly from the learning-to-learn component itself.

Core claim

Training a sequence model across thousands of semantically rich tasks mapping images to high-level concepts, without supervision from human data, produces representations that better predict human similarity judgements, semantic rule learning, and high-level visual cortex than their pretrained base encoders.

What carries the argument

Meta-learning across a distribution of tasks that shapes representations to acquire new semantic relationships from few observations.

If this is right

Meta-learned representations better predict human similarity judgements than pretrained encoders.
Meta-learned representations support improved semantic rule learning compared to pretrained encoders.
Meta-learned representations show stronger alignment with high-level visual cortex responses.
Behavioural gains require disentangled, high-level task distributions.
Brain alignment is driven primarily by the learning-to-learn pressure rather than task content alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Standard single-objective pretraining misses the open-ended adaptability that human vision exhibits.
Meta-learning could be applied to other modalities to test whether similar pressures produce human-like representations.
The approach highlights a possible reason current vision models struggle with rapid concept acquisition in new domains.

Load-bearing premise

The flexibility of human visual representations arises from meta-learning pressure that shapes them to acquire new tasks from few observations.

What would settle it

If a new set of human similarity judgments or brain scans shows the meta-learned model performing no better or worse than the base encoder, the central claim would be falsified.

Figures

Figures reproduced from arXiv: 2606.28399 by Alireza Modirshanechi, Can Demircan, Eric Schulz, Marcel Binz.

**Figure 1.** Figure 1: Overview of the meta-learning framework. A. Semantically rich tasks are generated at scale using Sparse Autoencoders (SAEs) applied to a pretrained CLIP model. Each SAE latent defines a task over images: images with non-zero activations for a given latent are positive examples, with activation magnitude serving as the graded target. Example tasks are shown with images from the THINGSplus database 31, displ… view at source ↗

**Figure 2.** Figure 2: Meta-learning over semantically rich tasks aligns visual representations with human behaviour. A. Behavioural alignment with human odd-one-out choices on the THINGS dataset, quantified by McFadden’s R2 based on the negative log-likelihood (NLL) of human choices (R2 = 1: perfect prediction; R2 = 0: chance-level). B–D. Behavioural alignment on the Levels dataset across different hierarchical depths, includin… view at source ↗

**Figure 3.** Figure 3: Meta-learning outperforms multitask learning on the same task distribution. Comparison of meta-learned representations (purple) against a non-sequential multitask model (pink) trained on the same task distribution but without episodic structure, processing each image independently via per-task linear heads. The multitask model receives the same image-task supervision as the meta-learner, isolating the cont… view at source ↗

**Figure 4.** Figure 4: Disentangled and high-level tasks drive human alignment. A. Visualisations of the three task distributions used for meta-learning: high-level disentangled tasks (SAE latents from CLIP Layer 11; left), highlevel entangled tasks (raw residual stream from CLIP Layer 11; middle), and mid-level disentangled tasks (SAE latents from CLIP Layer 6; right). Examples show images that maximally and minimally activate… view at source ↗

**Figure 5.** Figure 5: Meta-learning improves alignment with the human visual cortex. A. Whole-brain visualisation for Participant 1 showing the difference in noise-ceiling corrected predictive power (R2 meta-learned − R2 base) using the SigLIP2 backbone. Brighter colours indicate regions where meta-learning leads to a better fit of the fMRI data. B. Group-level predictive power (R2 ) for base (red) and meta-learned (purple) Si… view at source ↗

**Figure 6.** Figure 6: Meta-learning reorganises representational geometry to mirror human conceptual structure. A–C. t-SNE visualisations of THINGS images for (A) the SigLIP2 base model, (B) the meta-learned model, and (C) the 66-dimensional SPoSE human similarity embedding derived from the THINGS odd-one-out dataset. Meta-learning exaggerates the global separation between organic (e.g., animals, plants) and inorganic categori… view at source ↗

read the original abstract

The structure of human visual representations underpins our capacity for adaptive behaviour. While pretrained neural networks model human visual representations with unprecedented success, a large discrepancy remains. We propose one reason: these networks optimise a single fixed objective, whereas human representations must support open-ended tasks. We hypothesise this flexibility arises from meta-learning (learning to learn), a pressure shaping representations to acquire new tasks from few observations. To test this, we train a sequence model, without any supervision from human data, across thousands of semantically rich tasks mapping images to high-level concepts. Compared to their pretrained base encoders, meta-learned representations better predict human similarity judgements, semantic rule learning, and high-level visual cortex. Behavioural gains depend on disentangled, high-level task distributions, while brain alignment is driven primarily by the learning-to-learn pressure. Our results suggest the flexibility of human visual representations reflects the functional demand to learn new semantic relationships on the fly.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Meta-learning on thousands of image-to-concept tasks improves alignment with human judgments and brain data over base encoders, but the design leaves open whether the gains require the sequential meta-learning mechanism or just multi-task exposure to rich distributions.

read the letter

The core result is that a sequence model trained across many unsupervised tasks mapping images to concepts ends up closer to human similarity judgments, semantic rule learning, and high-level visual cortex responses than the original pretrained encoders. Behavioral improvements track the use of disentangled high-level tasks, while brain alignment tracks the learning-to-learn aspect.

What stands out is the scale and the lack of human supervision in the training loop. The setup tests whether pressure to acquire new tasks quickly can shape representations that match human flexibility, and it reports measurable gains on three external benchmarks. That is a concrete empirical step beyond standard pretraining comparisons.

The soft spot is the missing control the stress-test flags. The abstract claims brain alignment comes from the learning-to-learn pressure and behavior from task distributions, yet it gives no sign of a joint multi-task baseline on the identical task set without the sequential structure. If that baseline already produces most of the gains, the specific claim about meta-learning pressure as the driver weakens. The paper would be stronger with that comparison shown explicitly.

The work is aimed at researchers building computational accounts of visual representation that link AI training regimes to human data. A reader already working on meta-learning or on bridging models to neuroscience data will find the empirical test useful even if the isolation of the mechanism needs tightening.

It deserves peer review. The idea is grounded enough and the benchmarks are external, so referees can check the controls and statistics directly.

Referee Report

2 major / 2 minor

Summary. The paper claims that the flexibility of human visual representations arises from meta-learning pressure (learning to learn) that enables rapid acquisition of new tasks, rather than optimization of a single fixed objective. To test this, the authors train a sequence model without human supervision across thousands of semantically rich tasks that map images to high-level concepts. They report that the resulting meta-learned representations outperform their pretrained base encoders in predicting human similarity judgments, semantic rule learning performance, and alignment with high-level visual cortex activity. Behavioral gains are attributed to disentangled high-level task distributions, while brain alignment is attributed primarily to the learning-to-learn pressure.

Significance. If the central results hold after isolating the meta-learning mechanism, the work would offer a functional account of why human visual representations support open-ended adaptation, bridging computational neuroscience and machine learning. The unsupervised training regime on diverse tasks and the direct comparison to base encoders are strengths that could inform both model design and interpretations of cortical representations.

major comments (2)

[Methods] Methods section (task training and controls): The central claim attributes superior alignment to meta-learning pressure specifically, yet the manuscript provides no control condition that trains on the identical task set using a joint multi-task objective without the sequential meta-learning structure. Without this ablation, gains cannot be distinguished from benefits of multi-task exposure to disentangled semantic distributions, which directly undermines the load-bearing hypothesis that flexibility arises from the learning-to-learn mechanism.
[Results] Results on brain alignment: The abstract states that brain alignment is driven primarily by learning-to-learn pressure while behavioral gains depend on task distributions, but the manuscript does not report the statistical interaction or ablation that would support separating these two effects; this separation is required to sustain the differential attribution.

minor comments (2)

[Methods] The description of the sequence model architecture and how task sequences are constructed should include explicit pseudocode or a diagram to clarify the distinction between the meta-learning objective and standard multi-task training.
[Figures] Figure captions for the human benchmark comparisons should report exact sample sizes, statistical tests, and effect sizes alongside the qualitative statements of improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript accordingly to strengthen the isolation of the meta-learning mechanism.

read point-by-point responses

Referee: [Methods] Methods section (task training and controls): The central claim attributes superior alignment to meta-learning pressure specifically, yet the manuscript provides no control condition that trains on the identical task set using a joint multi-task objective without the sequential meta-learning structure. Without this ablation, gains cannot be distinguished from benefits of multi-task exposure to disentangled semantic distributions, which directly undermines the load-bearing hypothesis that flexibility arises from the learning-to-learn mechanism.

Authors: We agree that the absence of a joint multi-task baseline on the identical task set limits the ability to isolate the sequential meta-learning structure from multi-task exposure. The current comparisons are to base encoders trained on standard objectives, but this does not fully address the referee's concern. We will add the requested joint multi-task control condition in the revised manuscript. revision: yes
Referee: [Results] Results on brain alignment: The abstract states that brain alignment is driven primarily by learning-to-learn pressure while behavioral gains depend on task distributions, but the manuscript does not report the statistical interaction or ablation that would support separating these two effects; this separation is required to sustain the differential attribution.

Authors: The differential attribution draws from observed patterns across experiments (behavioral sensitivity to task disentanglement versus brain alignment gains tied to the meta-learning procedure). We acknowledge that a formal statistical interaction test and supporting ablations are not currently reported. We will include these analyses in the revised manuscript to substantiate the separation of effects. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained against external benchmarks

full rationale

The paper trains sequence models on independent task distributions (images to high-level concepts) without human supervision, then compares meta-learned representations to pretrained encoders on separate human similarity judgements, rule learning tasks, and fMRI data. No equations or claims reduce a reported gain to a fitted parameter, self-citation chain, or definitional equivalence. Behavioural and brain-alignment results are presented as empirical outcomes of the training procedure against external data, satisfying the default expectation of non-circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full details on training hyperparameters, task construction, and model architecture unavailable. No free parameters, axioms, or invented entities can be extracted beyond the central hypothesis.

axioms (1)

domain assumption Meta-learning pressure shapes representations to support few-shot acquisition of new semantic tasks
Core hypothesis stated in the abstract; not derived from prior results within the provided text.

pith-pipeline@v0.9.1-grok · 5700 in / 1158 out tokens · 34523 ms · 2026-06-30T01:25:55.685199+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

64 extracted references · 13 canonical work pages · 5 internal anchors

[1]

Deep neural networks: a new framework for modeling biological vision and brain information processing.Annual review of vision science1, 417–446 (2015)

Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing.Annual review of vision science1, 417–446 (2015)

2015
[2]

URLhttps://openreview.net/forum?id=Hiq7lUh4Yn

Sucholutsky, I.et al.Getting aligned on representational alignment.Transactions on Machine Learn- ing Research(2025). URLhttps://openreview.net/forum?id=Hiq7lUh4Yn

2025
[3]

Yamins, D. L. K.et al.Performance-optimized hierarchical models predict neural re- sponses in higher visual cortex.Proceedings of the National Academy of Sciences111, 8619–8624 (2014). URLhttps://www.pnas.org/doi/abs/10.1073/pnas.1403112111. eprint: https://www.pnas.org/doi/pdf/10.1073/pnas.1403112111

work page doi:10.1073/pnas.1403112111 2014
[4]

Schrimpf, M.et al.Brain-score: Which artificial neural network for object recognition is most brain- like?BioRxiv407007 (2018)

2018
[5]

& Griffiths, T

Sucholutsky, I. & Griffiths, T. L. Alignment with human representations supports robust few-shot learning. InThirty-seventh Conference on Neural Information Processing Systems(2023). URL https://openreview.net/forum?id=HYGnmSLBCf

2023
[6]

Muttenthaler, L., Linhardt, L., Dippel, J., Vandermeulen, R. A. & Kornblith, S. Human alignment of neural network representations. InSVRHM 2022 Workshop @ NeurIPS(2022). URLhttps: //openreview.net/forum?id=b2DmQYY-XY

2022
[7]

S., Kay, K

Conwell, C., Prince, J. S., Kay, K. N., Alvarez, G. A. & Konkle, T. A large-scale examination of induc- tive biases shaping high-level visual representation in brains and machines.Nature Communications 15, 9383 (2024). URLhttps://www.nature.com/articles/s41467-024-53147-y

2024
[8]

Demircan, C.et al.Evaluating alignment between humans and neural network representations in image-based learning tasks.Advances in Neural Information Processing Systems37, 122406–122433 (2024)

2024
[9]

Nature647, 349–355 (2025)

Muttenthaler, L.et al.Aligning machine and human visual representations across abstraction levels. Nature647, 349–355 (2025)

2025
[10]

FEL, T., Rodriguez, I. F. R., Linsley, D. & Serre, T. Harmonizing the object recognition strategies of deep neural networks with humans. In Oh, A. H., Agarwal, A., Belgrave, D. & Cho, K. (eds.) Advances in Neural Information Processing Systems(2022). URLhttps://openreview.net/forum? id=ZYKWi6Ylfg

2022
[11]

InThirty-seventh Conference on Neural Information Processing Systems(2023)

Fu, S.et al.DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data. InThirty-seventh Conference on Neural Information Processing Systems(2023). URLhttps: //openreview.net/forum?id=DEiNSfh1k7

2023
[12]

URLhttps: //openreview.net/forum?id=NmlnmLYMZ4

Sundaram, S.et al.When does perceptual alignment benefit vision representations? InThe Thirty-eighth Annual Conference on Neural Information Processing Systems(2024). URLhttps: //openreview.net/forum?id=NmlnmLYMZ4

2024
[13]

Roads, B. D. & Love, B. C. Enriching imagenet with human similarity judgments and psychological embeddings. InProceedings of the ieee/cvf conference on computer vision and pattern recognition, 3547–3557 (2021)

2021
[14]

InThirty-seventh Conference on Neural Information Processing Systems(2023)

Muttenthaler, L.et al.Improving neural network representations using human similarity judgments. InThirty-seventh Conference on Neural Information Processing Systems(2023). URLhttps:// openreview.net/forum?id=Nh5dp6Uuvx

2023
[15]

Binz, M.et al.Meta-learned models of cognition.Behavioral and Brain Sciences47, e147 (2024)

2024
[16]

L.et al.Doing more with less: meta-reasoning and meta-learning in humans and machines.Current Opinion in Behavioral Sciences29, 24–30 (2019)

Griffiths, T. L.et al.Doing more with less: meta-reasoning and meta-learning in humans and machines.Current Opinion in Behavioral Sciences29, 24–30 (2019)

2019
[17]

X.et al.Prefrontal cortex as a meta-reinforcement learning system.Nature neuroscience 21, 860–868 (2018)

Wang, J. X.et al.Prefrontal cortex as a meta-reinforcement learning system.Nature neuroscience 21, 860–868 (2018)

2018
[18]

Lake, B. M. & Baroni, M. Human-like systematic generalization through a meta-learning neural network.Nature623, 115–121 (2023)

2023
[19]

K., Coda-Forno, J., Thalmann, M., Schulz, E

Jagadish, A. K., Coda-Forno, J., Thalmann, M., Schulz, E. & Binz, M. Human-like category learn- ing by injecting ecological priors from large language models into neural networks.arXiv preprint arXiv:2402.01821(2024)

work page arXiv 2024
[20]

Meta-learning ecological priors from large language models explains human learning and decision making

Jagadish, A. K., Thalmann, M., Coda-Forno, J., Binz, M. & Schulz, E. Meta-learning ecological priors from large language models explains human learning and decision making.arXiv preprint arXiv:2509.00116(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[21]

Sparse Autoencoders Find Highly Interpretable Features in Language Models

Cunningham, H., Ewart, A., Riggs, L., Huben, R. & Sharkey, L. Sparse autoencoders find highly interpretable features in language models.arXiv preprint arXiv:2309.08600(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[22]

Bricken, T.et al.Towards Monosemanticity: Decomposing Language Models With Dictionary Learn- ing.Transformer Circuits Thread(2023)

2023
[23]

Elhage, N.et al.Toy Models of Superposition.Transformer Circuits Thread(2022)

2022
[24]

& Torralba, A

Bau, D., Zhou, B., Khosla, A., Oliva, A. & Torralba, A. Network dissection: Quantifying inter- pretability of deep visual representations. InProceedings of the IEEE conference on computer vision and pattern recognition, 6541–6549 (2017)

2017
[25]

Tschannen, M.et al.Siglip 2: Multilingual vision-language encoders with improved semantic under- standing, localization, and dense features.arXiv preprint arXiv:2502.14786(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[26]

InProceedings of the IEEE/CVF International Conference on Computer Vision, 370–382 (2025)

Fan, D.et al.Scaling language-free visual representation learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, 370–382 (2025)

2025
[27]

InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16000–16009 (2022)

He, K.et al.Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16000–16009 (2022)

2022
[28]

Sim´ eoni, O.et al.Dinov3.arXiv preprint arXiv:2508.10104(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[29]

org/10.5281/zenodo.5143773

Ilharco, G.et al.OpenCLIP (2021). URLhttps://doi.org/10.5281/zenodo.5143773

work page doi:10.5281/zenodo.5143773 2021
[30]

InInter- national conference on machine learning, 8748–8763 (PmLR, 2021)

Radford, A.et al.Learning transferable visual models from natural language supervision. InInter- national conference on machine learning, 8748–8763 (PmLR, 2021)

2021
[31]

M., Perkuhn, J

Stoinski, L. M., Perkuhn, J. & Hebart, M. N. THINGSplus: New norms and metadata for the THINGS database of 1854 object concepts and 26,107 natural object images.Behavior Research Methods56, 1583–1603 (2024)

2024
[32]

N., Zheng, C

Hebart, M. N., Zheng, C. Y., Pereira, F. & Baker, C. I. Revealing the multidimensional mental representations of natural objects underlying human similarity judgements.Nature human behaviour 4, 1173–1185 (2020)

2020
[33]

N.et al.THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior.Elife12, e82580 (2023)

Hebart, M. N.et al.THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior.Elife12, e82580 (2023)

2023
[34]

Muttenthaler, L.et al.The Levels Dataset (2024)

2024
[35]

Multitask learning.Machine learning28, 41–75 (1997)

Caruana, R. Multitask learning.Machine learning28, 41–75 (1997)

1997
[36]

& Hinton, G

Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE.Journal of machine learning research 9(2008)

2008
[37]

& Hinton, G

Kornblith, S., Norouzi, M., Lee, H. & Hinton, G. Similarity of neural network representations revis- ited. InInternational conference on machine learning, 3519–3529 (PMlR, 2019)

2019
[38]

Kubilius, J.et al.Brain-like object recognition with high-performing shallow recurrent ANNs.Ad- vances in neural information processing systems32(2019)

2019
[39]

C.et al.Recurrence is required to capture the representational dynam- ics of the human visual system.Proceedings of the National Academy of Sciences116, 21854–21863 (2019)

Kietzmann, T. C.et al.Recurrence is required to capture the representational dynam- ics of the human visual system.Proceedings of the National Academy of Sciences116, 21854–21863 (2019). URLhttps://www.pnas.org/doi/abs/10.1073/pnas.1905544116. eprint: https://www.pnas.org/doi/pdf/10.1073/pnas.1905544116

work page doi:10.1073/pnas.1905544116 2019
[40]

K., Wang, W., Orhan, A

Vong, W. K., Wang, W., Orhan, A. E. & Lake, B. M. Grounded language acquisition through the eyes and ears of a single child.Science383, 504–511 (2024). URLhttps://www.science.org/doi/abs/ 10.1126/science.adi1374. eprint: https://www.science.org/doi/pdf/10.1126/science.adi1374

work page doi:10.1126/science.adi1374 2024
[41]

J., Jones, E

Mehrer, J., Spoerer, C. J., Jones, E. C., Kriegeskorte, N. & Kietzmann, T. C. An ecologically motivated image dataset for deep learning yields better models of human vision.Proceedings of the National Academy of Sciences118, e2011417118 (2021). URLhttps://www.pnas.org/doi/abs/10. 1073/pnas.2011417118. eprint: https://www.pnas.org/doi/pdf/10.1073/pnas.2011417118

work page doi:10.1073/pnas.2011417118 2021
[42]

& Alvarez, G

Konkle, T. & Alvarez, G. A. A self-supervised domain-general learning framework for human ventral stream representation.Nature Communications13, 491 (2022). URLhttps://doi.org/10.1038/ s41467-022-28091-4

2022
[43]

& Richards, B

Bakhtiari, S., Mineault, P., Lillicrap, T., Pack, C. & Richards, B. The functional special- ization of visual cortex emerges from training parallel pathways with self-supervised predictive learning. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P. S. & Vaughan, J. W. (eds.)Advances in Neural Information Processing Systems, vol. 34, 25164–25178 (Cur...

2021
[44]

J., Yamins, D

Tang, Y., Gokce, A., Al-Karkari, K. J., Yamins, D. & Schrimpf, M. Diverse perceptual representations across visual pathways emerge from a single objective.bioRxiv2025–07 (2025)

2025
[45]

Dasgupta, I., Schulz, E., Tenenbaum, J. B. & Gershman, S. J. A theory of learning to infer.Psycho- logical review127, 412 (2020)

2020
[46]

J., Schulz, E

Binz, M., Gershman, S. J., Schulz, E. & Endres, D. Heuristics from bounded meta-learned inference. Psychological review129, 1042 (2022)

2022
[47]

T., Grant, E., Smolensky, P., Griffiths, T

McCoy, R. T., Grant, E., Smolensky, P., Griffiths, T. L. & Linzen, T. Universal linguistic inductive biases via meta-learning.arXiv preprint arXiv:2006.16324(2020)

work page arXiv 2006
[48]

ViT Prisma: A Mechanistic Interpretability Library for Vision Transformers (2023)

Joseph, S. ViT Prisma: A Mechanistic Interpretability Library for Vision Transformers (2023). URL https://github.com/soniajoseph/vit-prisma. Publication Title: GitHub repository

2023
[49]

URLhttps://arxiv.org/abs/2504.19475

Joseph, S.et al.Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video (2025). URLhttps://arxiv.org/abs/2504.19475. eprint: 2504.19475

work page arXiv 2025
[50]

InThe Thirteenth International Conference on Learning Representations(2025)

Gao, L.et al.Scaling and evaluating sparse autoencoders. InThe Thirteenth International Conference on Learning Representations(2025). URLhttps://openreview.net/forum?id=tcsZt9ZNKD

2025
[51]

In2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255 (2009)

Deng, J.et al.ImageNet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255 (2009)

2009
[52]

InEuropean conference on computer vision, 740–755 (Springer, 2014)

Lin, T.-Y.et al.Microsoft coco: Common objects in context. InEuropean conference on computer vision, 740–755 (Springer, 2014)

2014
[53]

Vaswani, A.et al.Attention is all you need.Advances in neural information processing systems30 (2017)

2017
[54]

Su, J.et al.Roformer: Enhanced transformer with rotary position embedding.Neurocomputing568, 127063 (2024)

2024
[55]

Layer Normalization

Ba, J. L., Kiros, J. R. & Hinton, G. E. Layer normalization.arXiv preprint arXiv:1607.06450(2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[56]

& Schmidhuber, J

Hochreiter, S. & Schmidhuber, J. Long short-term memory.Neural computation9, 1735–1780 (1997)

1997
[57]

& Cipolla, R

Kendall, A., Gal, Y. & Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. InProceedings of the IEEE conference on computer vision and pattern recognition, 7482–7491 (2018)

2018
[58]

Defazio, A.et al.The road less scheduled.Advances in Neural Information Processing Systems37, 9974–10007 (2024)

2024
[59]

PyTorch Image Models (2019)

Wightman, R. PyTorch Image Models (2019). URLhttps://github.com/rwightman/ pytorch-image-models. Publication Title: GitHub repository

2019
[60]

Conditional logit analysis of qualitative choice behavior (1972)

McFadden, D. Conditional logit analysis of qualitative choice behavior (1972)

1972
[61]

N.et al.THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PloS one14, e0223792 (2019)

Hebart, M. N.et al.THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PloS one14, e0223792 (2019)

2019
[62]

InProceedings of the Annual Meeting of the Cognitive Science Society, vol

Demircan, C.et al.Decision-making with naturalistic options. InProceedings of the Annual Meeting of the Cognitive Science Society, vol. 44 (2022)

2022
[63]

E., Friston, K

Rigoux, L., Stephan, K. E., Friston, K. J. & Daunizeau, J. Bayesian model selection for group studies—revisited.Neuroimage84, 971–985 (2014)

2014
[64]

& Norouzi, M

Kornblith, S., Chen, T., Lee, H. & Norouzi, M. Why do better loss functions lead to less transferable features?Advances in Neural Information Processing Systems34, 28648–28662 (2021). Supplementary Information Meta-learning as a principle for human-like visual representations 1 Robustness to architectural variations We investigated whether the emergence o...

2021

[1] [1]

Deep neural networks: a new framework for modeling biological vision and brain information processing.Annual review of vision science1, 417–446 (2015)

Kriegeskorte, N. Deep neural networks: a new framework for modeling biological vision and brain information processing.Annual review of vision science1, 417–446 (2015)

2015

[2] [2]

URLhttps://openreview.net/forum?id=Hiq7lUh4Yn

Sucholutsky, I.et al.Getting aligned on representational alignment.Transactions on Machine Learn- ing Research(2025). URLhttps://openreview.net/forum?id=Hiq7lUh4Yn

2025

[3] [3]

Yamins, D. L. K.et al.Performance-optimized hierarchical models predict neural re- sponses in higher visual cortex.Proceedings of the National Academy of Sciences111, 8619–8624 (2014). URLhttps://www.pnas.org/doi/abs/10.1073/pnas.1403112111. eprint: https://www.pnas.org/doi/pdf/10.1073/pnas.1403112111

work page doi:10.1073/pnas.1403112111 2014

[4] [4]

Schrimpf, M.et al.Brain-score: Which artificial neural network for object recognition is most brain- like?BioRxiv407007 (2018)

2018

[5] [5]

& Griffiths, T

Sucholutsky, I. & Griffiths, T. L. Alignment with human representations supports robust few-shot learning. InThirty-seventh Conference on Neural Information Processing Systems(2023). URL https://openreview.net/forum?id=HYGnmSLBCf

2023

[6] [6]

Muttenthaler, L., Linhardt, L., Dippel, J., Vandermeulen, R. A. & Kornblith, S. Human alignment of neural network representations. InSVRHM 2022 Workshop @ NeurIPS(2022). URLhttps: //openreview.net/forum?id=b2DmQYY-XY

2022

[7] [7]

S., Kay, K

Conwell, C., Prince, J. S., Kay, K. N., Alvarez, G. A. & Konkle, T. A large-scale examination of induc- tive biases shaping high-level visual representation in brains and machines.Nature Communications 15, 9383 (2024). URLhttps://www.nature.com/articles/s41467-024-53147-y

2024

[8] [8]

Demircan, C.et al.Evaluating alignment between humans and neural network representations in image-based learning tasks.Advances in Neural Information Processing Systems37, 122406–122433 (2024)

2024

[9] [9]

Nature647, 349–355 (2025)

Muttenthaler, L.et al.Aligning machine and human visual representations across abstraction levels. Nature647, 349–355 (2025)

2025

[10] [10]

FEL, T., Rodriguez, I. F. R., Linsley, D. & Serre, T. Harmonizing the object recognition strategies of deep neural networks with humans. In Oh, A. H., Agarwal, A., Belgrave, D. & Cho, K. (eds.) Advances in Neural Information Processing Systems(2022). URLhttps://openreview.net/forum? id=ZYKWi6Ylfg

2022

[11] [11]

InThirty-seventh Conference on Neural Information Processing Systems(2023)

Fu, S.et al.DreamSim: Learning New Dimensions of Human Visual Similarity using Synthetic Data. InThirty-seventh Conference on Neural Information Processing Systems(2023). URLhttps: //openreview.net/forum?id=DEiNSfh1k7

2023

[12] [12]

URLhttps: //openreview.net/forum?id=NmlnmLYMZ4

Sundaram, S.et al.When does perceptual alignment benefit vision representations? InThe Thirty-eighth Annual Conference on Neural Information Processing Systems(2024). URLhttps: //openreview.net/forum?id=NmlnmLYMZ4

2024

[13] [13]

Roads, B. D. & Love, B. C. Enriching imagenet with human similarity judgments and psychological embeddings. InProceedings of the ieee/cvf conference on computer vision and pattern recognition, 3547–3557 (2021)

2021

[14] [14]

InThirty-seventh Conference on Neural Information Processing Systems(2023)

Muttenthaler, L.et al.Improving neural network representations using human similarity judgments. InThirty-seventh Conference on Neural Information Processing Systems(2023). URLhttps:// openreview.net/forum?id=Nh5dp6Uuvx

2023

[15] [15]

Binz, M.et al.Meta-learned models of cognition.Behavioral and Brain Sciences47, e147 (2024)

2024

[16] [16]

L.et al.Doing more with less: meta-reasoning and meta-learning in humans and machines.Current Opinion in Behavioral Sciences29, 24–30 (2019)

Griffiths, T. L.et al.Doing more with less: meta-reasoning and meta-learning in humans and machines.Current Opinion in Behavioral Sciences29, 24–30 (2019)

2019

[17] [17]

X.et al.Prefrontal cortex as a meta-reinforcement learning system.Nature neuroscience 21, 860–868 (2018)

Wang, J. X.et al.Prefrontal cortex as a meta-reinforcement learning system.Nature neuroscience 21, 860–868 (2018)

2018

[18] [18]

Lake, B. M. & Baroni, M. Human-like systematic generalization through a meta-learning neural network.Nature623, 115–121 (2023)

2023

[19] [19]

K., Coda-Forno, J., Thalmann, M., Schulz, E

Jagadish, A. K., Coda-Forno, J., Thalmann, M., Schulz, E. & Binz, M. Human-like category learn- ing by injecting ecological priors from large language models into neural networks.arXiv preprint arXiv:2402.01821(2024)

work page arXiv 2024

[20] [20]

Meta-learning ecological priors from large language models explains human learning and decision making

Jagadish, A. K., Thalmann, M., Coda-Forno, J., Binz, M. & Schulz, E. Meta-learning ecological priors from large language models explains human learning and decision making.arXiv preprint arXiv:2509.00116(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[21] [21]

Sparse Autoencoders Find Highly Interpretable Features in Language Models

Cunningham, H., Ewart, A., Riggs, L., Huben, R. & Sharkey, L. Sparse autoencoders find highly interpretable features in language models.arXiv preprint arXiv:2309.08600(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[22] [22]

Bricken, T.et al.Towards Monosemanticity: Decomposing Language Models With Dictionary Learn- ing.Transformer Circuits Thread(2023)

2023

[23] [23]

Elhage, N.et al.Toy Models of Superposition.Transformer Circuits Thread(2022)

2022

[24] [24]

& Torralba, A

Bau, D., Zhou, B., Khosla, A., Oliva, A. & Torralba, A. Network dissection: Quantifying inter- pretability of deep visual representations. InProceedings of the IEEE conference on computer vision and pattern recognition, 6541–6549 (2017)

2017

[25] [25]

Tschannen, M.et al.Siglip 2: Multilingual vision-language encoders with improved semantic under- standing, localization, and dense features.arXiv preprint arXiv:2502.14786(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[26] [26]

InProceedings of the IEEE/CVF International Conference on Computer Vision, 370–382 (2025)

Fan, D.et al.Scaling language-free visual representation learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, 370–382 (2025)

2025

[27] [27]

InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16000–16009 (2022)

He, K.et al.Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 16000–16009 (2022)

2022

[28] [28]

Sim´ eoni, O.et al.Dinov3.arXiv preprint arXiv:2508.10104(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[29] [29]

org/10.5281/zenodo.5143773

Ilharco, G.et al.OpenCLIP (2021). URLhttps://doi.org/10.5281/zenodo.5143773

work page doi:10.5281/zenodo.5143773 2021

[30] [30]

InInter- national conference on machine learning, 8748–8763 (PmLR, 2021)

Radford, A.et al.Learning transferable visual models from natural language supervision. InInter- national conference on machine learning, 8748–8763 (PmLR, 2021)

2021

[31] [31]

M., Perkuhn, J

Stoinski, L. M., Perkuhn, J. & Hebart, M. N. THINGSplus: New norms and metadata for the THINGS database of 1854 object concepts and 26,107 natural object images.Behavior Research Methods56, 1583–1603 (2024)

2024

[32] [32]

N., Zheng, C

Hebart, M. N., Zheng, C. Y., Pereira, F. & Baker, C. I. Revealing the multidimensional mental representations of natural objects underlying human similarity judgements.Nature human behaviour 4, 1173–1185 (2020)

2020

[33] [33]

N.et al.THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior.Elife12, e82580 (2023)

Hebart, M. N.et al.THINGS-data, a multimodal collection of large-scale datasets for investigating object representations in human brain and behavior.Elife12, e82580 (2023)

2023

[34] [34]

Muttenthaler, L.et al.The Levels Dataset (2024)

2024

[35] [35]

Multitask learning.Machine learning28, 41–75 (1997)

Caruana, R. Multitask learning.Machine learning28, 41–75 (1997)

1997

[36] [36]

& Hinton, G

Van der Maaten, L. & Hinton, G. Visualizing data using t-SNE.Journal of machine learning research 9(2008)

2008

[37] [37]

& Hinton, G

Kornblith, S., Norouzi, M., Lee, H. & Hinton, G. Similarity of neural network representations revis- ited. InInternational conference on machine learning, 3519–3529 (PMlR, 2019)

2019

[38] [38]

Kubilius, J.et al.Brain-like object recognition with high-performing shallow recurrent ANNs.Ad- vances in neural information processing systems32(2019)

2019

[39] [39]

C.et al.Recurrence is required to capture the representational dynam- ics of the human visual system.Proceedings of the National Academy of Sciences116, 21854–21863 (2019)

Kietzmann, T. C.et al.Recurrence is required to capture the representational dynam- ics of the human visual system.Proceedings of the National Academy of Sciences116, 21854–21863 (2019). URLhttps://www.pnas.org/doi/abs/10.1073/pnas.1905544116. eprint: https://www.pnas.org/doi/pdf/10.1073/pnas.1905544116

work page doi:10.1073/pnas.1905544116 2019

[40] [40]

K., Wang, W., Orhan, A

Vong, W. K., Wang, W., Orhan, A. E. & Lake, B. M. Grounded language acquisition through the eyes and ears of a single child.Science383, 504–511 (2024). URLhttps://www.science.org/doi/abs/ 10.1126/science.adi1374. eprint: https://www.science.org/doi/pdf/10.1126/science.adi1374

work page doi:10.1126/science.adi1374 2024

[41] [41]

J., Jones, E

Mehrer, J., Spoerer, C. J., Jones, E. C., Kriegeskorte, N. & Kietzmann, T. C. An ecologically motivated image dataset for deep learning yields better models of human vision.Proceedings of the National Academy of Sciences118, e2011417118 (2021). URLhttps://www.pnas.org/doi/abs/10. 1073/pnas.2011417118. eprint: https://www.pnas.org/doi/pdf/10.1073/pnas.2011417118

work page doi:10.1073/pnas.2011417118 2021

[42] [42]

& Alvarez, G

Konkle, T. & Alvarez, G. A. A self-supervised domain-general learning framework for human ventral stream representation.Nature Communications13, 491 (2022). URLhttps://doi.org/10.1038/ s41467-022-28091-4

2022

[43] [43]

& Richards, B

Bakhtiari, S., Mineault, P., Lillicrap, T., Pack, C. & Richards, B. The functional special- ization of visual cortex emerges from training parallel pathways with self-supervised predictive learning. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P. S. & Vaughan, J. W. (eds.)Advances in Neural Information Processing Systems, vol. 34, 25164–25178 (Cur...

2021

[44] [44]

J., Yamins, D

Tang, Y., Gokce, A., Al-Karkari, K. J., Yamins, D. & Schrimpf, M. Diverse perceptual representations across visual pathways emerge from a single objective.bioRxiv2025–07 (2025)

2025

[45] [45]

Dasgupta, I., Schulz, E., Tenenbaum, J. B. & Gershman, S. J. A theory of learning to infer.Psycho- logical review127, 412 (2020)

2020

[46] [46]

J., Schulz, E

Binz, M., Gershman, S. J., Schulz, E. & Endres, D. Heuristics from bounded meta-learned inference. Psychological review129, 1042 (2022)

2022

[47] [47]

T., Grant, E., Smolensky, P., Griffiths, T

McCoy, R. T., Grant, E., Smolensky, P., Griffiths, T. L. & Linzen, T. Universal linguistic inductive biases via meta-learning.arXiv preprint arXiv:2006.16324(2020)

work page arXiv 2006

[48] [48]

ViT Prisma: A Mechanistic Interpretability Library for Vision Transformers (2023)

Joseph, S. ViT Prisma: A Mechanistic Interpretability Library for Vision Transformers (2023). URL https://github.com/soniajoseph/vit-prisma. Publication Title: GitHub repository

2023

[49] [49]

URLhttps://arxiv.org/abs/2504.19475

Joseph, S.et al.Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video (2025). URLhttps://arxiv.org/abs/2504.19475. eprint: 2504.19475

work page arXiv 2025

[50] [50]

InThe Thirteenth International Conference on Learning Representations(2025)

Gao, L.et al.Scaling and evaluating sparse autoencoders. InThe Thirteenth International Conference on Learning Representations(2025). URLhttps://openreview.net/forum?id=tcsZt9ZNKD

2025

[51] [51]

In2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255 (2009)

Deng, J.et al.ImageNet: A large-scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255 (2009)

2009

[52] [52]

InEuropean conference on computer vision, 740–755 (Springer, 2014)

Lin, T.-Y.et al.Microsoft coco: Common objects in context. InEuropean conference on computer vision, 740–755 (Springer, 2014)

2014

[53] [53]

Vaswani, A.et al.Attention is all you need.Advances in neural information processing systems30 (2017)

2017

[54] [54]

Su, J.et al.Roformer: Enhanced transformer with rotary position embedding.Neurocomputing568, 127063 (2024)

2024

[55] [55]

Layer Normalization

Ba, J. L., Kiros, J. R. & Hinton, G. E. Layer normalization.arXiv preprint arXiv:1607.06450(2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[56] [56]

& Schmidhuber, J

Hochreiter, S. & Schmidhuber, J. Long short-term memory.Neural computation9, 1735–1780 (1997)

1997

[57] [57]

& Cipolla, R

Kendall, A., Gal, Y. & Cipolla, R. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. InProceedings of the IEEE conference on computer vision and pattern recognition, 7482–7491 (2018)

2018

[58] [58]

Defazio, A.et al.The road less scheduled.Advances in Neural Information Processing Systems37, 9974–10007 (2024)

2024

[59] [59]

PyTorch Image Models (2019)

Wightman, R. PyTorch Image Models (2019). URLhttps://github.com/rwightman/ pytorch-image-models. Publication Title: GitHub repository

2019

[60] [60]

Conditional logit analysis of qualitative choice behavior (1972)

McFadden, D. Conditional logit analysis of qualitative choice behavior (1972)

1972

[61] [61]

N.et al.THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PloS one14, e0223792 (2019)

Hebart, M. N.et al.THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PloS one14, e0223792 (2019)

2019

[62] [62]

InProceedings of the Annual Meeting of the Cognitive Science Society, vol

Demircan, C.et al.Decision-making with naturalistic options. InProceedings of the Annual Meeting of the Cognitive Science Society, vol. 44 (2022)

2022

[63] [63]

E., Friston, K

Rigoux, L., Stephan, K. E., Friston, K. J. & Daunizeau, J. Bayesian model selection for group studies—revisited.Neuroimage84, 971–985 (2014)

2014

[64] [64]

& Norouzi, M

Kornblith, S., Chen, T., Lee, H. & Norouzi, M. Why do better loss functions lead to less transferable features?Advances in Neural Information Processing Systems34, 28648–28662 (2021). Supplementary Information Meta-learning as a principle for human-like visual representations 1 Robustness to architectural variations We investigated whether the emergence o...

2021