Characterizing Universal Object Representations Across Vision Models

Florian P. Mahner; Francisco Pereira; Johannes Roth; Ka Chun Lam; Martin N. Hebart; Michael F. Bonner

arxiv: 2605.13675 · v1 · pith:VV6ZO7CAnew · submitted 2026-05-13 · 💻 cs.CV · cs.LG· q-bio.NC

Characterizing Universal Object Representations Across Vision Models

Florian P. Mahner , Johannes Roth , Ka Chun Lam , Michael F. Bonner , Francisco Pereira , Martin N. Hebart This is my paper

Pith reviewed 2026-05-14 20:19 UTC · model grok-4.3

classification 💻 cs.CV cs.LGq-bio.NC

keywords vision modelsobject representationsuniversal dimensionssimilarity structurebiological alignmentdeep neural networksinterpretabilitysemantic properties

0 comments

The pith

Vision models converge on universal object dimensions that are more interpretable and align better with biological vision.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper decomposes the similarity structures of objects as represented in 162 different vision models into a small set of non-negative dimensions. It then measures how frequently each dimension appears across the models to distinguish universal from model-specific ones. Universal dimensions turn out to be more interpretable and more closely linked to conceptual properties of the images. Models that exhibit more of these universal dimensions are better at predicting activity in macaque inferior temporal cortex and human similarity judgments. Variations in model architecture, training objective, dataset, size, or accuracy do not explain why certain dimensions become universal.

Core claim

Decomposing object similarity structures from 162 vision models into non-negative dimensions and identifying universal ones by their reappearance frequency reveals that these shared dimensions are more interpretable and driven by semantic image properties than model-specific dimensions. Models with a higher number of universal dimensions also show stronger alignment with macaque IT neural activity and human perceptual similarity judgments, while factors such as architecture, objectives, and training data do not account for the emergence of universality.

What carries the argument

Non-negative decomposition of pairwise object similarity matrices from vision models, with universality defined by the frequency of dimension reappearance across the 162 models.

If this is right

Universal dimensions reflect conceptual and semantic properties more strongly than model-specific dimensions.
Models with more universal dimensions provide better predictions of biological vision responses in macaque IT and human judgments.
Convergence on universal representations occurs independently of differences in architecture, objective function, training data, model size, and performance.
Interpretability and semantic content act as implicit factors promoting universality across diverse models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The findings suggest that biological vision may prioritize these universal semantic dimensions for efficient object recognition.
Designing models to emphasize universal dimensions could improve their alignment with human and primate visual systems.
Applying similar decomposition to other AI domains might reveal analogous universal conceptual representations across modalities.
Universal dimensions could serve as a basis for more robust visual features that transfer across tasks and datasets.

Load-bearing premise

That counting how often dimensions reappear across the 162 models reliably identifies universal ones, and that the non-negative decomposition captures the essential object similarity structure without major information loss.

What would settle it

A test on a new collection of vision models where the same universal dimensions do not emerge at high frequency, or where models with more universal dimensions fail to better predict IT activity and human judgments.

Figures

Figures reproduced from arXiv: 2605.13675 by Florian P. Mahner, Francisco Pereira, Johannes Roth, Ka Chun Lam, Martin N. Hebart, Michael F. Bonner.

**Figure 1.** Figure 1: Overview of the analysis pipeline and universality framework. (a) For each of 162 vision models, we extract penultimate-layer representations for object images, compute pairwise object similarity matrices, and apply symmetric nonnegative matrix factorization to obtain non-negative embeddings. (b) We compute a universality score that quantifies how consistently each dimension of a model’s embedding reappear… view at source ↗

**Figure 2.** Figure 2: Universal dimensions are more driven by conceptual object properties, and modelspecific dimensions by visual properties or lack interpretable structure. (a) Representative model dimensions for each label category (columns) at low, mid, and high universality (rows). Each grid shows the top-weighted THINGS images for that dimension. (b) Proportion of dimensions assigned each human label (semantic, visual, m… view at source ↗

**Figure 3.** Figure 3: Controlled comparisons of per-dimension universality. Each panel varies one factor while holding the others approximately constant. Box plots show the distribution of per-dimension universality scores; individual dots are dimensions, colored by model. The dashed line and gray band indicate the grand mean ± 1 SD across all models in that group. (a) Architecture: 34 CNNs, 21 transformers, 6 MLP-Mixers, and 3… view at source ↗

**Figure 4.** Figure 4: Universality predicts neural and behavioral alignment. (a) Left: model universality vs IT encoding accuracy (mean across two macaques); Right IT encoding accuracy from the universal vs specific half of each model’s dimensions. (b) Left: model universality vs human triplet accuracy; Right: triplet accuracy from the universal vs specific half. Dotted lines are chance level (1/3). Architecture. We compared 34… view at source ↗

read the original abstract

Deep neural networks trained with different architectures, objectives, and datasets have been reported to converge on similar visual representations. However, what remains unknown is which visual properties models actually converge on and which factors may underlie this convergence. To address this, we decompose the object similarity structure of 162 diverse vision models into a small set of non-negative dimensions. To determine universal versus model-specific dimensions, we then estimate how often each dimension reappears across models. In contrast to model-specific dimensions, universal dimensions are more interpretable and more strongly driven by conceptual image properties, indicating the relevance of interpretability and semantic content as implicit factors driving universality across models. Differences in architecture, objective function, training data, model size, and model performance do not explain the emergence of universal dimensions. However, models with more universal dimensions also better predict macaque IT activity and human similarity judgments, suggesting that universality reflects representations relevant to biological vision. These findings have important implications for understanding the emergent representations underlying deep neural network models and their alignment with biological vision.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper isolates universal dimensions via reappearance frequency across 162 models and links them to better biological alignment, but the non-negative decomposition step needs explicit stability checks.

read the letter

The main point is that dimensions reappearing often across many vision models turn out more interpretable, more driven by semantic content, and better at predicting macaque IT responses plus human similarity judgments than the model-specific ones. Models carrying more of these universal dimensions also show stronger alignment with biological data, while architecture, objective, data, size, and performance do not predict which dimensions become universal.

Referee Report

3 major / 2 minor

Summary. The paper decomposes object similarity structures (RDMs) from 162 diverse vision models via non-negative matrix factorization into a small set of dimensions, then labels dimensions as universal if they reappear frequently across models. It reports that universal dimensions are more interpretable and more strongly driven by conceptual image properties than model-specific ones, that architectural, objective, data, size, and performance factors do not explain universality, and that models with more universal dimensions better predict macaque IT responses and human similarity judgments.

Significance. If the frequency-based labeling is robust, the work offers a concrete empirical characterization of convergent representations across models and links universality to biological relevance, which could guide future model development and alignment studies. The scale (162 models) and downstream biological correlations are strengths.

major comments (3)

[Methods] Methods (decomposition and matching procedure): Non-negative matrix factorization is initialization-sensitive and non-unique; the manuscript must specify the exact procedure used to match dimensions across models (e.g., correlation threshold, Procrustes alignment, or pooled clustering) and report stability analyses (multiple random initializations, cross-validation of reappearance frequency). Without this, the universal vs. model-specific classification risks being an algorithmic artifact rather than a property of the representations.
[Results] Results (reconstruction fidelity): The claim that the decomposition captures the relevant object similarity structure without significant loss is load-bearing for all downstream interpretations, yet no reconstruction error, explained variance, or held-out similarity prediction metrics are referenced. Rank selection criteria and sensitivity to the chosen number of dimensions must be reported.
[Results] Results (biological prediction): The reported advantage of models with more universal dimensions in predicting macaque IT and human judgments requires explicit statistical controls (e.g., partial correlations removing model performance or size) to rule out confounds; the current description leaves open whether the correlation is driven by the universality count itself.

minor comments (2)

[Methods] Clarify the exact definition of 'reappearance' (e.g., cosine similarity threshold) in the main text rather than supplementary material.
[Figures] Figure legends should include the precise number of models and dimensions used in each panel.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and have revised the manuscript to incorporate the requested clarifications and analyses.

read point-by-point responses

Referee: [Methods] Methods (decomposition and matching procedure): Non-negative matrix factorization is initialization-sensitive and non-unique; the manuscript must specify the exact procedure used to match dimensions across models (e.g., correlation threshold, Procrustes alignment, or pooled clustering) and report stability analyses (multiple random initializations, cross-validation of reappearance frequency). Without this, the universal vs. model-specific classification risks being an algorithmic artifact rather than a property of the representations.

Authors: We thank the referee for this important methodological point. In our implementation, NMF was run with 20 random initializations per model, retaining the solution with the lowest reconstruction error. Dimensions were matched across models via pairwise Pearson correlation, with a dimension considered reappearing if its correlation exceeded 0.65 with a dimension in another model; frequency was then computed as the fraction of models containing a match. We will add a dedicated subsection in Methods describing this procedure in full and include new stability analyses (consistency of frequencies across initializations, alternative thresholds, and bootstrap resampling of models). These additions will appear in the revised Methods and Supplementary Information. revision: yes
Referee: [Results] Results (reconstruction fidelity): The claim that the decomposition captures the relevant object similarity structure without significant loss is load-bearing for all downstream interpretations, yet no reconstruction error, explained variance, or held-out similarity prediction metrics are referenced. Rank selection criteria and sensitivity to the chosen number of dimensions must be reported.

Authors: We agree that explicit reconstruction metrics are required. The number of dimensions (k=10) was selected via the elbow method on the reconstruction error curve computed for k from 5 to 20. Mean reconstruction R² across the 162 models was 0.84, with sensitivity analyses showing stable downstream results for k between 8 and 12. We will add these quantitative results, the elbow plot, and held-out similarity prediction metrics (on a 20% image subset) to the revised Results section. revision: yes
Referee: [Results] Results (biological prediction): The reported advantage of models with more universal dimensions in predicting macaque IT and human judgments requires explicit statistical controls (e.g., partial correlations removing model performance or size) to rule out confounds; the current description leaves open whether the correlation is driven by the universality count itself.

Authors: We acknowledge the need for explicit confound controls. We have computed partial correlations between the count of universal dimensions and biological predictivity while controlling for model performance (ImageNet top-1 accuracy) and size (parameter count). The partial correlations remain significant (r_partial = 0.31, p < 0.001 for macaque IT; r_partial = 0.27, p < 0.01 for human judgments). These controlled analyses and statistics will be added to the revised Results section. revision: yes

Circularity Check

0 steps flagged

No circularity: universality defined empirically via cross-model frequency

full rationale

The paper decomposes each model's object similarity structure via non-negative factorization and labels dimensions as universal solely by their reappearance frequency across the 162 independent models. This labeling step is not self-definitional, does not rename a fitted parameter as a prediction, and does not rely on load-bearing self-citations or imported uniqueness theorems; the subsequent claims (greater interpretability, semantic drive, and better prediction of macaque IT and human judgments) are tested against separate external measures. No equation or procedure reduces the reported result to its own inputs by construction, and the method remains falsifiable against held-out biological data.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of non-negative decomposition for similarity structures and on the operational definition of universality via cross-model frequency; no free parameters are explicitly named but the choice of dimensionality is implicit.

free parameters (1)

number of dimensions in decomposition
The small set size is chosen to capture the similarity structure; exact selection criterion not stated in abstract.

axioms (2)

domain assumption Object similarity structure of vision models can be meaningfully decomposed into a small set of non-negative dimensions
Core methodological step used to enable the universal-versus-specific classification.
domain assumption Frequency of dimension reappearance across models indicates universality
Directly used to label dimensions as universal.

pith-pipeline@v0.9.0 · 5493 in / 1186 out tokens · 47064 ms · 2026-05-14T20:19:41.183498+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

We decomposed each similarity matrix into r non-negative dimensions using symmetric NMF... min Wm≥0 ½‖Sm − Wm Wm⊤‖²F

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

[1]

LeCun, Y

Y . LeCun, Y . Bengio, and G. Hinton. Deep learning.Nature, 521:436–444, 2015

work page 2015
[2]

Kanwisher, M

N. Kanwisher, M. Khosla, and K. Dobs. Using artificial neural networks to ask ’why’ questions of minds and brains.Trends in Neurosciences, 46:240–254, 2023

work page 2023
[3]

Doerig, R

A. Doerig, R. P. Sommers, K. Seeliger, B. Richards, J. Ismael, G. W. Lindsay, K. P. Kording, T. Konkle, M. A. J. van Gerven, N. Kriegeskorte, and T. C. Kietzmann. The neuroconnectionist research programme. Nature Reviews Neuroscience, 24:431–450, 2023

work page 2023
[4]

D. L. K. Yamins and J. J. DiCarlo. Using goal-driven deep learning models to understand sensory cortex. Nature Neuroscience, 19:356–365, 2016

work page 2016
[5]

M. Huh, B. Cheung, T. Wang, and P. Isola. Position: The platonic representation hypothesis.Proceedings of Machine Learning Research, 235:20617–20642, 2024

work page 2024
[6]

Guth and B

F. Guth and B. Ménard. On the universality of neural encodings in CNNs. InICLR 2024 Workshop on Representational Alignment (Re-Align), 2024

work page 2024
[7]

Hosseini, C

E. Hosseini, C. Casto, N. Zaslavsky, C. Conwell, M. Richardson, and E. Fedorenko. Universality of representation in biological and artificial neural networks.bioRxiv preprint, 2024.12.26.629294, 2024

work page 2024
[8]

Chen and M

Z. Chen and M. F. Bonner. Universal dimensions of visual representation.Science Advances, 11:eadw7697, 2025

work page 2025
[9]

F. P. Mahner, L. Muttenthaler, U. Güçlü, and M. N. Hebart. Dimensions underlying the representational alignment of deep neural networks with humans.Nature Machine Intelligence, 7:848–859, 2025

work page 2025
[10]

Muttenthaler, L

L. Muttenthaler, L. Linhardt, J. Dippel, R. A. Vandermeulen, K. Hermann, A. Lampinen, and S. Korn- blith. Improving neural network representations using human similarity judgments.Advances in Neural Information Processing Systems, 36:50978–51007, 2023

work page 2023
[11]

Conwell, J

C. Conwell, J. S. Prince, K. N. Kay, G. A. Alvarez, and T. Konkle. A large-scale examination of inductive biases shaping high-level visual representation in brains and machines.Nature Communications, 15:9383, 2024

work page 2024
[12]

M. N. Hebart, A. H. Dickter, A. Kidder, W. Y . Kwok, A. Corriveau, C. Van Wicklin, and C. I. Baker. THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PLOS ONE, 14:e0223792, 2019

work page 2019
[13]

M. N. Hebart, C. Y . Zheng, F. Pereira, and C. I. Baker. Revealing the multidimensional mental representa- tions of natural objects underlying human similarity judgements.Nature Human Behaviour, 4:1173–1185, 2020

work page 2020
[14]

Geirhos, K

R. Geirhos, K. Narayanappa, B. Mitzkus, T. Thieringer, M. Bethge, F. A. Wichmann, and W. Brendel. Partial success in closing the gap between human and machine vision.Advances in Neural Information Processing Systems, 34:23885–23899, 2021

work page 2021
[15]

K. L. Hermann, T. Chen, and S. Kornblith. The origins and prevalence of texture bias in convolutional neural networks. InAdvances in Neural Information Processing Systems, volume 33, pages 19000–19015. Curran Associates, 2020

work page 2020
[16]

Y . Li, J. Yosinski, J. Clune, H. Lipson, and J. Hopcroft. Convergent learning: Do different neural networks learn the same representations? InInternational Conference on Learning Representations, 2016

work page 2016
[17]

Kornblith, M

S. Kornblith, M. Norouzi, H. Lee, and G. Hinton. Similarity of neural network representations revisited. Proceedings of Machine Learning Research, 97:3519–3529, 2019

work page 2019
[18]

Revisiting the platonic representation hypothesis: An aristotelian view.arXiv preprint arXiv:2602.14486,

F. Gröger, S. Wen, and M. Brbi´c. Revisiting the platonic representation hypothesis: an aristotelian view. arXiv preprint, 2602.14486, 2026

work page arXiv 2026
[19]

Kumar, J

A. Kumar, J. Clune, J. Lehman, and K. O. Stanley. Questioning representational optimism in deep learning: The fractured entangled representation hypothesis.arXiv preprint, 2505.11581, 2025

work page arXiv 2025
[20]

Tjandrasuwita, C

M. Tjandrasuwita, C. Ekbote, L. Ziyin, and P. P. Liang. Understanding the emergence of multimodal representation alignment. InInternational Conference on Machine Learning, 2025. 11

work page 2025
[21]

A. Y . Wang, K. Kay, T. Naselaris, M. J. Tarr, and L. Wehbe. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset.Nature Machine Intelligence, 5: 1415–1426, 2023

work page 2023
[22]

J. S. Prince, G. A. Alvarez, and T. Konkle. Contrastive learning explains the emergence and function of visual category-selective regions.Science Advances, 10:eadl1776, 2024

work page 2024
[23]

Yerxa, J

T. Yerxa, J. Feather, E. P. Simoncelli, and S. Chung. Contrastive-equivariant self-supervised learning improves alignment with primate visual area IT.Advances in Neural Information Processing Systems, 37: 96045–96070, 2024

work page 2024
[24]

Muttenthaler, K

L. Muttenthaler, K. Greff, F. Born, B. Spitzer, S. Kornblith, M. C. Mozer, K.-R. Müller, T. Unterthiner, and A. K. Lampinen. Aligning machine and human visual representations across abstraction levels.Nature, 647:349–355, 2025

work page 2025
[25]

D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba. Network dissection: quantifying interpretability of deep visual representations. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3319–3327. IEEE, 2017

work page 2017
[26]

B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, and R. Sayres. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCA V).Proceedings of Machine Learning Research, 80:2668–2677, 2018

work page 2018
[27]

M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. InEuropean Conference on Computer Vision, volume 8689, pages 818–833. Springer, 2014

work page 2014
[28]

T. Fel, V . Boutin, L. Béthune, R. Cadène, M. Moayeri, L. Andéol, M. Chalvidal, and T. Serre. A holistic approach to unifying automatic concept extraction and concept importance estimation. InAdvances in Neural Information Processing Systems, volume 36, pages 54805–54818. Curran Associates, 2023

work page 2023
[29]

Ghorbani, J

A. Ghorbani, J. Wexler, J. Y . Zou, and B. Kim. Towards automatic concept-based explanations. InAdvances in Neural Information Processing Systems, volume 32, pages 9273–9282. Curran Associates, 2019

work page 2019
[30]

Zhang, P

R. Zhang, P. Madumal, T. Miller, K. A. Ehinger, and B. I. P. Rubinstein. Invertible concept-based explanations for CNN models with non-negative concept activation vectors.Proceedings of the AAAI Conference on Artificial Intelligence, 35:11682–11690, 2021

work page 2021
[31]

Graziani, A.-P

M. Graziani, A.-P. Nguyen, L. O’Mahony, H. Müller, and V . Andrearczyk. Concept discovery and dataset exploration with singular value decomposition. InICLR 2023 Workshop on Pitfalls of Limited Data and Computation for Trustworthy ML, 2023

work page 2023
[32]

Vielhaben, S

J. Vielhaben, S. Blücher, and N. Strodthoff. Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees.Transactions on Machine Learning Research, 2023

work page 2023
[33]

Kowal, A

M. Kowal, A. Dave, R. Ambrus, A. Gaidon, K. G. Derpanis, and P. Tokmakov. Understanding video transformers via universal concept discovery. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10946–10956. IEEE, 2024

work page 2024
[34]

T. Fel, A. Picard, L. Béthune, T. Boissin, D. Vigouroux, J. Colin, R. Cadène, and T. Serre. CRAFT: Concept recursive activation FacTorization for explainability. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2711–2721. IEEE, 2023

work page 2023
[35]

D. A. Klindt, S. Sanborn, F. Acosta, F. Poitevin, and N. Miolane. Identifying interpretable visual features in artificial and biological neural systems.arXiv preprint, 2310.11431, 2023

work page arXiv 2023
[36]

Cunningham, A

H. Cunningham, A. Ewart, L. Riggs, R. Huben, and L. Sharkey. Sparse autoencoders find highly inter- pretable features in language models. InInternational Conference on Learning Representations, 2024

work page 2024
[37]

Bricken, A

T. Bricken, A. Templeton, J. Batson, B. Chen, A. Jermyn, T. Conerly, N. Turner, C. Anil, C. Deni- son, A. Askell, R. Lasenby, Y . Wu, S. Kravec, N. Schiefer, T. Maxwell, N. Joseph, Z. Hatfield-Dodds, A. Tamkin, K. Nguyen, B. McLean, J. E. Burke, T. Hume, S. Carter, T. Henighan, and C. Olah. Towards monosemanticity: Decomposing language models with diction...

work page 2023
[38]

L. Gao, T. Dupré la Tour, H. Tillman, G. Goh, R. Troll, A. Radford, I. Sutskever, J. Leike, and J. Wu. Scaling and evaluating sparse autoencoders. InInternational Conference on Learning Representations, 2025. 12

work page 2025
[39]

T. Fel, E. S. Lubana, J. S. Prince, M. Kowal, V . Boutin, I. Papadimitriou, B. Wang, M. Wattenberg, D. Ba, and T. Konkle. Archetypal sae: adaptive and stable dictionary learning for concept extraction in large vision models.Proceedings of Machine Learning Research, 267:16543–16572, 2025

work page 2025
[40]

Bhalla, A

U. Bhalla, A. Oesterling, S. Srinivas, F. P. Calmon, and H. Lakkaraju. Interpreting CLIP with sparse linear concept embeddings (SpLiCE). InAdvances in Neural Information Processing Systems, volume 37, pages 84298–84328. Curran Associates, 2024

work page 2024
[41]

Thasarathan, J

H. Thasarathan, J. Forsyth, T. Fel, M. Kowal, and K. G. Derpanis. Universal sparse autoencoders: Interpretable cross-model concept alignment.Proceedings of Machine Learning Research, 267, 2025

work page 2025
[42]

Cherti, R

M. Cherti, R. Beaumont, R. Wightman, M. Wortsman, G. Ilharco, C. Gordon, C. Schuhmann, L. Schmidt, and J. Jitsev. Reproducible scaling laws for contrastive language-image learning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2818–2829. IEEE, 2023

work page 2023
[43]

A. H. Williams, E. Kunz, S. Kornblith, and S. Linderman. Generalized shape metrics on neural represen- tations. InAdvances in Neural Information Processing Systems, volume 34, pages 4738–4750. Curran Associates, 2021

work page 2021
[44]

Kriegeskorte, M

N. Kriegeskorte, M. Mur, D. A. Ruff, R. Kiani, J. Bodurka, H. Esteky, K. Tanaka, and P. A. Bandettini. Matching categorical object representations in inferior temporal cortex of man and monkey.Neuron, 60: 1126–1141, 2008

work page 2008
[45]

Schölkopf, A

B. Schölkopf, A. Smola, and K.-R. Müller. Kernel principal component analysis. InInternational Conference on Artificial Neural Networks, volume 1327, pages 583–588. Springer, 1997

work page 1997
[46]

D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization.Advances in Neural Information Processing Systems, 13:556–562, 2000

work page 2000
[47]

Q. Shi, H. Sun, S. Lu, M. Hong, and M. Razaviyayn. Inexact block coordinate descent methods for symmetric nonnegative matrix factorization.IEEE Transactions on Signal Processing, 65:5995–6008, 2017

work page 2017
[48]

Barbu, D

A. Barbu, D. Mayo, J. Alverio, W. Luo, C. Wang, D. Gutfreund, J. Tenenbaum, and B. Katz. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models.Advances in Neural Information Processing Systems, 32:9448–9458, 2019

work page 2019
[49]

A. R. Zamir, A. Sax, W. Shen, L. J. Guibas, J. Malik, and S. Savarese. Taskonomy: disentangling task transfer learning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3712–3722. IEEE, 2018

work page 2018
[50]

N. Mu, A. Kirillov, D. Wagner, and S. Xie. SLIP: Self-supervision meets language-image pre-training. In European Conference on Computer Vision, pages 529–544. Springer, 2022

work page 2022
[51]

Sucholutsky, L

I. Sucholutsky, L. Muttenthaler, A. Weller, A. Peng, A. Bobu, B. Kim, B. C. Love, C. J. Cueva, E. Grant, I. Groen, J. Achterberg, J. B. Tenenbaum, K. M. Collins, K. L. Hermann, K. Oktar, K. Greff, M. N. Hebart, N. Cloos, N. Kriegeskorte, N. Jacoby, Q. Zhang, R. Marjieh, R. Geirhos, S. Chen, S. Kornblith, S. Rane, T. Konkle, T. P. O’Connell, T. Unterthiner...

work page 2025
[52]

Papale, F

P. Papale, F. Wang, M. W. Self, and P. R. Roelfsema. An extensive dataset of spiking activity to reveal the syntax of the ventral stream.Neuron, 113:539–553.e5, 2025

work page 2025
[53]

Schoppe, N

O. Schoppe, N. S. Harper, B. D. B. Willmore, A. J. King, and J. W. H. Schnupp. Measuring the performance of neural models.Frontiers in Computational Neuroscience, 10:10, 2016. 13 A Model Overview While no finite benchmark can represent all conceivable vision models, our set of 162 models span major contemporary sources of variation in visual representatio...

work page 2016
[54]

Leave-family-out stability.We also test whether the universality of dimensions in a given model depends on other models from the same architectural family being present in the set

vs the mean across 1,000 bootstrap resamples (20% of models, n= 32 ).(b)Leave-family-out stability: universality from the full set vs recomputed after excluding all models from the same architecture family. Leave-family-out stability.We also test whether the universality of dimensions in a given model depends on other models from the same architectural fa...

work page

[1] [1]

LeCun, Y

Y . LeCun, Y . Bengio, and G. Hinton. Deep learning.Nature, 521:436–444, 2015

work page 2015

[2] [2]

Kanwisher, M

N. Kanwisher, M. Khosla, and K. Dobs. Using artificial neural networks to ask ’why’ questions of minds and brains.Trends in Neurosciences, 46:240–254, 2023

work page 2023

[3] [3]

Doerig, R

A. Doerig, R. P. Sommers, K. Seeliger, B. Richards, J. Ismael, G. W. Lindsay, K. P. Kording, T. Konkle, M. A. J. van Gerven, N. Kriegeskorte, and T. C. Kietzmann. The neuroconnectionist research programme. Nature Reviews Neuroscience, 24:431–450, 2023

work page 2023

[4] [4]

D. L. K. Yamins and J. J. DiCarlo. Using goal-driven deep learning models to understand sensory cortex. Nature Neuroscience, 19:356–365, 2016

work page 2016

[5] [5]

M. Huh, B. Cheung, T. Wang, and P. Isola. Position: The platonic representation hypothesis.Proceedings of Machine Learning Research, 235:20617–20642, 2024

work page 2024

[6] [6]

Guth and B

F. Guth and B. Ménard. On the universality of neural encodings in CNNs. InICLR 2024 Workshop on Representational Alignment (Re-Align), 2024

work page 2024

[7] [7]

Hosseini, C

E. Hosseini, C. Casto, N. Zaslavsky, C. Conwell, M. Richardson, and E. Fedorenko. Universality of representation in biological and artificial neural networks.bioRxiv preprint, 2024.12.26.629294, 2024

work page 2024

[8] [8]

Chen and M

Z. Chen and M. F. Bonner. Universal dimensions of visual representation.Science Advances, 11:eadw7697, 2025

work page 2025

[9] [9]

F. P. Mahner, L. Muttenthaler, U. Güçlü, and M. N. Hebart. Dimensions underlying the representational alignment of deep neural networks with humans.Nature Machine Intelligence, 7:848–859, 2025

work page 2025

[10] [10]

Muttenthaler, L

L. Muttenthaler, L. Linhardt, J. Dippel, R. A. Vandermeulen, K. Hermann, A. Lampinen, and S. Korn- blith. Improving neural network representations using human similarity judgments.Advances in Neural Information Processing Systems, 36:50978–51007, 2023

work page 2023

[11] [11]

Conwell, J

C. Conwell, J. S. Prince, K. N. Kay, G. A. Alvarez, and T. Konkle. A large-scale examination of inductive biases shaping high-level visual representation in brains and machines.Nature Communications, 15:9383, 2024

work page 2024

[12] [12]

M. N. Hebart, A. H. Dickter, A. Kidder, W. Y . Kwok, A. Corriveau, C. Van Wicklin, and C. I. Baker. THINGS: A database of 1,854 object concepts and more than 26,000 naturalistic object images.PLOS ONE, 14:e0223792, 2019

work page 2019

[13] [13]

M. N. Hebart, C. Y . Zheng, F. Pereira, and C. I. Baker. Revealing the multidimensional mental representa- tions of natural objects underlying human similarity judgements.Nature Human Behaviour, 4:1173–1185, 2020

work page 2020

[14] [14]

Geirhos, K

R. Geirhos, K. Narayanappa, B. Mitzkus, T. Thieringer, M. Bethge, F. A. Wichmann, and W. Brendel. Partial success in closing the gap between human and machine vision.Advances in Neural Information Processing Systems, 34:23885–23899, 2021

work page 2021

[15] [15]

K. L. Hermann, T. Chen, and S. Kornblith. The origins and prevalence of texture bias in convolutional neural networks. InAdvances in Neural Information Processing Systems, volume 33, pages 19000–19015. Curran Associates, 2020

work page 2020

[16] [16]

Y . Li, J. Yosinski, J. Clune, H. Lipson, and J. Hopcroft. Convergent learning: Do different neural networks learn the same representations? InInternational Conference on Learning Representations, 2016

work page 2016

[17] [17]

Kornblith, M

S. Kornblith, M. Norouzi, H. Lee, and G. Hinton. Similarity of neural network representations revisited. Proceedings of Machine Learning Research, 97:3519–3529, 2019

work page 2019

[18] [18]

Revisiting the platonic representation hypothesis: An aristotelian view.arXiv preprint arXiv:2602.14486,

F. Gröger, S. Wen, and M. Brbi´c. Revisiting the platonic representation hypothesis: an aristotelian view. arXiv preprint, 2602.14486, 2026

work page arXiv 2026

[19] [19]

Kumar, J

A. Kumar, J. Clune, J. Lehman, and K. O. Stanley. Questioning representational optimism in deep learning: The fractured entangled representation hypothesis.arXiv preprint, 2505.11581, 2025

work page arXiv 2025

[20] [20]

Tjandrasuwita, C

M. Tjandrasuwita, C. Ekbote, L. Ziyin, and P. P. Liang. Understanding the emergence of multimodal representation alignment. InInternational Conference on Machine Learning, 2025. 11

work page 2025

[21] [21]

A. Y . Wang, K. Kay, T. Naselaris, M. J. Tarr, and L. Wehbe. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset.Nature Machine Intelligence, 5: 1415–1426, 2023

work page 2023

[22] [22]

J. S. Prince, G. A. Alvarez, and T. Konkle. Contrastive learning explains the emergence and function of visual category-selective regions.Science Advances, 10:eadl1776, 2024

work page 2024

[23] [23]

Yerxa, J

T. Yerxa, J. Feather, E. P. Simoncelli, and S. Chung. Contrastive-equivariant self-supervised learning improves alignment with primate visual area IT.Advances in Neural Information Processing Systems, 37: 96045–96070, 2024

work page 2024

[24] [24]

Muttenthaler, K

L. Muttenthaler, K. Greff, F. Born, B. Spitzer, S. Kornblith, M. C. Mozer, K.-R. Müller, T. Unterthiner, and A. K. Lampinen. Aligning machine and human visual representations across abstraction levels.Nature, 647:349–355, 2025

work page 2025

[25] [25]

D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba. Network dissection: quantifying interpretability of deep visual representations. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3319–3327. IEEE, 2017

work page 2017

[26] [26]

B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, and R. Sayres. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCA V).Proceedings of Machine Learning Research, 80:2668–2677, 2018

work page 2018

[27] [27]

M. D. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. InEuropean Conference on Computer Vision, volume 8689, pages 818–833. Springer, 2014

work page 2014

[28] [28]

T. Fel, V . Boutin, L. Béthune, R. Cadène, M. Moayeri, L. Andéol, M. Chalvidal, and T. Serre. A holistic approach to unifying automatic concept extraction and concept importance estimation. InAdvances in Neural Information Processing Systems, volume 36, pages 54805–54818. Curran Associates, 2023

work page 2023

[29] [29]

Ghorbani, J

A. Ghorbani, J. Wexler, J. Y . Zou, and B. Kim. Towards automatic concept-based explanations. InAdvances in Neural Information Processing Systems, volume 32, pages 9273–9282. Curran Associates, 2019

work page 2019

[30] [30]

Zhang, P

R. Zhang, P. Madumal, T. Miller, K. A. Ehinger, and B. I. P. Rubinstein. Invertible concept-based explanations for CNN models with non-negative concept activation vectors.Proceedings of the AAAI Conference on Artificial Intelligence, 35:11682–11690, 2021

work page 2021

[31] [31]

Graziani, A.-P

M. Graziani, A.-P. Nguyen, L. O’Mahony, H. Müller, and V . Andrearczyk. Concept discovery and dataset exploration with singular value decomposition. InICLR 2023 Workshop on Pitfalls of Limited Data and Computation for Trustworthy ML, 2023

work page 2023

[32] [32]

Vielhaben, S

J. Vielhaben, S. Blücher, and N. Strodthoff. Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees.Transactions on Machine Learning Research, 2023

work page 2023

[33] [33]

Kowal, A

M. Kowal, A. Dave, R. Ambrus, A. Gaidon, K. G. Derpanis, and P. Tokmakov. Understanding video transformers via universal concept discovery. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10946–10956. IEEE, 2024

work page 2024

[34] [34]

T. Fel, A. Picard, L. Béthune, T. Boissin, D. Vigouroux, J. Colin, R. Cadène, and T. Serre. CRAFT: Concept recursive activation FacTorization for explainability. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2711–2721. IEEE, 2023

work page 2023

[35] [35]

D. A. Klindt, S. Sanborn, F. Acosta, F. Poitevin, and N. Miolane. Identifying interpretable visual features in artificial and biological neural systems.arXiv preprint, 2310.11431, 2023

work page arXiv 2023

[36] [36]

Cunningham, A

H. Cunningham, A. Ewart, L. Riggs, R. Huben, and L. Sharkey. Sparse autoencoders find highly inter- pretable features in language models. InInternational Conference on Learning Representations, 2024

work page 2024

[37] [37]

Bricken, A

T. Bricken, A. Templeton, J. Batson, B. Chen, A. Jermyn, T. Conerly, N. Turner, C. Anil, C. Deni- son, A. Askell, R. Lasenby, Y . Wu, S. Kravec, N. Schiefer, T. Maxwell, N. Joseph, Z. Hatfield-Dodds, A. Tamkin, K. Nguyen, B. McLean, J. E. Burke, T. Hume, S. Carter, T. Henighan, and C. Olah. Towards monosemanticity: Decomposing language models with diction...

work page 2023

[38] [38]

L. Gao, T. Dupré la Tour, H. Tillman, G. Goh, R. Troll, A. Radford, I. Sutskever, J. Leike, and J. Wu. Scaling and evaluating sparse autoencoders. InInternational Conference on Learning Representations, 2025. 12

work page 2025

[39] [39]

T. Fel, E. S. Lubana, J. S. Prince, M. Kowal, V . Boutin, I. Papadimitriou, B. Wang, M. Wattenberg, D. Ba, and T. Konkle. Archetypal sae: adaptive and stable dictionary learning for concept extraction in large vision models.Proceedings of Machine Learning Research, 267:16543–16572, 2025

work page 2025

[40] [40]

Bhalla, A

U. Bhalla, A. Oesterling, S. Srinivas, F. P. Calmon, and H. Lakkaraju. Interpreting CLIP with sparse linear concept embeddings (SpLiCE). InAdvances in Neural Information Processing Systems, volume 37, pages 84298–84328. Curran Associates, 2024

work page 2024

[41] [41]

Thasarathan, J

H. Thasarathan, J. Forsyth, T. Fel, M. Kowal, and K. G. Derpanis. Universal sparse autoencoders: Interpretable cross-model concept alignment.Proceedings of Machine Learning Research, 267, 2025

work page 2025

[42] [42]

Cherti, R

M. Cherti, R. Beaumont, R. Wightman, M. Wortsman, G. Ilharco, C. Gordon, C. Schuhmann, L. Schmidt, and J. Jitsev. Reproducible scaling laws for contrastive language-image learning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2818–2829. IEEE, 2023

work page 2023

[43] [43]

A. H. Williams, E. Kunz, S. Kornblith, and S. Linderman. Generalized shape metrics on neural represen- tations. InAdvances in Neural Information Processing Systems, volume 34, pages 4738–4750. Curran Associates, 2021

work page 2021

[44] [44]

Kriegeskorte, M

N. Kriegeskorte, M. Mur, D. A. Ruff, R. Kiani, J. Bodurka, H. Esteky, K. Tanaka, and P. A. Bandettini. Matching categorical object representations in inferior temporal cortex of man and monkey.Neuron, 60: 1126–1141, 2008

work page 2008

[45] [45]

Schölkopf, A

B. Schölkopf, A. Smola, and K.-R. Müller. Kernel principal component analysis. InInternational Conference on Artificial Neural Networks, volume 1327, pages 583–588. Springer, 1997

work page 1997

[46] [46]

D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization.Advances in Neural Information Processing Systems, 13:556–562, 2000

work page 2000

[47] [47]

Q. Shi, H. Sun, S. Lu, M. Hong, and M. Razaviyayn. Inexact block coordinate descent methods for symmetric nonnegative matrix factorization.IEEE Transactions on Signal Processing, 65:5995–6008, 2017

work page 2017

[48] [48]

Barbu, D

A. Barbu, D. Mayo, J. Alverio, W. Luo, C. Wang, D. Gutfreund, J. Tenenbaum, and B. Katz. ObjectNet: A large-scale bias-controlled dataset for pushing the limits of object recognition models.Advances in Neural Information Processing Systems, 32:9448–9458, 2019

work page 2019

[49] [49]

A. R. Zamir, A. Sax, W. Shen, L. J. Guibas, J. Malik, and S. Savarese. Taskonomy: disentangling task transfer learning. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3712–3722. IEEE, 2018

work page 2018

[50] [50]

N. Mu, A. Kirillov, D. Wagner, and S. Xie. SLIP: Self-supervision meets language-image pre-training. In European Conference on Computer Vision, pages 529–544. Springer, 2022

work page 2022

[51] [51]

Sucholutsky, L

I. Sucholutsky, L. Muttenthaler, A. Weller, A. Peng, A. Bobu, B. Kim, B. C. Love, C. J. Cueva, E. Grant, I. Groen, J. Achterberg, J. B. Tenenbaum, K. M. Collins, K. L. Hermann, K. Oktar, K. Greff, M. N. Hebart, N. Cloos, N. Kriegeskorte, N. Jacoby, Q. Zhang, R. Marjieh, R. Geirhos, S. Chen, S. Kornblith, S. Rane, T. Konkle, T. P. O’Connell, T. Unterthiner...

work page 2025

[52] [52]

Papale, F

P. Papale, F. Wang, M. W. Self, and P. R. Roelfsema. An extensive dataset of spiking activity to reveal the syntax of the ventral stream.Neuron, 113:539–553.e5, 2025

work page 2025

[53] [53]

Schoppe, N

O. Schoppe, N. S. Harper, B. D. B. Willmore, A. J. King, and J. W. H. Schnupp. Measuring the performance of neural models.Frontiers in Computational Neuroscience, 10:10, 2016. 13 A Model Overview While no finite benchmark can represent all conceivable vision models, our set of 162 models span major contemporary sources of variation in visual representatio...

work page 2016

[54] [54]

Leave-family-out stability.We also test whether the universality of dimensions in a given model depends on other models from the same architectural family being present in the set

vs the mean across 1,000 bootstrap resamples (20% of models, n= 32 ).(b)Leave-family-out stability: universality from the full set vs recomputed after excluding all models from the same architecture family. Leave-family-out stability.We also test whether the universality of dimensions in a given model depends on other models from the same architectural fa...

work page