Characterizing Universal Object Representations Across Vision Models

· 2026 · cs.CV · arXiv 2605.13675

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Deep neural networks trained with different architectures, objectives, and datasets have been reported to converge on similar visual representations. However, what remains unknown is which visual properties models actually converge on and which factors may underlie this convergence. To address this, we decompose the object similarity structure of 162 diverse vision models into a small set of non-negative dimensions. To determine universal versus model-specific dimensions, we then estimate how often each dimension reappears across models. In contrast to model-specific dimensions, universal dimensions are more interpretable and more strongly driven by conceptual image properties, indicating the relevance of interpretability and semantic content as implicit factors driving universality across models. Differences in architecture, objective function, training data, model size, and model performance do not explain the emergence of universal dimensions. However, models with more universal dimensions also better predict macaque IT activity and human similarity judgments, suggesting that universality reflects representations relevant to biological vision. These findings have important implications for understanding the emergent representations underlying deep neural network models and their alignment with biological vision.

representative citing papers

cs.CV · 2026-05-26 · unverdicted · novelty 6.0

SRF factorizes similarity matrices into low-dimensional non-negative interpretable dimensions, shown to work on sparse data and match task-specific models across simulations and real datasets.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Similarity-based representation factorization for revealing interpretable dimensions in representational data cs.CV · 2026-05-26 · unverdicted · none · ref 41 · internal anchor
SRF factorizes similarity matrices into low-dimensional non-negative interpretable dimensions, shown to work on sparse data and match task-specific models across simulations and real datasets.

Characterizing Universal Object Representations Across Vision Models

fields

years

verdicts

representative citing papers

citing papers explorer