Convergent Learning: Do different neural networks learn the same representations?

· 2015 · cs.LG · arXiv 1511.07543

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of parameters, but valuable because it increases our ability to understand current models and create improved versions of them. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We propose a specific method of probing representations: training multiple networks and then comparing and contrasting their individual, learned representations at the level of neurons or groups of neurons. We begin research into this question using three techniques to approximately align different neural networks on a feature level: a bipartite matching approach that makes one-to-one assignments between neurons, a sparse prediction approach that finds one-to-many mappings, and a spectral clustering approach that finds many-to-many mappings. This initial investigation reveals a few previously unknown properties of neural networks, and we argue that future research into the question of convergent learning will yield many more. The insights described here include (1) that some features are learned reliably in multiple networks, yet other features are not consistently learned; (2) that units learn to span low-dimensional subspaces and, while these subspaces are common to multiple networks, the specific basis vectors learned are not; (3) that the representation codes show evidence of being a mix between a local code and slightly, but not fully, distributed codes across multiple units.

representative citing papers

Signal-to-Noise Ratio and Sample Size Govern Representational Alignment in Neural Networks

stat.ML · 2026-05-26 · unverdicted · novelty 6.0

Representational alignment varies monotonically with SNR and non-monotonically with sample size (minimized near interpolation threshold) across linear and nonlinear networks, and is decoupled from generalization error.

Entropy-Based Characterisation of the Polarised Regime in Latent Variable Models

cs.LG · 2026-05-15 · unverdicted · novelty 6.0

An entropy criterion on mean representations characterises the polarised regime in VAEs and related models, with theoretical links to KL minimisation and empirical tests across several architectures.

Enabling Federated Inference via Unsupervised Consensus Embedding

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

CE-FI maps heterogeneous model representations to a shared embedding space via unsupervised training on unlabeled data, enabling privacy-preserving federated inference that outperforms solo models on image classification benchmarks.

The Platonic Representation Hypothesis

cs.LG · 2024-05-13 · unverdicted · novelty 5.0

Representations learned by large AI models are converging toward a shared statistical model of reality.

Are Face Embeddings Compatible Across Deep Neural Network Models?

cs.CV · 2026-04-08 · unverdicted · novelty 5.0

Simple affine transformations align face embeddings across different DNN models, substantially improving cross-model identification and verification performance.

At the Edge of Understanding: Sparse Autoencoders Trace The Limits of Transformer Generalization

cs.LG · 2026-06-24 · unverdicted · novelty 4.0

Sparse autoencoders show OOD prompts increase fallacious concept activation in transformers, offering a mechanistic measure of shift and a path to robust fine-tuning.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Are Face Embeddings Compatible Across Deep Neural Network Models? cs.CV · 2026-04-08 · unverdicted · none · ref 20
Simple affine transformations align face embeddings across different DNN models, substantially improving cross-model identification and verification performance.

Convergent Learning: Do different neural networks learn the same representations?

fields

years

verdicts

representative citing papers

citing papers explorer