Representation Learning: A Review and New Perspectives

Aaron Courville; Pascal Vincent; Yoshua Bengio

arxiv: 1206.5538 · v3 · pith:ACXPOPGUnew · submitted 2012-06-24 · 💻 cs.LG

Representation Learning: A Review and New Perspectives

Yoshua Bengio , Aaron Courville , Pascal Vincent This is my paper

classification 💻 cs.LG

keywords learningrepresentationsrepresentationalgorithmsdatadeepdesigndifferent

0 comments

read the original abstract

The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 11 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Generative models on phase space
hep-ph 2026-04 unverdicted novelty 8.0

Generative diffusion and flow models are constructed to remain exactly on the Lorentz-invariant massless N-particle phase space manifold during sampling for particle physics applications.
Unifying Dynamical Systems and Graph Theory to Mechanistically Understand Computation in Neural Networks
cs.NE 2026-05 unverdicted novelty 7.0

RNN computation is recovered from multi-hop graph pathways, and constraining these pathways via resolvent regularization yields improved temporal sparsity and task performance over standard L1.
Unifying Dynamical Systems and Graph Theory to Mechanistically Understand Computation in Neural Networks
cs.NE 2026-05 unverdicted novelty 7.0

Multi-hop graph analysis of RNNs reveals temporal information routing and motivates resolvent regularization that outperforms L1 by enforcing pathway-level sparsity aligned with task structure.
Nothing Deceives Like Success: Social Learning and the Illusion of Understanding in Science
physics.soc-ph 2026-04 unverdicted novelty 6.0

Success bias in collective theory-building leads to systematic overestimation of theory quality, narrower search, and paradoxically lower performance when agents optimize for apparent success.
CLEAR-HPV: Interpretable concept discovery for human-papillomavirus-associated morphology in whole-slide histology
cs.CV 2026-02 unverdicted novelty 6.0

CLEAR-HPV uses attention-weighted latent space in MIL to discover 10 interpretable concepts from HPV histopathology slides, producing spatial maps and compact vectors that retain predictive power across TCGA and CPTAC...
To Use AI as Dice of Possibilities with Timing Computation
cs.AI 2026-05 unverdicted novelty 5.0

Proposes verb-based paradigm with timing computation to enable data-driven discovery of patient trajectories and counterfactual timing from EHR data without domain knowledge.
Computational Hermeneutics: Evaluating generative AI as a cultural technology
cs.AI 2026-03 unverdicted novelty 5.0

Generative AI should be evaluated through computational hermeneutics using iterative, human-inclusive benchmarks that measure cultural context rather than isolated model outputs.
CLEAR-HPV: Interpretable concept discovery for human-papillomavirus-associated morphology in whole-slide histology
cs.CV 2026-02 unverdicted novelty 5.0

CLEAR-HPV restructures the latent space of attention-based MIL models to discover 10 label-free morphologic concepts that preserve slide-level HPV prediction performance and generalize across TCGA-HNSCC, TCGA-CESC, an...
On mechanisms for transfer using landmark value functions in multi-task lifelong reinforcement learning
cs.LG 2019-07 unverdicted novelty 5.0

Landmark topological coverings derived from traversibility metrics enable three transfer mechanisms with theoretical Q-value bounds in goal-based multi-task lifelong RL.
Learning-based Hamilton-Jacobi-Bellman Methods for Optimal Control
math.OC 2019-07 unverdicted novelty 4.0

Supervised and reinforcement learning are used to find initial adjoint variables for real-time solution of Hamilton-Jacobi-Bellman equations in two-point boundary value problems.
Survey on Disaster Management Datasets for Remote Sensing Based Emergency Applications
cs.CV 2026-05 unverdicted novelty 3.0

A survey providing an overview of publicly available image-based datasets for ML/DL-based disaster management pipelines covering pre-disaster, during, and post-disaster phases.