An exact mapping between the Variational Renormalization Group and Deep Learning

Pankaj Mehta , David J. Schwab

Authors on Pith no claims yet

classification 📊 stat.ML cond-mat.stat-mechcs.LGcs.NE

keywords learningdeeptechniquesfeaturesgrouprelevantrenormalizationdata

read the original abstract

Deep learning is a broad set of techniques that uses multiple layers of representation to automatically learn relevant features directly from structured data. Recently, such techniques have yielded record-breaking results on a diverse set of difficult machine learning tasks in computer vision, speech recognition, and natural language processing. Despite the enormous success of deep learning, relatively little is understood theoretically about why these techniques are so successful at feature learning and compression. Here, we show that deep learning is intimately related to one of the most important and successful techniques in theoretical physics, the renormalization group (RG). RG is an iterative coarse-graining scheme that allows for the extraction of relevant features (i.e. operators) as a physical system is examined at different length scales. We construct an exact mapping from the variational renormalization group, first introduced by Kadanoff, and deep learning architectures based on Restricted Boltzmann Machines (RBMs). We illustrate these ideas using the nearest-neighbor Ising Model in one and two-dimensions. Our results suggests that deep learning algorithms may be employing a generalized RG-like scheme to learn relevant features from data.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model
stat.ML 2026-05 accept novelty 7.0

A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
Borrowed Geometry: Computational Reuse of Frozen Text-Pretrained Transformer Weights Across Modalities
cs.LG 2026-05 unverdicted novelty 6.0

Frozen text-pretrained transformer weights transfer across modalities through a thin interface, achieving SOTA on a robotic task and parity on decision-making with far fewer trainable parameters.
Lecture Notes on Statistical Physics and Neural Networks
cond-mat.dis-nn 2026-05 unverdicted novelty 2.0

Lecture notes that treat statistical physics as probability theory and connect Ising models, spin glasses, and renormalization group ideas to Hopfield networks, restricted Boltzmann machines, and large language models.