Deep Learning and Hierarchal Generative Models

Elchanan Mossel

arxiv: 1612.09057 · v4 · pith:UCA3QLRUnew · submitted 2016-12-29 · 💻 cs.LG

Deep Learning and Hierarchal Generative Models

Elchanan Mossel This is my paper

classification 💻 cs.LG

keywords deepmodelsdatagenerativehierarchallearningalgorithmalgorithms

0 comments

read the original abstract

It is argued that deep learning is efficient for data that is generated from hierarchal generative models. Examples of such generative models include wavelet scattering networks, functions of compositional structure, and deep rendering models. Unfortunately so far, for all such models, it is either not rigorously known that they can be learned efficiently, or it is not known that "deep algorithms" are required in order to learn them. We propose a simple family of "generative hierarchal models" which can be efficiently learned and where "deep" algorithm are necessary for learning. Our definition of "deep" algorithms is based on the empirical observation that deep nets necessarily use correlations between features. More formally, we show that in a semi-supervised setting, given access to low-order moments of the labeled data and all of the unlabeled data, it is information theoretically impossible to perform classification while at the same time there is an efficient algorithm, that given all labelled and unlabeled data, perfectly labels all unlabelled data with high probability. For the proof, we use and strengthen the fact that Belief Propagation does not admit a good approximation in terms of linear functions.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model
stat.ML 2026-05 accept novelty 7.0

A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
Self-supervised local learning rules learn the hidden hierarchical structure of high-dimensional data
cs.LG 2026-05 unverdicted novelty 6.0

Layerwise self-supervised local rules learn the hierarchical structure of the Random Hierarchy Model as data-efficiently as supervised backpropagation, while direct feedback approximations fail due to missing masking ...