Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders

Guillaume Alain; Salah Rifai; Yoshua Bengio

arxiv: 1207.0057 · v1 · pith:MTL5TKDHnew · submitted 2012-06-30 · 💻 cs.LG · stat.ML

Implicit Density Estimation by Local Moment Matching to Sample from Auto-Encoders

Yoshua Bengio , Guillaume Alain , Salah Rifai This is my paper

classification 💻 cs.LG stat.ML

keywords localauto-encoderdensitycovariancecontractiveestimationgoodmatching

0 comments

read the original abstract

Recent work suggests that some auto-encoder variants do a good job of capturing the local manifold structure of the unknown data generating density. This paper contributes to the mathematical understanding of this phenomenon and helps define better justified sampling algorithms for deep learning based on auto-encoder variants. We consider an MCMC where each step samples from a Gaussian whose mean and covariance matrix depend on the previous state, defines through its asymptotic distribution a target density. First, we show that good choices (in the sense of consistency) for these mean and covariance functions are the local expected value and local covariance under that target density. Then we show that an auto-encoder with a contractive penalty captures estimators of these local moments in its reconstruction function and its Jacobian. A contribution of this work is thus a novel alternative to maximum-likelihood density estimation, which we call local moment matching. It also justifies a recently proposed sampling algorithm for the Contractive Auto-Encoder and extends it to the Denoising Auto-Encoder.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Score-based Membership Inference on Diffusion Models
cs.LG 2025-09 unverdicted novelty 7.0

Presents SimA, a score-based single-query membership inference attack for diffusion models and LDMs that uses denoiser output norm to reveal training set proximity and outperforms multi-query baselines on eight datasets.