pith. sign in

arxiv: 1406.2751 · v4 · pith:YHFOILJWnew · submitted 2014-06-11 · 💻 cs.LG

Reweighted Wake-Sleep

classification 💻 cs.LG
keywords inferencenetworkwake-sleeplatentalgorithmbetterdistributiongenerative
0
0 comments X
read the original abstract

Training deep directed graphical models with many hidden variables and performing inference remains a major challenge. Helmholtz machines and deep belief networks are such models, and the wake-sleep algorithm has been proposed to train them. The wake-sleep algorithm relies on training not just the directed generative model but also a conditional generative model (the inference network) that runs backward from visible to latent, estimating the posterior distribution of latent given visible. We propose a novel interpretation of the wake-sleep algorithm which suggests that better estimators of the gradient can be obtained by sampling latent variables multiple times from the inference network. This view is based on importance sampling as an estimator of the likelihood, with the approximate inference network as a proposal distribution. This interpretation is confirmed experimentally, showing that better likelihood can be achieved with this reweighted wake-sleep procedure. Based on this interpretation, we propose that a sigmoidal belief network is not sufficiently powerful for the layers of the inference network in order to recover a good estimator of the posterior distribution of latent variables. Our experiments show that using a more powerful layer model, such as NADE, yields substantially better generative models.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Emergence of Nonequilibrium Latent Cycles in Unsupervised Generative Modeling

    cond-mat.stat-mech 2025-12 unverdicted novelty 7.0

    A nonequilibrium latent-variable Markov model spontaneously develops cycles during likelihood training that enhance generative performance over equilibrium approaches.

  2. Efficient Inference for Coupled Hidden Markov Models in Continuous Time and Discrete Space

    stat.ML 2025-10 unverdicted novelty 6.0

    Proposes Latent Interacting Particle Systems with an efficient parameterization of twist potentials to enable approximate posterior inference for coupled continuous-time hidden Markov models via twisted sequential Mon...