Ladder Variational Autoencoders

Casper Kaae S{\o}nderby; Lars Maal{\o}e; Ole Winther; S{\o}ren Kaae S{\o}nderby; Tapani Raiko

arxiv: 1602.02282 · v3 · pith:J2E6VPE2new · submitted 2016-02-06 · 📊 stat.ML · cs.LG

Ladder Variational Autoencoders

Casper Kaae S{\o}nderby , Tapani Raiko , Lars Maal{\o}e , S{\o}ren Kaae S{\o}nderby , Ole Winther This is my paper

classification 📊 stat.ML cs.LG

keywords modelsvariationalautoencodersinferenceladdermodeldependentgenerative

0 comments

read the original abstract

Variational Autoencoders are powerful models for unsupervised learning. However deep models with several layers of dependent stochastic variables are difficult to train which limits the improvements obtained using these highly expressive models. We propose a new inference model, the Ladder Variational Autoencoder, that recursively corrects the generative distribution by a data dependent approximate likelihood in a process resembling the recently proposed Ladder Network. We show that this model provides state of the art predictive log-likelihood and tighter log-likelihood lower bound compared to the purely bottom-up inference in layered Variational Autoencoders and other generative models. We provide a detailed analysis of the learned hierarchical latent representation and show that our new inference model is qualitatively different and utilizes a deeper more distributed hierarchy of latent variables. Finally, we observe that batch normalization and deterministic warm-up (gradually turning on the KL-term) are crucial for training variational models with many stochastic layers.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Universal audio synthesizer control with normalizing flows
cs.LG 2019-07 unverdicted novelty 7.0

A VAE+NF model with disentangling flows for unified audio synthesizer control, claiming better parameter inference and reconstruction than baselines while disentangling audio factors into macro-parameters.
Ensemble-Based Dirichlet Modeling for Predictive Uncertainty and Selective Classification
stat.ML 2026-04 unverdicted novelty 6.0

Ensemble-based method of moments on softmax outputs produces stable Dirichlet predictive distributions that improve uncertainty-guided tasks like selective classification over evidential deep learning.
Autoencoding sensory substitution
q-bio.NC 2019-07 unverdicted novelty 4.0

Deep recurrent autoencoders convert images to shortened audio signals that incorporate hearing models, enabling above-chance hand posture discrimination and object reaching after a few hours of training instead of months.