pith. sign in

arxiv: 1611.02731 · v2 · pith:RLDMPK6Inew · submitted 2016-11-08 · 💻 cs.LG · stat.ML

Variational Lossy Autoencoder

classification 💻 cs.LG stat.ML
keywords globalrepresentationautoencoderautoregressivecodedatadistributionimages
0
0 comments X
read the original abstract

Representation learning seeks to expose certain aspects of observed data in a learned representation that's amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and , by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only "autoencodes" data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution $p(z)$ and decoding distribution $p(x|z)$, we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Understanding Multimodal Failure in Action-Chunking Behavioral Cloning

    cs.LG 2026-05 unverdicted novelty 7.0

    The paper identifies distinct failure mechanisms: excessive posterior-prior regularization erases mode information in latent policies, while smooth base-to-action maps limit mode coverage in generative policies.

  2. Tessellations of Semi-Discrete Flow Matching

    cs.LG 2026-05 unverdicted novelty 7.0

    Semi-discrete Flow Matching produces terminal assignment regions that are topologically simple (open, simply connected, homeomorphic to the ball under assumption) yet geometrically distinct from optimal transport Lagu...

  3. A renormalization-group inspired lattice-based framework for piecewise generalized linear models

    stat.ME 2026-05 unverdicted novelty 6.0

    RG-inspired lattice models for piecewise GLMs provide explicit interpretable partitions and a replica-analysis-derived scaling law for regularization that allows increasing complexity without expected rise in generali...

  4. Axiomatizing Neural Networks via Pursuit of Subspaces

    cs.LG 2026-05 unverdicted novelty 5.0

    Authors introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic geometric framework that unifies explanations for representation, computation, and generalization in shallow and deep neural networks.