Disentangling Disentanglement in Variational Autoencoders

Emile Mathieu; N. Siddharth; Tom Rainforth; Yee Whye Teh

arxiv: 1812.02833 · v3 · pith:ZNFW47DOnew · submitted 2018-12-06 · 📊 stat.ML · cs.LG

Disentangling Disentanglement in Variational Autoencoders

Emile Mathieu , Tom Rainforth , N. Siddharth , Yee Whye Teh This is my paper

classification 📊 stat.ML cs.LG

keywords disentanglementlatentpriordecompositionallowscontroldatafactors

0 comments

read the original abstract

We develop a generalisation of disentanglement in VAEs---decomposition of the latent representation---characterising it as the fulfilment of two factors: a) the latent encodings of the data having an appropriate level of overlap, and b) the aggregate encoding of the data conforming to a desired structure, represented through the prior. Decomposition permits disentanglement, i.e. explicit independence between latents, as a special case, but also allows for a much richer class of properties to be imposed on the learnt representation, such as sparsity, clustering, independent subspaces, or even intricate hierarchical dependency relationships. We show that the $\beta$-VAE varies from the standard VAE predominantly in its control of latent overlap and that for the standard choice of an isotropic Gaussian prior, its objective is invariant to rotations of the latent representation. Viewed from the decomposition perspective, breaking this invariance with simple manipulations of the prior can yield better disentanglement with little or no detriment to reconstructions. We further demonstrate how other choices of prior can assist in producing different decompositions and introduce an alternative training objective that allows the control of both decomposition factors in a principled manner.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Deep Attention Reweighting: Post-Hoc Attention-Based Feature Aggregation in CNNs for Disentangling Core and Spurious Features under Spurious Correlations
cs.CV 2026-05 unverdicted novelty 5.0

DAR replaces GAP with an attention-based aggregation module retrained jointly with the classifier head to disentangle core from spurious features and outperforms DFR on multiple datasets.