Product of Orthogonal Spheres Parameterization for Disentangled Representation Learning

Ankita Shukla; Pavan Turaga; Saket Anand; Sarthak Bhagat; Shagun Uppal

arxiv: 1907.09554 · v1 · pith:LMULMVPQnew · submitted 2019-07-22 · 💻 cs.CV · cs.LG

Product of Orthogonal Spheres Parameterization for Disentangled Representation Learning

Ankita Shukla , Sarthak Bhagat , Shagun Uppal , Saket Anand , Pavan Turaga This is my paper

Pith reviewed 2026-05-24 17:50 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords disentangled representation learninglatent space parameterizationorthogonal spheresVAEdisentanglement metricsimage generationrepresentation learning

0 comments

The pith

Parameterizing the latent space as a product of orthogonal spheres improves disentanglement quality in learned representations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a latent representation structured as a product of orthogonal spheres, called PrOSe, to better disentangle explanatory factors in data. The structure draws from relaxed physical assumptions about image formation, where independent factors map to separate spheres whose orthogonality follows from independence. This constraint reduces to a simple orthonormality term in the training loss when each factor occupies equal-sized latent blocks. The method applies to standard frameworks such as VAEs and auto-encoders and produces consistent gains over prior disentanglement approaches on multiple benchmarks and evaluation metrics. The parameterization remains general enough to extend past the original physical motivation.

Core claim

The central claim is that representing the latent space as a product of orthogonal spheres, enforced via an orthonormality penalty, yields significantly higher-quality disentangled representations than existing structured priors, with the improvement holding across several benchmarks and standard disentanglement metrics.

What carries the argument

The PrOSe parameterization: a latent space formed as the product of spheres with an orthogonality constraint between them, realized as an orthonormality term in the loss under equal block sizes.

If this is right

The approach supplies a simpler alternative to full physical image-formation models while remaining extensible to additional factors.
It applies directly inside existing VAE, GAN, and auto-encoder training pipelines.
The closed-form orthonormality loss replaces more complex structural priors on the latent space.
Disentanglement quality rises consistently across multiple datasets and evaluation metrics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The spherical product form may transfer to non-image domains such as audio or sensor data if analogous independence relations can be identified.
Removing the equal-sized block requirement would allow factors of unequal latent dimensionality and widen applicability.
The same orthogonality regularizer could be tested as a plug-in module inside other representation-learning objectives beyond disentanglement.

Load-bearing premise

Latent variables tied to the physics of image formation can be modeled as spherical spaces whose orthogonality follows from physical independence under relaxed assumptions.

What would settle it

Running the same models and benchmarks with and without the orthogonal-sphere constraint and finding no improvement or a drop in disentanglement scores on the standard metrics would falsify the central claim.

Figures

Figures reproduced from arXiv: 1907.09554 by Ankita Shukla, Pavan Turaga, Saket Anand, Sarthak Bhagat, Shagun Uppal.

**Figure 2.** Figure 2: A visualization grid of image synthesis using attribute transfer. In each grid of the [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Interpolation across disentangled gender attribute for CelebA datatset with MIX [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Interpolation across disentangled individual partitions signifying different at [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: t-SNE Plots for MNIST (Column 1), 2D Sprites (Column 2) and CelebA face [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Results of predicting identity (for MNIST) and hair colour (for 2D Sprites) using [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

read the original abstract

Learning representations that can disentangle explanatory attributes underlying the data improves interpretabilty as well as provides control on data generation. Various learning frameworks such as VAEs, GANs and auto-encoders have been used in the literature to learn such representations. Most often, the latent space is constrained to a partitioned representation or structured by a prior to impose disentangling. In this work, we advance the use of a latent representation based on a product space of Orthogonal Spheres PrOSe. The PrOSe model is motivated by the reasoning that latent-variables related to the physics of image-formation can under certain relaxed assumptions lead to spherical-spaces. Orthogonality between the spheres is motivated via physical independence models. Imposing the orthogonal-sphere constraint is much simpler than other complicated physical models, is fairly general and flexible, and extensible beyond the factors used to motivate its development. Under further relaxed assumptions of equal-sized latent blocks per factor, the constraint can be written down in closed form as an ortho-normality term in the loss function. We show that our approach improves the quality of disentanglement significantly. We find consistent improvement in disentanglement compared to several state-of-the-art approaches, across several benchmarks and metrics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PrOSe gives a clean ortho-normality regularizer under equal blocks but the physics-to-sphere step is asserted without derivation and the abstract shows no numbers or metrics.

read the letter

The main thing to know is that this paper puts forward a product of orthogonal spheres (PrOSe) parameterization for the latent space in disentanglement work, reducing to a closed-form ortho-normality term when blocks are equal-sized, and claims this yields consistent gains over prior methods. The construction is presented as simpler than full physical image-formation models and extensible beyond the motivating factors. That is the actual novelty: the specific product-space setup and the reduction to the loss term are not described as prior art in the abstract. It does a reasonable job framing the constraint as lightweight and general rather than locked to one domain. The soft spots are clear and material. The abstract states that the approach improves disentanglement significantly and finds consistent gains across benchmarks and metrics, yet supplies zero quantitative results, no metric definitions, no error bars, and no information on splits or controls. The central motivation—that latent variables tied to image-formation physics lead to spherical spaces under relaxed assumptions, with orthogonality following from physical independence—is stated but not derived; no equations or mapping from factors such as pose or lighting to unit-norm orthogonal subspaces appear. The stress-test concern therefore lands: without that step the method collapses to partitioned normalization plus an auxiliary loss whose contribution is unisolated. The equal-block assumption for the closed form further limits generality. This is aimed at people working on latent constraints inside VAEs and GANs who want alternatives to priors or hard partitions. A reader focused on new regularizers could extract value if the full experiments hold up. It deserves peer review because the parameterization is distinct enough to check, even though the abstract evidence is thin.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes the PrOSe (Product of Orthogonal Spheres) parameterization for latent representations in disentangled learning. It is motivated by the claim that factors in the physics of image formation map to spherical latent spaces under relaxed assumptions, with orthogonality between spheres following from physical independence; under the further assumption of equal-sized latent blocks per factor, this yields a closed-form ortho-normality loss term. The central claim is that this approach produces consistent, significant improvements in disentanglement quality over several state-of-the-art methods across multiple benchmarks and metrics.

Significance. If the empirical gains are robustly demonstrated with proper controls and the physical motivation can be made rigorous, the method would supply a comparatively simple, closed-form constraint that is extensible beyond the motivating factors, potentially offering a practical alternative to more complex priors or architectural constraints in VAEs and related models.

major comments (2)

[Abstract] Abstract: the central empirical claim ('consistent improvement in disentanglement' and 'improves the quality of disentanglement significantly') is stated without any quantitative numbers, metrics, error bars, data-split details, or statistical tests. This absence prevents assessment of whether the reported gains are load-bearing or merely incremental.
[Motivation] Motivation (opening paragraphs): the mapping from 'latent-variables related to the physics of image-formation' to 'spherical-spaces' under 'relaxed assumptions' is asserted without a forward model, derivation, or explicit construction showing how factors such as pose or lighting become unit-norm vectors whose subspaces are orthogonal. The same holds for the step from 'physical independence models' to the orthogonality constraint. Because the closed-form loss is introduced only after these steps, the justification for the method rests on an undischarged assumption; if the physics-to-sphere link does not hold, the contribution of the ortho-normality term remains unisolated.

minor comments (1)

[Abstract] The acronym PrOSe is introduced without an immediate parenthetical expansion on first use.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below, indicating planned revisions where appropriate. The empirical results are the primary contribution and stand independently of the heuristic motivation.

read point-by-point responses

Referee: [Abstract] Abstract: the central empirical claim ('consistent improvement in disentanglement' and 'improves the quality of disentanglement significantly') is stated without any quantitative numbers, metrics, error bars, data-split details, or statistical tests. This absence prevents assessment of whether the reported gains are load-bearing or merely incremental.

Authors: We agree that quantitative support should appear in the abstract. In the revision we will insert specific metrics (e.g., average MIG or DCI scores and relative gains versus baselines), standard deviations across runs, and dataset details while remaining within length limits. revision: yes
Referee: [Motivation] Motivation (opening paragraphs): the mapping from 'latent-variables related to the physics of image-formation' to 'spherical-spaces' under 'relaxed assumptions' is asserted without a forward model, derivation, or explicit construction showing how factors such as pose or lighting become unit-norm vectors whose subspaces are orthogonal. The same holds for the step from 'physical independence models' to the orthogonality constraint. Because the closed-form loss is introduced only after these steps, the justification for the method rests on an undischarged assumption; if the physics-to-sphere link does not hold, the contribution of the ortho-normality term remains unisolated.

Authors: The opening paragraphs present a heuristic motivation rather than a rigorous derivation; no forward model or explicit construction from image-formation physics is provided in the manuscript. We will revise the text to state explicitly that the spherical and orthogonal structure is inspired by, but not derived from, physical considerations, and that the main technical contribution is the closed-form ortho-normality loss together with the empirical results. A short paragraph clarifying the scope of the assumptions will be added. revision: yes

Circularity Check

0 steps flagged

No circularity; parameterization is an explicit modeling choice derived from stated assumptions

full rationale

The paper introduces the PrOSe model as a direct consequence of relaxed physical assumptions about image formation yielding spherical latent spaces and orthogonality from independence, with the ortho-normality loss written in closed form under equal-block assumptions. No equations reduce to fitted inputs by construction, no self-citations load-bear the central premise, and no uniqueness theorems or ansatzes are imported from prior author work. The empirical gains are measured against external baselines on standard benchmarks, rendering the derivation self-contained rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

Central claim rests on domain assumptions about image-formation physics producing spherical latent spaces and independence producing orthogonality; equal block size is an additional modeling choice that enables the closed-form loss. No free parameters or invented entities are introduced in the abstract.

axioms (3)

domain assumption latent-variables related to the physics of image-formation can under certain relaxed assumptions lead to spherical-spaces
Explicitly stated as motivation for the spherical geometry.
domain assumption orthogonality between the spheres is motivated via physical independence models
Used to justify the orthogonality constraint.
ad hoc to paper equal-sized latent blocks per factor
Required to obtain the closed-form ortho-normality term in the loss.

pith-pipeline@v0.9.0 · 5762 in / 1408 out tokens · 54076 ms · 2026-05-24T17:50:11.948762+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

latent-variables related to the physics of image-formation can under certain relaxed assumptions lead to spherical-spaces. Orthogonality between the spheres is motivated via physical independence models... L_orth = ||Z^T Z - I||_F^2
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose to parameterize the latent space representation as a product of orthogonal spheres, motivated by physical models of illumination, deformation, and motion.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · 8 internal anchors

[1]

Latent space oddity: on the curvature of deep generative models

Georgios Arvanitidis, Lars Kai Hansen, and Soren Hauberg. Latent space oddity: on the curvature of deep generative models. In International Conference on Learning Representations (ICLR), 2018

work page 2018
[2]

MGGAN: Solving Mode Collapse using Manifold Guided Training

Duhyeon Bang and Hyunjung Shim. MGGAN: Solving mode collapse using manifold guided training. CoRR, abs/1804.04391, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[3]

Courville, and Pascal Vincent

Yoshua Bengio, Aaron C. Courville, and Pascal Vincent. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35:1798–1828, 2013. SHUKLA ET AL:: PRODUCT OF ORTHOGONAL SPHERES PARAMETRIZA TION 11

work page 2013
[4]

Multi-level variational autoencoder: Learning disentangled representations from grouped observations

Diane Bouchacourt, Ryota Tomioka, and Sebastian Nowozin. Multi-level variational autoencoder: Learning disentangled representations from grouped observations. In Proceedings of the Thirty-Second Conference on Artiﬁcial Intelligence (AAAI) , pages 2095–2102, 2018

work page 2095
[5]

Why deep learning works: A manifold disentanglement perspective

Pratik Prabhanjan Brahma, Dapeng Wu, and Yiyuan She. Why deep learning works: A manifold disentanglement perspective. IEEE Transactions on Neural Networks and Learning Systems, 27:1997–2008, 2016

work page 1997
[6]

Rudrasis Chakraborty and Baba C. Vemuri. Recursive Frechet mean computation on the Grassmannian and its applications to computer vision. In IEEE International Con- ference on Computer Vision, (ICCV), pages 4229–4237, 2015

work page 2015
[7]

Isolating sources of disentanglement in variational autoencoders

Tian Qi Chen, Xuechen Li, Roger B Grosse, and David K Duvenaud. Isolating sources of disentanglement in variational autoencoders. In Neural Information Processing Sys- tems (NeuRIPS), pages 2610–2620, 2018

work page 2018
[8]

InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets

Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In International Conference on Neural Information Pro- cessing Systems, 2016

work page 2016
[9]

StarGAN: Uniﬁed generative adversarial networks for multi-domain image-to- image translation

Yunjey Choi, Min-Je Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. StarGAN: Uniﬁed generative adversarial networks for multi-domain image-to- image translation. In IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), pages 8789–8797, 2018

work page 2018
[10]

The quaternions and the spaces S3, SU(2), SO(3), and RP3

Jean Gallier. The quaternions and the spaces S3, SU(2), SO(3), and RP3. In Geometric Methods and Applications, pages 248–266. Springer, 2001

work page 2001
[11]

Gatys, Alexander S

Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. Image style transfer us- ing convolutional neural networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2414–2423, 2016

work page 2016
[12]

From few to many: Illumination cone models for face recognition under variable lighting and pose

Athinodoros S Georghiades, Peter N Belhumeur, and David J Kriegman. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis & Machine Intelligence, 23(6):643–660, 2001

work page 2001
[13]

Towards a Definition of Disentangled Representations

Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a deﬁnition of disentangled representa- tions. arXiv preprint arXiv:1812.02230, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[14]

Dis- entangling factors of variation by mixing them

Qiyang Hu, Attila Szabó, Tiziano Portenier, Paolo Favaro, and Matthias Zwicker. Dis- entangling factors of variation by mixing them. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3399–3407, 2018

work page 2018
[15]

Orthogonal weight normalization: Solution to optimization over multiple dependent Stiefel manifolds in deep neural networks

Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Yongliang Wang, and Bo Li. Orthogonal weight normalization: Solution to optimization over multiple dependent Stiefel manifolds in deep neural networks. In Thirty-Second Conference on Artiﬁcial Intelligence, (AAAI), 2018. 12 SHUKLA ET AL:: PRODUCT OF ORTHOGONAL SPHERES PARAMETRIZA TION

work page 2018
[16]

Jermyn, Sebastian Kurtek, Eric Klassen, and Anuj Srivastava

Ian H. Jermyn, Sebastian Kurtek, Eric Klassen, and Anuj Srivastava. Elastic shape matching of parameterized surfaces using square root normal ﬁelds. InEuropean Con- ference on Computer Vision (ECCV), pages 804–817, 2012

work page 2012
[17]

Ananya Harsh Jha, Saket Anand, Maneesh Kumar Singh, and V . S. R. Veeravasarapu. Disentangling factors of variation with cycle-consistent variational auto-encoders. In European Conference on Computer Vision (ECCV), 2018

work page 2018
[18]

Disentangled Representation Learning for Non-Parallel Text Style Transfer

Vineet John, Lili Mou, Hareesh Bahuleyan, and Olga Vechtomova. Disentangled rep- resentation learning for text style transfer. arXiv preprint arXiv:1808.04339, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

Disentangling by factorising

Hyunjik Kim and Andriy Mnih. Disentangling by factorising. In International Confer- ence on Machine Learning (ICML), pages 2654–2663, 2018

work page 2018
[20]

Latent Space Non-Linear Statistics

Line Kühnel, Tom Fletcher, Sarang C. Joshi, and Stefan Sommer. Latent space non- linear statistics. CoRR, abs/1805.07632, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[21]

Lecun, L

Y . Lecun, L. Bottou, Y . Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, Nov 1998

work page 1998
[22]

MR-GAN: Manifold Regularized Generative Adversarial Networks

Qunwei Li, Bhavya Kailkhura, Rushil Anirudh, Yi Zhou, Yingbin Liang, and Pramod K. Varshney. MR-GAN: Manifold regularized generative adversarial networks. CoRR, abs/1811.10427, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[23]

Disentangling Pose from Appearance in Monochrome Hand Images

Yikang Li, Chris Twigg, Yuting Ye, Lingling Tao, and Xiaogang Wang. Disentangling pose from appearance in monochrome hand images. arXiv preprint arXiv:1904.07528, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904
[24]

Disentangled sequential autoencoder

Yingzhen Li and Stephan Mandt. Disentangled sequential autoencoder. InInternational Conference on Machine Learning (ICML), pages 5656–5665, 2018

work page 2018
[25]

Deep learning face attributes in the wild

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV) , pages 3730–3738, Dec 2015

work page 2015
[26]

Learning invariant Riemannian geometric represen- tations using deep nets

Suhas Lohit and Pavan Turaga. Learning invariant Riemannian geometric represen- tations using deep nets. In ICCV Workshop on Manifold Learning: From Euclid to Riemann, pages 1329–1338, 2017

work page 2017
[27]

Disentangling factors of variation in deep representations using adversarial training

Michaël Mathieu, Junbo Jake Zhao, Pablo Sprechmann, Aditya Ramesh, and Yann LeCun. Disentangling factors of variation in deep representations using adversarial training. In Advances in Neural Information Processing Systems (NIPS), pages 5041– 5049, 2016

work page 2016
[28]

beta-V AE: Learning basic visual concepts with a constrained variational framework

Loïc Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. beta-V AE: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations (ICLR), 2017

work page 2017
[29]

A note on Riemannian optimization methods on the Stiefel and the Grassmann manifolds

Yasunori Nishimori. A note on Riemannian optimization methods on the Stiefel and the Grassmann manifolds. In International Symposium on Nonlinear Theory and its Applications (NOLTA2005), volume 1, pages 349–352, 2005. SHUKLA ET AL:: PRODUCT OF ORTHOGONAL SPHERES PARAMETRIZA TION 13

work page 2005
[30]

Emerging disentanglement in auto-encoder based unsupervised image content transfer

Ori Press, Tomer Galanti, Sagie Benaim, and Lior Wolf. Emerging disentanglement in auto-encoder based unsupervised image content transfer. In International Conference on Learning Representations (ICLR), 2019

work page 2019
[31]

Reed, Yi Zhang, Yuting Zhang, and Honglak Lee

Scott E. Reed, Yi Zhang, Yuting Zhang, and Honglak Lee. Deep visual analogy- making. In Advances in Neural Information Processing Systems (NIPS), pages 1252– 1260, 2015

work page 2015
[32]

Learning Disentangled Representations with Reference-Based Variational Autoencoders

Adrià Ruiz, Oriol Martinez, Xavier Binefa, and Jakob Verbeek. Learning disentan- gled representations with reference-based variational autoencoders. arXiv preprint arXiv:1901.08534, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901
[33]

Thomas Fletcher

Hang Shao, Abhishek Kumar, and P. Thomas Fletcher. The Riemannian geometry of deep generative models. In CVPR Workshop on Differential Geometry in Computer Vision and Machine Learning (DiffCVML), pages 315–323, 2018

work page 2018
[34]

Deforming autoencoders: Unsupervised disentangling of shape and appearance

Zhixin Shu, Mihir Sahasrabudhe, Riza Alp Güler, Dimitris Samaras, Nikos Paragios, and Iasonas Kokkinos. Deforming autoencoders: Unsupervised disentangling of shape and appearance. In European Conference on Computer Vision (ECCV), Part X, pages 664–680, 2018

work page 2018
[35]

Distance metric learning by optimization on the Stiefel manifold

Ankita Shukla and Saket Anand. Distance metric learning by optimization on the Stiefel manifold. In BMVC workshop on Differential Geometry in Computer Vision (DiffCV), 2015

work page 2015
[36]

Ankita Shukla, Shagun Uppal, Sarthak Bhagat, Saket Anand, and Pavan K. Turaga. Geometry of deep generative models for disentangled representations. In Indian Con- ference on Computer Vision, Graphics, and Image Processing (ICVGIP), 2018

work page 2018
[37]

Chal- lenges in disentangling independent factors of variation

Attila Szabó, Qiyang Hu, Tiziano Portenier, Matthias Zwicker, and Paolo Favaro. Chal- lenges in disentangling independent factors of variation. In 6th International Confer- ence on Learning Representations (ICLR), Workshop Track Proceedings, 2018

work page 2018
[38]

Domain Adaptation Meets Disentangled Representation Learning and Style Transfer

Hoang Tran Vu and Ching-Chun Huang. Domain adaptation meets disentangled repre- sentation learning and style transfer. arXiv preprint arXiv:1712.09025, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[39]

On orthogonality and learning recurrent networks with long term dependencies

Eugene V orontsov, Chiheb Trabelsi, Samuel Kadoury, and Chris Pal. On orthogonality and learning recurrent networks with long term dependencies. InInternational Confer- ence on Machine Learning (ICML), pages 3570–3578, 2017

work page 2017

[1] [1]

Latent space oddity: on the curvature of deep generative models

Georgios Arvanitidis, Lars Kai Hansen, and Soren Hauberg. Latent space oddity: on the curvature of deep generative models. In International Conference on Learning Representations (ICLR), 2018

work page 2018

[2] [2]

MGGAN: Solving Mode Collapse using Manifold Guided Training

Duhyeon Bang and Hyunjung Shim. MGGAN: Solving mode collapse using manifold guided training. CoRR, abs/1804.04391, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[3] [3]

Courville, and Pascal Vincent

Yoshua Bengio, Aaron C. Courville, and Pascal Vincent. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35:1798–1828, 2013. SHUKLA ET AL:: PRODUCT OF ORTHOGONAL SPHERES PARAMETRIZA TION 11

work page 2013

[4] [4]

Multi-level variational autoencoder: Learning disentangled representations from grouped observations

Diane Bouchacourt, Ryota Tomioka, and Sebastian Nowozin. Multi-level variational autoencoder: Learning disentangled representations from grouped observations. In Proceedings of the Thirty-Second Conference on Artiﬁcial Intelligence (AAAI) , pages 2095–2102, 2018

work page 2095

[5] [5]

Why deep learning works: A manifold disentanglement perspective

Pratik Prabhanjan Brahma, Dapeng Wu, and Yiyuan She. Why deep learning works: A manifold disentanglement perspective. IEEE Transactions on Neural Networks and Learning Systems, 27:1997–2008, 2016

work page 1997

[6] [6]

Rudrasis Chakraborty and Baba C. Vemuri. Recursive Frechet mean computation on the Grassmannian and its applications to computer vision. In IEEE International Con- ference on Computer Vision, (ICCV), pages 4229–4237, 2015

work page 2015

[7] [7]

Isolating sources of disentanglement in variational autoencoders

Tian Qi Chen, Xuechen Li, Roger B Grosse, and David K Duvenaud. Isolating sources of disentanglement in variational autoencoders. In Neural Information Processing Sys- tems (NeuRIPS), pages 2610–2620, 2018

work page 2018

[8] [8]

InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets

Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets. In International Conference on Neural Information Pro- cessing Systems, 2016

work page 2016

[9] [9]

StarGAN: Uniﬁed generative adversarial networks for multi-domain image-to- image translation

Yunjey Choi, Min-Je Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. StarGAN: Uniﬁed generative adversarial networks for multi-domain image-to- image translation. In IEEE Conference on Computer Vision and Pattern Recognition, (CVPR), pages 8789–8797, 2018

work page 2018

[10] [10]

The quaternions and the spaces S3, SU(2), SO(3), and RP3

Jean Gallier. The quaternions and the spaces S3, SU(2), SO(3), and RP3. In Geometric Methods and Applications, pages 248–266. Springer, 2001

work page 2001

[11] [11]

Gatys, Alexander S

Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. Image style transfer us- ing convolutional neural networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2414–2423, 2016

work page 2016

[12] [12]

From few to many: Illumination cone models for face recognition under variable lighting and pose

Athinodoros S Georghiades, Peter N Belhumeur, and David J Kriegman. From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on Pattern Analysis & Machine Intelligence, 23(6):643–660, 2001

work page 2001

[13] [13]

Towards a Definition of Disentangled Representations

Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, and Alexander Lerchner. Towards a deﬁnition of disentangled representa- tions. arXiv preprint arXiv:1812.02230, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[14] [14]

Dis- entangling factors of variation by mixing them

Qiyang Hu, Attila Szabó, Tiziano Portenier, Paolo Favaro, and Matthias Zwicker. Dis- entangling factors of variation by mixing them. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3399–3407, 2018

work page 2018

[15] [15]

Orthogonal weight normalization: Solution to optimization over multiple dependent Stiefel manifolds in deep neural networks

Lei Huang, Xianglong Liu, Bo Lang, Adams Wei Yu, Yongliang Wang, and Bo Li. Orthogonal weight normalization: Solution to optimization over multiple dependent Stiefel manifolds in deep neural networks. In Thirty-Second Conference on Artiﬁcial Intelligence, (AAAI), 2018. 12 SHUKLA ET AL:: PRODUCT OF ORTHOGONAL SPHERES PARAMETRIZA TION

work page 2018

[16] [16]

Jermyn, Sebastian Kurtek, Eric Klassen, and Anuj Srivastava

Ian H. Jermyn, Sebastian Kurtek, Eric Klassen, and Anuj Srivastava. Elastic shape matching of parameterized surfaces using square root normal ﬁelds. InEuropean Con- ference on Computer Vision (ECCV), pages 804–817, 2012

work page 2012

[17] [17]

Ananya Harsh Jha, Saket Anand, Maneesh Kumar Singh, and V . S. R. Veeravasarapu. Disentangling factors of variation with cycle-consistent variational auto-encoders. In European Conference on Computer Vision (ECCV), 2018

work page 2018

[18] [18]

Disentangled Representation Learning for Non-Parallel Text Style Transfer

Vineet John, Lili Mou, Hareesh Bahuleyan, and Olga Vechtomova. Disentangled rep- resentation learning for text style transfer. arXiv preprint arXiv:1808.04339, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

Disentangling by factorising

Hyunjik Kim and Andriy Mnih. Disentangling by factorising. In International Confer- ence on Machine Learning (ICML), pages 2654–2663, 2018

work page 2018

[20] [20]

Latent Space Non-Linear Statistics

Line Kühnel, Tom Fletcher, Sarang C. Joshi, and Stefan Sommer. Latent space non- linear statistics. CoRR, abs/1805.07632, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[21] [21]

Lecun, L

Y . Lecun, L. Bottou, Y . Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, Nov 1998

work page 1998

[22] [22]

MR-GAN: Manifold Regularized Generative Adversarial Networks

Qunwei Li, Bhavya Kailkhura, Rushil Anirudh, Yi Zhou, Yingbin Liang, and Pramod K. Varshney. MR-GAN: Manifold regularized generative adversarial networks. CoRR, abs/1811.10427, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[23] [23]

Disentangling Pose from Appearance in Monochrome Hand Images

Yikang Li, Chris Twigg, Yuting Ye, Lingling Tao, and Xiaogang Wang. Disentangling pose from appearance in monochrome hand images. arXiv preprint arXiv:1904.07528, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1904

[24] [24]

Disentangled sequential autoencoder

Yingzhen Li and Stephan Mandt. Disentangled sequential autoencoder. InInternational Conference on Machine Learning (ICML), pages 5656–5665, 2018

work page 2018

[25] [25]

Deep learning face attributes in the wild

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV) , pages 3730–3738, Dec 2015

work page 2015

[26] [26]

Learning invariant Riemannian geometric represen- tations using deep nets

Suhas Lohit and Pavan Turaga. Learning invariant Riemannian geometric represen- tations using deep nets. In ICCV Workshop on Manifold Learning: From Euclid to Riemann, pages 1329–1338, 2017

work page 2017

[27] [27]

Disentangling factors of variation in deep representations using adversarial training

Michaël Mathieu, Junbo Jake Zhao, Pablo Sprechmann, Aditya Ramesh, and Yann LeCun. Disentangling factors of variation in deep representations using adversarial training. In Advances in Neural Information Processing Systems (NIPS), pages 5041– 5049, 2016

work page 2016

[28] [28]

beta-V AE: Learning basic visual concepts with a constrained variational framework

Loïc Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, and Alexander Lerchner. beta-V AE: Learning basic visual concepts with a constrained variational framework. In International Conference on Learning Representations (ICLR), 2017

work page 2017

[29] [29]

A note on Riemannian optimization methods on the Stiefel and the Grassmann manifolds

Yasunori Nishimori. A note on Riemannian optimization methods on the Stiefel and the Grassmann manifolds. In International Symposium on Nonlinear Theory and its Applications (NOLTA2005), volume 1, pages 349–352, 2005. SHUKLA ET AL:: PRODUCT OF ORTHOGONAL SPHERES PARAMETRIZA TION 13

work page 2005

[30] [30]

Emerging disentanglement in auto-encoder based unsupervised image content transfer

Ori Press, Tomer Galanti, Sagie Benaim, and Lior Wolf. Emerging disentanglement in auto-encoder based unsupervised image content transfer. In International Conference on Learning Representations (ICLR), 2019

work page 2019

[31] [31]

Reed, Yi Zhang, Yuting Zhang, and Honglak Lee

Scott E. Reed, Yi Zhang, Yuting Zhang, and Honglak Lee. Deep visual analogy- making. In Advances in Neural Information Processing Systems (NIPS), pages 1252– 1260, 2015

work page 2015

[32] [32]

Learning Disentangled Representations with Reference-Based Variational Autoencoders

Adrià Ruiz, Oriol Martinez, Xavier Binefa, and Jakob Verbeek. Learning disentan- gled representations with reference-based variational autoencoders. arXiv preprint arXiv:1901.08534, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1901

[33] [33]

Thomas Fletcher

Hang Shao, Abhishek Kumar, and P. Thomas Fletcher. The Riemannian geometry of deep generative models. In CVPR Workshop on Differential Geometry in Computer Vision and Machine Learning (DiffCVML), pages 315–323, 2018

work page 2018

[34] [34]

Deforming autoencoders: Unsupervised disentangling of shape and appearance

Zhixin Shu, Mihir Sahasrabudhe, Riza Alp Güler, Dimitris Samaras, Nikos Paragios, and Iasonas Kokkinos. Deforming autoencoders: Unsupervised disentangling of shape and appearance. In European Conference on Computer Vision (ECCV), Part X, pages 664–680, 2018

work page 2018

[35] [35]

Distance metric learning by optimization on the Stiefel manifold

Ankita Shukla and Saket Anand. Distance metric learning by optimization on the Stiefel manifold. In BMVC workshop on Differential Geometry in Computer Vision (DiffCV), 2015

work page 2015

[36] [36]

Ankita Shukla, Shagun Uppal, Sarthak Bhagat, Saket Anand, and Pavan K. Turaga. Geometry of deep generative models for disentangled representations. In Indian Con- ference on Computer Vision, Graphics, and Image Processing (ICVGIP), 2018

work page 2018

[37] [37]

Chal- lenges in disentangling independent factors of variation

Attila Szabó, Qiyang Hu, Tiziano Portenier, Matthias Zwicker, and Paolo Favaro. Chal- lenges in disentangling independent factors of variation. In 6th International Confer- ence on Learning Representations (ICLR), Workshop Track Proceedings, 2018

work page 2018

[38] [38]

Domain Adaptation Meets Disentangled Representation Learning and Style Transfer

Hoang Tran Vu and Ching-Chun Huang. Domain adaptation meets disentangled repre- sentation learning and style transfer. arXiv preprint arXiv:1712.09025, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[39] [39]

On orthogonality and learning recurrent networks with long term dependencies

Eugene V orontsov, Chiheb Trabelsi, Samuel Kadoury, and Chris Pal. On orthogonality and learning recurrent networks with long term dependencies. InInternational Confer- ence on Machine Learning (ICML), pages 3570–3578, 2017

work page 2017