arxiv: 1511.06434 · v2 · submitted 2015-11-19 · 💻 cs.LG · cs.CV

Recognition: 3 theorem links

· Lean Theorem

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford , Luke Metz , Soumith Chintala

Authors on Pith no claims yet

Pith reviewed 2026-05-13 16:39 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords unsupervised learninggenerative adversarial networksconvolutional networksrepresentation learningimage synthesisdeep learningfeature hierarchy

0 comments

The pith

Deep convolutional GANs learn hierarchical image representations from object parts to full scenes without any labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces deep convolutional generative adversarial networks that add specific architectural rules to standard GAN training. The rules replace pooling with strided convolutions, add batch normalization to most layers, and use ReLU or leaky ReLU activations. When trained on image collections, the generator and discriminator together develop features that begin with simple parts such as edges and textures and build up to objects and entire scenes. These features transfer to other image tasks, showing that unsupervised adversarial training can produce general-purpose representations. The result suggests a practical route to close the performance gap between supervised CNNs and unsupervised methods.

Core claim

We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks, demonstrating their applicability as general image representations.

What carries the argument

The deep convolutional adversarial pair of generator and discriminator networks, each built with strided convolutions, batch normalization, and chosen activations, which together drive stable training and the emergence of hierarchical features.

If this is right

The discriminator network supplies features that work as general representations for new image classification or detection tasks.
The generator composes learned low-level parts into coherent higher-level scenes during image synthesis.
The same constrained architecture succeeds across multiple distinct image collections, indicating the method is not tied to one dataset.
Unsupervised pre-training with DCGANs becomes a viable starting point before supervised fine-tuning on limited labeled data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same style of architectural constraints might stabilize unsupervised training for other data types such as audio waveforms or video frames.
Inspecting the intermediate layers could reveal which specific visual concepts the generator assembles at each stage of synthesis.
Scaling the same constrained design to larger image resolutions could test whether the part-to-scene hierarchy continues to hold.

Load-bearing premise

The chosen architectural constraints of strided convolutions, batch normalization, and specific activations are what produce stable training and the observed hierarchy of representations.

What would settle it

Train an otherwise identical pair of networks on the same image datasets after removing batch normalization from all layers and check whether training diverges or the learned features lose their part-to-scene hierarchy.

read the original abstract

In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DCGAN gives a practical recipe for stable convolutional GANs that learn hierarchical image features, but the paper never isolates which constraints actually drive the stability or the representations.

read the letter

DCGAN is the paper that first showed how to train convolutional GANs on real image datasets without immediate collapse. The authors replace pooling with strided convolutions, add batch normalization to most layers, and use ReLU in the generator with LeakyReLU in the discriminator. On LSUN, CelebA, and ImageNet they produce generators whose layers progress from local parts to full scenes and discriminators whose features transfer to supervised classification tasks on CIFAR-10 and SVHN with reasonable accuracy. The latent-space arithmetic examples and filter visualizations are the clearest evidence they offer for hierarchical learning without labels. That combination was new relative to the original GAN work and gave people a concrete starting point for unsupervised image models. The main gap is the missing controls. They report results only for the full set of constraints and do not run ablations that remove batch norm, change the activations, or swap back to pooling while holding dataset and optimizer fixed. Without those tests it is hard to know whether the observed stability and feature progression come from the stated rules or from dataset scale and tuning effort. The quantitative transfer numbers are also thin, with no broad comparison to other unsupervised methods available at the time. This paper is for anyone building generative models who needs a reliable convolutional baseline. It deserves peer review because it supplies a reproducible empirical advance that later work could build on, even though the causal claims about the architecture would need tighter evidence.

Referee Report

1 major / 3 minor

Summary. The manuscript introduces deep convolutional generative adversarial networks (DCGANs) that impose specific architectural constraints (strided convolutions, batch normalization, and chosen activations) on standard GANs. It reports experimental results on multiple image datasets (LSUN, ImageNet, CelebA) showing that the generator and discriminator learn hierarchical representations progressing from object parts to scenes, and demonstrates the utility of these features on downstream tasks.

Significance. If the central empirical claims hold, the work is significant for bridging the gap between supervised CNN successes and unsupervised representation learning. It provides concrete evidence that a constrained adversarial framework can produce stable training and semantically meaningful features without labels, influencing subsequent generative modeling research.

major comments (1)

[Experiments] Experiments section: The central claim that the listed architectural constraints are responsible for stable training and the observed hierarchy of representations is not supported by ablation experiments. Results are reported only for the full constrained architecture; no variants are shown that remove or alter one constraint at a time (e.g., disabling batch normalization or replacing strided convolutions) while holding dataset, optimizer, and initialization fixed. This leaves the causal role of the constraints unisolated.

minor comments (3)

[Architecture] §3 (Architecture): A summary table listing exact layer counts, filter sizes, strides, and activation choices for both generator and discriminator would improve reproducibility and clarity.
[Figures] Figure captions: Captions for the visualization figures should explicitly name the dataset, model variant, and training epoch to allow readers to match visuals to the quantitative claims.
[Downstream tasks] Downstream evaluation: The reported feature-transfer results would benefit from explicit comparison tables against contemporaneous unsupervised baselines (e.g., autoencoders or sparse coding) with standard metrics such as classification accuracy.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and positive overall assessment. We address the major comment on the experiments section below.

read point-by-point responses

Referee: Experiments section: The central claim that the listed architectural constraints are responsible for stable training and the observed hierarchy of representations is not supported by ablation experiments. Results are reported only for the full constrained architecture; no variants are shown that remove or alter one constraint at a time (e.g., disabling batch normalization or replacing strided convolutions) while holding dataset, optimizer, and initialization fixed. This leaves the causal role of the constraints unisolated.

Authors: We agree that the manuscript does not contain ablation experiments that isolate the contribution of each individual constraint while holding all other factors fixed. The DCGAN architecture is presented as an integrated set of choices (strided convolutions, batch normalization, and specific activations) that together enable stable training and the emergence of hierarchical representations, with each element motivated by iterative empirical observations during model development. We will revise the text in the experiments and architecture sections to explicitly qualify the claims: the constraints are described as a combination that produces the reported outcomes, without asserting independent causality for any single component. We will also add a brief discussion of the rationale for each choice based on observations from our development process. This constitutes a partial revision, as new ablation experiments are not included. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical results on image datasets support hierarchy claims without reduction to fitted inputs or self-citations

full rationale

The paper introduces DCGAN architecture with constraints (strided convs, batch norm, ReLU/LeakyReLU) and validates via training on LSUN/CelebA/ImageNet, showing feature visualizations and transfer tasks. No equations derive predictions from fitted parameters; no self-citation chains justify uniqueness; claims rest on observed training stability and representations, not self-referential definitions or renamings. Central hierarchy evidence is experimental, not constructed from inputs.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard GAN minimax objective plus a set of hand-chosen architectural rules treated as domain assumptions rather than derived quantities. No new entities are postulated.

free parameters (2)

learning rate and optimizer settings
Standard hyperparameters tuned to achieve stable training on the reported datasets.
batch normalization momentum and epsilon
Architectural hyperparameters selected to stabilize the adversarial dynamics.

axioms (1)

domain assumption Convolutional networks with the listed constraints will converge to useful hierarchical representations when trained adversarially on natural images.
Invoked throughout the architecture section and results; not proven but supported by the reported experiments.

pith-pipeline@v0.9.0 · 5420 in / 1242 out tokens · 105682 ms · 2026-05-13T16:39:09.350506+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Cost.FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints... Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes
Foundation.DimensionForcing alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Architecture guidelines for stable Deep Convolutional GANs: Replace any pooling layers with strided convolutions... Use batchnorm in both the generator and the discriminator... Use ReLU activation in generator... Use LeakyReLU activation in the discriminator
Foundation.HierarchyEmergence hierarchy_emergence_forces_phi unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We use the trained discriminators for image classification tasks, showing competitive performance with other unsupervised algorithms

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 26 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Toy Models of Superposition
cs.LG 2022-09 accept novelty 8.0

Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulne...
Density estimation using Real NVP
cs.LG 2016-05 accept novelty 8.0

Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.
One-Step Generative Modeling via Wasserstein Gradient Flows
cs.LG 2026-05 conditional novelty 7.0

W-Flow achieves state-of-the-art one-step ImageNet 256x256 generation at 1.29 FID by training a static neural network to follow a Wasserstein gradient flow that minimizes Sinkhorn divergence, delivering roughly 100x f...
Discriminative Span as a Predictor of Synthetic Data Utility via Classifier Reconstruction
cs.CV 2026-05 unverdicted novelty 7.0

A relative projection error metric in foundation-model embedding space predicts the downstream utility of synthetic positive samples for binary classifiers.
Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences
cs.LG 2026-05 unverdicted novelty 7.0

Recursive generative retraining with pluralistic preferences converges to a stable diverse distribution that satisfies a weighted Nash bargaining solution.
Active Learning for Conditional Generative Compressed Sensing
cs.LG 2026-05 unverdicted novelty 7.0

Prompts can be split into separate roles for sampling design and recovery modeling in generative compressed sensing, with stable recovery bounds for matched prompts and an explicit penalty for mismatch, validated on S...
Physics-informed, Generative Adversarial Design of Funicular Shells
cs.CE 2026-04 unverdicted novelty 7.0

A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
SurFITR: A Dataset for Surveillance Image Forgery Detection and Localisation
cs.CV 2026-04 conditional novelty 7.0

SurFITR is a new collection of 137k+ surveillance-style forged images that causes existing detectors to degrade while enabling substantial gains when used for training in both in-domain and cross-domain settings.
Progressive Growing of GANs for Improved Quality, Stability, and Variation
cs.NE 2017-10 accept novelty 7.0

Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.
Mixed Precision Training
cs.AI 2017-10 accept novelty 7.0

Mixed precision training uses FP16 for most computations, FP32 master weights for accumulation, and loss scaling to enable accurate training of large DNNs with halved memory usage.
Neural Fields for NV-Center Inverse Sensing
cs.LG 2026-05 unverdicted novelty 6.0

NeTMY neural fields with annealed encoding, multiscale optimization, and spectrum-fidelity losses achieve superior localization and distributional accuracy in NV-center inverse sensing by using a tensor power-summed d...
Discriminative Span as a Predictor of Synthetic Data Utility via Classifier Reconstruction
cs.CV 2026-05 unverdicted novelty 6.0

A relative projection error metric in foundation model embeddings predicts whether synthetic positive samples will improve downstream CNN classification performance on real-negative plus synthetic-positive mixtures.
Enabling Federated Inference via Unsupervised Consensus Embedding
cs.LG 2026-05 unverdicted novelty 6.0

CE-FI maps heterogeneous model representations to a shared embedding space via unsupervised training on unlabeled data, enabling privacy-preserving federated inference that outperforms solo models on image classificat...
A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities
cs.AI 2026-04 unverdicted novelty 6.0

A new framework evaluates utility of synthetic mobility trajectories while a membership inference attack reveals privacy vulnerabilities in generative models thought to be safe.
Embedding Arithmetic: A Lightweight, Tuning-Free Framework for Post-hoc Bias Mitigation in Text-to-Image Models
cs.CV 2026-04 unverdicted novelty 6.0

Embedding Arithmetic performs vector operations in the embedding space of T2I models to mitigate bias at inference time, outperforming baselines on diversity while preserving coherence via a new Concept Coherence Score.
FatigueFusion: Latent Space Fusion for Fatigue-Driven Motion Synthesis
cs.GR 2026-04 unverdicted novelty 6.0

FatigueFusion fuses fatigue features in latent space using algorithmic, data-driven, and PINN modules to synthesize novel fatigued motions from non-fatigued joint sequences in an end-to-end pipeline.
MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework
cs.AI 2023-08 unverdicted novelty 6.0

MetaGPT embeds human SOPs into LLM prompts to create role-specialized agent teams that produce more coherent solutions on collaborative software engineering tasks than prior chat-based multi-agent systems.
VideoGPT: Video Generation using VQ-VAE and Transformers
cs.CV 2021-04 accept novelty 6.0

VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.
Demystifying MMD GANs
stat.ML 2018-01 accept novelty 6.0

MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.
Are Candidate Models Really Needed for Active Learning?
cs.CV 2026-05 unverdicted novelty 5.0

Active learning with randomly initialized models achieves comparable results to traditional candidate-model methods, with low-confidence sampling proving most effective.
Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling
cs.CV 2026-04 unverdicted novelty 5.0

Visual generation models are evolving from passive renderers to interactive agentic world modelers, but current systems lack spatial reasoning, temporal consistency, and causal understanding, with evaluations overemph...
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance
cs.CV 2026-04 unverdicted novelty 5.0

ACPO uses anchor-based regularization with NR-IQA guidance to enable stable perceptual quality improvements in diffusion model fine-tuning.
Improving Diversity in Black-box Few-shot Knowledge Distillation
cs.CV 2026-04 unverdicted novelty 5.0

An adaptive high-confidence image selection scheme during GAN training expands diversity in the distillation set for black-box few-shot KD and yields SOTA student accuracy on seven image datasets.
A Geometric Algebra-informed NeRF Framework for Generalizable Wireless Channel Prediction
cs.NI 2026-04 unverdicted novelty 5.0

GAI-NeRF combines geometric algebra attention and an adaptive ray tracing module inside a NeRF model to deliver more accurate and generalizable wireless channel predictions across varied indoor environments.
Enhancing the accuracy of under-resolved numerical simulations of atmospheric flows with super resolution
physics.flu-dyn 2026-04 unverdicted novelty 4.0

A multi-scale CNN super-resolution model outperforms baseline CNN, attention CNN, and diffusion-based approaches in reconstructing fine-scale features from under-resolved atmospheric flow simulations on standard benchmarks.
Synthetic data in cryptocurrencies using generative models
cs.LG 2026-04 unverdicted novelty 3.0

CGANs with LSTM generator can produce synthetic crypto price series that reproduce temporal patterns and preserve market trends and dynamics.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · cited by 25 Pith papers · 5 internal anchors

[1]

Deep generative image models using a laplacian pyramid of adversarial networks

Denton, Emily, Chintala, Soumith, Szlam, Arthur, and Fergus, Rob. Deep generative image models using a laplacian pyramid of adversarial networks. arXiv preprint arXiv:1506.05751,

work page arXiv
[2]

Learning to generate chairs with convolutional neural networks

Dosovitskiy, Alexey, Springenberg, Jost Tobias, and Brox, Thomas. Learning to generate chairs with convolutional neural networks. arXiv preprint arXiv:1411.5928,

work page arXiv
[3]

Discriminative unsupervised feature learning with exemplar convolutional neural net- works

11 Under review as a conference paper at ICLR 2016 Dosovitskiy, Alexey, Fischer, Philipp, Springenberg, Jost Tobias, Riedmiller, Martin, and Brox, Thomas. Discriminative unsupervised feature learning with exemplar convolutional neural net- works. In Pattern Analysis and Machine Intelligence, IEEE Transactions on , volume

work page 2016
[4]

Maxout networks

Goodfellow, Ian J, Warde-Farley, David, Mirza, Mehdi, Courville, Aaron, and Bengio, Yoshua. Maxout networks. arXiv preprint arXiv:1302.4389,

work page arXiv
[5]

Draw: A recurrent neural network for image generation

Gregor, Karol, Danihelka, Ivo, Graves, Alex, and Wierstra, Daan. Draw: A recurrent neural network for image generation. arXiv preprint arXiv:1502.04623,

work page arXiv
[6]

Train faster, generalize better: Stability of stochastic gradient descent

Hardt, Moritz, Recht, Benjamin, and Singer, Yoram. Train faster, generalize better: Stability of stochastic gradient descent. arXiv preprint arXiv:1509.01240,

work page arXiv
[7]

Dreaming more data: Class-dependent distributions over diffeomorphisms for learned data augmentation

Hauberg, Sren, Freifeld, Oren, Larsen, Anders Boesen Lindbo, Fisher III, John W., and Hansen, Lars Kair. Dreaming more data: Class-dependent distributions over diffeomorphisms for learned data augmentation. arXiv preprint arXiv:1510.02795,

work page arXiv
[8]

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Ioffe, Sergey and Szegedy, Christian. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167,

work page internal anchor Pith review arXiv
[9]

Adam: A Method for Stochastic Optimization

Kingma, Diederik P and Ba, Jimmy Lei. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980,

work page internal anchor Pith review Pith/arXiv arXiv
[10]

Auto-Encoding Variational Bayes

Kingma, Diederik P and Welling, Max. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114,

work page internal anchor Pith review Pith/arXiv arXiv
[11]

Maas, Andrew L, Hannun, Awni Y , and Ng, Andrew Y

URL http://leon.bottou.org/papers/loosli-canu-bottou-2006 . Maas, Andrew L, Hannun, Awni Y , and Ng, Andrew Y . Rectiﬁer nonlinearities improve neural network acoustic models. In Proc. ICML, volume 30,

work page 2006
[12]

Inceptionism : Going deeper into neural networks

Mordvintsev, Alexander, Olah, Christopher, and Tyka, Mike. Inceptionism : Going deeper into neural networks. http://googleresearch.blogspot.com/2015/06/ inceptionism-going-deeper-into-neural.html . Accessed: 2015-06-17. Nair, Vinod and Hinton, Geoffrey E. Rectiﬁed linear units improve restricted boltzmann machines. In Proceedings of the 27th International...

work page 2015
[13]

Read- ing digits in natural images with unsupervised feature learning

12 Under review as a conference paper at ICLR 2016 Netzer, Yuval, Wang, Tao, Coates, Adam, Bissacco, Alessandro, Wu, Bo, and Ng, Andrew Y . Read- ing digits in natural images with unsupervised feature learning. In NIPS workshop on deep learn- ing and unsupervised feature learning , volume 2011, pp

work page 2016
[14]

Semi- supervised learning with ladder network

Rasmus, Antti, Valpola, Harri, Honkala, Mikko, Berglund, Mathias, and Raiko, Tapani. Semi- supervised learning with ladder network. arXiv preprint arXiv:1507.02672,

work page arXiv
[15]

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

Sohl-Dickstein, Jascha, Weiss, Eric A, Maheswaranathan, Niru, and Ganguli, Surya. Deep unsuper- vised learning using nonequilibrium thermodynamics. arXiv preprint arXiv:1503.03585,

work page internal anchor Pith review arXiv
[16]

Striving for simplicity: The all convolutional net

Springenberg, Jost Tobias, Dosovitskiy, Alexey, Brox, Thomas, and Riedmiller, Martin. Striving for simplicity: The all convolutional net. arXiv preprint arXiv:1412.6806,

work page arXiv
[17]

Under- standing locally competitive networks

Srivastava, Rupesh Kumar, Masci, Jonathan, Gomez, Faustino, and Schmidhuber, J ¨urgen. Under- standing locally competitive networks. arXiv preprint arXiv:1410.1165,

work page arXiv
[19]

A note on the evaluation of generative models

URL http://arxiv.org/abs/1511.01844. Vincent, Pascal, Larochelle, Hugo, Lajoie, Isabelle, Bengio, Yoshua, and Manzagol, Pierre-Antoine. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research , 11:3371–3408,

work page Pith review arXiv
[20]

Empirical evalua- tion of rectified activations in convolutional network

Xu, Bing, Wang, Naiyan, Chen, Tianqi, and Li, Mu. Empirical evaluation of rectiﬁed activations in convolutional network. arXiv preprint arXiv:1505.00853,

work page arXiv
[21]

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

Yu, Fisher, Zhang, Yinda, Song, Shuran, Seff, Ari, and Xiao, Jianxiong. Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365 ,

work page internal anchor Pith review Pith/arXiv arXiv
[22]

Visualizing and understanding convolutional networks

Zeiler, Matthew D and Fergus, Rob. Visualizing and understanding convolutional networks. In Computer Vision–ECCV 2014, pp. 818–833. Springer,

work page 2014
[23]

Stacked what-where auto- encoders

Zhao, Junbo, Mathieu, Michael, Goroshin, Ross, and Lecun, Yann. Stacked what-where auto- encoders. arXiv preprint arXiv:1506.02351,

work page arXiv
[24]

13 Under review as a conference paper at ICLR 2016 8 S UPPLEMENTARY MATERIAL 8.1 E VALUATING DCGAN S CAPABILITY TO CAPTURE DATA DISTRIBUTIONS We propose to apply standard classiﬁcation metrics to a conditional version of our model, evaluating the conditional distributions learned. We trained a DCGAN on MNIST (splitting off a 10K validation set) as well as...

work page 2016
[25]

Table 3: Nearest neighbor classiﬁcation results

while being more general as it directly models the data instead of transformations of the data. Table 3: Nearest neighbor classiﬁcation results. Model Test Error @50K samples Test Error @10M samples AlignMNIST - 1.4% InﬁMNIST - 2.6% Real Data 3.1% - GAN 6.28% 5.65% DCGAN (ours) 2.98% 1.48% Figure 9: Side-by-side illustration of (from left-to-right) the MN...

work page 2016