arxiv: 1710.10196 · v3 · submitted 2017-10-27 · 💻 cs.NE · cs.LG· stat.ML

Recognition: 2 theorem links

· Lean Theorem

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Tero Karras , Timo Aila , Samuli Laine , Jaakko Lehtinen

Authors on Pith no claims yet

Pith reviewed 2026-05-12 12:18 UTC · model grok-4.3

classification 💻 cs.NE cs.LGstat.ML

keywords generative adversarial networksprogressive growinghigh resolution image synthesistraining stabilityimage variationCelebA datasetinception score

0 comments

The pith

Progressive growing of GANs from low to high resolution stabilizes training and yields higher quality images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a training approach for generative adversarial networks in which the generator and discriminator begin at low resolution and new layers are added progressively to model finer details. This change speeds up the overall process and reduces instability during optimization. The result is the ability to generate images at much higher resolutions with improved visual quality, such as 1024 by 1024 pixels on CelebA, plus higher variation among samples. Additional implementation choices are given to keep the generator and discriminator from falling into destructive patterns.

Core claim

By starting both networks at low resolution and progressively adding layers that capture increasingly fine details, training becomes faster and more stable. This permits generation of 1024 squared CelebA images of high visual quality, an inception score of 8.80 on unsupervised CIFAR-10, and includes several implementation practices that discourage unhealthy generator-discriminator competition, plus a new quality-and-variation metric and an improved CelebA dataset.

What carries the argument

Progressive growing, where new layers are faded in to handle finer image details while preserving previously learned coarse features.

Load-bearing premise

That new layers can be added and faded in without erasing the coarse features already learned at lower resolutions.

What would settle it

Training a standard GAN directly at 1024 by 1024 resolution on CelebA and observing whether it reaches comparable visual quality and training stability without collapse.

read the original abstract

We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CelebA images at 1024^2. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally, we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CelebA dataset.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Progressive growing stabilizes high-res GAN training and delivers clear empirical gains on CelebA and CIFAR-10.

read the letter

The one thing to know is that progressive growing of both generator and discriminator lets them produce stable, high-quality images at 1024x1024 on CelebA and sets a new record on CIFAR-10. They do this by starting small and adding layers progressively with fade-in blending. This is new as a training methodology for GANs, building on multi-scale ideas but making it work empirically for stability. The paper shows good results with visuals, metrics, and some ablations on the growth process. They also detail tricks like equalized learning rates that discourage bad generator-discriminator dynamics. It does well in providing enough implementation specifics for others to follow, and the outcomes support the claims of faster and more stable training. The new metric and dataset contribution are useful additions. Soft spots are limited. The method is empirical, so it relies on the listed hyperparameters and schedule working out, with no formal proof that layer addition always preserves features. Their ablations help, but generality to other setups isn't fully tested here. The central argument holds up in the reported experiments without internal issues. This paper is for GAN researchers and practitioners in vision who want to generate high-res images. Readers in that group will get practical value from it. It deserves serious peer review given the clear empirical advances. I would recommend engaging with the work and sending it through peer review.

Referee Report

0 major / 3 minor

Summary. The paper introduces a progressive growing methodology for training GANs in which both the generator and discriminator begin at low resolution and new layers are incrementally added to model finer details as training proceeds. This is claimed to accelerate training, greatly improve stability, and enable generation of high-quality images at resolutions up to 1024² on CelebA, while also achieving a record unsupervised Inception score of 8.80 on CIFAR-10. Additional contributions include implementation details (equalized learning rates, minibatch discrimination, fade-in blending) to discourage unhealthy generator-discriminator dynamics, a new metric for assessing both image quality and variation, and the release of a higher-quality CelebA dataset.

Significance. If the reported results hold, the work constitutes a substantial practical advance in stabilizing and scaling GAN training for high-resolution synthesis. The empirical evidence—training curves, fade-in ablations, visual results on CelebA and CIFAR-10, and the explicit procedural description of the growth schedule—directly supports the central claims of improved speed, stability, and output quality. The provision of concrete implementation details, the new evaluation metric, and the improved dataset are clear strengths that enhance reproducibility and utility for the community.

minor comments (3)

The abstract states that a 'simple way to increase the variation' is proposed; the main text should supply explicit pseudocode or a numbered algorithmic listing for this component to aid direct implementation.
Figure captions for the high-resolution CelebA samples should explicitly state the exact resolution, number of training iterations, and whether the images are from the final model or an intermediate stage.
A compact table comparing the reported Inception score of 8.80 against the best prior unsupervised results on CIFAR-10 would help readers immediately contextualize the claimed record.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of our work on progressive growing of GANs. We are pleased that the referee recommends acceptance and agrees that the empirical evidence supports the claims of improved training speed, stability, and image quality.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a procedural training methodology for GANs based on progressive resolution growth, with implementation details for layer addition, fade-in, and stabilization techniques. No mathematical derivations, predictions, or first-principles results are claimed that could reduce to inputs by construction. All central claims (improved quality, stability, variation) rest on experimental outcomes such as training curves, ablation studies, and metrics on external datasets like CelebA and CIFAR10, which serve as independent validation rather than tautological reductions. No self-citations exist as load-bearing elements since this is the original work, and no ansatzes or uniqueness theorems are invoked in a self-referential manner.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

This is an empirical methods paper; the central claim rests on the assumption that progressive layer addition stabilizes adversarial dynamics. Free parameters include the exact growth schedule, fade-in rates, and standard GAN hyperparameters tuned experimentally. No new physical or mathematical entities are postulated.

free parameters (2)

progressive growth schedule and fade-in parameters
The resolution steps and transition durations are chosen experimentally to maintain stability.
GAN training hyperparameters (learning rates, batch sizes, etc.)
Standard hyperparameters are tuned specifically for the progressive regime.

axioms (1)

domain assumption Starting at low resolution and progressively adding layers stabilizes GAN training without catastrophic interference with prior features
Invoked to justify the core methodology; supported only by the reported experiments.

pith-pipeline@v0.9.0 · 5454 in / 1350 out tokens · 46782 ms · 2026-05-12T12:18:49.348295+00:00 · methodology

discussion (0)

Forward citations

Cited by 37 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AI safety via debate
stat.ML 2018-05 conditional novelty 8.0

AI agents trained through competitive debate can allow polynomial-time human judges to oversee PSPACE-level questions, with MNIST experiments boosting sparse classifier accuracy from 59% to 89% using only 6 pixels.
Training-Free Generative Sampling via Moment-Matched Score Smoothing
stat.ML 2026-05 unverdicted novelty 7.0

MM-SOLD is a training-free particle sampler whose large-particle limit converges to a moment-matched Gibbs distribution obtained by exponentially tilting a score-smoothed target.
ImageAttributionBench: How Far Are We from Generalizable Attribution?
cs.CV 2026-05 unverdicted novelty 7.0

ImageAttributionBench is a benchmark dataset demonstrating that state-of-the-art image attribution methods lack robustness to image degradation and fail to generalize to semantically disjoint domains.
What Cohort INRs Encode and Where to Freeze Them
cs.LG 2026-05 unverdicted novelty 7.0

Optimal INR freeze depth matches highest weight stable rank layer; SAEs reveal SIREN atoms are localized while FFMLP atoms trace cohort contours with causal impact on PSNR.
LEGO: LoRA-Enabled Generator-Oriented Framework for Synthetic Image Detection
cs.CV 2026-05 unverdicted novelty 7.0

LEGO uses multiple generator-specific LoRA modules modulated by an MLP and fused with attention to detect synthetic images, achieving better performance than prior methods while using under 10% of the training data.
Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges
cs.LG 2026-05 unverdicted novelty 7.0

Structured diffusion bridges with alignment constraints achieve near fully-paired quality in modality translation while working effectively in unpaired and semi-paired regimes.
ID-Eraser: Proactive Defense Against Face Swapping via Identity Perturbation
cs.CV 2026-04 unverdicted novelty 7.0

A feature-space method that erases usable identity information from face images via learnable perturbations and a Face Revive Generator, rendering them ineffective for deepfake swapping while preserving visual quality.
ReImagine: Rethinking Controllable High-Quality Human Video Generation via Image-First Synthesis
cs.CV 2026-04 unverdicted novelty 7.0

ReImagine decouples human appearance from temporal consistency via pretrained image backbones, SMPL-X motion guidance, and training-free video diffusion refinement to generate high-quality controllable videos.
MetaCloak-JPEG: JPEG-Robust Adversarial Perturbation for Preventing Unauthorized DreamBooth-Based Deepfake Generation
cs.CV 2026-04 unverdicted novelty 7.0

MetaCloak-JPEG uses a DiffJPEG layer with straight-through estimator inside a JPEG-aware EOT and curriculum meta-learning loop to produce l-inf bounded perturbations that retain 91.3% effectiveness after real JPEG com...
ExpertEdit: Learning Skill-Aware Motion Editing from Expert Videos
cs.CV 2026-04 unverdicted novelty 7.0

ExpertEdit edits novice motions to expert skill levels by learning a motion prior from unpaired videos and infilling masked skill-critical spans.
LPNSR: Optimal Noise-Guided Diffusion Image Super-Resolution Via Learnable Noise Prediction
cs.CV 2026-03 conditional novelty 7.0

LPNSR derives optimal intermediate noise for diffusion SR via MLE and implements it with an LR-guided noise predictor, reaching SOTA perceptual quality in 4 steps without text priors.
Diffusion Posterior Sampling for General Noisy Inverse Problems
stat.ML 2022-09 unverdicted novelty 7.0

Diffusion models solve noisy (non)linear inverse problems via approximated posterior sampling that blends diffusion steps with manifold gradients without strict consistency projection.
High-Resolution Image Synthesis with Latent Diffusion Models
cs.CV 2021-12 conditional novelty 7.0

Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrai...
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
cs.CV 2021-08 conditional novelty 7.0

SDEdit performs guided image synthesis and editing by adding noise to inputs and refining them via denoising with a diffusion model's SDE prior, outperforming GAN methods in human studies without task-specific training.
Reduce the Artifacts Bias for More Generalizable AI-Generated Image Detection
cs.CV 2026-05 conditional novelty 6.0

SEF introduces GAN upsampling for diverse artifacts and expert fusion to reduce domain interference, yielding stronger generalization on 13 benchmarks for AI-generated image detection.
The Diffusion Encoder
cs.LG 2026-05 unverdicted novelty 6.0

A diffusion model serves as the encoder in an autoencoder when trained alternately with the decoder to resolve opposing update directions while retaining the standard diffusion training objective.
Deep Probabilistic Unfolding for Quantized Compressive Sensing
cs.CV 2026-05 unverdicted novelty 6.0

A probabilistic unfolding network with stable likelihood projection and dual-domain Mamba achieves state-of-the-art reconstruction in quantized compressive sensing.
DiffATS: Diffusion in Aligned Tensor Space
cs.LG 2026-05 unverdicted novelty 6.0

DiffATS trains diffusion models directly on aligned Tucker tensor primitives that are proven to be homeomorphisms, delivering efficient unconditional and conditional generation across images, videos, and PDE data with...
Decoupling Semantics and Fingerprints: A Universal Representation for AI-Generated Image Detection
cs.CV 2026-05 unverdicted novelty 6.0

ODP-Net structurally disentangles universal forgery traces from generator fingerprints and semantics via orthogonal decomposition and purification, delivering state-of-the-art generalization to unseen AI image generat...
Conditional Diffusion Under Linear Constraints: Langevin Mixing and Information-Theoretic Guarantees
cs.LG 2026-05 unverdicted novelty 6.0

Error in approximating the tangent conditional score by the unconditional score in diffusion models is bounded by dimension-free conditional mutual information, with a projected-Langevin method outperforming baselines...
A Few-Step Generative Model on Cumulative Flow Maps
cs.LG 2026-05 unverdicted novelty 6.0

Cumulative flow maps unify few-step generative modeling for diffusion and flow models via cumulative transport and parameterization with minimal changes to time embeddings and objectives.
Which Face and Whose Identity? Solving the Dual Challenge of Deepfake Proactive Forensics in Multi-Face Scenarios
cs.CV 2026-04 unverdicted novelty 6.0

DAWF embeds identity watermarks via a parallel multi-face architecture and uses selective loss to answer which face was forged and whose identity was used.
Selective Depthwise Separable Convolution for Lightweight Joint Source-Channel Coding in Wireless Image Transmission
eess.IV 2026-04 unverdicted novelty 6.0

A selective replacement of convolutional layers by depthwise separable convolutions in JSCC systems cuts parameters substantially while keeping reconstruction performance nearly intact for wireless image transmission.
Combating Pattern and Content Bias: Adversarial Feature Learning for Generalized AI-Generated Image Detection
cs.CV 2026-04 unverdicted novelty 6.0

MAFL uses adversarial training to suppress pattern and content biases, guiding models to learn shared generative features for better cross-model generalization in detecting AI images.
SyncBreaker:Stage-Aware Multimodal Adversarial Attacks on Audio-Driven Talking Head Generation
cs.CV 2026-04 unverdicted novelty 6.0

SyncBreaker jointly attacks image and audio streams with Multi-Interval Sampling and Cross-Attention Fooling to degrade speech-driven talking head generation more than single-modality baselines.
VideoGPT: Video Generation using VQ-VAE and Transformers
cs.CV 2021-04 accept novelty 6.0

VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.
Evidence-based Decision Modeling for Synthetic Face Detection with Uncertainty-driven Active Learning
cs.CV 2026-05 unverdicted novelty 5.0

EMSFD uses Dirichlet-based evidence modeling to capture prediction uncertainty in synthetic face detection and applies uncertainty-driven active learning to achieve 15% higher accuracy than prior methods.
Evidence-based Decision Modeling for Synthetic Face Detection with Uncertainty-driven Active Learning
cs.CV 2026-05 unverdicted novelty 5.0

EMSFD models synthetic face detection via Dirichlet evidence and uncertainty-driven active learning, reporting 15% higher accuracy than prior state-of-the-art methods while improving reliability on out-of-distribution images.
Exploring and Exploiting Stability in Latent Flow Matching
cs.LG 2026-05 unverdicted novelty 5.0

Latent Flow Matching models exhibit inherent stability to data reduction and model shrinkage due to the flow matching objective, enabling reduced-dataset training and two-stage inference with over 2x speedup while pre...
Structured Diffusion Bridges: Inductive Bias for Denoising Diffusion Bridges
cs.LG 2026-05 unverdicted novelty 5.0

A structured diffusion bridge method achieves near fully-paired modality translation quality using alignment constraints even in unpaired or semi-paired regimes.
Mesh Based Simulations with Spatial and Temporal awareness
cs.LG 2026-05 unverdicted novelty 5.0

A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention corre...
IdentiFace: Multi-Modal Iterative Diffusion Framework for Identifiable Suspect Face Generation in Crime Investigations
cs.CV 2026-05 unverdicted novelty 5.0

IdentiFace is a multi-modal iterative diffusion framework that generates identifiable suspect faces with improved identity retrieval for law enforcement applications.
HiMix: Hierarchical Artifact-aware Mixup for Generalized Synthetic Image Detection
cs.CV 2026-04 unverdicted novelty 5.0

HiMix combines mixup augmentation to create transitional real-fake samples with hierarchical global-local artifact feature fusion to achieve better generalization in detecting AI-generated images from unseen generators.
The Amazing Stability of Flow Matching
cs.CV 2026-04 unverdicted novelty 5.0

Flow matching generative models preserve sample quality, diversity, and latent representations despite pruning 50% of the CelebA-HQ dataset or altering architecture and training configurations.
DiffMagicFace: Identity Consistent Facial Editing of Real Videos
cs.CV 2026-04 unverdicted novelty 5.0

DiffMagicFace uses concurrent fine-tuned text and image diffusion models plus a rendered multi-view dataset to achieve identity-consistent text-conditioned editing of real facial videos.
Adaptive Forensic Feature Refinement via Intrinsic Importance Perception
cs.CV 2026-04 unverdicted novelty 4.0

I2P adaptively selects the most discriminative layers from visual foundation models for synthetic image detection and constrains task updates to low-sensitivity parameter subspaces to improve specificity without harmi...
Evolution of Video Generative Foundations
cs.CV 2026-04 unverdicted novelty 2.0

This survey traces video generation technology from GANs to diffusion models and then to autoregressive and multimodal approaches while analyzing principles, strengths, and future trends.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · cited by 35 Pith papers · 5 internal anchors

[1]

Towards principled methods for training generative adversarial networks

Martin Arjovsky and L\' e on Bottou. Towards principled methods for training generative adversarial networks. In ICLR, 2017

work page 2017
[2]

Wasserstein GAN

Martin Arjovsky, Soumith Chintala, and L\' e on Bottou. Wasserstein GAN . CoRR, abs/1701.07875, 2017

work page Pith review arXiv 2017
[3]

Do GANs actually learn the distribution? An empirical study

Sanjeev Arora and Yi Zhang. Do GAN s actually learn the distribution? an empirical study. CoRR, abs/1706.08224, 2017

work page Pith review arXiv 2017
[4]

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer normalization. CoRR, abs/1607.06450, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[5]

Greedy layer-wise training of deep networks

Yoshua Bengio, Pascal Lamblin, Dan Popovici, and Hugo Larochelle. Greedy layer-wise training of deep networks. In P. B. Sch\" o lkopf, J. C. Platt, and T. Hoffman (eds.), NIPS, pp.\ 153--160. 2007

work page 2007
[6]

BEGAN: Boundary Equilibrium Generative Adversarial Networks

David Berthelot, Tom Schumm, and Luke Metz. BEGAN: B oundary equilibrium generative adversarial networks. CoRR, abs/1703.10717, 2017

work page Pith review arXiv 2017
[7]

Burt and Edward H

Peter J. Burt and Edward H. Adelson. Readings in computer vision: Issues, problems, principles, and paradigms. chapter The Laplacian Pyramid As a Compact Image Code, pp.\ 671--679. 1987

work page 1987
[8]

Photographic image synthesis with cascaded refinement networks

Qifeng Chen and Vladlen Koltun. Photographic image synthesis with cascaded refinement networks. CoRR, abs/1707.09405, 2017

work page arXiv 2017
[9]

Hovy, and Aaron C

Zihang Dai, Amjad Almahairi, Philip Bachman, Eduard H. Hovy, and Aaron C. Courville. Calibrating energy-based generative adversarial networks. In ICLR, 2017

work page 2017
[10]

Deep generative image models using a laplacian pyramid of adversarial networks

Emily L. Denton, Soumith Chintala, Arthur Szlam, and Robert Fergus. Deep generative image models using a L aplacian pyramid of adversarial networks. CoRR, abs/1506.05751, 2015

work page arXiv 2015
[11]

arXiv preprint arXiv:1606.00704 , year=

Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Alex Lamb, Martin Arjovsky, Olivier Mastropietro, and Aaron Courville. Adversarially learned inference. CoRR, abs/1606.00704, 2016

work page arXiv 2016
[12]

Durugkar, Ian Gemp, and Sridhar Mahadevan

Ishan P. Durugkar, Ian Gemp, and Sridhar Mahadevan. Generative multi-adversarial networks. CoRR, abs/1611.01673, 2016

work page arXiv 2016
[13]

A growing neural gas network learns topologies

Bernd Fritzke. A growing neural gas network learns topologies. In G. Tesauro, D. S. Touretzky, and T. K. Leen (eds.), Advances in Neural Information Processing Systems 7, pp.\ 625--632. 1995

work page 1995
[14]

Namboodiri, Philip H

Arnab Ghosh, Viveka Kulharia, Vinay P. Namboodiri, Philip H. S. Torr, and Puneet Kumar Dokania. Multi-agent diverse generative adversarial networks. CoRR, abs/1704.02906, 2017

work page arXiv 2017
[15]

Generative Adversarial Networks

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative Adversarial Networks . In NIPS, 2014

work page 2014
[16]

Grinblat, Lucas C

Guillermo L. Grinblat, Lucas C. Uzal, and Pablo M. Granitto. Class-splitting generative adversarial networks. CoRR, abs/1709.07359, 2017

work page arXiv 2017
[17]

Improved Training of Wasserstein GANs

Ishaan Gulrajani, Faruk Ahmed, Mart \' n Arjovsky, Vincent Dumoulin, and Aaron C. Courville. Improved training of W asserstein GAN s. CoRR, abs/1704.00028, 2017

work page Pith review arXiv 2017
[18]

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. CoRR, abs/1502.01852, 2015

work page Pith review arXiv 2015
[19]

GANs trained by a two time-scale update rule converge to a local Nash equilibrium

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In NIPS, pp.\ 6626--6637. 2017

work page 2017
[20]

Boundary-Seeking Generative Adversarial Networks

R Devon Hjelm, Athul Paul Jacob, Tong Che, Kyunghyun Cho, and Yoshua Bengio. Boundary-Seeking Generative Adversarial Networks . CoRR, abs/1702.08431, 2017

work page arXiv 2017
[21]

Stacked Generative Adversarial Networks

Xun Huang, Yixuan Li, Omid Poursaeed, John E. Hopcroft, and Serge J. Belongie. Stacked generative adversarial networks. CoRR, abs/1612.04357, 2016

work page Pith review arXiv 2016
[22]

Globally and locally consistent image completion

Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. Globally and locally consistent image completion. ACM Trans. Graph., 36 0 (4): 0 107:1--107:14, 2017

work page 2017
[23]

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167, 2015

work page internal anchor Pith review arXiv 2015
[24]

Kingma and Jimmy Ba

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In ICLR, 2015

work page 2015
[25]

Kingma and Max Welling

Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. In ICLR, 2014

work page 2014
[26]

Improved variational inference with inverse autoregressive flow

Diederik P Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Improved variational inference with inverse autoregressive flow. In NIPS, volume 29, pp.\ 4743--4751. 2016

work page 2016
[27]

On convergence and stability of GANs

Naveen Kodali, Jacob D. Abernethy, James Hays, and Zsolt Kira. How to train your DRAGAN . CoRR, abs/1705.07215, 2017

work page arXiv 2017
[28]

Single image super-resolution using deep learning, 2017

Dmitry Korobchenko and Marco Foco. Single image super-resolution using deep learning, 2017. URL https://gwmt.nvidia.com/super-res/about. Machines Can See summit

work page 2017
[29]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In NIPS, pp.\ 1097--1105. 2012

work page 2012
[30]

Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi

Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew P. Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic single image super-resolution using a generative adversarial network. CoRR, abs/1609.04802, 2016

work page arXiv 2016
[31]

PacGAN: The power of two samples in generative adversarial networks

Zinan Lin, Ashish Khetan, Giulia Fanti, and Sewoong Oh. PacGAN: The power of two samples in generative adversarial networks . CoRR, abs/1712.04086, 2017

work page arXiv 2017
[32]

Unsupervised image-to-image translation networks

Ming - Yu Liu, Thomas Breuel, and Jan Kautz. Unsupervised image-to-image translation networks. CoRR, abs/1703.00848, 2017

work page arXiv 2017
[33]

Deep learning face attributes in the wild

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In ICCV, 2015

work page 2015
[34]

Alireza Makhzani and Brendan J. Frey. Pixel GAN autoencoders. CoRR, abs/1706.00531, 2017

work page arXiv 2017
[35]

Image restoration using convolutional auto-encoders with symmetric skip connections

Xiao - Jiao Mao, Chunhua Shen, and Yu - Bin Yang. Image restoration using convolutional auto-encoders with symmetric skip connections. CoRR, abs/1606.08921, 2016 a

work page arXiv 2016
[36]

Xudong Mao, Qing Li, Haoran Xie, Raymond Y. K. Lau, and Zhen Wang. Least squares generative adversarial networks. CoRR, abs/1611.04076, 2016 b

work page arXiv 2016
[37]

Megapixel size image creation using generative adversarial networks

Marco Marchesi. Megapixel size image creation using generative adversarial networks. CoRR, abs/1706.00082, 2017

work page arXiv 2017
[38]

Unrolled generative adversarial networks

Luke Metz, Ben Poole, David Pfau, and Jascha Sohl - Dickstein. Unrolled generative adversarial networks. CoRR, abs/1611.02163, 2016

work page arXiv 2016
[39]

Conditional image synthesis with auxiliary classifier GAN s

Augustus Odena, Christopher Olah, and Jonathon Shlens. Conditional image synthesis with auxiliary classifier GAN s. In ICML, 2017

work page 2017
[40]

Wasserstein barycenter and its application to texture mixing

Julien Rabin, Gabriel Peyr�, Julie Delon, and Marc Bernot. Wasserstein barycenter and its application to texture mixing. In Scale Space and Variational Methods in Computer Vision (SSVM), pp.\ 435--446, 2011

work page 2011
[41]

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015

work page internal anchor Pith review arXiv 2015
[42]

Tim Salimans and Diederik P. Kingma. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. CoRR, abs/1602.07868, 2016

work page arXiv 2016
[43]

Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen

Tim Salimans, Ian J. Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training GAN s. In NIPS, 2016

work page 2016
[44]

Stanley and Risto Miikkulainen

Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10 0 (2): 0 99--127, 2002

work page 2002
[45]

Tijmen Tieleman and Geoffrey E. Hinton. Lecture 6.5 - RMSProp . Technical report, COURSERA: Neural Networks for Machine Learning, 2012

work page 2012
[46]

Lempitsky

Dmitry Ulyanov, Andrea Vedaldi, and Victor S. Lempitsky. Adversarial generator-encoder networks. CoRR, abs/1704.02304, 2017

work page arXiv 2017
[47]

WaveNet: A Generative Model for Raw Audio

A \" a ron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, and Koray Kavukcuoglu. WaveNet: A generative model for raw audio. CoRR, abs/1609.03499, 2016 a

work page internal anchor Pith review arXiv 2016
[48]

Pixel recurrent neural networks

A\" a ron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. Pixel recurrent neural networks. In ICML, pp.\ 1747--1756, 2016 b

work page 2016
[49]

Conditional image generation with PixelCNN decoders

A \" a ron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. Conditional image generation with PixelCNN decoders. CoRR, abs/1606.05328, 2016 c

work page arXiv 2016
[50]

L2 regularization versus batch and weight normalization

Twan van Laarhoven. L2 regularization versus batch and weight normalization. CoRR, abs/1706.05350, 2017

work page arXiv 2017
[51]

High-resolution image synthesis and semantic manipulation with conditional GANs

Ting - Chun Wang, Ming - Yu Liu, Jun - Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional GANs . CoRR, abs/1711.11585, 2017

work page arXiv 2017
[52]

Simoncelli, and Alan C

Zhou Wang, Eero P. Simoncelli, and Alan C. Bovik. Multi-scale structural similarity for image quality assessment. In Proc. IEEE Asilomar Conf. on Signals, Systems, and Computers, pp.\ 1398--1402, 2003

work page 2003
[53]

Improving generative adversarial networks with denoising feature matching

David Warde-Farley and Yoshua Bengio. Improving generative adversarial networks with denoising feature matching. In ICLR, 2017

work page 2017
[54]

LR-GAN: layered recursive generative adversarial networks for image generation

Jianwei Yang, Anitha Kannan, Dhruv Batra, and Devi Parikh. LR-GAN: layered recursive generative adversarial networks for image generation. In ICLR, 2017

work page 2017
[55]

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

Fisher Yu, Yinda Zhang, Shuran Song, Ari Seff, and Jianxiong Xiao. LSUN: C onstruction of a large-scale image dataset using deep learning with humans in the loop. CoRR, abs/1506.03365, 2015

work page internal anchor Pith review arXiv 2015
[56]

Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaolei Huang, Xiaogang Wang, and Dimitris N. Metaxas. StackGAN: text to photo-realistic image synthesis with stacked generative adversarial networks. In ICCV, 2017

work page 2017
[57]

Energy-based generative adversarial network

Junbo Jake Zhao, Micha \" e l Mathieu, and Yann LeCun. Energy-based generative adversarial network. In ICLR, 2017

work page 2017
[58]

Jun - Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. CoRR, abs/1703.10593, 2017

work page arXiv 2017