Optimizing the Latent Space of Generative Networks

Armand Joulin; Arthur Szlam; David Lopez-Paz; Piotr Bojanowski

arxiv: 1707.05776 · v2 · pith:VKG4BDTQnew · submitted 2017-07-18 · 📊 stat.ML · cs.CV· cs.LG

Optimizing the Latent Space of Generative Networks

Piotr Bojanowski , Armand Joulin , David Lopez-Paz , Arthur Szlam This is my paper

classification 📊 stat.ML cs.CVcs.LG

keywords adversarialgansgenerativenetworksoptimizationconvolutionaldeepdiscriminator

0 comments

read the original abstract

Generative Adversarial Networks (GANs) have achieved remarkable results in the task of generating realistic natural images. In most successful applications, GAN models share two common aspects: solving a challenging saddle point optimization problem, interpreted as an adversarial game between a generator and a discriminator functions; and parameterizing the generator and the discriminator as deep convolutional neural networks. The goal of this paper is to disentangle the contribution of these two factors to the success of GANs. In particular, we introduce Generative Latent Optimization (GLO), a framework to train deep convolutional generators using simple reconstruction losses. Throughout a variety of experiments, we show that GLO enjoys many of the desirable properties of GANs: synthesizing visually-appealing samples, interpolating meaningfully between samples, and performing linear arithmetic with noise vectors; all of this without the adversarial optimization scheme.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Tessellations of Semi-Discrete Flow Matching
cs.LG 2026-05 unverdicted novelty 7.0

Semi-discrete Flow Matching produces terminal assignment regions that are topologically simple (open, simply connected, homeomorphic to the ball under assumption) yet geometrically distinct from optimal transport Lagu...
Predicting 3D structure by latent posterior sampling
cs.CV 2026-05 unverdicted novelty 5.0

A two-stage latent-variable model uses diffusion-based score matching to sample 3D scenes from posteriors conditioned on varied observations via volumetric rendering likelihoods.
Predicting 3D structure by latent posterior sampling
cs.CV 2026-05 unverdicted novelty 5.0

A latent-variable approach uses diffusion models on NeRF-encoded scene representations to perform posterior sampling for 3D reconstruction from single-view, multi-view, noisy, sparse-pixel, or sparse-depth inputs.
Predicting 3D structure by latent posterior sampling
cs.CV 2026-05 unverdicted novelty 5.0

A two-stage method trains NeRF latents then a diffusion prior to sample posteriors for 3D reconstruction from varied observations including single-view, multi-view, noisy, sparse pixels, and sparse depth.
ChatSR: Multimodal Large Language Models for Scientific Formula Discovery
cs.AI 2024-06 unverdicted novelty 5.0

ChatSR aligns scientific data encoders with LLMs to produce formulas that fit data and satisfy explicit priors, reporting SOTA results on 13 symbolic regression benchmarks plus zero-shot handling of unseen prior types.
NeRF: Neural Radiance Field in 3D Vision: A Comprehensive Review (Updated Post-Gaussian Splatting)
cs.CV 2022-10 unverdicted novelty 2.0

A literature survey of NeRF and neural field methods from 2020-2025, organized by architecture and application taxonomies with benchmarks and dataset overviews, covering both pre- and post-Gaussian Splatting periods.