Connecting Generative Adversarial Networks and Actor-Critic Methods

David Pfau; Oriol Vinyals

arxiv: 1610.01945 · v3 · pith:ZEUXVTI5new · submitted 2016-10-06 · 💻 cs.LG · stat.ML

Connecting Generative Adversarial Networks and Actor-Critic Methods

David Pfau , Oriol Vinyals This is my paper

classification 💻 cs.LG stat.ML

keywords actor-criticmethodsnetworksadversarialalgorithmscommunitiesgansgenerative

0 comments

read the original abstract

Both generative adversarial networks (GAN) in unsupervised learning and actor-critic methods in reinforcement learning (RL) have gained a reputation for being difficult to optimize. Practitioners in both fields have amassed a large number of strategies to mitigate these instabilities and improve training. Here we show that GANs can be viewed as actor-critic methods in an environment where the actor cannot affect the reward. We review the strategies for stabilizing training for each class of models, both those that generalize between the two and those that are particular to that model. We also review a number of extensions to GANs and RL algorithms with even more complicated information flow. We hope that by highlighting this formal connection we will encourage both GAN and RL communities to develop general, scalable, and stable algorithms for multilevel optimization with deep networks, and to draw inspiration across communities.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On the Stability and Generalization of First-order Bilevel Minimax Optimization
cs.LG 2026-04 unverdicted novelty 7.0

Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
Shuffling the Data, Stretching the Step-size: Sharper Bias in constant step-size SGD
math.OC 2026-04 unverdicted novelty 7.0

Combining random reshuffling and Richardson-Romberg extrapolation yields cubic bias refinement and better MSE for constant-step SGD on structured non-monotone variational inequalities.