GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

Antonio Torralba; Bolei Zhou; David Bau; Hendrik Strobelt; Joshua B. Tenenbaum; Jun-Yan Zhu; William T. Freeman

arxiv: 1811.10597 · v2 · pith:XX6C4KXCnew · submitted 2018-11-26 · 💻 cs.CV · cs.AI· cs.GR· cs.LG

GAN Dissection: Visualizing and Understanding Generative Adversarial Networks

David Bau , Jun-Yan Zhu , Hendrik Strobelt , Bolei Zhou , Joshua B. Tenenbaum , William T. Freeman , Antonio Torralba This is my paper

classification 💻 cs.CV cs.AIcs.GRcs.LG

keywords unitsgansmodelsadversarialapplicationsbetterconceptsdissection

0 comments

read the original abstract

Generative Adversarial Networks (GANs) have recently achieved impressive results for many real-world applications, and many GAN variants have emerged with improvements in sample quality and training stability. However, they have not been well visualized or understood. How does a GAN represent our visual world internally? What causes the artifacts in GAN results? How do architectural choices affect GAN learning? Answering such questions could enable us to develop new insights and better models. In this work, we present an analytic framework to visualize and understand GANs at the unit-, object-, and scene-level. We first identify a group of interpretable units that are closely related to object concepts using a segmentation-based network dissection method. Then, we quantify the causal effect of interpretable units by measuring the ability of interventions to control objects in the output. We examine the contextual relationship between these units and their surroundings by inserting the discovered object concepts into new images. We show several practical applications enabled by our framework, from comparing internal representations across different layers, models, and datasets, to improving GANs by locating and removing artifact-causing units, to interactively manipulating objects in a scene. We provide open source interpretation tools to help researchers and practitioners better understand their GAN models.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
cs.LG 2026-05 unverdicted novelty 6.0

A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.