Self-Attention Generative Adversarial Networks

Augustus Odena; Dimitris Metaxas; Han Zhang; Ian Goodfellow

arxiv: 1805.08318 · v2 · pith:T7LH63CDnew · submitted 2018-05-21 · 📊 stat.ML · cs.LG

Self-Attention Generative Adversarial Networks

Han Zhang , Ian Goodfellow , Dimitris Metaxas , Augustus Odena This is my paper

classification 📊 stat.ML cs.LG

keywords generatorsaganadversarialdetailsfeaturegenerativeimageinception

0 comments

read the original abstract

In this paper, we propose the Self-Attention Generative Adversarial Network (SAGAN) which allows attention-driven, long-range dependency modeling for image generation tasks. Traditional convolutional GANs generate high-resolution details as a function of only spatially local points in lower-resolution feature maps. In SAGAN, details can be generated using cues from all feature locations. Moreover, the discriminator can check that highly detailed features in distant portions of the image are consistent with each other. Furthermore, recent work has shown that generator conditioning affects GAN performance. Leveraging this insight, we apply spectral normalization to the GAN generator and find that this improves training dynamics. The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset. Visualization of the attention layers shows that the generator leverages neighborhoods that correspond to object shapes rather than local regions of fixed shape.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 9 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AGAN: Towards Automated Design of Generative Adversarial Networks
cs.LG 2019-06 unverdicted novelty 8.0

AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
Large Scale GAN Training for High Fidelity Natural Image Synthesis
cs.LG 2018-09 accept novelty 7.0

BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.
From DES to KiDS: Domain adaptation for cross-survey detection of low-surface-brightness galaxies
astro-ph.GA 2026-05 unverdicted novelty 6.0

Domain adaptation with an ensemble of CNN and transformer models trained on DES detects 20,180 LSBGs and 434 UDGs in KiDS DR5, with structural parameters and environmental trends consistent with known samples.
DASGAN -- Joint Domain Adaptation and Segmentation for the Analysis of Epithelial Regions in Histopathology PD-L1 Images
eess.IV 2019-06 unverdicted novelty 6.0

DASGAN trains a segmentation network on semi-automatically labeled CK images via unpaired translation to PD-L1, enabling epithelium segmentation and TC score estimation without serial sections.
Deep Exemplar-based Video Colorization
cs.CV 2019-06 unverdicted novelty 6.0

A recurrent end-to-end network for exemplar-based video colorization that unifies semantic correspondence and color propagation with a temporal consistency loss.
Deep Learning for MRI Slice Interpolation: The Critical Role of Problem Formulation
eess.IV 2026-05 unverdicted novelty 5.0

Reformulating the input to adjacent slices for deep learning MRI interpolation yields 58% SSIM gains and 10.1% improvement over linear baseline, with problem formulation outweighing architecture choice.
Latent Adversarial Defence with Boundary-guided Generation
cs.LG 2019-07 unverdicted novelty 5.0

LAD generates diverse adversarial examples in latent space by perturbing along normals to an SVM-defined decision boundary and uses them for adversarial training to improve DNN robustness.
Layer Selection in Feature-Based Losses Affects Image Quality and Microstructural Consistency in Deep Learning Super-Resolution of Brain Diffusion MRI
eess.IV 2026-05 unverdicted novelty 4.0

Deeper VGG16 layers in feature losses for diffusion MRI super-resolution introduce persistent grid artifacts in images and anisotropy maps, whereas the shallowest layer preserves consistency with ground truth at high ...
Incremental Concept Learning via Online Generative Memory Recall
cs.LG 2019-07 unverdicted novelty 4.0

Pseudo-rehearsal method with cGAN-generated old-concept samples, balanced online recall, and concept contrastive loss for class-incremental learning on MNIST, Fashion-MNIST and SVHN.