hub Mixed citations

Improved techniques for training gans

Salimans, T · 2016 · cs.LG · arXiv 1606.03498

Mixed citation behavior. Most common role is background (60%).

17 Pith papers citing it

Background 60% of classified citations

open full Pith review browse 17 citing papers arXiv PDF

abstract

We present a variety of new architectural features and training procedures that we apply to the generative adversarial networks (GANs) framework. We focus on two applications of GANs: semi-supervised learning, and the generation of images that humans find visually realistic. Unlike most work on generative models, our primary goal is not to train a model that assigns high likelihood to test data, nor do we require the model to be able to learn well without using any labels. Using our new techniques, we achieve state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN. The generated images are of high quality as confirmed by a visual Turing test: our model generates MNIST samples that humans cannot distinguish from real data, and CIFAR-10 samples that yield a human error rate of 21.3%. We also present ImageNet samples with unprecedented resolution and show that our methods enable the model to learn recognizable features of ImageNet classes.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 method 2

citation-polarity summary

background 3 use method 2

representative citing papers

AGAN: Towards Automated Design of Generative Adversarial Networks

cs.LG · 2019-06-25 · unverdicted · novelty 8.0

AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.

Setting-Matched and Semantics-Scaled Benchmarking of One-Step Generative Models Against Multistep Diffusion and Flow Models

cs.CV · 2026-03-15 · unverdicted · novelty 7.0

Matched benchmarking reveals FID misleads in few-step regimes under CFG, prompting CLIP-scaled and PickScore-scaled FID and IS variants for better semantic evaluation of one-step image generators.

Omni2Sound: Towards Unified Video-Text-to-Audio Generation

cs.SD · 2026-01-06 · unverdicted · novelty 7.0

A single DiT-based diffusion model unifies video-to-audio, text-to-audio, and joint video-text-to-audio generation, supported by a new 470k-pair dataset and three-stage progressive training that resolves task competition.

Physics-informed, Generative Adversarial Design of Funicular Shells

cs.CE · 2026-04-17 · unverdicted · novelty 7.0

A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Quantitative Video World Model Evaluation for Geometric-Consistency

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

PDI-Bench computes 3D projective residuals from segmented and tracked points to quantify geometric inconsistency in AI-generated videos.

EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules

physics.ao-ph · 2025-09-30 · unverdicted · novelty 6.0

EnScale emulates high-resolution regional climate model outputs from global circulation models for multiple variables using a two-step generative process with sparse local stochastic layers and energy score optimization, including a temporally consistent variant.

Demystifying MMD GANs

stat.ML · 2018-01-04 · accept · novelty 6.0

MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.

MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis

cs.CV · 2026-04-15 · unverdicted · novelty 6.0

MaMe is a differentiable matrix-only token merging method that doubles ViT-B throughput with a 2% accuracy drop on pre-trained models and enables faster, higher-quality image synthesis when paired with MaRe.

ELT: Elastic Looped Transformers for Visual Generation

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

Elastic Looped Transformers share weights across recurrent blocks and apply intra-loop self-distillation to deliver 4x parameter reduction while matching competitive FID and FVD scores on ImageNet and UCF-101.

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

cs.CV · 2023-07-04 · conditional · novelty 6.0

SDXL improves upon prior Stable Diffusion versions through a larger UNet backbone, dual text encoders, novel conditioning, and a refinement model, producing higher-fidelity images competitive with black-box state-of-the-art generators.

VideoGPT: Video Generation using VQ-VAE and Transformers

cs.CV · 2021-04-20 · accept · novelty 6.0

VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.

Spatial sensitivity analysis for urban land use prediction with physics-constrained conditional generative adversarial networks

cs.LG · 2019-07-22 · unverdicted · novelty 5.0

A physics-constrained cGAN is trained as an image-to-image translator on remote-sensing layers to recover spatial sensitivities of urban land-use change to macroeconomic indicators via backpropagation gradients.

On the Tradeoffs of On-Device Generative Models in Federated Predictive Maintenance Systems

cs.LG · 2026-05-08 · unverdicted · novelty 5.0

Experiments on real industrial time series show that partial model sharing improves diffusion model performance in bandwidth-limited non-IID settings, while full sharing stabilizes GAN training but offers less robustness than VAE or DDPM alternatives.

Protecting and Preserving Protest Dynamics for Responsible Analysis

cs.CV · 2026-04-06 · unverdicted · novelty 5.0

A responsible computing framework substitutes real protest imagery with labeled synthetic reproductions from conditional image synthesis to enable privacy-aware analysis of collective action patterns.

A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence

cs.LG · 2026-04-22 · unverdicted · novelty 4.0

A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.

Measuring the Transferability of Adversarial Examples

cs.LG · 2019-07-14 · unverdicted · novelty 3.0

Empirical measurement of adversarial example transferability between VGG and Inception model classes with methodological refinements to attack strength selection, perturbation clipping, and evaluation via SSIM.

citing papers explorer

Showing 17 of 17 citing papers.

AGAN: Towards Automated Design of Generative Adversarial Networks cs.LG · 2019-06-25 · unverdicted · none · ref 21 · internal anchor
AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
Setting-Matched and Semantics-Scaled Benchmarking of One-Step Generative Models Against Multistep Diffusion and Flow Models cs.CV · 2026-03-15 · unverdicted · none · ref 20 · internal anchor
Matched benchmarking reveals FID misleads in few-step regimes under CFG, prompting CLIP-scaled and PickScore-scaled FID and IS variants for better semantic evaluation of one-step image generators.
Omni2Sound: Towards Unified Video-Text-to-Audio Generation cs.SD · 2026-01-06 · unverdicted · none · ref 47 · internal anchor
A single DiT-based diffusion model unifies video-to-audio, text-to-audio, and joint video-text-to-audio generation, supported by a new 470k-pair dataset and three-stage progressive training that resolves task competition.
Physics-informed, Generative Adversarial Design of Funicular Shells cs.CE · 2026-04-17 · unverdicted · none · ref 27
A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
Diffusion Models Beat GANs on Image Synthesis cs.LG · 2021-05-11 · accept · none · ref 54
Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.
Quantitative Video World Model Evaluation for Geometric-Consistency cs.CV · 2026-05-14 · unverdicted · none · ref 25 · internal anchor
PDI-Bench computes 3D projective residuals from segmented and tracked points to quantify geometric inconsistency in AI-generated videos.
EnScale: Temporally-consistent multivariate generative downscaling via proper scoring rules physics.ao-ph · 2025-09-30 · unverdicted · none · ref 48 · internal anchor
EnScale emulates high-resolution regional climate model outputs from global circulation models for multiple variables using a two-step generative process with sparse local stochastic layers and energy score optimization, including a temporally consistent variant.
Demystifying MMD GANs stat.ML · 2018-01-04 · accept · none · ref 45 · internal anchor
MMD GANs have unbiased critic gradients but biased generator gradients from sample-based learning, and the Kernel Inception Distance provides a practical new measure for GAN convergence and dynamic learning rate adaptation.
MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis cs.CV · 2026-04-15 · unverdicted · none · ref 39
MaMe is a differentiable matrix-only token merging method that doubles ViT-B throughput with a 2% accuracy drop on pre-trained models and enables faster, higher-quality image synthesis when paired with MaRe.
ELT: Elastic Looped Transformers for Visual Generation cs.CV · 2026-04-10 · unverdicted · none · ref 61
Elastic Looped Transformers share weights across recurrent blocks and apply intra-loop self-distillation to deliver 4x parameter reduction while matching competitive FID and FVD scores on ImageNet and UCF-101.
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis cs.CV · 2023-07-04 · conditional · none · ref 42
SDXL improves upon prior Stable Diffusion versions through a larger UNet backbone, dual text encoders, novel conditioning, and a refinement model, producing higher-fidelity images competitive with black-box state-of-the-art generators.
VideoGPT: Video Generation using VQ-VAE and Transformers cs.CV · 2021-04-20 · accept · none · ref 31
VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.
Spatial sensitivity analysis for urban land use prediction with physics-constrained conditional generative adversarial networks cs.LG · 2019-07-22 · unverdicted · none · ref 23 · internal anchor
A physics-constrained cGAN is trained as an image-to-image translator on remote-sensing layers to recover spatial sensitivities of urban land-use change to macroeconomic indicators via backpropagation gradients.
On the Tradeoffs of On-Device Generative Models in Federated Predictive Maintenance Systems cs.LG · 2026-05-08 · unverdicted · none · ref 47
Experiments on real industrial time series show that partial model sharing improves diffusion model performance in bandwidth-limited non-IID settings, while full sharing stabilizes GAN training but offers less robustness than VAE or DDPM alternatives.
Protecting and Preserving Protest Dynamics for Responsible Analysis cs.CV · 2026-04-06 · unverdicted · none · ref 54
A responsible computing framework substitutes real protest imagery with labeled synthetic reproductions from conditional image synthesis to enable privacy-aware analysis of collective action patterns.
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence cs.LG · 2026-04-22 · unverdicted · none · ref 48
A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
Measuring the Transferability of Adversarial Examples cs.LG · 2019-07-14 · unverdicted · none · ref 14 · internal anchor
Empirical measurement of adversarial example transferability between VGG and Inception model classes with methodological refinements to attack strength selection, perturbation clipping, and evaluation via SSIM.

Improved techniques for training gans

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer