AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
hub
Title resolution pending
10 Pith papers cite this work. Polarity classification is still indexing.
abstract
Generative Adversarial Nets (GANs) represent an important milestone for effective generative models, which has inspired numerous variants seemingly different from each other. One of the main contributions of this paper is to reveal a unified geometric structure in GAN and its variants. Specifically, we show that the adversarial generative model training can be decomposed into three geometric steps: separating hyperplane search, discriminator parameter update away from the separating hyperplane, and the generator update along the normal vector direction of the separating hyperplane. This geometric intuition reveals the limitations of the existing approaches and leads us to propose a new formulation called geometric GAN using SVM separating hyperplane that maximizes the margin. Our theoretical analysis shows that the geometric GAN converges to a Nash equilibrium between the discriminator and generator. In addition, extensive numerical results show that the superior performance of geometric GAN.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.
Tadpole is a pre-trained autoencoder foundation model for 3D PDEs that learns transferable representations from online-generated data and supports efficient fine-tuning for dynamics prediction and other tasks.
RGT-Est transforms relative geologic time estimation into a sinusoidal space and applies pointwise, perceptual, and adversarial losses to achieve better stratigraphic consistency and horizon correlation on seismic data.
Semantic pseudo-pairing via DINOv2 embeddings and fused Gromov-Wasserstein optimal transport enables training a 7K-parameter CNN for unpaired smartphone ISP, achieving 22.569 PSNR on the NTIRE 2026 challenge test set.
Continuous adversarial flow models replace MSE in flow matching with adversarial training via a discriminator, improving guidance-free FID on ImageNet from 8.26 to 3.63 for SiT and similar gains for JiT and text-to-image benchmarks.
Inpainting auxiliary task improves clustering of embeddings for individual zebrafish identification based on skin patterns.
Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.
Generative model with normalized pairwise distance constraint discovers output space topologies from sparse data and outperforms GANs and VAEs by avoiding mode collapse.
Venom is an educational PyTorch toolkit that packages multiple generative modeling families under a single MNIST-first interface with reproducible scripts and tutorials.
citing papers explorer
-
AGAN: Towards Automated Design of Generative Adversarial Networks
AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
-
Large Scale GAN Training for High Fidelity Natural Image Synthesis
BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.
-
Tadpole: Autoencoders as Foundation Models for 3D PDEs with Online Learning
Tadpole is a pre-trained autoencoder foundation model for 3D PDEs that learns transferable representations from online-generated data and supports efficient fine-tuning for dynamics prediction and other tasks.
-
Learning Stratigraphically Consistent Relative Geologic Time from 3D Seismic Data via Sinusoidal Mapping
RGT-Est transforms relative geologic time estimation into a sinusoidal space and applies pointwise, perceptual, and adversarial losses to achieve better stratigraphic consistency and horizon correlation on seismic data.
-
Lightweight Unpaired Smartphone ISP Transfer with Semantic Pseudo-Pairing
Semantic pseudo-pairing via DINOv2 embeddings and fused Gromov-Wasserstein optimal transport enables training a 7K-parameter CNN for unpaired smartphone ISP, achieving 22.569 PSNR on the NTIRE 2026 challenge test set.
-
Continuous Adversarial Flow Models
Continuous adversarial flow models replace MSE in flow matching with adversarial training via a discriminator, improving guidance-free FID on ImageNet from 8.26 to 3.63 for SiT and similar gains for JiT and text-to-image benchmarks.
-
Exploring Clustering Capability of Inpainting Model Embeddings for Pattern-based Individual Identification
Inpainting auxiliary task improves clustering of embeddings for individual zebrafish identification based on skin patterns.
-
Woosh: A Sound Effects Foundation Model
Woosh is a new publicly released foundation model optimized for high-quality sound effect generation from text or video, showing competitive or better results than open alternatives like Stable Audio Open.
-
Neural Embedding for Physical Manipulations
Generative model with normalized pairwise distance constraint discovers output space topologies from sparse data and outperforms GANs and VAEs by avoiding mode collapse.
-
Venom: A PyTorch Generative Modeling Toolkit
Venom is an educational PyTorch toolkit that packages multiple generative modeling families under a single MNIST-first interface with reproducible scripts and tutorials.