CoFi-UCGen achieves both coarse- and fine-grained unsupervised conditional image generation by using bit-codes for structured latent space and hierarchical modulation in diffusion models.
Gaussian mixture generative adver- sarial networks for diverse datasets, and the unsupervised clustering of images
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
Generative Adversarial Networks (GANs) have been shown to produce realistically looking synthetic images with remarkable success, yet their performance seems less impressive when the training set is highly diverse. In order to provide a better fit to the target data distribution when the dataset includes many different classes, we propose a variant of the basic GAN model, called Gaussian Mixture GAN (GM-GAN), where the probability distribution over the latent space is a mixture of Gaussians. We also propose a supervised variant which is capable of conditional sample synthesis. In order to evaluate the model's performance, we propose a new scoring method which separately takes into account two (typically conflicting) measures - diversity vs. quality of the generated data. Through a series of empirical experiments, using both synthetic and real-world datasets, we quantitatively show that GM-GANs outperform baselines, both when evaluated using the commonly used Inception Score, and when evaluated using our own alternative scoring method. In addition, we qualitatively demonstrate how the \textit{unsupervised} variant of GM-GAN tends to map latent vectors sampled from different Gaussians in the latent space to samples of different classes in the data space. We show how this phenomenon can be exploited for the task of unsupervised clustering, and provide quantitative evaluation showing the superiority of our method for the unsupervised clustering of image datasets. Finally, we demonstrate a feature which further sets our model apart from other GAN models: the option to control the quality-diversity trade-off by altering, post-training, the probability distribution of the latent space. This allows one to sample higher quality and lower diversity samples, or vice versa, according to one's needs.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A language-driven system generates semantically consistent multimodal textures from text prompts by linking autoregressive haptic models and diffusion-based visuals through a shared latent representation.
citing papers explorer
-
CoFi-UCGen: Coarse-to-Fine Unsupervised Conditional Generation without Label Priors
CoFi-UCGen achieves both coarse- and fine-grained unsupervised conditional image generation by using bit-codes for structured latent space and hierarchical modulation in diffusion models.
-
Language-Guided Multimodal Texture Authoring via Generative Models
A language-driven system generates semantically consistent multimodal textures from text prompts by linking autoregressive haptic models and diffusion-based visuals through a shared latent representation.