ciwGAN and fiwGAN models trained on isolated words spontaneously generate concatenated multi-word outputs and display early compositionality precursors.
hub Mixed citations
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
Mixed citation behavior. Most common role is background (67%).
abstract
In recent years, supervised learning with convolutional networks (CNNs) has seen huge adoption in computer vision applications. Comparatively, unsupervised learning with CNNs has received less attention. In this work we hope to help bridge the gap between the success of CNNs for supervised learning and unsupervised learning. We introduce a class of CNNs called deep convolutional generative adversarial networks (DCGANs), that have certain architectural constraints, and demonstrate that they are a strong candidate for unsupervised learning. Training on various image datasets, we show convincing evidence that our deep convolutional adversarial pair learns a hierarchy of representations from object parts to scenes in both the generator and discriminator. Additionally, we use the learned features for novel tasks - demonstrating their applicability as general image representations.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Toy models demonstrate that polysemanticity arises when neural networks store more sparse features than neurons via superposition, producing a phase transition tied to polytope geometry and increased adversarial vulnerability.
GPT-f, a transformer-based prover for Metamath, generated new short proofs that were accepted into the main library—the first such contribution from a deep-learning system.
AGAN is the first neural architecture search method for GANs that discovers architectures outperforming state-of-the-art on CIFAR-10 unsupervised image generation and competitive on supervised tasks.
Real NVP uses affine coupling layers to create invertible transformations that support exact density estimation, sampling, and latent inference without approximations.
A relative projection error metric in foundation-model embedding space predicts the downstream utility of synthetic positive samples for binary classifiers.
Prompts can be split into separate roles for sampling design and recovery modeling in generative compressed sensing, with stable recovery bounds for matched prompts and an explicit penalty for mismatch, validated on Stable Diffusion.
A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
SurFITR is a new collection of 137k+ surveillance-style forged images that causes existing detectors to degrade while enabling substantial gains when used for training in both in-domain and cross-domain settings.
A pre-training diagnostic map based on spectral correlation resemblance to IQP circuits and excess structural complexity identifies suitable datasets like turbulence data for quantum generative models, yielding competitive low-resource performance.
ASTRA is a plug-and-play training-free method for precise multi-subject video editing that uses prompt-guided multimodal alignment and prior-based mask retargeting to avoid attention dilution and boundary issues.
Progressive growing stabilizes GAN training to produce high-resolution images of unprecedented quality and achieves a record unsupervised inception score of 8.80 on CIFAR10.
Mixed precision training uses FP16 for most computations, FP32 master weights for accumulation, and loss scaling to enable accurate training of large DNNs with halved memory usage.
VFMTok builds a generalist image tokenizer on frozen VFMs using adaptive quantization and semantic alignment, delivering gFID 1.36 for autoregressive and 1.25 for continuous generation on ImageNet with 3x faster convergence.
NeTMY neural fields with annealed encoding, multiscale optimization, and spectrum-fidelity losses achieve superior localization and distributional accuracy in NV-center inverse sensing by using a tensor power-summed dipolar operator that exposes and mitigates center-collapse failures.
CE-FI maps heterogeneous model representations to a shared embedding space via unsupervised training on unlabeled data, enabling privacy-preserving federated inference that outperforms solo models on image classification benchmarks.
A new framework evaluates utility of synthetic mobility trajectories while a membership inference attack reveals privacy vulnerabilities in generative models thought to be safe.
Embedding Arithmetic performs vector operations in the embedding space of T2I models to mitigate bias at inference time, outperforming baselines on diversity while preserving coherence via a new Concept Coherence Score.
FatigueFusion fuses fatigue features in latent space using algorithmic, data-driven, and PINN modules to synthesize novel fatigued motions from non-fatigued joint sequences in an end-to-end pipeline.
Finetuning generative models on limited instance segmentation data produces zero-shot generalization to unseen object categories and styles, matching or exceeding supervised baselines like SAM on ambiguous boundaries.
Scaling noise magnitude in NCE aligns gradients with MLE, enabling a practical approximation that improves performance on CIFAR-10 and ImageNet image modeling with fewer training steps.
MetaGPT embeds human SOPs into LLM prompts to create role-specialized agent teams that produce more coherent solutions on collaborative software engineering tasks than prior chat-based multi-agent systems.
VideoGPT generates competitive natural videos by learning discrete latents with VQ-VAE and modeling them autoregressively with a transformer.
DASCN uses a unified primal-dual GAN architecture to generate semantics-consistent visual features for generalized zero-shot learning, claiming state-of-the-art gains.
citing papers explorer
-
Physics-informed, Generative Adversarial Design of Funicular Shells
A modified DCGAN with an auxiliary discriminator using the membrane factor generates stable, previously unseen funicular shells optimized for pure compression in three dimensions.
- One-Step Generative Modeling via Wasserstein Gradient Flows