A Note on the Inception Score

Shane Barratt, Rishi Sharma · 2018 · stat.ML · arXiv 1801.01973

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

open full Pith review browse 12 citing papers arXiv PDF

abstract

Deep generative models are powerful tools that have produced impressive results in recent years. These advances have been for the most part empirically driven, making it essential that we use high quality evaluation metrics. In this paper, we provide new insights into the Inception Score, a recently proposed and widely used evaluation metric for generative models, and demonstrate that it fails to provide useful guidance when comparing models. We discuss both suboptimalities of the metric itself and issues with its application. Finally, we call for researchers to be more systematic and careful when evaluating and comparing generative models, as the advancement of the field depends upon it.

citation-role summary

background 2 method 1 other 1

citation-polarity summary

background 2 unclear 1 use method 1

representative citing papers

OccDirector: Language-Guided Behavior and Interaction Generation in 4D Occupancy Space

cs.CV · 2026-04-24 · unverdicted · novelty 7.0

OccDirector uses a VLM-guided Spatio-Temporal MMDiT model with history anchoring to generate physically plausible 4D occupancy from language scripts, supported by the new OccInteract-85k dataset.

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Large Scale GAN Training for High Fidelity Natural Image Synthesis

cs.LG · 2018-09-28 · accept · novelty 7.0

BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.

Optimizing Visual Generative Models via Distribution-wise Rewards

cs.LG · 2026-07-02 · unverdicted · novelty 6.0

Distribution-wise rewards with subset-replace strategy and post-hoc merging improve FID-50K on SiT (8.30 to 5.77) and EDM2 (3.74 to 3.52) while preserving diversity.

Post-Training Pruning for Diffusion Transformers

cs.CV · 2026-07-01 · unverdicted · novelty 6.0

DiT-Pruning introduces an energy-based saliency metric balancing weights and activations plus clustering-aware granularity for post-training pruning of DiTs, showing near-zero CLIP score degradation at 50% sparsity on FLUX.1-dev.

Diffusion Fine-tuning with Rewarded Moment Matching Distillation

cs.LG · 2026-06-29 · unverdicted · novelty 6.0

RMMD simultaneously distills diffusion models and optimizes rewards, yielding better FID-reward trade-offs on ImageNet than DI++, DRaFT and HyperNoise, and a 7.5x faster GenCast model that beats its teacher on 93% of weather variables while improving calibration.

UAT: Unified Audio-Text Diffusion for Audio Generation, Editing, and Captioning

eess.AS · 2026-06-03 · unverdicted · novelty 6.0

UAT presents a diffusion-centric framework coupling continuous latent diffusion for audio with masked discrete diffusion for text in a shared dual-stream backbone to enable unified generation, editing, and captioning.

Generative Recursive Reasoning

cs.AI · 2026-05-19 · unverdicted · novelty 6.0 · 2 refs

GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.

Speculative Coupled Decoding for Training-Free Lossless Acceleration of Autoregressive Visual Generation

cs.CV · 2025-10-28 · unverdicted · novelty 6.0

Speculative Coupled Decoding stabilizes draft sampling in Speculative Jacobi Decoding via an information-theoretic coupling step, delivering up to 4.2x image and 13.6x video speedups with no quality loss or training.

TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

cs.CV · 2026-04-08 · unverdicted · novelty 6.0

TC-AE improves reconstruction and generative performance in deep compression by decomposing token-to-latent compression into two stages and using joint self-supervised training.

Movie Gen: A Cast of Media Foundation Models

cs.CV · 2024-10-17 · unverdicted · novelty 5.0

A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.

Image-to-Video Diffusion: From Foundations to Open Frontiers

cs.CV · 2026-05-17 · unverdicted · novelty 3.0

A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Diffusion Models Beat GANs on Image Synthesis cs.LG · 2021-05-11 · accept · none · ref 3
Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

A Note on the Inception Score

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer