pith. sign in

A Note on the Inception Score

8 Pith papers cite this work. Polarity classification is still indexing.

8 Pith papers citing it
abstract

Deep generative models are powerful tools that have produced impressive results in recent years. These advances have been for the most part empirically driven, making it essential that we use high quality evaluation metrics. In this paper, we provide new insights into the Inception Score, a recently proposed and widely used evaluation metric for generative models, and demonstrate that it fails to provide useful guidance when comparing models. We discuss both suboptimalities of the metric itself and issues with its application. Finally, we call for researchers to be more systematic and careful when evaluating and comparing generative models, as the advancement of the field depends upon it.

citation-role summary

background 2 method 1 other 1

citation-polarity summary

representative citing papers

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Generative Recursive Reasoning

cs.AI · 2026-05-19 · unverdicted · novelty 6.0 · 2 refs

GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.

Movie Gen: A Cast of Media Foundation Models

cs.CV · 2024-10-17 · unverdicted · novelty 5.0

A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.

Image-to-Video Diffusion: From Foundations to Open Frontiers

cs.CV · 2026-05-17 · unverdicted · novelty 3.0

A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.

citing papers explorer

Showing 8 of 8 citing papers.

  • OccDirector: Language-Guided Behavior and Interaction Generation in 4D Occupancy Space cs.CV · 2026-04-24 · unverdicted · none · ref 1

    OccDirector uses a VLM-guided Spatio-Temporal MMDiT model with history anchoring to generate physically plausible 4D occupancy from language scripts, supported by the new OccInteract-85k dataset.

  • Diffusion Models Beat GANs on Image Synthesis cs.LG · 2021-05-11 · accept · none · ref 3

    Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

  • Large Scale GAN Training for High Fidelity Natural Image Synthesis cs.LG · 2018-09-28 · accept · none · ref 1

    BigGANs achieve state-of-the-art class-conditional synthesis on ImageNet 128x128 with Inception Score 166.5 and FID 7.4 by scaling GANs and applying orthogonal regularization plus truncation.

  • Generative Recursive Reasoning cs.AI · 2026-05-19 · unverdicted · none · ref 42 · 2 links · internal anchor

    GRAM is a latent-variable generative model that performs recursive reasoning via stochastic trajectories, trained with amortized variational inference to support multi-hypothesis reasoning and unconditional generation.

  • Speculative Coupled Decoding for Training-Free Lossless Acceleration of Autoregressive Visual Generation cs.CV · 2025-10-28 · unverdicted · none · ref 5 · internal anchor

    Speculative Coupled Decoding stabilizes draft sampling in Speculative Jacobi Decoding via an information-theoretic coupling step, delivering up to 4.2x image and 13.6x video speedups with no quality loss or training.

  • TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders cs.CV · 2026-04-08 · unverdicted · none · ref 1

    TC-AE improves reconstruction and generative performance in deep compression by decomposing token-to-latent compression into two stages and using joint self-supervised training.

  • Movie Gen: A Cast of Media Foundation Models cs.CV · 2024-10-17 · unverdicted · none · ref 4

    A 30B-parameter transformer and related models generate high-quality videos and audio, claiming state-of-the-art results on text-to-video, video editing, personalization, and audio generation tasks.

  • Image-to-Video Diffusion: From Foundations to Open Frontiers cs.CV · 2026-05-17 · unverdicted · none · ref 124 · internal anchor

    A survey that organizes diffusion image-to-video methods into a taxonomy, distills core designs in condition encoding, temporal modeling, noise prior, and upsampling, and discusses applications plus challenges.