pith. sign in

On buggy resizing libraries and surprising subtleties in fid calculation.arXiv preprint arXiv:2104.11222, 5:14

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

citation-role summary

background 1 method 1

citation-polarity summary

fields

cs.CV 3 cs.LG 1

representative citing papers

High-Resolution Image Synthesis with Latent Diffusion Models

cs.CV · 2021-12-20 · conditional · novelty 7.0

Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and

Diffusion Models Beat GANs on Image Synthesis

cs.LG · 2021-05-11 · accept · novelty 7.0

Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

Latte: Latent Diffusion Transformer for Video Generation

cs.CV · 2024-01-05 · unverdicted · novelty 6.0

Latte achieves state-of-the-art video generation on FaceForensics, SkyTimelapse, UCF101, and Taichi-HD by using a latent diffusion transformer with four efficient spatial-temporal decomposition variants and best-practice training choices.

citing papers explorer

Showing 4 of 4 citing papers.

  • iTryOn: Mastering Interactive Video Virtual Try-On with Spatial-Semantic Guidance cs.CV · 2026-05-20 · unverdicted · none · ref 23

    iTryOn is a video diffusion Transformer that injects spatial 3D hand guidance and semantic action captions to enable interactive garment replacement in videos.

  • High-Resolution Image Synthesis with Latent Diffusion Models cs.CV · 2021-12-20 · conditional · none · ref 64

    Latent diffusion models achieve state-of-the-art inpainting and competitive results on unconditional generation, scene synthesis, and super-resolution by performing the diffusion process in the latent space of pretrained autoencoders with cross-attention conditioning, while cutting computational and

  • Diffusion Models Beat GANs on Image Synthesis cs.LG · 2021-05-11 · accept · none · ref 45

    Diffusion models with architecture improvements and classifier guidance achieve superior FID scores to GANs on unconditional and conditional ImageNet image synthesis.

  • Latte: Latent Diffusion Transformer for Video Generation cs.CV · 2024-01-05 · unverdicted · none · ref 9

    Latte achieves state-of-the-art video generation on FaceForensics, SkyTimelapse, UCF101, and Taichi-HD by using a latent diffusion transformer with four efficient spatial-temporal decomposition variants and best-practice training choices.