Align your latents: High-resolution video synthesis with latent diffusion models

Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, Karsten Kreis · 2023

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction

cs.CV · 2026-05-07 · unverdicted · novelty 7.0

FreeSpec uses SVD-based spectral reconstruction to fuse global low-rank and local high-rank features, reducing content drift and preserving temporal dynamics in long video generation.

Immune2V: Image Immunization Against Dual-Stream Image-to-Video Generation

cs.CV · 2026-04-12 · unverdicted · novelty 7.0

Immune2V immunizes images against dual-stream I2V generation by enforcing temporally balanced latent divergence and aligning generative features to a precomputed collapse trajectory, yielding stronger persistent degradation than image-level baselines.

LaWM: Least Action World Models for Long-Horizon Physical Consistency from Visual Observations

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

LaWM induces latent transitions from a learned discrete variational principle rather than an unconstrained neural predictor, yielding improved physical consistency on synthetic dynamics and robot benchmarks.

Intermediate Representations are Strong AI-Generated Image Detectors

cs.CV · 2026-05-05 · unverdicted · novelty 6.0

Intermediate layer embedding sensitivity to perturbations distinguishes AI-generated images from real ones, yielding higher AUROC on GenImage and Forensics Small benchmarks than prior methods.

ST-Gen4D: Embedding 4D Spatiotemporal Cognition into World Model for 4D Generation

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

ST-Gen4D uses a world model that fuses global appearance and local dynamic graphs into a 4D cognition representation to guide consistent 4D Gaussian generation.

The Amazing Stability of Flow Matching

cs.CV · 2026-04-17 · unverdicted · novelty 5.0

Flow matching generative models preserve sample quality, diversity, and latent representations despite pruning 50% of the CelebA-HQ dataset or altering architecture and training configurations.

citing papers explorer

Showing 6 of 6 citing papers.

FreeSpec: Training-Free Long Video Generation via Singular-Spectrum Reconstruction cs.CV · 2026-05-07 · unverdicted · none · ref 1
FreeSpec uses SVD-based spectral reconstruction to fuse global low-rank and local high-rank features, reducing content drift and preserving temporal dynamics in long video generation.
Immune2V: Image Immunization Against Dual-Stream Image-to-Video Generation cs.CV · 2026-04-12 · unverdicted · none · ref 22
Immune2V immunizes images against dual-stream I2V generation by enforcing temporally balanced latent divergence and aligning generative features to a precomputed collapse trajectory, yielding stronger persistent degradation than image-level baselines.
LaWM: Least Action World Models for Long-Horizon Physical Consistency from Visual Observations cs.LG · 2026-05-08 · unverdicted · none · ref 2
LaWM induces latent transitions from a learned discrete variational principle rather than an unconstrained neural predictor, yielding improved physical consistency on synthetic dynamics and robot benchmarks.
Intermediate Representations are Strong AI-Generated Image Detectors cs.CV · 2026-05-05 · unverdicted · none · ref 5
Intermediate layer embedding sensitivity to perturbations distinguishes AI-generated images from real ones, yielding higher AUROC on GenImage and Forensics Small benchmarks than prior methods.
ST-Gen4D: Embedding 4D Spatiotemporal Cognition into World Model for 4D Generation cs.CV · 2026-05-08 · unverdicted · none · ref 3
ST-Gen4D uses a world model that fuses global appearance and local dynamic graphs into a 4D cognition representation to guide consistent 4D Gaussian generation.
The Amazing Stability of Flow Matching cs.CV · 2026-04-17 · unverdicted · none · ref 4
Flow matching generative models preserve sample quality, diversity, and latent representations despite pruning 50% of the CelebA-HQ dataset or altering architecture and training configurations.

Align your latents: High-resolution video synthesis with latent diffusion models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer