hub

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, Bj ¨orn Ommer · 2022

13 Pith papers cite this work. Polarity classification is still indexing.

13 Pith papers citing it

browse 13 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Learning to Track Instance from Single Nature Language Description

cs.CV · 2026-05-08 · unverdicted · novelty 7.0

Tracker is a self-supervised VL tracker that uses a Dynamic Token Aggregation Module to learn instance tracking from single language descriptions in unlabeled videos and outperforms prior self-supervised methods.

C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion

cs.CV · 2026-04-17 · unverdicted · novelty 7.0

C-GenReg achieves training-free 3D point cloud registration by generating multi-view-consistent images from geometry, extracting VFM correspondences, and probabilistically fusing them with raw geometric matches for zero-shot performance on indoor and outdoor benchmarks.

NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity

cs.LG · 2026-04-10 · unverdicted · novelty 7.0

NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.

VOSR: A Vision-Only Generative Model for Image Super-Resolution

cs.CV · 2026-04-03 · conditional · novelty 7.0

VOSR shows that competitive generative image super-resolution with faithful structures can be achieved by training a diffusion-style model from scratch on visual data alone, using a vision encoder for guidance and a restoration-oriented sampling strategy.

MultiAnimate: Pose-Guided Image Animation Made Extensible

cs.CV · 2026-02-25 · unverdicted · novelty 7.0

MultiAnimate adds Identifier Assigner and Identifier Adapter modules to diffusion video models so they can handle multiple characters without identity mix-ups, generalizing from two-character training data to more characters.

ATATA: One Algorithm to Align Them All

cs.CV · 2026-01-16 · unverdicted · novelty 7.0

ATATA enables fast joint inference of structurally aligned pairs using Rectified Flow models via segment transport, improving state-of-the-art for image and video generation while matching 3D quality at much higher speed.

One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer

cs.CV · 2025-11-28 · unverdicted · novelty 7.0

One-to-All Animation enables alignment-free character animation and image pose transfer via self-supervised outpainting reformulation, reference extraction, hybrid fusion attention, identity-robust pose control, and token replacement for long videos.

PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion

cs.CV · 2025-11-24 · unverdicted · novelty 7.0

PartDiffuser is a semi-autoregressive discrete diffusion framework that generates high-fidelity 3D meshes from point clouds by combining inter-part autoregression with intra-part parallel diffusion using a part-aware DiT architecture.

GOR-IS: 3D Gaussian Object Removal in the Intrinsic Space

cs.CV · 2026-05-01 · unverdicted · novelty 6.0

GOR-IS removes objects from 3D Gaussian Splatting reconstructions by performing inpainting in an intrinsic decomposition space that explicitly models light transport for consistent global lighting and non-Lambertian surfaces.

EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure

cs.CV · 2026-04-10 · unverdicted · novelty 6.0

EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.

From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images

cs.CV · 2025-12-08 · unverdicted · novelty 6.0

A technique reconstructs large urban areas from sparse extreme off-nadir satellite images by modeling geometry as a Z-monotonic 2.5D height map SDF and applying a generative network to restore plausible textures on the resulting mesh.

GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes

cs.CV · 2026-05-22 · unverdicted · novelty 5.0

GlowGS improves 3D Gaussian Splatting in nighttime glow scenes via semantic feature generation from diffusion models and novel-view semantic learning with vision foundation models.

AHS: Adaptive Head Synthesis via Synthetic Data Augmentations

cs.CV · 2026-04-17 · unverdicted · novelty 4.0

Adaptive Head Synthesis (AHS) employs head-reenacted synthetic data augmentation to enable robust head swapping on full upper-body images without paired training data.

citing papers explorer

Showing 13 of 13 citing papers.

Learning to Track Instance from Single Nature Language Description cs.CV · 2026-05-08 · unverdicted · none · ref 36
Tracker is a self-supervised VL tracker that uses a Dynamic Token Aggregation Module to learn instance tracking from single language descriptions in unlabeled videos and outperforms prior self-supervised methods.
C-GenReg: Training-Free 3D Point Cloud Registration by Multi-View-Consistent Geometry-to-Image Generation with Probabilistic Modalities Fusion cs.CV · 2026-04-17 · unverdicted · none · ref 24
C-GenReg achieves training-free 3D point cloud registration by generating multi-view-consistent images from geometry, extracting VFM correspondences, and probabilistically fusing them with raw geometric matches for zero-shot performance on indoor and outdoor benchmarks.
NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity cs.LG · 2026-04-10 · unverdicted · none · ref 49
NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.
VOSR: A Vision-Only Generative Model for Image Super-Resolution cs.CV · 2026-04-03 · conditional · none · ref 28
VOSR shows that competitive generative image super-resolution with faithful structures can be achieved by training a diffusion-style model from scratch on visual data alone, using a vision encoder for guidance and a restoration-oriented sampling strategy.
MultiAnimate: Pose-Guided Image Animation Made Extensible cs.CV · 2026-02-25 · unverdicted · none · ref 24
MultiAnimate adds Identifier Assigner and Identifier Adapter modules to diffusion video models so they can handle multiple characters without identity mix-ups, generalizing from two-character training data to more characters.
ATATA: One Algorithm to Align Them All cs.CV · 2026-01-16 · unverdicted · none · ref 46
ATATA enables fast joint inference of structurally aligned pairs using Rectified Flow models via segment transport, improving state-of-the-art for image and video generation while matching 3D quality at much higher speed.
One-to-All Animation: Alignment-Free Character Animation and Image Pose Transfer cs.CV · 2025-11-28 · unverdicted · none · ref 39
One-to-All Animation enables alignment-free character animation and image pose transfer via self-supervised outpainting reformulation, reference extraction, hybrid fusion attention, identity-robust pose control, and token replacement for long videos.
PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion cs.CV · 2025-11-24 · unverdicted · none · ref 30
PartDiffuser is a semi-autoregressive discrete diffusion framework that generates high-fidelity 3D meshes from point clouds by combining inter-part autoregression with intra-part parallel diffusion using a part-aware DiT architecture.
GOR-IS: 3D Gaussian Object Removal in the Intrinsic Space cs.CV · 2026-05-01 · unverdicted · none · ref 35
GOR-IS removes objects from 3D Gaussian Splatting reconstructions by performing inpainting in an intrinsic decomposition space that explicitly models light transport for consistent global lighting and non-Lambertian surfaces.
EGLOCE: Training-Free Energy-Guided Latent Optimization for Concept Erasure cs.CV · 2026-04-10 · unverdicted · none · ref 40
EGLOCE erases target concepts in diffusion models at inference time by optimizing latents with dual energy guidance that repels unwanted concepts while retaining prompt alignment.
From Orbit to Ground: Generative City Photogrammetry from Extreme Off-Nadir Satellite Images cs.CV · 2025-12-08 · unverdicted · none · ref 48
A technique reconstructs large urban areas from sparse extreme off-nadir satellite images by modeling geometry as a Z-monotonic 2.5D height map SDF and applying a generative network to restore plausible textures on the resulting mesh.
GlowGS: Generative Semantic Feature Learning for 3D Gaussian Splatting in Nighttime Glow Scenes cs.CV · 2026-05-22 · unverdicted · none · ref 50
GlowGS improves 3D Gaussian Splatting in nighttime glow scenes via semantic feature generation from diffusion models and novel-view semantic learning with vision foundation models.
AHS: Adaptive Head Synthesis via Synthetic Data Augmentations cs.CV · 2026-04-17 · unverdicted · none · ref 46
Adaptive Head Synthesis (AHS) employs head-reenacted synthetic data augmentation to enable robust head swapping on full upper-body images without paired training data.

High-resolution image synthesis with latent diffusion models

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer