Adding conditional control to text-to-image diffusion models

Lvmin Zhang, Anyi Rao, Maneesh Agrawala · 2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

StyleText: A Large-Scale Dataset and Benchmark for Stylized Scene Text Inpainting

cs.CV · 2026-05-17 · unverdicted · novelty 7.0

StyleText is a new large-scale dataset and benchmark for stylized scene text inpainting, constructed via an automated pipeline and paired with a FluxFill+LoRA baseline that improves OCR accuracy.

Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models

cs.CV · 2025-11-01 · unverdicted · novelty 6.0

A feed-forward video latent transformer that predicts time-varying 3D Gaussian primitives from one image to produce controllable 4D scenes with appearance, geometry, and motion.

citing papers explorer

Showing 2 of 2 citing papers.

StyleText: A Large-Scale Dataset and Benchmark for Stylized Scene Text Inpainting cs.CV · 2026-05-17 · unverdicted · none · ref 24
StyleText is a new large-scale dataset and benchmark for stylized scene text inpainting, constructed via an automated pipeline and paired with a FluxFill+LoRA baseline that improves OCR accuracy.
Diff4Splat: Controllable 4D Scene Generation with Latent Dynamic Reconstruction Models cs.CV · 2025-11-01 · unverdicted · none · ref 106
A feed-forward video latent transformer that predicts time-varying 3D Gaussian primitives from one image to produce controllable 4D scenes with appearance, geometry, and motion.

Adding conditional control to text-to-image diffusion models

fields

years

verdicts

representative citing papers

citing papers explorer