TextSculptor supplies an automated data synthesis pipeline yielding 3.2M samples plus a four-task benchmark that raises open-source scene text editing performance.
Ultraedit: Instruction-based fine-grained image editing at scale
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
VAE-LFA suppresses semantic drift in multi-turn DiT image editing by low-pass filtering latent discrepancies and aligning low-frequency components to an EMA of previous rounds in VAE space.
citing papers explorer
-
TextSculptor: Training and Benchmarking Scene Text Editing
TextSculptor supplies an automated data synthesis pipeline yielding 3.2M samples plus a four-task benchmark that raises open-source scene text editing performance.
-
Why Do DiT Editors Drift? Plug-and-Play Low Frequency Alignment in VAE Latent Space
VAE-LFA suppresses semantic drift in multi-turn DiT image editing by low-pass filtering latent discrepancies and aligning low-frequency components to an EMA of previous rounds in VAE space.