Textdiffuser: Diffusion models as text painters

Chen, J · 2023 · arXiv 2305.10855

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

SVGDreamer: Text Guided SVG Generation with Diffusion Model

cs.CV · 2023-12-27 · unverdicted · novelty 7.0

SVGDreamer introduces semantic-driven image vectorization (SIVE) and vectorized particle-based score distillation (VPSD) to produce editable, high-quality, diverse SVGs from text.

Training-Free Occluded Text Rendering via Glyph Priors and Attention-Guided Semantic Blending

cs.CV · 2026-05-16 · unverdicted · novelty 6.0

A restarted dual-stream inference approach with glyph priors and attention-guided masks improves occluded text rendering in pretrained diffusion models without fine-tuning.

StyleTextGen: Style-Conditioned Multilingual Scene Text Generation

cs.CV · 2026-05-14 · unverdicted · novelty 6.0

StyleTextGen proposes a dual-branch style encoder, text style consistency loss, and mask-guided inference to achieve superior style consistency and cross-lingual performance in multilingual scene text generation on a new bilingual benchmark.

FontFusion: Enhancing Generative Text in Diffusion Models with Typographic Conditioning

cs.CV · 2026-06-04 · unverdicted · novelty 5.0

FontFusion adds hierarchical token conditioning, position-aware embeddings, and multi-level dropping to DiT diffusion models, yielding 76% relative gains on decorative fonts and 68-76% consistency improvements via a dual DeepFont+DINOv2 encoder.

Evaluating Reasoning Fidelity in Visual Text Generation

cs.CV · 2026-06-03 · unverdicted · novelty 5.0

T2I models frequently exhibit semantic errors, logical inconsistencies, and incorrect reasoning steps in visual text generation tasks, unlike text-only models.

Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction

cs.CV · 2026-05-20 · unverdicted · novelty 5.0

A two-stage method predicts an intermediate Canny map for structure then renders the image conditioned on appearance and structure, paired with a 100k text-aware dataset, to improve detail preservation in subject-driven generation.

FineEdit: Fine-Grained Image Edit with Bounding Box Guidance

cs.CV · 2026-04-13 · unverdicted · novelty 5.0

FineEdit adds multi-level bounding box injection to diffusion image editing, releases a 1.2M-pair dataset with box annotations, and shows better instruction following and background consistency than prior open models on new and existing benchmarks.

citing papers explorer

Showing 7 of 7 citing papers.

SVGDreamer: Text Guided SVG Generation with Diffusion Model cs.CV · 2023-12-27 · unverdicted · none · ref 2
SVGDreamer introduces semantic-driven image vectorization (SIVE) and vectorized particle-based score distillation (VPSD) to produce editable, high-quality, diverse SVGs from text.
Training-Free Occluded Text Rendering via Glyph Priors and Attention-Guided Semantic Blending cs.CV · 2026-05-16 · unverdicted · none · ref 3
A restarted dual-stream inference approach with glyph priors and attention-guided masks improves occluded text rendering in pretrained diffusion models without fine-tuning.
StyleTextGen: Style-Conditioned Multilingual Scene Text Generation cs.CV · 2026-05-14 · unverdicted · none · ref 5
StyleTextGen proposes a dual-branch style encoder, text style consistency loss, and mask-guided inference to achieve superior style consistency and cross-lingual performance in multilingual scene text generation on a new bilingual benchmark.
FontFusion: Enhancing Generative Text in Diffusion Models with Typographic Conditioning cs.CV · 2026-06-04 · unverdicted · none · ref 4
FontFusion adds hierarchical token conditioning, position-aware embeddings, and multi-level dropping to DiT diffusion models, yielding 76% relative gains on decorative fonts and 68-76% consistency improvements via a dual DeepFont+DINOv2 encoder.
Evaluating Reasoning Fidelity in Visual Text Generation cs.CV · 2026-06-03 · unverdicted · none · ref 3
T2I models frequently exhibit semantic errors, logical inconsistencies, and incorrect reasoning steps in visual text generation tasks, unlike text-only models.
Decomposing Subject-Driven Image Generation via Intermediate Structural Prediction cs.CV · 2026-05-20 · unverdicted · none · ref 4
A two-stage method predicts an intermediate Canny map for structure then renders the image conditioned on appearance and structure, paired with a 100k text-aware dataset, to improve detail preservation in subject-driven generation.
FineEdit: Fine-Grained Image Edit with Bounding Box Guidance cs.CV · 2026-04-13 · unverdicted · none · ref 7
FineEdit adds multi-level bounding box injection to diffusion image editing, releases a 1.2M-pair dataset with box annotations, and shows better instruction following and background consistency than prior open models on new and existing benchmarks.

Textdiffuser: Diffusion models as text painters

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer