pith. sign in

DiffSketcher: Text Guided Vector Sketch Synthesis through Latent Diffusion Models

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

We demonstrate that pre-trained text-to-image diffusion models, despite being trained on raster images, possess a remarkable capacity to guide vector sketch synthesis. In this paper, we introduce DiffSketcher, a novel algorithm for generating vectorized free-hand sketches directly from natural language prompts. Our method optimizes a set of B\'ezier curves via an extended Score Distillation Sampling (SDS) loss, successfully bridging a raster-level diffusion prior with a parametric vector generator. To further accelerate the generation process, we propose a stroke initialization strategy driven by the diffusion model's intrinsic attention maps. Results show that DiffSketcher produces sketches across varying levels of abstraction while maintaining the structural integrity and essential visual details of the subject. Experiments confirm that our approach yields superior perceptual quality and controllability over existing methods. The code and demo are available at https://ximinng.github.io/DiffSketcher-project/

fields

cs.AI 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

PhyDrawGen: Physically Grounded Diagram Generation from Natural Language

cs.AI · 2026-05-28 · unverdicted · novelty 5.0

PhyDrawGen is a neuro-symbolic pipeline that extracts typed scene graphs via LLM, converts them to physically constrained PSLGs via deterministic solver, and refines via fine-tuned Qwen-VL, claiming superior performance over GPT-5-image and Gemini models on 1,449 physics problems.

citing papers explorer

Showing 1 of 1 citing paper.

  • PhyDrawGen: Physically Grounded Diagram Generation from Natural Language cs.AI · 2026-05-28 · unverdicted · none · ref 35 · internal anchor

    PhyDrawGen is a neuro-symbolic pipeline that extracts typed scene graphs via LLM, converts them to physically constrained PSLGs via deterministic solver, and refines via fine-tuned Qwen-VL, claiming superior performance over GPT-5-image and Gemini models on 1,449 physics problems.