Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching
Pith reviewed 2026-05-21 12:40 UTC · model grok-4.3
The pith
Stroke of Surprise is a framework that generates vector sketches undergoing semantic transformation from one concept to another by adding strokes, using dual-branch SDS and overlay loss for optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baselines in recognizability and illusion strength, successfully expanding visual anagrams from the spatial to the temporal dimension.
Load-bearing premise
The dual-constraint that initial prefix strokes must form a coherent object (e.g., a duck) while simultaneously serving as the structural foundation for a second concept (e.g., a sheep) upon adding delta strokes, with the existence of a discoverable common structural subspace.
Figures
read the original abstract
Visual illusions traditionally rely on spatial manipulations such as multi-view consistency. In this work, we introduce Progressive Semantic Illusions, a novel vector sketching task where a single sketch undergoes a dramatic semantic transformation through the sequential addition of strokes. We present Stroke of Surprise, a generative framework that optimizes vector strokes to satisfy distinct semantic interpretations at different drawing stages. The core challenge lies in the "dual-constraint": initial prefix strokes must form a coherent object (e.g., a duck) while simultaneously serving as the structural foundation for a second concept (e.g., a sheep) upon adding delta strokes. To address this, we propose a sequence-aware joint optimization framework driven by a dual-branch Score Distillation Sampling (SDS) mechanism. Unlike sequential approaches that freeze the initial state, our method dynamically adjusts prefix strokes to discover a "common structural subspace" valid for both targets. Furthermore, we introduce a novel Overlay Loss that enforces spatial complementarity, ensuring structural integration rather than occlusion. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baselines in recognizability and illusion strength, successfully expanding visual anagrams from the spatial to the temporal dimension. Project page: https://stroke-of-surprise.github.io/
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Progressive Semantic Illusions as a new vector sketching task in which a single sketch undergoes semantic transformation via sequential stroke addition. It proposes the Stroke of Surprise framework, which employs sequence-aware joint optimization driven by a dual-branch Score Distillation Sampling (SDS) mechanism to discover a common structural subspace satisfying two distinct semantic targets (e.g., prefix strokes forming a duck that later supports a sheep), together with a novel Overlay Loss to enforce spatial complementarity rather than occlusion. The authors claim that extensive experiments show the method significantly outperforms state-of-the-art baselines in recognizability and illusion strength, thereby extending visual anagrams from the spatial to the temporal domain.
Significance. If the experimental claims hold, the work would be a meaningful contribution to generative computer vision by formalizing and solving a temporal extension of visual anagrams. The dual-branch SDS and Overlay Loss constitute concrete technical advances for handling the dual-constraint problem, and the absence of free parameters in the core optimization is a positive feature. The approach could influence downstream applications in creative tools and interactive illustration if the progressive coherence is robustly demonstrated.
major comments (2)
- [Abstract and §3] Abstract and §3 (dual-branch SDS and dual-constraint description): the central claim that prefix strokes achieve independent coherence for the first concept while serving as foundation for the second relies on joint optimization; no explicit loss term (beyond the final Overlay Loss) is described that enforces standalone recognizability of the prefix alone. This leaves open the possibility that observed success arises from simultaneous gradient balancing rather than discovery of a truly independent common structural subspace, which is load-bearing for the dual-constraint formulation.
- [§4] §4 (experiments): the strongest claim of significant outperformance in recognizability and illusion strength is presented without reference to specific quantitative metrics, ablation tables isolating the contribution of the dual-branch versus sequential freezing, or user-study protocols for illusion strength. Concrete results (e.g., Table X or Figure Y) are required to substantiate the temporal-expansion claim.
minor comments (2)
- [§3.1] Clarify the precise definition and parameterization of 'delta strokes' and the temporal sequencing mechanism to ensure reproducibility of the progressive addition process.
- [§5] Add a short discussion of failure modes, such as cases where no common structural subspace exists for the chosen concept pairs.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to improve clarity on the optimization mechanism and to provide explicit experimental details and references.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (dual-branch SDS and dual-constraint description): the central claim that prefix strokes achieve independent coherence for the first concept while serving as foundation for the second relies on joint optimization; no explicit loss term (beyond the final Overlay Loss) is described that enforces standalone recognizability of the prefix alone. This leaves open the possibility that observed success arises from simultaneous gradient balancing rather than discovery of a truly independent common structural subspace, which is load-bearing for the dual-constraint formulation.
Authors: We appreciate the referee highlighting this point of potential ambiguity. The dual-branch SDS applies the first semantic target’s score distillation loss exclusively to the prefix strokes (enforcing standalone coherence for the initial concept) while the second branch applies the loss to the full stroke sequence. Joint optimization then discovers the common structural subspace by allowing prefix adjustments under both constraints simultaneously. We agree the description could be more explicit and have added a clarifying subsection in §3 that isolates the contribution of each SDS branch to the dual constraints, along with an updated abstract sentence referencing this mechanism. revision: yes
-
Referee: [§4] §4 (experiments): the strongest claim of significant outperformance in recognizability and illusion strength is presented without reference to specific quantitative metrics, ablation tables isolating the contribution of the dual-branch versus sequential freezing, or user-study protocols for illusion strength. Concrete results (e.g., Table X or Figure Y) are required to substantiate the temporal-expansion claim.
Authors: We thank the referee for noting the need for clearer substantiation. The submitted manuscript contains quantitative recognizability metrics (CLIP similarity), an ablation comparing dual-branch SDS to sequential freezing, and a user study on illusion strength, but cross-references were insufficient. We have revised §4 to explicitly cite Table 2 (quantitative results), Figure 5 (ablation isolating dual-branch contribution), and the supplementary material (user-study protocol with 100 participants and pairwise comparison design). These additions directly support the outperformance claims and the temporal extension of visual anagrams. revision: yes
Circularity Check
No circularity: derivation extends SDS via independent dual-branch optimization and overlay loss without reducing to self-defined inputs.
full rationale
The paper's core contribution is a sequence-aware joint optimization using a dual-branch SDS mechanism plus a novel Overlay Loss to enforce spatial complementarity for progressive semantic illusions. This extends prior SDS work with new components (dual-branch, overlay) to address the dual-constraint of prefix strokes being coherent for both initial and final concepts. No equations or claims reduce by construction to fitted parameters renamed as predictions, self-citations that bear the load of uniqueness, or ansatzes smuggled from prior author work. The method is presented as an engineering extension validated by experiments, remaining self-contained against external benchmarks like standard SDS baselines.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Existence of a common structural subspace valid for both semantic targets that can be discovered by dynamic adjustment of prefix strokes
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
sequence-aware joint optimization framework driven by a dual-branch Score Distillation Sampling (SDS) mechanism … novel Overlay Loss that enforces spatial complementarity
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
discover a 'common structural subspace' valid for both targets
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Cose: Compositional stroke embeddings
Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, and Otmar Hilliges. Cose: Compositional stroke embeddings. Advances in Neural Information Processing Systems, 33: 10041–10052, 2020. 2
work page 2020
-
[2]
Abstracting sketches through simple primitives
Stephan Alaniz, Massimiliano Mancini, Anjan Dutta, Diego Marcos, and Zeynep Akata. Abstracting sketches through simple primitives. InEuropean Conference on Computer Vision, pages 396–412. Springer, 2022. 3
work page 2022
-
[3]
As-rigid- as-possible shape interpolation
Marc Alexa, Daniel Cohen-Or, and David Levin. As-rigid- as-possible shape interpolation. InSeminal Graphics Papers: Pushing the Boundaries, Volume 2, pages 165–172. 2023. 3
work page 2023
-
[4]
Swiftsketch: A diffusion model for image- to-vector sketch generation
Ellie Arar, Yarden Frenkel, Daniel Cohen-Or, Ariel Shamir, and Yael Vinker. Swiftsketch: A diffusion model for image- to-vector sketch generation. InProceedings of the Special Interest Group on Computer Graphics and Interactive Tech- niques Conference Conference Papers, pages 1–12, 2025. 2
work page 2025
-
[5]
Break-a-scene: Extracting multiple concepts from a single image
Omri Avrahami, Kfir Aberman, Ohad Fried, Daniel Cohen- Or, and Dani Lischinski. Break-a-scene: Extracting multiple concepts from a single image. InSIGGRAPH Asia 2023 Conference Papers, pages 1–12, 2023. 3
work page 2023
-
[6]
4d-fy: Text-to-4d generation using hybrid score distillation sampling
Sherwin Bahmani, Ivan Skorokhodov, Victor Rong, Gordon Wetzstein, Leonidas Guibas, Peter Wonka, Sergey Tulyakov, Jeong Joon Park, Andrea Tagliasacchi, and David B Lindell. 4d-fy: Text-to-4d generation using hybrid score distillation sampling. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7996–8006,
-
[7]
Sketchinr: A first look into sketches as implicit neural representations
Hmrishav Bandyopadhyay, Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Tao Xiang, Timothy Hospedales, and Yi-Zhe Song. Sketchinr: A first look into sketches as implicit neural representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12565–12574, 2024. 3
work page 2024
-
[8]
Feature-based image metamorphosis
Thaddeus Beier and Shawn Neely. Feature-based image metamorphosis. InSeminal Graphics Papers: Pushing the Boundaries, Volume 2, pages 529–536. 2023. 3
work page 2023
-
[9]
How renault uses numerical control for car body design and tooling
Pierre E B´ezier. How renault uses numerical control for car body design and tooling. Technical report, SAE Technical Paper, 1968. 3
work page 1968
-
[10]
#$%&: 33.0; CLIP'()): 31.7; CLIP*#)+,: 16.1ΦIR!
Ayan Kumar Bhunia, Ayan Das, Umar Riaz Muhammad, Yongxin Yang, Timothy M Hospedales, Tao Xiang, Yulia Gryaditskaya, and Yi-Zhe Song. Pixelor: A competitive 8 bear cat chicken dog cow angel dolphin peacock horse monkey lighthouse firefighter fox cow koala horse rabbit greekstatue sheep pig dog detective flamingo giraffe Figure 14.Additional 2-phase progres...
work page 2020
-
[11]
Doodleformer: Creative sketch drawing with transformers
Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Jorma Laak- sonen, and Michael Felsberg. Doodleformer: Creative sketch drawing with transformers. InEuropean Conference on Computer Vision, pages 338–355. Springer, 2022. 2
work page 2022
-
[12]
Sketch2saliency: Learning to detect salient objects from human drawings
Ayan Kumar Bhunia, Subhadeep Koley, Amandeep Ku- mar, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, and Yi-Zhe Song. Sketch2saliency: Learning to detect salient objects from human drawings. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2733–2743, 2023. 3
work page 2023
-
[13]
Irving Biederman. Recognition-by-components: a theory of human image understanding.Psychological review, 94(2): 115, 1987. 3
work page 1987
-
[14]
Surface versus edge-based determinants of visual recognition.Cognitive psychology, 20(1):38–64, 1988
Irving Biederman and Ginny Ju. Surface versus edge-based determinants of visual recognition.Cognitive psychology, 20(1):38–64, 1988. 3
work page 1988
-
[15]
Diffusion illusions: Hiding images in plain sight
Ryan Burgert, Xiang Li, Abe Leite, Kanchana Ranasinghe, and Michael Ryoo. Diffusion illusions: Hiding images in plain sight. InACM SIGGRAPH 2024 Conference Papers, pages 1–11, 2024. 3
work page 2024
-
[16]
A computational approach to edge detection
John Canny. A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelli- gence, (6):679–698, 2009. 2 10 rabbit elephant chicken monkey Figure 19.Extension on variable-width B- spline. rabbit horse pikachu sunflower Figure 20.Extension on vector graph. carrot rabbit apple pig Figure 21.Extension on colored strokes
work page 2009
-
[17]
Alexandre Carlier, Martin Danelljan, Alexandre Alahi, and Radu Timofte. Deepsvg: A hierarchical generative network for vector graphics animation.Advances in Neural Informa- tion Processing Systems, 33:16351–16361, 2020. 2
work page 2020
-
[18]
The artist as neuroscientist.Nature, 434 (7031):301–307, 2005
Patrick Cavanagh. The artist as neuroscientist.Nature, 434 (7031):301–307, 2005. 3
work page 2005
-
[19]
Lookingglass: Generative anamor- phoses via laplacian pyramid warping
Pascal Chang, Sergio Sancho, Jingwei Tang, Markus Gross, and Vinicius Azevedo. Lookingglass: Generative anamor- phoses via laplacian pyramid warping. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 24–33, 2025. 3
work page 2025
-
[20]
Chen-Hao Chao, Wei-Fang Sun, Bo-Wun Cheng, Yi-Chen Lo, Chia-Che Chang, Yu-Lun Liu, Yu-Lin Chang, Chia- Ping Chen, and Chun-Yi Lee. Denoising likelihood score matching for conditional score-based data generation.arXiv preprint arXiv:2203.14206, 2022. 3
-
[21]
Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, and Daniel Cohen-Or. Attend-and-excite: Attention-based se- mantic guidance for text-to-image diffusion models.ACM transactions on Graphics (TOG), 42(4):1–10, 2023. 3
work page 2023
-
[22]
Svgbuilder: Component-based colored svg generation with text-guided autoregressive trans- formers
Zehao Chen and Rong Pan. Svgbuilder: Component-based colored svg generation with text-guided autoregressive trans- formers. InProceedings of the AAAI Conference on Artificial Intelligence, pages 2358–2366, 2025. 2
work page 2025
-
[23]
Images that sound: Composing images and sounds on a single canvas
Ziyang Chen, Daniel Geng, and Andrew Owens. Images that sound: Composing images and sounds on a single canvas. Advances in Neural Information Processing Systems, 37: 85045–85073, 2024. 3
work page 2024
-
[24]
Hung-Kuo Chu, Wei-Hsin Hsu, Niloy J Mitra, Daniel Cohen- Or, Tien-Tsin Wong, and Tong-Yee Lee. Camouflage images. ACM Trans. Graph., 29(4):51–1, 2010. 3
work page 2010
-
[25]
B´eziersketch: A generative model for scal- able vector sketches
Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. B´eziersketch: A generative model for scal- able vector sketches. InEuropean conference on computer vision, pages 632–647. Springer, 2020. 2
work page 2020
-
[26]
Drawing ap- prentice: An enactive co-creative agent for artistic collabora- tion
Nicholas Davis, Chih-PIn Hsiao, Kunwar Yashraj Singh, Lisa Li, Sanat Moningi, and Brian Magerko. Drawing ap- prentice: An enactive co-creative agent for artistic collabora- tion. InProceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition, pages 185–186, 2015. 3
work page 2015
-
[27]
Outillages m´ethodes calcul.Andr e Citro en Automobiles SA, Paris, 4:25, 1959
Paul De Casteljau. Outillages m´ethodes calcul.Andr e Citro en Automobiles SA, Paris, 4:25, 1959. 3
work page 1959
-
[28]
Rasp: Revisiting 3d anamor- phic art for shadow-guided packing of irregular objects
Soumyaratna Debnath, Ashish Tiwari, Kaustubh Sadekar, and Shanmuganathan Raman. Rasp: Revisiting 3d anamor- phic art for shadow-guided packing of irregular objects. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 5849–5858, 2025. 3
work page 2025
-
[29]
How do humans sketch objects?ACM Transactions on graphics (TOG), 31 (4):1–10, 2012
Mathias Eitz, James Hays, and Marc Alexa. How do humans sketch objects?ACM Transactions on graphics (TOG), 31 (4):1–10, 2012. 2, 3
work page 2012
-
[30]
Drawing as a versatile cognitive tool.Nature Reviews Psychology, 2(9):556–568, 2023
Judith E Fan, Wilma A Bainbridge, Rebecca Chamberlain, and Jeffrey D Wammes. Drawing as a versatile cognitive tool.Nature Reviews Psychology, 2(9):556–568, 2023. 3
work page 2023
-
[31]
Illusion3d: 3d mul- tiview illusion with 2d diffusion priors.arXiv preprint arXiv:2412.09625, 2024
Yue Feng, Vaibhav Sanjay, Spencer Lutz, Badour AlBa- har, Songwei Ge, and Jia-Bin Huang. Illusion3d: 3d mul- tiview illusion with 2d diffusion priors.arXiv preprint arXiv:2412.09625, 2024. 3
-
[32]
Kevin Frans, Lisa Soros, and Olaf Witkowski. Clipdraw: Ex- ploring text-to-drawing synthesis through language-image encoders.Advances in Neural Information Processing Sys- tems, 35:5207–5218, 2022. 2
work page 2022
-
[33]
Xiang Gao, Shuai Yang, and Jiaying Liu. Ptdiffusion: Free lunch for generating optical illusion hidden pictures with phase-transferred diffusion model. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 18240–18249, 2025. 3
work page 2025
-
[34]
Creative sketch generation.arXiv preprint arXiv:2011.10039, 2020
Songwei Ge, Vedanuj Goswami, C Lawrence Zitnick, and Devi Parikh. Creative sketch generation.arXiv preprint arXiv:2011.10039, 2020. 2
-
[35]
Factorized diffusion: Perceptual illusions by noise decomposition
Daniel Geng, Inbum Park, and Andrew Owens. Factorized diffusion: Perceptual illusions by noise decomposition. In European Conference on Computer Vision, pages 366–384. Springer, 2024. 3
work page 2024
-
[36]
Visual ana- grams: Generating multi-view optical illusions with diffu- sion models
Daniel Geng, Inbum Park, and Andrew Owens. Visual ana- grams: Generating multi-view optical illusions with diffu- sion models. InProceedings of the IEEE/CVF Conference 11 on Computer Vision and Pattern Recognition, pages 24154– 24163, 2024. 2, 3, 5
work page 2024
-
[37]
Draw: A recurrent neural network for image generation
Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Rezende, and Daan Wierstra. Draw: A recurrent neural network for image generation. InInternational conference on machine learning, pages 1462–1471. PMLR, 2015. 2
work page 2015
-
[38]
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shun- ing Chang, Weijia Wu, et al. Mix-of-show: Decentralized low-rank adaptation for multi-concept customization of dif- fusion models.Advances in Neural Information Processing Systems, 36:15890–15902, 2023. 3
work page 2023
-
[39]
A Neural Representation of Sketch Drawings
David Ha and Douglas Eck. A neural representation of sketch drawings.arXiv preprint arXiv:1704.03477, 2017. 2
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[40]
Amir Hertz, Kfir Aberman, and Daniel Cohen-Or. Delta denoising score. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 2328–2337,
-
[41]
Painterly rendering with curved brush strokes of multiple sizes
Aaron Hertzmann. Painterly rendering with curved brush strokes of multiple sizes. InProceedings of the 25th an- nual conference on Computer graphics and interactive tech- niques, pages 453–460, 1998. 2
work page 1998
-
[42]
A survey of stroke-based rendering
Aaron Hertzmann. A survey of stroke-based rendering. In- stitute of Electrical and Electronics Engineers, 2003. 2
work page 2003
-
[43]
Clipscore: A reference-free evaluation met- ric for image captioning, 2022
Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. Clipscore: A reference-free evaluation met- ric for image captioning, 2022. 5
work page 2022
-
[44]
Optimize & reduce: a top-down approach for image vectorization
Or Hirschorn, Amir Jevnisek, and Shai Avidan. Optimize & reduce: a top-down approach for image vectorization. InPro- ceedings of the AAAI Conference on Artificial Intelligence, pages 2148–2156, 2024. 3
work page 2024
-
[45]
Classifier-Free Diffusion Guidance
Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance.arXiv preprint arXiv:2207.12598, 2022. 3
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[46]
Multi- view wire art.ACM Trans
Kai-Wen Hsiao, Jia-Bin Huang, and Hung-Kuo Chu. Multi- view wire art.ACM Trans. Graph., 37(6):242, 2018. 3
work page 2018
-
[47]
Stroke- based neural painting and stylization with dynamically pre- dicted painting region
Teng Hu, Ran Yi, Haokun Zhu, Liang Liu, Jinlong Peng, Yabiao Wang, Chengjie Wang, and Lizhuang Ma. Stroke- based neural painting and stylization with dynamically pre- dicted painting region. InProceedings of the 31st ACM International Conference on Multimedia, pages 7470–7480,
-
[48]
Voxify3D: Pixel Art Meets Volumetric Rendering
Yi-Chuan Huang, Jiewen Chan, Hao-Jen Chien, and Yu-Lun Liu. V oxify3d: Pixel art meets volumetric rendering.arXiv preprint arXiv:2512.07834, 2025. 3
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[49]
Learning to paint with model-based deep reinforcement learning
Zhewei Huang, Wen Heng, and Shuchang Zhou. Learning to paint with model-based deep reinforcement learning. In Proceedings of the IEEE/CVF international conference on computer vision, pages 8709–8718, 2019. 3
work page 2019
-
[50]
Francisco Ibarrola, Tomas Lawton, and Kazjon Grace. A collaborative, interactive and context-aware drawing agent for co-creative design.IEEE Transactions on Visualization and Computer Graphics, 30(8):5525–5537, 2023. 3
work page 2023
-
[51]
Word-as-image for semantic typography.ACM Transactions on Graphics (TOG), 42(4): 1–11, 2023
Shir Iluz, Yael Vinker, Amir Hertz, Daniel Berio, Daniel Cohen-Or, and Ariel Shamir. Word-as-image for semantic typography.ACM Transactions on Graphics (TOG), 42(4): 1–11, 2023. 3
work page 2023
-
[52]
Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models
Ajay Jain, Amber Xie, and Pieter Abbeel. Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1911–1920, 2023. 2, 3
work page 1911
-
[53]
Mcˆ 2: Multi- concept guidance for customized multi-concept generation
Jiaxiu Jiang, Yabo Zhang, Kailai Feng, Xiaohe Wu, Wenbo Li, Renjing Pei, Fan Li, and Wangmeng Zuo. Mcˆ 2: Multi- concept guidance for customized multi-concept generation. InProceedings of the Computer Vision and Pattern Recogni- tion Conference, pages 2802–2812, 2025. 3
work page 2025
-
[54]
Quick, draw! the data.dataset for online game Quick, Draw, 2016
Jonas Jongejan, Henry Rowley, Takashi Kawashima, Jong- min Kim, and Nick Fox-Gieg. Quick, draw! the data.dataset for online game Quick, Draw, 2016. 2
work page 2016
-
[55]
On the temporality for sketch representation learning.arXiv preprint arXiv:2512.04007, 2025
Marcelo Isaias de Moraes Junior and Moacir Antonelli Ponti. On the temporality for sketch representation learning.arXiv preprint arXiv:2512.04007, 2025. 2
-
[56]
Orga- nization in vision: Essays on gestalt perception.(No Title),
Gaetano Kanizsa, Paolo Legrenzi, and Paolo Bozzi. Orga- nization in vision: Essays on gestalt perception.(No Title),
-
[57]
Creative sketching partner: an analysis of human-ai co-creativity
Pegah Karimi, Jeba Rezwana, Safat Siddiqui, Mary Lou Maher, and Nasrin Dehbozorgi. Creative sketching partner: an analysis of human-ai co-creativity. InProceedings of the 25th international conference on intelligent user interfaces, pages 221–230, 2020. 3
work page 2020
-
[58]
Noise-free score distillation.arXiv preprint arXiv:2310.17590, 2023
Oren Katzir, Or Patashnik, Daniel Cohen-Or, and Dani Lischinski. Noise-free score distillation.arXiv preprint arXiv:2310.17590, 2023. 3
-
[59]
Stealthattack: Robust 3d gaussian splatting poisoning via density-guided illusions
Bo-Hsu Ke, You-Zhe Xie, Yu-Lun Liu, and Wei-Chen Chiu. Stealthattack: Robust 3d gaussian splatting poisoning via density-guided illusions. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 27400– 27411, 2025. 3
work page 2025
-
[60]
Subin Kim, Kyungmin Lee, June Suk Choi, Jongheon Jeong, Kihyuk Sohn, and Jinwoo Shin. Collaborative score dis- tillation for consistent visual synthesis.arXiv preprint arXiv:2307.04787, 2023. 3
-
[61]
Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, and Yi-Zhe Song. How to handle sketch-abstraction in sketch-based image retrieval? InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16859–16869, 2024. 3
work page 2024
-
[62]
Posterior dis- tillation sampling
Juil Koo, Chanho Park, and Minhyuk Sung. Posterior dis- tillation sampling. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition, pages 13352–13361, 2024. 3
work page 2024
-
[63]
Multi-concept customization of text- to-image diffusion
Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shecht- man, and Jun-Yan Zhu. Multi-concept customization of text- to-image diffusion. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 1931–1941, 2023. 3
work page 1931
-
[64]
Drawing with reframer: Emergence and control in co-creative ai
Tomas Lawton, Francisco J Ibarrola, Dan Ventura, and Kazjon Grace. Drawing with reframer: Emergence and control in co-creative ai. InProceedings of the 28th Inter- national Conference on Intelligent User Interfaces, pages 264–277, 2023. 3
work page 2023
-
[65]
Skyfall-gs: Synthe- sizing immersive 3d urban scenes from satellite imagery
Jie-Ying Lee, Yi-Ruei Liu, Shr-Ruei Tsai, Wei-Cheng Chang, Chung-Ho Wu, Jiewen Chan, Zhenjun Zhao, 12 Chieh Hubert Lin, and Yu-Lun Liu. Skyfall-gs: Synthe- sizing immersive 3d urban scenes from satellite imagery. arXiv preprint arXiv:2510.15869, 2025. 3
-
[66]
Universal sketch perceptual grouping
Ke Li, Kaiyue Pang, Jifei Song, Yi-Zhe Song, Tao Xiang, Timothy M Hospedales, and Honggang Zhang. Universal sketch perceptual grouping. InProceedings of the european conference on computer vision (ECCV), pages 582–597,
-
[67]
Tzu-Mao Li, Michal Luk´aˇc, Micha¨el Gharbi, and Jonathan Ragan-Kelley. Differentiable vector graphics rasterization for editing and learning.ACM Transactions on Graphics (TOG), 39(6):1–15, 2020. 3
work page 2020
-
[68]
Luciddreamer: Towards high- fidelity text-to-3d generation via interval score matching
Yixun Liang, Xin Yang, Jiantao Lin, Haodong Li, Xiao- gang Xu, and Yingcong Chen. Luciddreamer: Towards high- fidelity text-to-3d generation via interval score matching. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6517–6526, 2024. 3
work page 2024
-
[69]
Magic3d: High-resolution text-to-3d content creation
Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. Magic3d: High-resolution text-to-3d content creation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 300–309, 2023. 3
work page 2023
-
[70]
Hangyu Lin, Yanwei Fu, Xiangyang Xue, and Yu-Gang Jiang. Sketch-bert: Learning sketch bidirectional encoder representation from transformers by self-supervised learning of sketch gestalt. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 6758–6767, 2020. 2
work page 2020
-
[71]
Sketchgan: Joint sketch com- pletion and recognition with generative adversarial network
Fang Liu, Xiaoming Deng, Yu-Kun Lai, Yong-Jin Liu, Cuixia Ma, and Hongan Wang. Sketchgan: Joint sketch com- pletion and recognition with generative adversarial network. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5830–5839, 2019. 2
work page 2019
-
[72]
Compositional visual generation with composable diffusion models
Nan Liu, Shuang Li, Yilun Du, Antonio Torralba, and Joshua B Tenenbaum. Compositional visual generation with composable diffusion models. InEuropean conference on computer vision, pages 423–439. Springer, 2022. 3
work page 2022
-
[73]
Paint transformer: Feed forward neural painting with stroke prediction
Songhua Liu, Tianwei Lin, Dongliang He, Fu Li, Ruifeng Deng, Xin Li, Errui Ding, and Hao Wang. Paint transformer: Feed forward neural painting with stroke prediction. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6598–6607, 2021. 3
work page 2021
-
[74]
Xi Liu, Chaoyi Zhou, Nanxuan Zhao, and Siyu Huang. B\’ezier splatting for fast and differentiable vector graphics rendering.arXiv preprint arXiv:2503.16424, 2025. 3
-
[75]
SyncDreamer: Generating Multiview-consistent Images from a Single-view Image
Yuan Liu, Cheng Lin, Zijiao Zeng, Xiaoxiao Long, Lingjie Liu, Taku Komura, and Wenping Wang. Syncdreamer: Gen- erating multiview-consistent images from a single-view im- age.arXiv preprint arXiv:2309.03453, 2023. 3
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[76]
Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, and Yang Cao. Cones: Concept neurons in diffusion models for customized generation.arXiv preprint arXiv:2303.05125, 2023. 3
-
[77]
A learned representation for scalable vec- tor graphics
Raphael Gontijo Lopes, David Ha, Douglas Eck, and Jonathon Shlens. A learned representation for scalable vec- tor graphics. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 7930–7939, 2019. 3
work page 2019
-
[78]
Artem Lukoianov, Haitz S ´aez de Oc ´ariz Borde, Kristjan Greenewald, Vitor Guizilini, Timur Bagautdinov, Vincent Sitzmann, and Justin M Solomon. Score distillation via reparametrized ddim.Advances in Neural Information Pro- cessing Systems, 37:26011–26044, 2024. 3
work page 2024
-
[79]
Rundong Luo, Noah Snavely, and Wei-Chiu Ma. Shadow- draw: From any object to shadow-drawing compositional art.arXiv preprint arXiv:2512.05110, 2025. 2, 5
-
[80]
Towards layer- wise image vectorization
Xu Ma, Yuqian Zhou, Xingqian Xu, Bin Sun, Valerii Filev, Nikita Orlov, Yun Fu, and Humphrey Shi. Towards layer- wise image vectorization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16314–16323, 2022. 3
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.