pith. sign in

arxiv: 2605.19551 · v1 · pith:SU26B4W4new · submitted 2026-05-19 · 💻 cs.GR · cs.CV

AnchorFlow: Editable SVG Reconstruction via Sparse Anchor Point Fields

Pith reviewed 2026-05-20 02:01 UTC · model grok-4.3

classification 💻 cs.GR cs.CV
keywords image-to-svgvector graphicsbezier curvesanchor pointseditable reconstructionsparse fieldsrendering feedback
0
0 comments X

The pith

AnchorFlow reconstructs SVGs from raster images by predicting sparse anchor point fields that resolve into simple Bezier paths with feedback correction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to resolve the long-standing tension in image-to-vector conversion: methods that stay close to the input raster tend to produce paths with too many anchors or too many separate curves, which makes manual editing cumbersome. AnchorFlow tackles this by shifting focus to how anchors are placed along each path. It first extracts path-like foreground pieces from the raster, then predicts a sparse field of anchor locations conditioned on the image. Each field is turned into an ordered Bezier curve, after which a rendering step supplies corrective signals that prune or adjust local mistakes before the final SVG is assembled. A reader would care because the resulting files keep visual accuracy while offering far fewer control points for artists to move or delete.

Core claim

AnchorFlow models path-level anchor placement with image-conditioned sparse anchor point fields. Given foreground components extracted from a raster image, the method predicts a sparse field for each component, resolves the field into an ordered Bezier path, applies rendering-guided feedback to fix local structural errors, and reassembles the corrected paths into an optimized SVG. Experiments on isolated paths and complete images demonstrate that this yields a favorable fidelity-editability trade-off, with substantially lower editable complexity at competitive raster fidelity.

What carries the argument

Image-conditioned sparse anchor point fields that are resolved into ordered Bezier paths and refined by rendering-guided feedback.

If this is right

  • Output SVGs contain markedly fewer anchor points per path than dense alternatives while matching raster appearance.
  • The same pipeline works for both single isolated paths and full multi-component images.
  • Rendering feedback step removes many of the redundant anchors that boundary-following methods introduce on imperfect raster evidence.
  • Final assembled SVGs are directly usable in standard vector editors with reduced manual cleanup.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If foreground extraction becomes more robust, the same sparse-field idea could extend to natural photographs that contain overlapping objects.
  • Fewer anchors per curve may also reduce file size and speed up rendering or animation of the resulting graphics.
  • A direct user study measuring editing time or number of operations needed would quantify the practical editability gain beyond anchor counts.

Load-bearing premise

The method assumes that path-like foreground components can be reliably extracted from the input raster and that the predicted sparse fields can be turned into Bezier paths whose remaining local errors are fixable by rendering feedback.

What would settle it

Run the pipeline on a set of raster images whose boundaries contain clear noise or gaps; if the output SVGs require more anchors or show larger pixel error than dense baseline methods, the claimed trade-off does not hold.

Figures

Figures reproduced from arXiv: 2605.19551 by Antonio Haas, Christian Franke, Grace Li Zhang, Mengnan Jiang, Michele Franco Adesso.

Figure 1
Figure 1. Figure 1: Fidelity and editability trade-off. This leads to a central challenge in editable SVG reconstruction: how to preserve raster fidelity while keeping the vector structure sparse and ed￾itable. Different method families naturally em￾phasize different parts of this space, including tracing [1, 22, 23], optimization-based vector￾ization [10, 14, 17], adaptive reconstruction [19, 31, 32], and SVG code generation… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of AnchorFlow. The input raster is decomposed into path-like foreground [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Rendering-guided field refinement. Incomplete anchor evidence may lead the initial field to [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparison on full-image SVG reconstruction. AnchorFlow closely preserves [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative comparison on controlled single-path reconstruction. Red points indicate [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Robustness to boundary perturbations in single-path reconstruction. The top row uses a [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Human evaluation on visual similarity and editability in anonymized pairwise trials. [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Representative failure cases. (a) Component extraction failure: missed or merged [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Additional full-image reconstruction examples on the ColorSVG-100K [4] dataset. [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Additional full-image reconstruction examples on the Noto Emoji [ [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Additional full-image reconstruction examples on the Noto Emoji [ [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Additional robustness examples under boundary-perturbed single-path inputs. [PITH_FULL_IMAGE:figures/full_fig_p017_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Additional assembly-stage examples in full-image reconstruction. From left to right: [PITH_FULL_IMAGE:figures/full_fig_p018_13.png] view at source ↗
read the original abstract

Image-to-SVG reconstruction aims to produce vector graphics that are faithful to raster inputs and easy to edit. Existing methods face a structural trade-off in how vector structure is parameterized, including how many paths represent an image and how many anchor points define each path. High-fidelity methods often rely on many paths or densely parameterized curves, whereas overly compact SVG generation may deviate from the input geometry. This issue becomes more pronounced when local raster evidence is imperfect, where boundary-following reconstruction can introduce redundant anchors and fragmented structures. We argue that this trade-off should be addressed at the level of anchor placement, since anchors on Bezier curves define local path structure and strongly affect both accuracy and editability. We propose AnchorFlow, an editable SVG reconstruction framework that models path-level anchor placement with sparse anchor point fields. Given path-like foreground components extracted from a raster image, AnchorFlow predicts an image-conditioned sparse anchor field for each component and resolves it into an ordered Bezier path. Rendering-guided feedback then corrects local structural errors before re-resolution. The recovered paths are then assembled and optimized into the final SVG. Experiments on isolated paths and full images show that AnchorFlow achieves a favorable fidelity-editability trade-off, substantially reducing editable complexity while preserving competitive raster fidelity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces AnchorFlow, a framework for image-to-SVG reconstruction that extracts path-like foreground components from a raster input, predicts an image-conditioned sparse anchor point field for each component, resolves the field into an ordered Bezier path, applies rendering-guided feedback to correct local structural errors, and assembles/optimizes the paths into a final SVG. The central claim is that modeling anchor placement via sparse fields yields a favorable fidelity-editability trade-off, substantially reducing editable complexity while preserving competitive raster fidelity, particularly when local raster evidence is imperfect.

Significance. If the quantitative results and robustness claims hold, the work would meaningfully advance vector graphics reconstruction by shifting the parameterization focus to sparse anchor placement and a prediction-resolution-feedback pipeline, offering a concrete alternative to dense curves or boundary-following methods that produce redundant or fragmented structures.

major comments (2)
  1. [Method (resolution and feedback pipeline)] The resolution step from predicted sparse anchor fields to ordered Bezier paths is load-bearing for the fidelity-editability claim (abstract and method overview). The manuscript must demonstrate, via targeted ablations or failure-case analysis, that this step plus the rendering-guided feedback reliably avoids ambiguous ordering, self-intersections, or missing closures when raster evidence is imperfect; otherwise the claimed advantage over boundary-following approaches cannot be verified.
  2. [Experiments] Experiments section: the abstract states that experiments on isolated paths and full images show a favorable trade-off, but the manuscript must report concrete metrics (e.g., raster fidelity error, number of anchors/paths, editability measures) against explicit baselines, including error bars or statistical significance, to substantiate the central claim.
minor comments (2)
  1. [Method] Clarify the exact definition and conditioning of the sparse anchor point field (e.g., how image features are encoded and how sparsity is enforced) to improve reproducibility.
  2. [Figures and tables] Figure captions and text should explicitly link visual results to the quantitative tables so readers can directly assess the fidelity-editability trade-off.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important aspects of the resolution pipeline and experimental reporting that we will address to strengthen the work. We respond to each major comment below.

read point-by-point responses
  1. Referee: [Method (resolution and feedback pipeline)] The resolution step from predicted sparse anchor fields to ordered Bezier paths is load-bearing for the fidelity-editability claim (abstract and method overview). The manuscript must demonstrate, via targeted ablations or failure-case analysis, that this step plus the rendering-guided feedback reliably avoids ambiguous ordering, self-intersections, or missing closures when raster evidence is imperfect; otherwise the claimed advantage over boundary-following approaches cannot be verified.

    Authors: We agree that the resolution and feedback components are central to our claims. The manuscript describes the conversion from sparse anchor fields to ordered Bezier paths and the use of rendering feedback for local corrections in the method section. To better substantiate robustness under imperfect raster evidence, we will add targeted ablations (with and without the feedback loop) and failure-case visualizations showing handling of ordering ambiguities, self-intersections, and closures. These additions will directly compare against boundary-following baselines to verify the claimed advantage. revision: yes

  2. Referee: [Experiments] Experiments section: the abstract states that experiments on isolated paths and full images show a favorable trade-off, but the manuscript must report concrete metrics (e.g., raster fidelity error, number of anchors/paths, editability measures) against explicit baselines, including error bars or statistical significance, to substantiate the central claim.

    Authors: We acknowledge that more granular quantitative details are needed to fully support the trade-off claim. The experiments section currently evaluates on isolated paths and full images, but we will expand it to report explicit metrics including raster fidelity errors, anchor and path counts, and editability measures. Direct comparisons to baselines will be included, along with error bars from multiple runs and statistical significance tests. revision: yes

Circularity Check

0 steps flagged

No circularity in AnchorFlow's proposed modeling framework

full rationale

The paper describes AnchorFlow as a proposed framework that extracts path-like foreground components from a raster image, predicts an image-conditioned sparse anchor field, resolves it into an ordered Bezier path, applies rendering-guided feedback to correct local errors, and assembles the paths into a final SVG. No equations, derivations, or self-referential definitions are present in the provided text; the sparse anchor field prediction and resolution steps are presented as independent architectural choices rather than quantities defined in terms of their own outputs or fitted parameters. The central claim of a favorable fidelity-editability trade-off rests on empirical experiments rather than any reduction to self-citations or ansatzes smuggled via prior work. This makes the approach self-contained with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the central claim rests on unstated assumptions about component extraction and feedback correction whose details are not supplied.

pith-pipeline@v0.9.0 · 5757 in / 1131 out tokens · 33299 ms · 2026-05-20T02:01:17.658459+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 1 internal anchor

  1. [1]

    AutoTrace: Bitmap to vector graphics converter.https://autotrace.sourceforge

    AutoTrace Project. AutoTrace: Bitmap to vector graphics converter.https://autotrace.sourceforge. net/, 2024. Open source software

  2. [2]

    DeepSVG: A hierarchical generative network for vector graphics animation

    Alexandre Carlier, Martin Danelljan, Alexandre Alahi, and Radu Timofte. DeepSVG: A hierarchical generative network for vector graphics animation. InAdvances in Neural Information Processing Systems, volume 33, pages 16351–16361, 2020. URLhttps://proceedings.neurips.cc/paper/2020/hash/ bcf9d6bd14a2095866ce8c950b702341-Abstract.html

  3. [3]

    Image vectorization via gradient reconstruction.Computer Graphics Forum, 44(2):e70055, 2025

    Souymodip Chakraborty, Vineet Batra, Ankit Phogat, Vishwas Jain, Jaswant Singh Ranawat, Sumit Dhingra, Kevin Wampler, and Michal Lukáˇc. Image vectorization via gradient reconstruction.Computer Graphics Forum, 44(2):e70055, 2025. doi: 10.1111/cgf.70055. URL https://onlinelibrary.wiley. com/doi/10.1111/cgf.70055

  4. [4]

    Svgbuilder: Component-based colored svg generation with text-guided autoregressive transformers

    Zehao Chen and Rong Pan. Svgbuilder: Component-based colored svg generation with text-guided autoregressive transformers. InProceedings of the AAAI Conference on Artificial Intelligence, 2025

  5. [5]

    Cloud2curve: Generation and vectorization of parametric sketches

    Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang, and Yi-Zhe Song. Cloud2curve: Generation and vectorization of parametric sketches. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7088–7097, 2021

  6. [6]

    Image vectorization: A review.arXiv preprint arXiv:2306.06441, 2023

    Maria Dziuba, Ivan Jarsky, Valeria Efimova, and Andrey Filchenkov. Image vectorization: A review.arXiv preprint arXiv:2306.06441, 2023

  7. [7]

    Deep vectorization of technical drawings

    Vage Egiazarian, Oleg V oynov, Alexey Artemov, Denis V olkhonskiy, Aleksandr Safin, Maria Taktasheva, Denis Zorin, and Evgeny Burnaev. Deep vectorization of technical drawings. InComputer Vision – ECCV 2020, pages 582–598, 2020. doi: 10.1007/978-3-030-58601-0_35. URL https://www.ecva.net/ papers/eccv_2020/papers_ECCV/html/123580579.php

  8. [8]

    Noto Emoji

    Google Fonts. Noto Emoji. https://github.com/googlefonts/noto-emoji, 2026. Open source emoji library with vector SVG and PNG assets

  9. [9]

    Weld, and Ranjay Krishna

    Qijia He, Xunmei Liu, Hammaad Memon, Ziang Li, Zixian Ma, Jaemin Cho, Jason Ren, Daniel S. Weld, and Ranjay Krishna. Vfig: Vectorizing complex figures in svg with vision-language models.arXiv preprint arXiv:2603.24575, 2026

  10. [10]

    Optimize & reduce: A top-down approach for image vectorization

    Or Hirschorn, Amir Jevnisek, and Shai Avidan. Optimize & reduce: A top-down approach for image vectorization. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 2148–2156, 2024. doi: 10.1609/aaai.v38i3.27987. URL https://ojs.aaai.org/index.php/AAAI/ article/view/27987

  11. [11]

    AmodalSVG: Amodal Image Vectorization via Semantic Layer Peeling

    Juncheng Hu, Ziteng Xue, Guotao Liang, Anran Qi, Buyu Li, Sheng Wang, Dong Xu, and Qian Yu. Amodalsvg: Amodal image vectorization via semantic layer peeling.arXiv preprint arXiv:2604.10940, 2026

  12. [12]

    Rosin, and Yu-Kun Lai

    Teng Hu, Ran Yi, Baihong Qian, Jiangning Zhang, Paul L. Rosin, and Yu-Kun Lai. Supersvg: Superpixel- based scalable vector graphics synthesis.arXiv preprint arXiv:2406.09794, 2024

  13. [13]

    Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models

    Ajay Jain, Amber Xie, and Pieter Abbeel. Vectorfusion: Text-to-svg by abstracting pixel-based diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1911–1920, 2023

  14. [14]

    Differentiable vector graphics rasterization for editing and learning.ACM Transactions on Graphics, 39(6):193:1–193:15, 2020

    Tzu-Mao Li, Michal Lukáˇc, Michaël Gharbi, and Jonathan Ragan-Kelley. Differentiable vector graphics rasterization for editing and learning.ACM Transactions on Graphics, 39(6):193:1–193:15, 2020. doi: 10.1145/3414685.3417871. URLhttps://people.csail.mit.edu/tzumao/diffvg/

  15. [15]

    End-to-end line drawing vectorization

    Hanyuan Liu, Chengze Li, Xueting Liu, and Tien-Tsin Wong. End-to-end line drawing vectorization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 4559–4566, 2022. doi: 10.1609/aaai.v36i4.20379. URLhttps://ojs.aaai.org/index.php/AAAI/article/view/20379

  16. [16]

    A learned representation for scalable vector graphics

    Raphael Gontijo Lopes, David Ha, Douglas Eck, and Jonathon Shlens. A learned representation for scalable vector graphics. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2019

  17. [17]

    In: CVPR

    Xu Ma, Yuqian Zhou, Xingqian Xu, Bin Sun, Valerii Filev, Nikita Orlov, Yun Fu, and Humphrey Shi. Towards layer-wise image vectorization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16314–16323, 2022. doi: 10.1109/CVPR52688. 2022.01583. URL https://openaccess.thecvf.com/content/CVPR2022/html/Ma_Towards_ Layer-...

  18. [18]

    Fluent Emoji

    Microsoft. Fluent Emoji. https://github.com/microsoft/fluentui-emoji, 2026. Open source emoji collection, MIT license

  19. [19]

    Pradyumna Reddy, Michaël Gharbi, Michal Luká ˇc, and Niloy J. Mitra. Im2Vec: Synthesizing vector graphics without vector supervision. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 734–743, 2021. URLhttps://arxiv.org/abs/2102.02798

  20. [20]

    Rocket-1: Mastering open-world interaction with visual-temporal context prompting

    Juan A. Rodriguez, Abhay Puri, Shubham Agarwal, Issam H. Laradji, Pau Rodriguez, Sai Rajeswar, David Vazquez, Christopher Pal, and Marco Pedersoli. Starvector: Generating scalable vector graphics code from images and text. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16175–16186, 2025. doi: 10.1109/CVPR52734.2...

  21. [21]

    Juan A. Rodriguez, Haotian Zhang, Abhay Puri, Rishav Pramanik, Aarash Feizi, Pascal Wichmann, Arnab Kumar Mondal, Mohammad Reza Samsami, Rabiul Awal, Perouz Taslakian, Spandana Gella, Sai Rajeswar, David Vazquez, Christopher Pal, and Marco Pedersoli. Rendering-aware reinforcement learning for vector graphics generation. InThe Thirty-Ninth Annual Conferenc...

  22. [22]

    Potrace: A polygon-based tracing algorithm, September 2003

    Peter Selinger. Potrace: A polygon-based tracing algorithm, September 2003. URL https://www. mathstat.dal.ca/~selinger/potrace/potrace.pdf. Technical report

  23. [23]

    VTracer: Raster to vector graphics converter

    Vision Cortex. VTracer: Raster to vector graphics converter. https://github.com/visioncortex/ vtracer, 2024. Open source software

  24. [24]

    Layered image vectorization via semantic simplification

    Zhenyu Wang, Jianxi Huang, Zhida Sun, Yuanhao Gong, Daniel Cohen-Or, and Min Lu. Layered image vectorization via semantic simplification. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7728–7738, 2025. URL https://openaccess.thecvf.com/content/CVPR2025/html/Wang_Layered_Image_ Vectorization_via_Semantic_Simplifi...

  25. [25]

    Bovik, Hamid R

    Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. Image quality assessment: From error visibility to structural similarity.IEEE Transactions on Image Processing, 13(4):600–612, 2004

  26. [26]

    Iconshop: Text-guided vector icon synthesis with autoregressive transformers.ACM Transactions on Graphics, 42(6):230:1–230:14, 2023

    Ronghuan Wu, Wanchao Su, Kede Ma, and Jing Liao. Iconshop: Text-guided vector icon synthesis with autoregressive transformers.ACM Transactions on Graphics, 42(6):230:1–230:14, 2023

  27. [27]

    Chat2svg: Vector graphics generation with large language models and image diffusion models

    Ronghuan Wu, Wanchao Su, and Jing Liao. Chat2svg: Vector graphics generation with large language models and image diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025

  28. [28]

    Svgdreamer: Text guided svg generation with diffusion model

    Ximing Xing, Haitao Zhou, Chuang Wang, Jing Zhang, Dong Xu, and Qian Yu. Svgdreamer: Text guided svg generation with diffusion model. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

  29. [29]

    Omnisvg: A unified scalable vector graphics generation model.arXiv preprint arXiv:2504.06263, 2025

    Yiying Yang, Wei Cheng, Sijin Chen, Xianfang Zeng, Jiaxu Zhang, Liao Wang, Gang Yu, Xingjun Ma, and Yu-Gang Jiang. Omnisvg: A unified scalable vector graphics generation model.arXiv preprint arXiv:2504.06263, 2025

  30. [30]

    Efros, Eli Shechtman, and Oliver Wang

    Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018

  31. [31]

    Rocket-1: Mastering open-world interaction with visual-temporal context prompting

    Kaibo Zhao, Liang Bao, Yufei Li, Xu Su, Ke Zhang, and Xiaotian Qiao. Less is more: Efficient image vectorization with adaptive parameterization. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18166–18175, 2025. doi: 10.1109/CVPR52734.2025.01693. URL https://openaccess.thecvf.com/content/CVPR2025/html/Zhao_Less_i...

  32. [32]

    Segmentation-guided layer-wise image vectorization with gradient fills.arXiv preprint arXiv:2408.15741, 2024

    Hengyu Zhou, Hui Zhang, and Bin Wang. Segmentation-guided layer-wise image vectorization with gradient fills.arXiv preprint arXiv:2408.15741, 2024

  33. [33]

    Win” and “Lose

    Haokun Zhu, Juang Ian Chong, Teng Hu, Ran Yi, Yu-Kun Lai, and Paul L. Rosin. Samvg: A multi-stage image vectorization model with the segment-anything model.arXiv preprint arXiv:2311.05276, 2023. 12 A Supplementary experimental results A.1 Failure cases Figure 8 shows two representative failure modes of AnchorFlow. First, the AdaVec-style decomposi- tion f...

  34. [34]

    Parse the SVG path geometry and obtain the editable anchor positions from the path structure

  35. [35]

    Rasterize the local path or component into a grayscale crop and normalize it to the training resolution

  36. [36]

    Sample the target contour and construct the sparse field target using anchor peaks and weaker contour support, as defined in Eq. 12

  37. [37]

    This construction ensures that the predictor is trained on structural evidence for anchor placement rather than on object occupancy masks or final SVG code

    Store the raster crop, the field target, and metadata needed for reproducible splitting and validation. This construction ensures that the predictor is trained on structural evidence for anchor placement rather than on object occupancy masks or final SVG code. D.2 Training corpus composition We train the anchor field predictor on a lightweight mixed SVG-d...

  38. [38]

    Decompose X into path-like foreground components {Xm, Tm}M m=1 =D(X) , where Xm is a normal- ized crop andT m stores the crop-to-canvas transform

  39. [39]

    , M: 2.1 Encode and decode the crop to obtain an initial sparse anchor field: zm,0 =E ϕ(Xm) and Fm,0 = Gθ(zm,0)

    For each componentm= 1, . . . , M: 2.1 Encode and decode the crop to obtain an initial sparse anchor field: zm,0 =E ϕ(Xm) and Fm,0 = Gθ(zm,0). 2.2 Hard resolve the field into anchors, ordering, connectivity, tangents, and cubic Bézier controls: (C 0 m, s0 m) =H(F m,0, Xm). 2.3 Apply SDF-guided control point pulling under the fixed anchor structure. Only i...

  40. [40]

    Transform all accepted component paths back to the original canvas usingT −1 m

  41. [41]

    Assemble the transformed paths according to the part order, assign appearance from the raster components, and optionally fit simple gradients for non-uniform parts

  42. [42]

    Table 11: Default inference settings used in the implementation

    Optionally apply global pydiffvg polishing to the assembled SVG without adding new paths. Table 11: Default inference settings used in the implementation. Setting Value Maximum field refinement rounds 2 Latent update variable∆zonly Acceptance score tolerance-based strokeF δ Control point refinement SDF-guided control point pulling Final repair optional mi...