Hierarchical Vectorization for Portrait Images

Fei Hou; Linlin Liu; Qian Fu; Ying He

arxiv: 2205.11880 · v1 · submitted 2022-05-24 · 💻 cs.CV · cs.GR

Hierarchical Vectorization for Portrait Images

Qian Fu , Linlin Liu , Fei Hou , Ying He This is my paper

Pith reviewed 2026-05-24 11:41 UTC · model grok-4.3

classification 💻 cs.CV cs.GR

keywords portrait vectorizationdiffusion curvesPoisson regionshierarchical representationimage editinggenerative modelretouchingcolor transfer

0 comments

The pith

A three-tier vector representation converts raster portraits into editable diffusion curves, Poisson regions, and generated residuals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes automatically converting raster portrait images into a three-level hierarchical vector format. The base layer uses sparse diffusion curves to capture geometric features and low-frequency colors for tasks such as color transfer and expression editing. The middle layer encodes lighting with editable Poisson regions that users can adjust for highlights and shadows. The top layer adds pixel-sized Poisson regions and a trained generative model to handle high-frequency details and enable automatic retouching. This structure supports new blending operations based on the Laplace operator and is evaluated with an illumination-sensitive extension of the FLIP metric on the FFHQR dataset.

Core claim

The central claim is that organizing vector primitives into three tiers—sparse diffusion curves for salient features and low-frequency content, large editable Poisson regions for mid-frequency lighting, and pixel-sized Poisson regions plus a generative model for high-frequency residuals—produces a representation that supports intuitive portrait editing operations including color transfer, facial expression changes, highlight and shadow adjustments, and automatic retouching while preserving image information.

What carries the argument

The 3-tier hierarchical representation consisting of sparse diffusion curves, editable Poisson regions, and pixel-sized PRs with a generative model for residuals.

If this is right

Diffusion curves enable semantic color transfer and facial expression editing.
Adjusting strength or shape of Poisson regions directly modifies illumination.
The generative model produces residuals for automatic retouching of details.
Linearity of the Laplace operator allows alpha blending, linear dodge, and linear burn in vector form for lighting edits.
The IS-FLIP metric evaluates edits by capturing illumination changes more consistently with perception.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The hierarchy could be applied to other image categories if the primitive extraction generalizes beyond portraits.
Public release of code and models would allow testing on new editing workflows outside the reported tasks.
The approach might combine with existing raster tools to create hybrid editing systems.
Propagating the layers across video frames could extend the method to moving portraits.

Load-bearing premise

The chosen primitives of diffusion curves for low-frequency content, Poisson regions for lighting, and generated residuals for details can be extracted from and recombined into diverse portraits without visible artifacts or loss of essential information.

What would settle it

Recombining the three layers after an edit produces visible artifacts or mismatches on multiple varied portraits from the FFHQR dataset.

Figures

Figures reproduced from arXiv: 2205.11880 by Fei Hou, Linlin Liu, Qian Fu, Ying He.

**Figure 2.** Figure 2: The algorithmic pipeline of our hierarchical vectorization. See the text for details. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Preprocessing. Given an input image I, we compute a retouched image Ir and a highlightcompensated retouched image Irh. We highlight the differences in greyed boxes. Extracting diffusion curves (DCs). We adopt a two-step method for computing DCs. First, we apply the probability edge algorithm [18] to extract strong edges Ie in the retouched image Irh and use the colors on the edges to define the boundary c… view at source ↗

**Figure 4.** Figure 4: Illustration of hierarchical PVG. The vector primitives are shown in the small insets. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Applying the linear blending functions to edit highlights and shadows. (b)-(c): We add [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: An example of DC mask. (a) is the input image [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: DC extraction. The percentages are the ratio of the number of pixels in DCs to the [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: Generating residual PRs using deep learning. Clockwise from top left in the last column: [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 9.** Figure 9: The IS-FLIP-ct metric δEct I is more effective than FLIP δEF and IS-FLIP δEI for evaluating color transfer results. The input image I and the color transferred result J have the same facial features but different colors. However, due to significant color change between J and I, both FLIP and IS-FLIP, which take color changes as part of the difference metric, yield large error values. IS-FLIP-ct, in contras… view at source ↗

**Figure 10.** Figure 10: Hair color transfer by modifying colors of sparse diffusion curves. (a) original image. (b)-(c) reference images. (d)-(e) hair color transfer. (f)-(i) hair highlighting effects. Light editing. Since our method separates illuminations from colors, the user can explicitly edit light using the PRs in the middle level. In [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

**Figure 11.** Figure 11: Light editing. We apply linear dodge and linear burn to the middle-level PRs to modify [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 12.** Figure 12: Comparing our face retouching results with the [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗

**Figure 13.** Figure 13: Expression editing via changing the geometries of DCs in the base level. The first [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗

**Figure 14.** Figure 14: Comparison with deep sparse, smart contours [11] in image reconstruction. Our method [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗

**Figure 15.** Figure 15: Facial color transfer on a challenging case with a wide range of brightness. We show the [PITH_FULL_IMAGE:figures/full_fig_p016_15.png] view at source ↗

**Figure 16.** Figure 16: A failed case. The vector primitives in the base level are not able to capture the facial [PITH_FULL_IMAGE:figures/full_fig_p017_16.png] view at source ↗

read the original abstract

Aiming at developing intuitive and easy-to-use portrait editing tools, we propose a novel vectorization method that can automatically convert raster images into a 3-tier hierarchical representation. The base layer consists of a set of sparse diffusion curves (DC) which characterize salient geometric features and low-frequency colors and provide means for semantic color transfer and facial expression editing. The middle level encodes specular highlights and shadows to large and editable Poisson regions (PR) and allows the user to directly adjust illumination via tuning the strength and/or changing shape of PR. The top level contains two types of pixel-sized PRs for high-frequency residuals and fine details such as pimples and pigmentation. We also train a deep generative model that can produce high-frequency residuals automatically. Thanks to the meaningful organization of vector primitives, editing portraits becomes easy and intuitive. In particular, our method supports color transfer, facial expression editing, highlight and shadow editing and automatic retouching. Thanks to the linearity of the Laplace operator, we introduce alpha blending, linear dodge and linear burn to vector editing and show that they are effective in editing highlights and shadows. To quantitatively evaluate the results, we extend the commonly used FLIP metric (which measures differences between two images) by considering illumination. The new metric, called illumination-sensitive FLIP or IS-FLIP, can effectively capture the salient changes in color transfer results, and is more consistent with human perception than FLIP and other quality measures on portrait images. We evaluate our method on the FFHQR dataset and show that our method is effective for common portrait editing tasks, such as retouching, light editing, color transfer and expression editing. We will make the code and trained models publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a three-tier vectorization for portraits that separates geometry, lighting, and details in a way that supports targeted edits, but the recombination quality and quantitative support need checking.

read the letter

The main thing here is a practical split of portrait images into sparse diffusion curves for low-frequency structure and color, larger Poisson regions for highlights and shadows, and pixel-scale PRs plus a generative model for the fine residuals. This lets the method handle color transfer, expression edits, light adjustments, and retouching through direct manipulation of the layers rather than pixel-level work. The use of Laplace linearity to bring alpha blending, dodge, and burn into the vector domain is a straightforward addition that fits the editing goals. Releasing code and models is also useful for anyone wanting to test it. The IS-FLIP metric is a reasonable extension for catching illumination shifts that standard FLIP misses, and it is presented as an independent check rather than a circular fit. The central claim rests on the layers recombining cleanly after edits without visible artifacts or lost information. The abstract gives no reconstruction errors, no ablation on the layer separation, and no tables comparing against prior vectorization on the FFHQR set, so the strength of that claim is still open. If the full experiments show consistent results across diverse portraits and the generative residuals stay coherent with edited lower layers, the approach holds; otherwise the editing tasks could introduce side effects. This is aimed at researchers and developers working on vector-based portrait tools who need editable representations rather than pure compression. A reader focused on graphics applications would find the method description and the editing examples worth looking at. The work is coherent enough on its own terms to deserve referee time, even if the experiments turn out to need expansion.

Referee Report

2 major / 1 minor

Summary. The paper proposes a 3-tier hierarchical vectorization for portrait images: a base layer of sparse diffusion curves (DCs) for salient geometry and low-frequency colors, a middle layer of editable Poisson regions (PRs) for specular highlights and shadows, and a top layer of pixel-sized PRs for high-frequency residuals and details, augmented by a deep generative model to synthesize residuals. This structure is claimed to support intuitive editing operations including semantic color transfer, facial expression editing, highlight/shadow adjustment via PR strength/shape, and automatic retouching. Linearity of the Laplace operator is used to introduce alpha blending, linear dodge, and linear burn for vector editing. A new illumination-sensitive FLIP metric (IS-FLIP) is introduced to better capture color-transfer changes, and the method is evaluated on the FFHQR dataset with the claim that it is effective for common portrait editing tasks. Code and models will be released.

Significance. If the extraction and recombination claims hold with low artifact rates across diverse inputs, the work would offer a practically useful advance in vector-based portrait editing by organizing primitives into semantically meaningful, independently editable layers rather than flat vectorizations. The planned public release of code and models is a clear strength that would aid reproducibility. The IS-FLIP extension addresses a relevant gap in evaluating illumination-aware edits, though its added value depends on the missing human-judgment validation.

major comments (2)

[Abstract] Abstract: The central claim that the 3-tier representation (sparse DCs + editable PRs + generated pixel-sized PRs) can be automatically extracted from any portrait and recombined (with or without edits) while preserving salient information and avoiding visible artifacts rests on unshown implementation choices; no reconstruction error metrics, no ablation studies on layer separation, and no consistency checks between edited lower layers and the generative residual model are supplied.
[Abstract] Abstract: The assertion that IS-FLIP is 'more consistent with human perception than FLIP and other quality measures on portrait images' is load-bearing for the quantitative evaluation of editing tasks, yet the abstract supplies neither the validation procedure against human judgments nor any comparative tables on the FFHQR dataset.

minor comments (1)

[Abstract] The abstract states that the method 'supports color transfer, facial expression editing, highlight and shadow editing and automatic retouching' but does not clarify whether these operations are demonstrated with before/after examples or only described at a high level.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the abstract and related sections to better highlight the supporting evidence from the full paper.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the 3-tier representation (sparse DCs + editable PRs + generated pixel-sized PRs) can be automatically extracted from any portrait and recombined (with or without edits) while preserving salient information and avoiding visible artifacts rests on unshown implementation choices; no reconstruction error metrics, no ablation studies on layer separation, and no consistency checks between edited lower layers and the generative residual model are supplied.

Authors: The full manuscript reports reconstruction error metrics on FFHQR, includes ablation studies on the contribution of each hierarchical layer, and analyzes consistency between edited base/middle layers and the generative residual model in the results and supplementary material. The abstract summarizes these without including specific numbers or figures. We will revise the abstract to briefly reference the quantitative evaluations and key implementation details supporting the extraction and recombination claims. revision: yes
Referee: [Abstract] Abstract: The assertion that IS-FLIP is 'more consistent with human perception than FLIP and other quality measures on portrait images' is load-bearing for the quantitative evaluation of editing tasks, yet the abstract supplies neither the validation procedure against human judgments nor any comparative tables on the FFHQR dataset.

Authors: The manuscript body contains comparative tables on FFHQR and details the IS-FLIP formulation to capture illumination-sensitive differences. The consistency claim with human perception derives from these metric comparisons and visual analysis rather than a formal user study. We will revise the abstract to reference the evaluation tables and procedure, and qualify the wording to reflect the basis of the claim. revision: partial

Circularity Check

0 steps flagged

No circularity: method construction is independent of claimed outputs

full rationale

The abstract and description present a hierarchical decomposition into diffusion curves, Poisson regions, and residuals with a trained generative model, but contain no equations, fitted parameters, or self-citations that reduce the editing capabilities or IS-FLIP metric to re-expressions of their own inputs by construction. The representation is built bottom-up from image primitives, and the metric is described as an explicit extension of FLIP without load-bearing self-reference. This is the common case of a self-contained technical contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, mathematical axioms, or newly postulated entities; the approach relies on standard graphics primitives (diffusion curves, Poisson regions) and a trained generative model whose training details are not given.

pith-pipeline@v0.9.0 · 5833 in / 1066 out tokens · 23423 ms · 2026-05-24T11:41:54.181591+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 1 internal anchor

[1]

M. Aﬁﬁ, M. A. Brubaker, and M. S. Brown. Histogan: Controlling colors of gan-generated and real images via color histograms. In IEEE CVPR , 2021

work page 2021
[2]

Andersson, J

P. Andersson, J. Nilsson, T. Akenine-M¨ oller, M. Oskarsson, K.˚Astr¨ om, and M. D. Fairchild. FLIP: A Diﬀerence Evaluator for Alternating Images. Proceedings of the ACM on Computer Graphics and Interactive Techniques , 3(2):15:1–15:23, 2020

work page 2020
[3]

Bang and H

D. Bang and H. Shim. Mggan: Solving mode collapse using manifold-guided training. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 2347– 2356, 2021

work page 2021
[4]

S. Bell, K. Bala, and N. Snavely. Intrinsic images in the wild. ACM TOG, 33(4):1–12, 2014

work page 2014
[5]

S. Bi, X. Han, and Y. Yu. An l1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition. ACM TOG, 34(4):1–12, 2015

work page 2015
[6]

Boy´ e, P

S. Boy´ e, P. Barla, and G. Guennebaud. A vectorial solver for free-form vector gradients.ACM TOG, 31(6):1–9, 2012

work page 2012
[7]

J. Canny. A computational approach to edge detection. IEEE PAMI, (6):679–698, 1986

work page 1986
[8]

J. F. Canny. A computational approach to edge detection. IEEE PAMI, PAMI-8(6):679–698, 1986

work page 1986
[9]

Chen, Y.-S

K.-W. Chen, Y.-S. Luo, Y.-C. Lai, Y.-L. Chen, C.-Y. Yao, H.-K. Chu, and T.-Y. Lee. Image vectorization with real-time thin-plate spline. IEEE Transactions on Multimedia, 22(1):15–29, 2019

work page 2019
[10]

Cheng, Y

Z. Cheng, Y. Zheng, S. You, and I. Sato. Non-local intrinsic decomposition with near-infrared priors. In IEEE ICCV , pages 2521–2530, 2019

work page 2019
[11]

Dekel, C

T. Dekel, C. Gan, D. Krishnan, C. Liu, and W. T. Freeman. Sparse, smart contours to represent and edit images. In IEEE CVPR , pages 3511–3520, 2018

work page 2018
[12]

Favreau, F

J.-D. Favreau, F. Lafarge, and A. Bousseau. Photo2clipart: image abstraction and vectoriza- tion using layered linear gradients. ACM TOG, 36(6):1–11, 2017

work page 2017
[13]

Finch, J

M. Finch, J. Snyder, and H. Hoppe. Freeform vector graphics with controlled thin-plate splines. ACM TOG, 30(6):1–10, 2011

work page 2011
[14]

Q. Fu, Y. He, F. Hou, J. Zhang, A. Zeng, and Y.-J. Liu. Vectorization based color transfer for portrait images. Computer-Aided Design, 115:111–121, 2019

work page 2019
[15]

F. Hou, Q. Sun, Z. Fang, Y. Liu, S. Hu, H. Qin, A. Hao, and Y. He. Poisson vector graphics (PVG). IEEE TVCG, 26(2):1361–1371, 2020

work page 2020
[16]

Lai, S.-M

Y.-K. Lai, S.-M. Hu, and R. R. Martin. Automatic and topology-preserving gradient mesh generation for image vectorization. ACM Transactions on Graphics (TOG) , 28(3):1–8, 2009. 17

work page 2009
[17]

C.-H. Lee, Z. Liu, L. Wu, and P. Luo. Maskgan: Towards diverse and interactive facial image manipulation. In IEEE CVPR , pages 5549–5558, 2020

work page 2020
[18]

Leordeanu, R

M. Leordeanu, R. Sukthankar, and C. Sminchisescu. Eﬃcient closed-form solution to gener- alized boundary detection. In ECCV, pages 516–529. Springer, 2012

work page 2012
[19]

J. Liao, Y. Yao, L. Yuan, G. Hua, and S. B. Kang. Visual attribute transfer through deep image analogy. arXiv:1705.01088, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[20]

Z. Liao, H. Hoppe, D. Forsyth, and Y. Yu. A subdivision-based representation for vector image editing. IEEE transactions on visualization and computer graphics , 18(11):1858–1867, 2012

work page 2012
[21]

S. Lu, W. Jiang, X. Ding, C. S. Kaplan, X. Jin, F. Gao, and J. Chen. Depth-aware image vectorization and editing. The Visual Computer , 35(6-8):1027–1039, 2019

work page 2019
[22]

Z. Lu, T. Hu, L. Song, Z. Zhang, and R. He. Conditional expression synthesis with face parsing transformation. In ACM MM, pages 1083–1091, 2018

work page 2018
[23]

Orzan, A

A. Orzan, A. Bousseau, H. Winnem¨ oller, P. Barla, J. Thollot, and D. Salesin. Diﬀusion curves: A vector representation for smooth-shaded images. ACM TOG, 27(3):1–8, 2008

work page 2008
[24]

X. S. Poma, E. Riba, and A. Sappa. Dense extreme inception network: Towards a robust cnn model for edge detection. In IEEE WCACV, pages 1923–1932, 2020

work page 1923
[25]

Sengupta, A

S. Sengupta, A. Kanazawa, C. D. Castillo, and D. W. Jacobs. Sfsnet: Learning shape, re- ﬂectance and illuminance of facesin the wild’. In IEEE CVPR , pages 6296–6305, 2018

work page 2018
[26]

Shafaei, J

A. Shafaei, J. J. Little, and M. Schmidt. Autoretouch: Automatic professional face retouching. In IEEE WACV, pages 990–998, January 2021

work page 2021
[27]

Shen and Z.-H

H.-L. Shen and Z.-H. Zheng. Real-time highlight removal using intensity ratio. Applied Optics, 52(19):4483–4493, 2013

work page 2013
[28]

Sheng, Z

L. Sheng, Z. Lin, J. Shao, and X. Wang. Avatar-net: Multi-scale zero-shot style transfer by feature decoration. In IEEE CVPR , pages 8242–8250, 2018

work page 2018
[29]

Y. Shih, S. Paris, C. Barnes, W. T. Freeman, and F. Durand. Style transfer for headshot portraits. ACM TOG, 33(4):148, 2014

work page 2014
[30]

Z. Shu, S. Hadap, E. Shechtman, K. Sunkavalli, S. Paris, and D. Samaras. Portrait lighting transfer using a mass transport approach. ACM TOG, 36(4):1, 2017

work page 2017
[31]

Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras. Neural face editing with intrinsic image disentangling. In IEEE CVPR , pages 5541–5550, 2017

work page 2017
[32]

J. Sun, L. Liang, F. Wen, and H.-Y. Shum. Image vectorization using optimized gradient meshes. ACM TOG, 26(3):Article 11, 2007

work page 2007
[33]

Thanh-Tung and T

H. Thanh-Tung and T. Tran. Catastrophic forgetting and mode collapse in gans. In 2020 International Joint Conference on Neural Networks (IJCNN) , pages 1–10. IEEE, 2020

work page 2020
[34]

Wang, M.-Y

T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In IEEE CVPR , 2018

work page 2018
[35]

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE TIP , 13(4):600–612, 2004

work page 2004
[36]

G. Xie, X. Sun, X. Tong, and D. Nowrouzezahrai. Hierarchical diﬀusion curves for accurate automatic image vectorization. ACM TOG, 33(6):1–11, 2014

work page 2014
[37]

Xie, M.-T

Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le. Self-training with noisy student improves imagenet classiﬁcation. In IEEE CVPR , pages 10687–10698, 2020

work page 2020
[38]

Zhang, S

X. Zhang, S. Fanello, Y.-T. Tsai, T. Sun, T. Xue, R. Pandey, S. Orts-Escolano, P. Davidson, C. Rhemann, P. Debevec, et al. Neural light transport for relighting and view synthesis. arXiv:2008.03806, 2020. 18

work page arXiv 2008
[39]

S. Zhao, F. Durand, and C. Zheng. Inverse diﬀusion curves using shape optimization. IEEE TVCG, 24(7):2153–2166, 2017

work page 2017
[40]

H. Zhou, J. Zheng, and L. Wei. Representing images using curvilinear feature driven subdivi- sion surfaces. IEEE transactions on image processing , 23(8):3268–3280, 2014

work page 2014
[41]

H. Zhou, S. Hadap, K. Sunkavalli, and D. W. Jacobs. Deep single-image portrait relighting. In IEEE ICCV , pages 7194–7202, 2019

work page 2019
[42]

H. Zhou, X. Yu, and D. W. Jacobs. Glosh: Global-local spherical harmonics for intrinsic image decomposition. In IEEE ICCV , pages 7820–7829, 2019

work page 2019
[43]

B. Zoph, G. Ghiasi, T.-Y. Lin, Y. Cui, H. Liu, E. D. Cubuk, and Q. Le. Rethinking pre-training and self-training. NeurIPS, 33, 2020. 19

work page 2020

[1] [1]

M. Aﬁﬁ, M. A. Brubaker, and M. S. Brown. Histogan: Controlling colors of gan-generated and real images via color histograms. In IEEE CVPR , 2021

work page 2021

[2] [2]

Andersson, J

P. Andersson, J. Nilsson, T. Akenine-M¨ oller, M. Oskarsson, K.˚Astr¨ om, and M. D. Fairchild. FLIP: A Diﬀerence Evaluator for Alternating Images. Proceedings of the ACM on Computer Graphics and Interactive Techniques , 3(2):15:1–15:23, 2020

work page 2020

[3] [3]

Bang and H

D. Bang and H. Shim. Mggan: Solving mode collapse using manifold-guided training. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 2347– 2356, 2021

work page 2021

[4] [4]

S. Bell, K. Bala, and N. Snavely. Intrinsic images in the wild. ACM TOG, 33(4):1–12, 2014

work page 2014

[5] [5]

S. Bi, X. Han, and Y. Yu. An l1 image transform for edge-preserving smoothing and scene-level intrinsic decomposition. ACM TOG, 34(4):1–12, 2015

work page 2015

[6] [6]

Boy´ e, P

S. Boy´ e, P. Barla, and G. Guennebaud. A vectorial solver for free-form vector gradients.ACM TOG, 31(6):1–9, 2012

work page 2012

[7] [7]

J. Canny. A computational approach to edge detection. IEEE PAMI, (6):679–698, 1986

work page 1986

[8] [8]

J. F. Canny. A computational approach to edge detection. IEEE PAMI, PAMI-8(6):679–698, 1986

work page 1986

[9] [9]

Chen, Y.-S

K.-W. Chen, Y.-S. Luo, Y.-C. Lai, Y.-L. Chen, C.-Y. Yao, H.-K. Chu, and T.-Y. Lee. Image vectorization with real-time thin-plate spline. IEEE Transactions on Multimedia, 22(1):15–29, 2019

work page 2019

[10] [10]

Cheng, Y

Z. Cheng, Y. Zheng, S. You, and I. Sato. Non-local intrinsic decomposition with near-infrared priors. In IEEE ICCV , pages 2521–2530, 2019

work page 2019

[11] [11]

Dekel, C

T. Dekel, C. Gan, D. Krishnan, C. Liu, and W. T. Freeman. Sparse, smart contours to represent and edit images. In IEEE CVPR , pages 3511–3520, 2018

work page 2018

[12] [12]

Favreau, F

J.-D. Favreau, F. Lafarge, and A. Bousseau. Photo2clipart: image abstraction and vectoriza- tion using layered linear gradients. ACM TOG, 36(6):1–11, 2017

work page 2017

[13] [13]

Finch, J

M. Finch, J. Snyder, and H. Hoppe. Freeform vector graphics with controlled thin-plate splines. ACM TOG, 30(6):1–10, 2011

work page 2011

[14] [14]

Q. Fu, Y. He, F. Hou, J. Zhang, A. Zeng, and Y.-J. Liu. Vectorization based color transfer for portrait images. Computer-Aided Design, 115:111–121, 2019

work page 2019

[15] [15]

F. Hou, Q. Sun, Z. Fang, Y. Liu, S. Hu, H. Qin, A. Hao, and Y. He. Poisson vector graphics (PVG). IEEE TVCG, 26(2):1361–1371, 2020

work page 2020

[16] [16]

Lai, S.-M

Y.-K. Lai, S.-M. Hu, and R. R. Martin. Automatic and topology-preserving gradient mesh generation for image vectorization. ACM Transactions on Graphics (TOG) , 28(3):1–8, 2009. 17

work page 2009

[17] [17]

C.-H. Lee, Z. Liu, L. Wu, and P. Luo. Maskgan: Towards diverse and interactive facial image manipulation. In IEEE CVPR , pages 5549–5558, 2020

work page 2020

[18] [18]

Leordeanu, R

M. Leordeanu, R. Sukthankar, and C. Sminchisescu. Eﬃcient closed-form solution to gener- alized boundary detection. In ECCV, pages 516–529. Springer, 2012

work page 2012

[19] [19]

J. Liao, Y. Yao, L. Yuan, G. Hua, and S. B. Kang. Visual attribute transfer through deep image analogy. arXiv:1705.01088, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[20] [20]

Z. Liao, H. Hoppe, D. Forsyth, and Y. Yu. A subdivision-based representation for vector image editing. IEEE transactions on visualization and computer graphics , 18(11):1858–1867, 2012

work page 2012

[21] [21]

S. Lu, W. Jiang, X. Ding, C. S. Kaplan, X. Jin, F. Gao, and J. Chen. Depth-aware image vectorization and editing. The Visual Computer , 35(6-8):1027–1039, 2019

work page 2019

[22] [22]

Z. Lu, T. Hu, L. Song, Z. Zhang, and R. He. Conditional expression synthesis with face parsing transformation. In ACM MM, pages 1083–1091, 2018

work page 2018

[23] [23]

Orzan, A

A. Orzan, A. Bousseau, H. Winnem¨ oller, P. Barla, J. Thollot, and D. Salesin. Diﬀusion curves: A vector representation for smooth-shaded images. ACM TOG, 27(3):1–8, 2008

work page 2008

[24] [24]

X. S. Poma, E. Riba, and A. Sappa. Dense extreme inception network: Towards a robust cnn model for edge detection. In IEEE WCACV, pages 1923–1932, 2020

work page 1923

[25] [25]

Sengupta, A

S. Sengupta, A. Kanazawa, C. D. Castillo, and D. W. Jacobs. Sfsnet: Learning shape, re- ﬂectance and illuminance of facesin the wild’. In IEEE CVPR , pages 6296–6305, 2018

work page 2018

[26] [26]

Shafaei, J

A. Shafaei, J. J. Little, and M. Schmidt. Autoretouch: Automatic professional face retouching. In IEEE WACV, pages 990–998, January 2021

work page 2021

[27] [27]

Shen and Z.-H

H.-L. Shen and Z.-H. Zheng. Real-time highlight removal using intensity ratio. Applied Optics, 52(19):4483–4493, 2013

work page 2013

[28] [28]

Sheng, Z

L. Sheng, Z. Lin, J. Shao, and X. Wang. Avatar-net: Multi-scale zero-shot style transfer by feature decoration. In IEEE CVPR , pages 8242–8250, 2018

work page 2018

[29] [29]

Y. Shih, S. Paris, C. Barnes, W. T. Freeman, and F. Durand. Style transfer for headshot portraits. ACM TOG, 33(4):148, 2014

work page 2014

[30] [30]

Z. Shu, S. Hadap, E. Shechtman, K. Sunkavalli, S. Paris, and D. Samaras. Portrait lighting transfer using a mass transport approach. ACM TOG, 36(4):1, 2017

work page 2017

[31] [31]

Z. Shu, E. Yumer, S. Hadap, K. Sunkavalli, E. Shechtman, and D. Samaras. Neural face editing with intrinsic image disentangling. In IEEE CVPR , pages 5541–5550, 2017

work page 2017

[32] [32]

J. Sun, L. Liang, F. Wen, and H.-Y. Shum. Image vectorization using optimized gradient meshes. ACM TOG, 26(3):Article 11, 2007

work page 2007

[33] [33]

Thanh-Tung and T

H. Thanh-Tung and T. Tran. Catastrophic forgetting and mode collapse in gans. In 2020 International Joint Conference on Neural Networks (IJCNN) , pages 1–10. IEEE, 2020

work page 2020

[34] [34]

Wang, M.-Y

T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. In IEEE CVPR , 2018

work page 2018

[35] [35]

Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE TIP , 13(4):600–612, 2004

work page 2004

[36] [36]

G. Xie, X. Sun, X. Tong, and D. Nowrouzezahrai. Hierarchical diﬀusion curves for accurate automatic image vectorization. ACM TOG, 33(6):1–11, 2014

work page 2014

[37] [37]

Xie, M.-T

Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le. Self-training with noisy student improves imagenet classiﬁcation. In IEEE CVPR , pages 10687–10698, 2020

work page 2020

[38] [38]

Zhang, S

X. Zhang, S. Fanello, Y.-T. Tsai, T. Sun, T. Xue, R. Pandey, S. Orts-Escolano, P. Davidson, C. Rhemann, P. Debevec, et al. Neural light transport for relighting and view synthesis. arXiv:2008.03806, 2020. 18

work page arXiv 2008

[39] [39]

S. Zhao, F. Durand, and C. Zheng. Inverse diﬀusion curves using shape optimization. IEEE TVCG, 24(7):2153–2166, 2017

work page 2017

[40] [40]

H. Zhou, J. Zheng, and L. Wei. Representing images using curvilinear feature driven subdivi- sion surfaces. IEEE transactions on image processing , 23(8):3268–3280, 2014

work page 2014

[41] [41]

H. Zhou, S. Hadap, K. Sunkavalli, and D. W. Jacobs. Deep single-image portrait relighting. In IEEE ICCV , pages 7194–7202, 2019

work page 2019

[42] [42]

H. Zhou, X. Yu, and D. W. Jacobs. Glosh: Global-local spherical harmonics for intrinsic image decomposition. In IEEE ICCV , pages 7820–7829, 2019

work page 2019

[43] [43]

B. Zoph, G. Ghiasi, T.-Y. Lin, Y. Cui, H. Liu, E. D. Cubuk, and Q. Le. Rethinking pre-training and self-training. NeurIPS, 33, 2020. 19

work page 2020