NGL: Natural Garment Language for Training-Free Sewing Pattern Estimation
Pith reviewed 2026-05-21 11:58 UTC · model grok-4.3
The pith
A new Natural Garment Language bridges vision-language models with sewing pattern creation from images without training.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that NGL, a domain-specific language for describing garments in terms suited to vision-language models, permits a fully training-free pipeline: VLMs are queried to produce structured garment specifications in NGL from input images, which are then converted via deterministic rules into valid sewing patterns. This yields state-of-the-art results on geometry metrics, higher preference in perceptual tests, and the ability to recover multi-layer outfits from in-the-wild images, unlike prior approaches restricted to single-layer synthetic data.
What carries the argument
NGL, a novel domain-specific language that represents garments using structured specifications aligned with the natural descriptive abilities of vision-language models, which acts as the intermediary for extraction and deterministic conversion to sewing patterns.
Load-bearing premise
Large vision-language models can reliably extract accurate and complete structured garment specifications in NGL from in-the-wild images with occlusions and multi-layer outfits, and the deterministic conversion rules always produce valid sewing patterns matching the visual input.
What would settle it
A counterexample would be an in-the-wild image of a person wearing multiple layered garments where the extracted NGL leads to a sewing pattern whose 3D reconstruction does not align with the visible clothing layers or parts in the original photo.
Figures
read the original abstract
Estimating sewing patterns from images is a practical approach for creating high-quality 3D garments, but it remains challenging due to the scarcity of paired real-world image and sewing-pattern data. Existing methods address this limitation by training vision-language models (VLMs) to learn low-level sewing-pattern representations from synthetic garments sampled from parametric garment models. However, they often struggle to generalize to in-the-wild images, fail to capture real-world correlations between garment parts, and are restricted to single-layer outfits. In contrast, we observe that VLMs are effective at describing garments in natural language, but mapping these descriptions into valid sewing patterns remains difficult. To bridge this gap, we propose NGL (Natural Garment Language), a novel domain-specific language that represents garments in terms aligned with VLMs' natural descriptive abilities. Leveraging NGL, we introduce a fully training-free pipeline that queries large VLMs to extract structured garment specifications and deterministically converts them into valid sewing patterns. We evaluate our method on the Dress4D, CloSe and a newly collected dataset of 253 in-the-wild fashion images. Our approach achieves state-of-the-art performance on standard geometry metrics and is preferred in both human and GPT-based perceptual evaluations compared to existing baselines. Furthermore, NGL recovers multi-layer outfits whereas competing methods focus mostly on single-layer garments, highlighting its strong generalization to real-world images even with occluded parts. These results demonstrate that an efficient garment representation is critical for sewing pattern estimation with VLMs. Our code and data will be released for research use.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Natural Garment Language (NGL), a domain-specific language for representing garments in terms aligned with VLM descriptive abilities. It describes a fully training-free pipeline that queries large VLMs to extract structured NGL specifications from images and applies deterministic conversion rules to produce valid sewing patterns. The method is evaluated on Dress4D, CloSe, and a new 253-image in-the-wild dataset, claiming state-of-the-art geometry metrics, human and GPT-based perceptual preference over baselines, and improved handling of multi-layer outfits with occlusions.
Significance. If the core assumptions hold, the work would be significant as a training-free alternative that leverages existing VLMs without synthetic paired data, potentially improving generalization to real-world multi-layer garments. The introduction of NGL as an intermediary representation and the release of code/data are positive contributions to reproducibility in garment reconstruction.
major comments (2)
- [Abstract and §4] Abstract and evaluation description: no quantitative breakdown is provided for VLM extraction accuracy, NGL completeness on occluded/multi-layer cases, or the fraction of inputs for which the deterministic conversion produces topologically valid sewing patterns; these metrics are load-bearing for the SOTA geometry and perceptual claims on the 253-image set.
- [§3.2] Method section on conversion rules: the claim that the hand-crafted rules always emit valid, image-faithful patterns is not supported by reported validation, failure-case analysis, or success rates, leaving open whether gaps in rule coverage (e.g., inner-layer attachment or occluded darts) undermine the pipeline on in-the-wild data.
minor comments (1)
- [§2] The NGL grammar and example specifications would benefit from a formal syntax definition or additional illustrative figures to improve clarity for readers unfamiliar with the language.
Simulated Author's Rebuttal
We thank the referee for their thoughtful comments and suggestions. We address the major comments point by point below, providing clarifications and committing to revisions where appropriate to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and evaluation description: no quantitative breakdown is provided for VLM extraction accuracy, NGL completeness on occluded/multi-layer cases, or the fraction of inputs for which the deterministic conversion produces topologically valid sewing patterns; these metrics are load-bearing for the SOTA geometry and perceptual claims on the 253-image set.
Authors: We acknowledge that a more granular quantitative breakdown of the pipeline components would enhance the transparency of our results. Our primary focus in the evaluation was on the end-to-end performance metrics, which reflect the practical utility of the method for sewing pattern estimation. That said, we agree this is a valid point and will revise the manuscript to include additional metrics, such as VLM extraction accuracy assessed via manual verification on a subset of the 253-image dataset, NGL completeness for multi-layer and occluded cases, and the success rate of the deterministic conversion in producing valid patterns. This will better support our claims. revision: yes
-
Referee: [§3.2] Method section on conversion rules: the claim that the hand-crafted rules always emit valid, image-faithful patterns is not supported by reported validation, failure-case analysis, or success rates, leaving open whether gaps in rule coverage (e.g., inner-layer attachment or occluded darts) undermine the pipeline on in-the-wild data.
Authors: The hand-crafted rules are intended to deterministically map NGL specifications to valid sewing patterns by construction, leveraging the structured nature of NGL. While we did not include explicit success rates or a dedicated failure-case analysis in the original submission, our internal testing on the evaluation datasets showed consistent production of valid patterns. We recognize the importance of this documentation and will add a validation subsection in §3.2, including success rates, examples of rule applications for occluded and multi-layer garments, and discussion of potential limitations in rule coverage. revision: yes
Circularity Check
No significant circularity in the NGL pipeline
full rationale
The paper proposes NGL as a new domain-specific language and a training-free pipeline that queries external VLMs for structured specifications followed by hand-crafted deterministic conversion rules to sewing patterns. No parameters are fitted to data, no predictions reduce to inputs by construction, and no load-bearing self-citations or uniqueness theorems are invoked to justify the core method. The approach is self-contained as an engineering contribution relying on VLM capabilities and explicit rules, with evaluations on Dress4D, CloSe, and a new 253-image set providing independent empirical support.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Large VLMs can extract accurate structured garment specifications in NGL from in-the-wild images including occluded and multi-layer cases
invented entities (1)
-
NGL (Natural Garment Language)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose NGL (Natural Garment Language), a novel intermediate language that restructures GarmentCode into a representation more understandable to language models... A deterministic parser then maps NGL back into GarmentCode
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
training-free pipeline that queries large VLMs to extract structured garment specifications and deterministically converts them into valid sewing patterns
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Layered- garment net: Generating multiple implicit garment layers from a single image
Alakh Aggarwal, Jikai Wang, Steven Hogue, Saifeng Ni, Madhukar Budagavi, and Xiaohu Guo. Layered- garment net: Generating multiple implicit garment layers from a single image. InProceedings of the Asian Conference on Computer Vision (ACCV), 2022. 2
work page 2022
-
[2]
Close: A 3d clothing segmentation dataset and model
Dimitrije Anti ´c, Garvita Tiwari, Batuhan Ozcomlekci, Riccardo Marin, and Gerard Pons-Moll. Close: A 3d clothing segmentation dataset and model. In2024 international conference on 3D vision (3DV), pages 591–601. IEEE, 2024. 2, 6
work page 2024
-
[3]
Chatgarment: Garment estimation, generation and editing via large language models
Siyuan Bian, Chenghao Xu, Yuliang Xiu, Artur Grig- orev, Zhen Liu, Cewu Lu, Michael J Black, and Yao Feng. Chatgarment: Garment estimation, generation and editing via large language models. InProceed- ings of the Computer Vision and Pattern Recognition Conference, pages 2924–2934, 2025. 2, 3, 7, 10
work page 2025
-
[4]
Panelformer: Sewing pattern reconstruction from 2d garment im- ages
Cheng-Hsiu Chen, Jheng-Wei Su, Min-Chun Hu, Chih-Yuan Yao, and Hung-Kuo Chu. Panelformer: Sewing pattern reconstruction from 2d garment im- ages. InProceedings of the IEEE/CVF Winter Con- ference on Applications of Computer Vision (WACV), pages 454–463, 2024. 3
work page 2024
-
[5]
Xipeng Chen, Guangrun Wang, Dizhong Zhu, Xiao- dan Liang, Philip H. S. Torr, and Liang Lin. Structure- preserving 3d garment modeling with neural sewing machines. InProceedings of the 36th International Conference on Neural Information Processing Sys- tems, Red Hook, NY , USA, 2022. Curran Associates Inc. 3
work page 2022
-
[6]
Garmentnets: Category- level pose estimation for garments via canonical space shape completion
Cheng Chi and Shuran Song. Garmentnets: Category- level pose estimation for garments via canonical space shape completion. InThe IEEE International Confer- ence on Computer Vision (ICCV), 2021. 2
work page 2021
-
[7]
Sm- plicit: Topology-aware generative model for clothed people
Enric Corona, Albert Pumarola, Guillem Aleny `a, Ger- ard Pons-Moll, and Francesc Moreno-Noguer. Sm- plicit: Topology-aware generative model for clothed people. InCVPR, 2021. 2 10
work page 2021
-
[8]
Ngd: Neu- ral gradient based deformation for monocular garment reconstruction
Soham Dasgupta, Shanthika Naik, Preet Savalia, Su- jay Kumar Ingle, and Avinash Sharma. Ngd: Neu- ral gradient based deformation for monocular garment reconstruction. InInternational Conference on Com- puter Vision (ICCV), 2025. 2
work page 2025
-
[9]
TokenHMR: Advancing human mesh recovery with a tokenized pose represen- tation
Sai Kumar Dwivedi, Yu Sun, Priyanka Patel, Yao Feng, and Michael J Black. TokenHMR: Advancing human mesh recovery with a tokenized pose represen- tation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1323– 1333, 2024. 6
work page 2024
-
[10]
Con- tourcraft: Learning to resolve intersections in neural multi-garment simulations
Artur Grigorev, Giorgio Becherini, Michael Black, Otmar Hilliges, and Bernhard Thomaszewski. Con- tourcraft: Learning to resolve intersections in neural multi-garment simulations. InACM SIGGRAPH 2024 conference papers, pages 1–10, 2024. 6
work page 2024
-
[11]
Garnet: A two-stream network for fast and accurate 3d cloth draping
Erhan Gundogdu, Victor Constantin, Amrollah Seifoddini, Minh Dang, Mathieu Salzmann, and Pas- cal Fua. Garnet: A two-stream network for fast and accurate 3d cloth draping. InIEEE International Con- ference on Computer Vision (ICCV). IEEE, 2019. 2
work page 2019
-
[12]
Kai He, Kaixin Yao, Qixuan Zhang, Jingyi Yu, Lingjie Liu, and Lan Xu. Dresscode: Autoregressively sewing and generating garments from text guidance.ACM Transactions on Graphics (TOG), 43(4):1–13, 2024. 3
work page 2024
-
[13]
Garment4d: garment reconstruction from point cloud sequences
Fangzhou Hong, Liang Pan, Zhongang Cai, and Ziwei Liu. Garment4d: garment reconstruction from point cloud sequences. InProceedings of the 35th Interna- tional Conference on Neural Information Processing Systems, Red Hook, NY , USA, 2021. Curran Asso- ciates Inc. 2
work page 2021
-
[14]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C Berg, Wan-Yen Lo, et al. Segment anything. InProceedings of the IEEE/CVF international conference on computer vi- sion, pages 4015–4026, 2023. 6
work page 2023
-
[15]
Generating datasets of 3d garments with sewing patterns.arXiv preprint arXiv:2109.05633, 2021
Maria Korosteleva and Sung-Hee Lee. Generating datasets of 3d garments with sewing patterns.arXiv preprint arXiv:2109.05633, 2021. 3
-
[16]
Gar- mentcode: Programming parametric sewing patterns
Maria Korosteleva and Olga Sorkine-Hornung. Gar- mentcode: Programming parametric sewing patterns. ACM Transactions on Graphics (TOG), 42(6):1–15,
-
[17]
Garmentcodedata: A dataset of 3d made-to-measure garments with sewing patterns
Maria Korosteleva, Timur Levent Kesdogan, Fabian Kemper, Stephan Wenninger, Jasmin Koller, Yuhan Zhang, Mario Botsch, and Olga Sorkine-Hornung. Garmentcodedata: A dataset of 3d made-to-measure garments with sewing patterns. InEuropean Confer- ence on Computer Vision, pages 110–127. Springer,
-
[18]
DIG: Draping Implicit Garment over the Human Body
Ren Li, Benoit Guillard, Edoardo Remelli, and Pascal Fua. DIG: Draping Implicit Garment over the Human Body. InAsian Conference on Computer Vision, 2022. 2
work page 2022
-
[19]
ISP: Multi- Layered Garment Draping with Implicit Sewing Pat- terns
Ren Li, Benoit Guillard, and Pascal Fua. ISP: Multi- Layered Garment Draping with Implicit Sewing Pat- terns. InAdvances in Neural Information Processing Systems, 2023. 2
work page 2023
-
[20]
Garmentdiffu- sion: 3d garment sewing pattern generation with mul- timodal diffusion transformers
Xinyu Li, Qi Yao, and Yuanda Wang. Garmentdiffu- sion: 3d garment sewing pattern generation with mul- timodal diffusion transformers. InProceedings of the Thirty-Fourth International Joint Conference on Ar- tificial Intelligence, IJCAI-25, pages 1458–1466. In- ternational Joint Conferences on Artificial Intelligence Organization, 2025. Main Track. 3
work page 2025
-
[21]
Sp- net: Estimating garment sewing patterns from a single image of a posed user
Seungchan Lim, Sumin Kim, and Sung-Hee Lee. Sp- net: Estimating garment sewing patterns from a single image of a posed user. In45th Annual Conference of the European Association for Computer Graphics, Eurographics 2024 - Short Papers, Limassol, Cyprus, April 22-26, 2024. Eurographics Association, 2024. 3
work page 2024
-
[22]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning. InNeurIPS,
-
[23]
Lijuan Liu, Xiangyu Xu, Zhijie Lin, Jiabin Liang, and Shuicheng Yan. Towards garment sewing pattern re- construction from a single image.ACM Transactions on Graphics (TOG), 42(6):1–15, 2023. 3
work page 2023
-
[24]
Qianli Ma, Jinlong Yang, Michael J. Black, and Siyu Tang. Neural point-based shape modeling of humans in challenging clothing. InInternational Conference on 3D Vision (3DV), pages 679–689, 2022. 2
work page 2022
-
[25]
3d clothed human reconstruction in the wild
Gyeongsik Moon, Hyeongjin Nam, Takaaki Shiratori, and Kyoung Mu Lee. 3d clothed human reconstruction in the wild. InEuropean Conference on Computer Vision (ECCV), 2022. 2
work page 2022
-
[26]
AIpparel: A multimodal foun- dation model for digital garments
Kiyohiro Nakayama, Jan Ackermann, Timur Lev- ent Kesdogan, Yang Zheng, Maria Korosteleva, Olga Sorkine-Hornung, Leonidas J Guibas, Guandao Yang, and Gordon Wetzstein. AIpparel: A multimodal foun- dation model for digital garments. InProceedings of the Computer Vision and Pattern Recognition Confer- ence, pages 8138–8149, 2025. 2, 3
work page 2025
-
[27]
OpenAI. Chatgpt.https://chat.openai. com/, 2025. Large language model. 6
work page 2025
-
[28]
Computational pattern making from 3d gar- ment models.ACM Trans
Nico Pietroni, Corentin Dumery, Raphael Falque, Mark Liu, Teresa Vidal-Calleja, and Olga Sorkine- Hornung. Computational pattern making from 3d gar- ment models.ACM Trans. Graph., 41(4), 2022. 3
work page 2022
-
[29]
Learning transferable visual models from natural 11 language supervision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sas- try, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural 11 language supervision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 10
work page 2021
-
[30]
Otaduy, Nils Thuerey, and Dan Casas
Igor Santesteban, Miguel A. Otaduy, Nils Thuerey, and Dan Casas. Ulnef: untangled layered neural fields for mix-and-match virtual try-on. InProceedings of the 36th International Conference on Neural Informa- tion Processing Systems, Red Hook, NY , USA, 2022. Curran Associates Inc. 2
work page 2022
- [31]
-
[32]
Neural-gif: Neural generalized im- plicit functions for animating people in clothing
Garvita Tiwari, Nikolaos Sarafianos, Tony Tung, and Gerard Pons-Moll. Neural-gif: Neural generalized im- plicit functions for animating people in clothing. In International Conference on Computer Vision (ICCV),
-
[33]
Wang, Duygu Ceylan, Jovan Popovi ´c, and Niloy J
Tuanfeng Y . Wang, Duygu Ceylan, Jovan Popovi ´c, and Niloy J. Mitra. Learning a shared shape space for multimodal garment design.ACM Trans. Graph., 37(6), 2018. 3
work page 2018
-
[34]
4d-dress: A 4d dataset of real-world human clothing with semantic annotations
Wenbo Wang, Hsuan-I Ho, Chen Guo, Boxiang Rong, Artur Grigorev, Jie Song, Juan Jose Zarate, and Otmar Hilliges. 4d-dress: A 4d dataset of real-world human clothing with semantic annotations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 550–560, 2024. 2, 6
work page 2024
-
[35]
Yuliang Xiu, Jinlong Yang, Dimitrios Tzionas, and Michael J. Black. ICON: Implicit Clothed humans Obtained from Normals. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), pages 13296–13306, 2022. 2
work page 2022
-
[36]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025. 6
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[37]
Detailed Garment Recovery from a Single-View Image
Shan Yang, Tanya Amert, Zherong Pan, Ke Wang, Licheng Yu, Tamara L. Berg, and Ming C Lin. De- tailed garment recovery from a single-view image. ArXiv, abs/1608.01250, 2016. 3
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[38]
Shan Yang, Zherong Pan, Tanya Amert, Ke Wang, Licheng Yu, Tamara L. Berg, and Ming C Lin. Physics-inspired garment recovery from a single-view image.ACM Transactions on Graphics (TOG), 37:1 – 14, 2018. 3
work page 2018
-
[39]
Cheng Zhang, Yuanhao Wang, Francisco Vicente, Chenglei Wu, Jinlong Yang, Thabo Beeler, and Fer- nando De la Torre. Fabricdiffusion: High-fidelity tex- ture transfer for 3d garments generation from in-the- wild images. InSIGGRAPH Asia 2024 Conference Papers, pages 1–12, 2024. 6
work page 2024
-
[40]
Learning anchor transformations for 3d gar- ment animation
Fang Zhao, Zekun Li, Shaoli Huang, Junwu Weng, Tianfei Zhou, Guo-Sen Xie, Jue Wang, and Ying Shan. Learning anchor transformations for 3d gar- ment animation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), 2023. 2
work page 2023
-
[41]
De- sign2garmentcode: Turning design concepts to tangi- ble garments through program synthesis
Feng Zhou, Ruiyang Liu, Chen Liu, Gaofeng He, Yong-Lu Li, Xiaogang Jin, and Huamin Wang. De- sign2garmentcode: Turning design concepts to tangi- ble garments through program synthesis. InProceed- ings of the Computer Vision and Pattern Recognition Conference, pages 23712–23722, 2025. 2, 3 12
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.