CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

Dong Xu; Jing Zhang; Qian Yu; Xilin Wang; Ximing Xing; Yandong Guan

arxiv: 2505.19713 · v3 · pith:6DM7LPVAnew · submitted 2025-05-26 · 💻 cs.GR

CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

Yandong Guan , Xilin Wang , Ximing Xing , Jing Zhang , Dong Xu , Qian Yu This is my paper

Pith reviewed 2026-05-19 14:06 UTC · model grok-4.3

classification 💻 cs.GR

keywords text-to-CADCadQuerylarge language modelsreinforcement learningchain-of-thoughtgeometric reward3D model generation

0 comments

The pith

CAD-Coder lets language models generate valid complex CAD models from text by producing optimized CadQuery scripts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CAD-Coder as a framework that turns natural language descriptions into 3D CAD models by generating executable CadQuery Python scripts rather than direct geometry outputs. This representation choice supports immediate geometric validation and gives the model access to a full parametric modeling vocabulary. Training proceeds in two stages: supervised fine-tuning on paired text and script data, then reinforcement learning with Group Reward Policy Optimization that scores outputs using Chamfer Distance for shape accuracy plus a format check for code validity. A chain-of-thought planning step is added to improve step-by-step reasoning about the design. The authors assembled a dataset of 110,000 text-CadQuery-model triplets to train the system and report gains in producing diverse and usable CAD results.

Core claim

By casting text-to-CAD as the generation of parametric CadQuery scripts and training with a two-stage pipeline of supervised fine-tuning followed by reinforcement learning under a geometric reward that combines Chamfer Distance with format compliance, together with chain-of-thought planning, the approach enables large language models to produce diverse, valid, and complex CAD models directly from natural language.

What carries the argument

The CAD-specific reward that adds Chamfer Distance for geometric fidelity to a format reward for script correctness, applied inside Group Reward Policy Optimization after supervised fine-tuning.

If this is right

Language models can output executable code that directly produces geometrically correct CAD parts.
The resulting models integrate immediately with standard CAD tools for further editing or validation.
Chain-of-thought reasoning enables the model to handle more intricate design sequences than direct generation methods.
The same training recipe can scale to larger datasets for continued gains in model complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Designers could describe a mechanical part in plain words and receive a production-ready CAD file without drawing it themselves.
The method could be combined with existing CAD libraries to support assemblies or parametric families of parts.
Reward-driven code generation of this form might transfer to other engineering domains that rely on scripted geometry.

Load-bearing premise

A reward signal built from Chamfer Distance and code format compliance is enough to guarantee that the generated models are both geometrically accurate and ready for practical use.

What would settle it

Running the generated CAD models through manufacturing simulation or expert review and finding frequent cases of non-manufacturable topology or hidden errors despite low Chamfer Distance scores would show the reward does not suffice.

Figures

Figures reproduced from arXiv: 2505.19713 by Dong Xu, Jing Zhang, Qian Yu, Xilin Wang, Ximing Xing, Yandong Guan.

**Figure 2.** Figure 2: CAD-Coder Training Pipeline Stage 1: Supervised Fine-Tuning for CAD Code Generation. We begin by performing SFT to equip the model with the basic capability to translate natural language descriptions into executable CadQuery code. Unlike generic code generation, CAD code must follow strict syntactic and geometric constraints. This phase serves as a foundation that enables the model to understand the Cad… view at source ↗

**Figure 3.** Figure 3: Qualitative comparison between baseline methods and different model variants under [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Overview of our annotation pipeline. Given CAD command sequences and natural language [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: (a) Chamfer Distance (CD) distributions of generated CAD models trained with different [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

read the original abstract

In this work, we introduce CAD-Coder, a novel framework that reformulates text-to-CAD as the generation of CadQuery scripts - a Python-based, parametric CAD language. This representation enables direct geometric validation, a richer modeling vocabulary, and seamless integration with existing LLMs. To further enhance code validity and geometric fidelity, we propose a two-stage learning pipeline: (1) supervised fine-tuning on paired text-CadQuery data, and (2) reinforcement learning with Group Reward Policy Optimization (GRPO), guided by a CAD-specific reward comprising both a geometric reward (Chamfer Distance) and a format reward. We also introduce a chain-of-thought (CoT) planning process to improve model reasoning, and construct a large-scale, high-quality dataset of 110K text-CadQuery-3D model triplets and 1.5K CoT samples via an automated pipeline. Extensive experiments demonstrate that CAD-Coder enables LLMs to generate diverse, valid, and complex CAD models directly from natural language, advancing the state of the art of text-to-CAD generation and geometric reasoning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces CAD-Coder, a framework reformulating text-to-CAD generation as the production of CadQuery Python scripts. It uses a two-stage pipeline consisting of supervised fine-tuning on a constructed 110K text-CadQuery-3D model dataset followed by Group Reward Policy Optimization (GRPO) reinforcement learning. The RL stage is guided by a composite reward that includes a geometric component based on Chamfer Distance (computed after point sampling) and a format reward, augmented by chain-of-thought planning. The central claim is that this approach enables LLMs to generate diverse, valid, and complex CAD models directly from natural language, advancing the state of the art in text-to-CAD and geometric reasoning.

Significance. If the empirical claims hold, the work would be significant for the graphics and CAD communities by demonstrating a scalable way to leverage LLMs for parametric CAD script generation, potentially lowering barriers to complex 3D modeling. The automated construction of a large paired dataset and the explicit use of an external geometric metric (Chamfer Distance) rather than self-referential signals are positive features. However, the practical impact hinges on whether the chosen reward produces models that are not only geometrically close but also topologically valid and manufacturable.

major comments (2)

[Abstract and Section 4] Abstract and reward definition (Section 4): The geometric reward relies on Chamfer Distance after point sampling from the generated CadQuery output. Chamfer Distance quantifies surface proximity but is insensitive to self-intersections, non-manifold edges, invalid B-rep topology, or parametric constraints that would cause the model to fail as a solid in a CAD kernel. Because the paper positions this reward (together with the format reward) as the mechanism delivering 'valid' and 'practically usable' models without further human or domain-specific validation, this choice is load-bearing for the central claim; additional topological or manufacturing-validity metrics would be required to substantiate the claim.
[Section 5] Section 5 (Experiments): The abstract asserts 'extensive experiments' showing superiority in diversity, validity, and complexity, yet the provided description contains no quantitative tables, baseline comparisons, ablation results on the GRPO coefficients, or error analysis of failure modes (e.g., rate of topologically invalid outputs). Without these details it is impossible to evaluate whether the reported gains are robust or whether the Chamfer-based reward actually correlates with downstream usability.

minor comments (2)

[Section 3] The construction pipeline for the 110K dataset and the 1.5K CoT samples is mentioned but lacks sufficient detail on filtering criteria, quality assurance, or diversity metrics; a dedicated subsection or appendix would improve reproducibility.
[Section 4] Notation for the GRPO objective and the precise weighting between geometric and format rewards should be formalized with an equation rather than prose description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below, indicating where revisions have been made to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract and Section 4] Abstract and reward definition (Section 4): The geometric reward relies on Chamfer Distance after point sampling from the generated CadQuery output. Chamfer Distance quantifies surface proximity but is insensitive to self-intersections, non-manifold edges, invalid B-rep topology, or parametric constraints that would cause the model to fail as a solid in a CAD kernel. Because the paper positions this reward (together with the format reward) as the mechanism delivering 'valid' and 'practically usable' models without further human or domain-specific validation, this choice is load-bearing for the central claim; additional topological or manufacturing-validity metrics would be required to substantiate the claim.

Authors: We thank the referee for this important observation. Chamfer Distance indeed measures surface proximity and is insensitive to topological defects such as self-intersections or non-manifold geometry. In our pipeline the format reward requires the CadQuery script to execute successfully before point sampling occurs, providing a basic validity filter. Nevertheless, we agree this does not fully address manufacturability or complex topological validity. In the revised manuscript we have added an explicit limitations paragraph in Section 4, clarified the proxy nature of the current reward, and included supplementary topological validation rates (e.g., successful B-rep solid checks) in the experimental results. revision: partial
Referee: [Section 5] Section 5 (Experiments): The abstract asserts 'extensive experiments' showing superiority in diversity, validity, and complexity, yet the provided description contains no quantitative tables, baseline comparisons, ablation results on the GRPO coefficients, or error analysis of failure modes (e.g., rate of topologically invalid outputs). Without these details it is impossible to evaluate whether the reported gains are robust or whether the Chamfer-based reward actually correlates with downstream usability.

Authors: We agree that the experimental presentation must be more comprehensive to support the claims. The full manuscript contains baseline comparisons and quantitative metrics, but we have substantially expanded Section 5 in the revision: new Table 1 reports validity, diversity, and complexity scores against baselines; Table 2 presents ablations on GRPO reward coefficients; and a dedicated error-analysis subsection now quantifies failure modes including the rate of topologically invalid outputs. These additions allow direct assessment of robustness and the relationship between the reward signal and practical usability. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on external Chamfer Distance metric and independent dataset construction

full rationale

The paper describes a standard two-stage pipeline of supervised fine-tuning on an externally constructed 110K text-CadQuery dataset followed by GRPO reinforcement learning. The geometric reward is defined using Chamfer Distance computed against ground-truth point samples plus a separate format reward; neither quantity is defined in terms of the model's own outputs or predictions. No self-citations, uniqueness theorems, or ansatzes from prior author work are invoked as load-bearing steps in the provided abstract and description. The central claim that the resulting models are valid and complex is an empirical outcome of optimizing the external proxy rather than a definitional tautology. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim depends on the assumption that CadQuery can represent the target class of CAD models and that the automatically generated dataset faithfully captures real geometric relationships; these are domain assumptions rather than derived results.

free parameters (1)

GRPO reward coefficients
Weights balancing geometric Chamfer Distance and format rewards are not specified in the abstract but must be chosen to make the RL stage work.

axioms (1)

domain assumption CadQuery scripts provide a sufficiently expressive and directly executable representation for complex parametric CAD models
Invoked when the authors choose CadQuery as the output format to enable geometric validation.

pith-pipeline@v0.9.0 · 5736 in / 1207 out tokens · 61937 ms · 2026-05-19T14:06:27.959777+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

reinforcement learning with Group Reward Policy Optimization (GRPO), guided by a CAD-specific reward comprising both a geometric reward (Chamfer Distance) and a format reward
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

two-stage learning pipeline: (1) supervised fine-tuning ... (2) reinforcement learning with GRPO

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ArtiCAD: Articulated CAD Assembly Design via Multi-Agent Code Generation
cs.CV 2026-04 unverdicted novelty 7.0

ArtiCAD presents the first training-free multi-agent framework that generates articulated, editable CAD assemblies from text or images by predicting assembly relationships early and using validation with rollback.
InCoder-32B-Thinking: Industrial Code World Model for Thinking
cs.AR 2026-04 unverdicted novelty 6.0

InCoder-32B-Thinking uses error-feedback synthesized thinking traces and a code world model to reach top open-source scores on general and industrial code benchmarks including 81.3% on LiveCodeBench and 84.0% on CAD-Coder.
Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection
cs.CV 2026-03 unverdicted novelty 6.0

Pointer-CAD unifies B-Rep geometry with command sequences via pointer-based entity selection, allowing LLMs to perform complex CAD edits while cutting topological errors from quantization.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · cited by 3 Pith papers · 8 internal anchors

[1]

Program Synthesis with Large Language Models

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, et al. Program synthesis with large language models.arXiv preprint arXiv:2108.07732, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[2]

Query2cad: Generating cad models using natural language queries

Akshay Badagabettu, Sai Sravan Yarlagadda, and Amir Barati Farimani. Query2cad: Generating cad models using natural language queries.arXiv preprint arXiv:2406.00144, 2024

work page arXiv 2024
[3]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian...

work page 2021
[4]

Img2cad: Conditioned 3d cad model generation from single image with structured visual geometry.arXiv preprint arXiv:2410.03417, 2024

Tianrun Chen, Chunan Yu, Yuanqi Hu, Jing Li, Tao Xu, Runlong Cao, Lanyun Zhu, Ying Zang, Yong Zhang, Zejian Li, et al. Img2cad: Conditioned 3d cad model generation from single image with structured visual geometry.arXiv preprint arXiv:2410.03417, 2024

work page arXiv 2024
[5]

An investigation on utilizing large language model for industrial computer-aided design automation.Procedia CIRP, 128:221–226, 2024

Haoxuan Deng, Samir Khan, and John Ahmet Erkoyuncu. An investigation on utilizing large language model for industrial computer-aided design automation.Procedia CIRP, 128:221–226, 2024

work page 2024
[6]

What Sets Proficient and Expert Users Apart? Results of a Computer-Aided Design Experiment.Journal of Mechanical Design, 146(1):011401, 10 2023

Yuanzhe Deng, James Chen, and Alison Olechowski. What Sets Proficient and Expert Users Apart? Results of a Computer-Aided Design Experiment.Journal of Mechanical Design, 146(1):011401, 10 2023

work page 2023
[7]

Cadquery: A python parametric cad scripting framework

CADQuery Developers. Cadquery: A python parametric cad scripting framework. https: //cadquery.readthedocs.io/, 2024. Accessed: 2024-10-22

work page 2024
[8]

Complexgen: Cad reconstruction by b-rep chain complex generation.ACM Transactions on Graphics (TOG), 2022

Haoxiang Guo, Shilin Liu, Hao Pan, Yang Liu, Xin Tong, and Baining Guo. Complexgen: Cad reconstruction by b-rep chain complex generation.ACM Transactions on Graphics (TOG), 2022

work page 2022
[9]

Unveiling the mist over 3d vision-language understanding: Object-centric evaluation with chain-of-analysis.arXiv preprint arXiv:2503.22420, 2025

Jiangyong Huang, Baoxiong Jia, Yan Wang, Ziyu Zhu, Xiongkun Linghu, Qing Li, Song- Chun Zhu, and Siyuan Huang. Unveiling the mist over 3d vision-language understanding: Object-centric evaluation with chain-of-analysis.arXiv preprint arXiv:2503.22420, 2025

work page arXiv 2025
[10]

Solidgen: An autoregressive model for direct b-rep synthesis

Pradeep Kumar Jayaraman, Joseph G Lambourne, Nishkrit Desai, Karl DD Willis, Aditya Sanghi, and Nigel JW Morris. Solidgen: An autoregressive model for direct b-rep synthesis. Transaction in Machine Learning Research, 2023

work page 2023
[11]

A survey of reinforcement learning from human feedback, 2024

Timo Kaufmann, Paul Weng, Viktor Bengs, and Eyke Hüllermeier. A survey of reinforcement learning from human feedback, 2024

work page 2024
[12]

Cad-signet: Cad language inference from point clouds using layer-wise sketch instance guided attention

Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali, Kseniya Cherenkova, Anis Kacem, and Djamila Aouada. Cad-signet: Cad language inference from point clouds using layer-wise sketch instance guided attention. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4713–4722, 2024

work page 2024
[13]

Text2cad: Generating sequential cad designs from beginner-to-expert level text prompts.Advances in Neural Information Processing Systems, 37:7552–7579, 2024

Mohammad Sadil Khan, Sankalp Sinha, Talha Uddin, Didier Stricker, Sk Aziz Ali, and Muham- mad Zeshan Afzal. Text2cad: Generating sequential cad designs from beginner-to-expert level text prompts.Advances in Neural Information Processing Systems, 37:7552–7579, 2024

work page 2024
[14]

Abc: A big cad model dataset for geometric deep learning

Sebastian Koch, Albert Matveev, Zhongshi Jiang, Francis Williams, Alexey Artemov, Evgeny Burnaev, Marc Alexa, Denis Zorin, and Daniele Panozzo. Abc: A big cad model dataset for geometric deep learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9601–9611, 2019. 15

work page 2019
[15]

Cad-llama: Leveraging large language models for computer-aided design parametric 3d model generation.arXiv preprint arXiv:2505.04481, 2025

Jiahao Li, Weijian Ma, Xueyang Li, Yunzhong Lou, Guichun Zhou, and Xiangdong Zhou. Cad-llama: Leveraging large language models for computer-aided design parametric 3d model generation.arXiv preprint arXiv:2505.04481, 2025

work page arXiv 2025
[16]

Cad translator: An effective drive for text to 3d parametric computer-aided design generative modeling

Xueyang Li, Yu Song, Yunzhong Lou, and Xiangdong Zhou. Cad translator: An effective drive for text to 3d parametric computer-aided design generative modeling. InProceedings of the 32nd ACM International Conference on Multimedia, pages 8461–8470, 2024

work page 2024
[17]

Hola: B-rep generation using a holistic latent representation.arXiv preprint arXiv:2504.14257, 2025

Yilin Liu, Duoteng Xu, Xingyao Yu, Xiang Xu, Daniel Cohen-Or, Hao Zhang, and Hui Huang. Hola: B-rep generation using a holistic latent representation.arXiv preprint arXiv:2504.14257, 2025

work page arXiv 2025
[18]

Decoupled weight decay regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019

work page 2019
[19]

CAD-Assistant: Tool-augmented VLLMs as generic CAD task solvers.arXiv preprint arXiv:2412.13810, 2024

Dimitrios Mallis, Ahmet Serdar Karadeniz, Sebastian Cavada, Danila Rukhovich, Niki Foteinopoulou, Kseniya Cherenkova, Anis Kacem, and Djamila Aouada. Cad-assistant: Tool- augmented vllms as generic cad task solvers?arXiv preprint arXiv:2412.13810, 2024

work page arXiv 2024
[20]

GPT-4 Technical Report

Josh OpenAI, Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[21]

Cad system use and engineering performance.IEEE Transactions on Engineering Management, 40(3):274–282, 1993

David Robertson and Thomas J Allen. Cad system use and engineering performance.IEEE Transactions on Engineering Management, 40(3):274–282, 1993

work page 1993
[22]

CAD-Recode: Reverse engineering CAD code from point clouds.arXiv preprint arXiv:2412.14042, 2024

Danila Rukhovich, Elona Dupont, Dimitrios Mallis, Kseniya Cherenkova, Anis Kacem, and Djamila Aouada. Cad-recode: Reverse engineering cad code from point clouds.arXiv preprint arXiv:2412.14042, 2024

work page arXiv 2024
[23]

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[24]

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, Y Wu, et al. Deepseekmath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[25]

HybridFlow: A Flexible and Efficient RLHF Framework

Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, and Chuan Wu. Hybridflow: A flexible and efficient rlhf framework.arXiv preprint arXiv: 2409.19256, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[26]

Large language models for computer-aided design (llm4cad) fine-tuned: Dataset and experiments.Journal of Mechanical Design, pages 1–19, 2025

Yuewan Sun, Xingang Li, and Zhenghui Sha. Large language models for computer-aided design (llm4cad) fine-tuned: Dataset and experiments.Journal of Mechanical Design, pages 1–19, 2025

work page 2025
[27]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timo- thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[28]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

work page 2017
[29]

arXiv preprint arXiv:2501.19054 , year=

Ruiyu Wang, Yu Yuan, Shizhao Sun, and Jiang Bian. Text-to-cad generation through infusing visual feedback in large language models.arXiv preprint arXiv:2501.19054, 2025

work page arXiv 2025
[30]

Chi, Quoc V Le, and Denny Zhou

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, brian ichter, Fei Xia, Ed H. Chi, Quoc V Le, and Denny Zhou. Chain of thought prompting elicits reasoning in large language models. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. 16

work page 2022
[31]

Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences.ACM Transactions on Graphics (TOG), 40(4):1–24, 2021

Karl DD Willis, Yewen Pu, Jieliang Luo, Hang Chu, Tao Du, Joseph G Lambourne, Armando Solar-Lezama, and Wojciech Matusik. Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences.ACM Transactions on Graphics (TOG), 40(4):1–24, 2021

work page 2021
[32]

Deepcad: A deep generative network for computer- aided design models

Rundi Wu, Chang Xiao, and Changxi Zheng. Deepcad: A deep generative network for computer- aided design models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 6772–6782, 2021

work page 2021
[33]

Cad-mllm: Unifying multimodality-conditioned cad generation with mllm

Jingwei Xu, Zibo Zhao, Chenyu Wang, Wen Liu, Yi Ma, and Shenghua Gao. Cad-mllm: Unify- ing multimodality-conditioned cad generation with mllm.arXiv preprint arXiv:2411.04954, 2024

work page arXiv 2024
[34]

Brepgen: A b-rep generative diffusion model with structured latent geometry.ACM SIGGRAPH, 2024

Xiang Xu, Joseph G Lambourne, Pradeep Kumar Jayaraman, Zhengqing Wang, Karl DD Willis, and Yasutaka Furukawa. Brepgen: A b-rep generative diffusion model with structured latent geometry.ACM SIGGRAPH, 2024

work page 2024
[35]

Qwen2.5 Technical Report

An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tingyu X...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[36]

How to enable llm with 3d capacity? a survey of spatial reasoning in llm.arXiv preprint arXiv:2504.05786, 2025

Jirong Zha, Yuxuan Fan, Xiao Yang, Chen Gao, and Xinlei Chen. How to enable llm with 3d capacity? a survey of spatial reasoning in llm.arXiv preprint arXiv:2504.05786, 2025

work page arXiv 2025
[37]

The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?, 2025

Weichen Zhang, Ruiying Peng, Chen Gao, Jianjie Fang, Xin Zeng, Kaiyuan Li, Ziyou Wang, Jinqiang Cui, Xin Wang, Xinlei Chen, et al. The point, the vision and the text: Does point cloud boost spatial reasoning of large language models?arXiv preprint arXiv:2504.04540, 2025. 17

work page internal anchor Pith review arXiv 2025

[1] [1]

Program Synthesis with Large Language Models

Jacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, et al. Program synthesis with large language models.arXiv preprint arXiv:2108.07732, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[2] [2]

Query2cad: Generating cad models using natural language queries

Akshay Badagabettu, Sai Sravan Yarlagadda, and Amir Barati Farimani. Query2cad: Generating cad models using natural language queries.arXiv preprint arXiv:2406.00144, 2024

work page arXiv 2024

[3] [3]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian...

work page 2021

[4] [4]

Img2cad: Conditioned 3d cad model generation from single image with structured visual geometry.arXiv preprint arXiv:2410.03417, 2024

Tianrun Chen, Chunan Yu, Yuanqi Hu, Jing Li, Tao Xu, Runlong Cao, Lanyun Zhu, Ying Zang, Yong Zhang, Zejian Li, et al. Img2cad: Conditioned 3d cad model generation from single image with structured visual geometry.arXiv preprint arXiv:2410.03417, 2024

work page arXiv 2024

[5] [5]

An investigation on utilizing large language model for industrial computer-aided design automation.Procedia CIRP, 128:221–226, 2024

Haoxuan Deng, Samir Khan, and John Ahmet Erkoyuncu. An investigation on utilizing large language model for industrial computer-aided design automation.Procedia CIRP, 128:221–226, 2024

work page 2024

[6] [6]

What Sets Proficient and Expert Users Apart? Results of a Computer-Aided Design Experiment.Journal of Mechanical Design, 146(1):011401, 10 2023

Yuanzhe Deng, James Chen, and Alison Olechowski. What Sets Proficient and Expert Users Apart? Results of a Computer-Aided Design Experiment.Journal of Mechanical Design, 146(1):011401, 10 2023

work page 2023

[7] [7]

Cadquery: A python parametric cad scripting framework

CADQuery Developers. Cadquery: A python parametric cad scripting framework. https: //cadquery.readthedocs.io/, 2024. Accessed: 2024-10-22

work page 2024

[8] [8]

Complexgen: Cad reconstruction by b-rep chain complex generation.ACM Transactions on Graphics (TOG), 2022

Haoxiang Guo, Shilin Liu, Hao Pan, Yang Liu, Xin Tong, and Baining Guo. Complexgen: Cad reconstruction by b-rep chain complex generation.ACM Transactions on Graphics (TOG), 2022

work page 2022

[9] [9]

Unveiling the mist over 3d vision-language understanding: Object-centric evaluation with chain-of-analysis.arXiv preprint arXiv:2503.22420, 2025

Jiangyong Huang, Baoxiong Jia, Yan Wang, Ziyu Zhu, Xiongkun Linghu, Qing Li, Song- Chun Zhu, and Siyuan Huang. Unveiling the mist over 3d vision-language understanding: Object-centric evaluation with chain-of-analysis.arXiv preprint arXiv:2503.22420, 2025

work page arXiv 2025

[10] [10]

Solidgen: An autoregressive model for direct b-rep synthesis

Pradeep Kumar Jayaraman, Joseph G Lambourne, Nishkrit Desai, Karl DD Willis, Aditya Sanghi, and Nigel JW Morris. Solidgen: An autoregressive model for direct b-rep synthesis. Transaction in Machine Learning Research, 2023

work page 2023

[11] [11]

A survey of reinforcement learning from human feedback, 2024

Timo Kaufmann, Paul Weng, Viktor Bengs, and Eyke Hüllermeier. A survey of reinforcement learning from human feedback, 2024

work page 2024

[12] [12]

Cad-signet: Cad language inference from point clouds using layer-wise sketch instance guided attention

Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali, Kseniya Cherenkova, Anis Kacem, and Djamila Aouada. Cad-signet: Cad language inference from point clouds using layer-wise sketch instance guided attention. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4713–4722, 2024

work page 2024

[13] [13]

Text2cad: Generating sequential cad designs from beginner-to-expert level text prompts.Advances in Neural Information Processing Systems, 37:7552–7579, 2024

Mohammad Sadil Khan, Sankalp Sinha, Talha Uddin, Didier Stricker, Sk Aziz Ali, and Muham- mad Zeshan Afzal. Text2cad: Generating sequential cad designs from beginner-to-expert level text prompts.Advances in Neural Information Processing Systems, 37:7552–7579, 2024

work page 2024

[14] [14]

Abc: A big cad model dataset for geometric deep learning

Sebastian Koch, Albert Matveev, Zhongshi Jiang, Francis Williams, Alexey Artemov, Evgeny Burnaev, Marc Alexa, Denis Zorin, and Daniele Panozzo. Abc: A big cad model dataset for geometric deep learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9601–9611, 2019. 15

work page 2019

[15] [15]

Cad-llama: Leveraging large language models for computer-aided design parametric 3d model generation.arXiv preprint arXiv:2505.04481, 2025

Jiahao Li, Weijian Ma, Xueyang Li, Yunzhong Lou, Guichun Zhou, and Xiangdong Zhou. Cad-llama: Leveraging large language models for computer-aided design parametric 3d model generation.arXiv preprint arXiv:2505.04481, 2025

work page arXiv 2025

[16] [16]

Cad translator: An effective drive for text to 3d parametric computer-aided design generative modeling

Xueyang Li, Yu Song, Yunzhong Lou, and Xiangdong Zhou. Cad translator: An effective drive for text to 3d parametric computer-aided design generative modeling. InProceedings of the 32nd ACM International Conference on Multimedia, pages 8461–8470, 2024

work page 2024

[17] [17]

Hola: B-rep generation using a holistic latent representation.arXiv preprint arXiv:2504.14257, 2025

Yilin Liu, Duoteng Xu, Xingyao Yu, Xiang Xu, Daniel Cohen-Or, Hao Zhang, and Hui Huang. Hola: B-rep generation using a holistic latent representation.arXiv preprint arXiv:2504.14257, 2025

work page arXiv 2025

[18] [18]

Decoupled weight decay regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. In7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019

work page 2019

[19] [19]

CAD-Assistant: Tool-augmented VLLMs as generic CAD task solvers.arXiv preprint arXiv:2412.13810, 2024

Dimitrios Mallis, Ahmet Serdar Karadeniz, Sebastian Cavada, Danila Rukhovich, Niki Foteinopoulou, Kseniya Cherenkova, Anis Kacem, and Djamila Aouada. Cad-assistant: Tool- augmented vllms as generic cad task solvers?arXiv preprint arXiv:2412.13810, 2024

work page arXiv 2024

[20] [20]

GPT-4 Technical Report

Josh OpenAI, Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[21] [21]

Cad system use and engineering performance.IEEE Transactions on Engineering Management, 40(3):274–282, 1993

David Robertson and Thomas J Allen. Cad system use and engineering performance.IEEE Transactions on Engineering Management, 40(3):274–282, 1993

work page 1993

[22] [22]

CAD-Recode: Reverse engineering CAD code from point clouds.arXiv preprint arXiv:2412.14042, 2024

Danila Rukhovich, Elona Dupont, Dimitrios Mallis, Kseniya Cherenkova, Anis Kacem, and Djamila Aouada. Cad-recode: Reverse engineering cad code from point clouds.arXiv preprint arXiv:2412.14042, 2024

work page arXiv 2024

[23] [23]

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. Proximal policy optimization algorithms.arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[24] [24]

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, Y Wu, et al. Deepseekmath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[25] [25]

HybridFlow: A Flexible and Efficient RLHF Framework

Guangming Sheng, Chi Zhang, Zilingfeng Ye, Xibin Wu, Wang Zhang, Ru Zhang, Yanghua Peng, Haibin Lin, and Chuan Wu. Hybridflow: A flexible and efficient rlhf framework.arXiv preprint arXiv: 2409.19256, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[26] [26]

Large language models for computer-aided design (llm4cad) fine-tuned: Dataset and experiments.Journal of Mechanical Design, pages 1–19, 2025

Yuewan Sun, Xingang Li, and Zhenghui Sha. Large language models for computer-aided design (llm4cad) fine-tuned: Dataset and experiments.Journal of Mechanical Design, pages 1–19, 2025

work page 2025

[27] [27]

LLaMA: Open and Efficient Foundation Language Models

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timo- thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[28] [28]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

work page 2017

[29] [29]

arXiv preprint arXiv:2501.19054 , year=

Ruiyu Wang, Yu Yuan, Shizhao Sun, and Jiang Bian. Text-to-cad generation through infusing visual feedback in large language models.arXiv preprint arXiv:2501.19054, 2025

work page arXiv 2025

[30] [30]

Chi, Quoc V Le, and Denny Zhou

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, brian ichter, Fei Xia, Ed H. Chi, Quoc V Le, and Denny Zhou. Chain of thought prompting elicits reasoning in large language models. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022. 16

work page 2022

[31] [31]

Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences.ACM Transactions on Graphics (TOG), 40(4):1–24, 2021

Karl DD Willis, Yewen Pu, Jieliang Luo, Hang Chu, Tao Du, Joseph G Lambourne, Armando Solar-Lezama, and Wojciech Matusik. Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences.ACM Transactions on Graphics (TOG), 40(4):1–24, 2021

work page 2021

[32] [32]

Deepcad: A deep generative network for computer- aided design models

Rundi Wu, Chang Xiao, and Changxi Zheng. Deepcad: A deep generative network for computer- aided design models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 6772–6782, 2021

work page 2021

[33] [33]

Cad-mllm: Unifying multimodality-conditioned cad generation with mllm

Jingwei Xu, Zibo Zhao, Chenyu Wang, Wen Liu, Yi Ma, and Shenghua Gao. Cad-mllm: Unify- ing multimodality-conditioned cad generation with mllm.arXiv preprint arXiv:2411.04954, 2024

work page arXiv 2024

[34] [34]

Brepgen: A b-rep generative diffusion model with structured latent geometry.ACM SIGGRAPH, 2024

Xiang Xu, Joseph G Lambourne, Pradeep Kumar Jayaraman, Zhengqing Wang, Karl DD Willis, and Yasutaka Furukawa. Brepgen: A b-rep generative diffusion model with structured latent geometry.ACM SIGGRAPH, 2024

work page 2024

[35] [35]

Qwen2.5 Technical Report

An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li, Tingyu X...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[36] [36]

How to enable llm with 3d capacity? a survey of spatial reasoning in llm.arXiv preprint arXiv:2504.05786, 2025

Jirong Zha, Yuxuan Fan, Xiao Yang, Chen Gao, and Xinlei Chen. How to enable llm with 3d capacity? a survey of spatial reasoning in llm.arXiv preprint arXiv:2504.05786, 2025

work page arXiv 2025

[37] [37]

The Point, the Vision and the Text: Does Point Cloud Boost Spatial Reasoning of Large Language Models?, 2025

Weichen Zhang, Ruiying Peng, Chen Gao, Jianjie Fang, Xin Zeng, Kaiyuan Li, Ziyou Wang, Jinqiang Cui, Xin Wang, Xinlei Chen, et al. The point, the vision and the text: Does point cloud boost spatial reasoning of large language models?arXiv preprint arXiv:2504.04540, 2025. 17

work page internal anchor Pith review arXiv 2025