Towards High-Fidelity CAD Generation via LLM-Driven Program Generation and Text-Based B-Rep Primitive Grounding
Pith reviewed 2026-05-15 12:08 UTC · model grok-4.3
The pith
FutureCAD generates CAD models by having LLMs write CadQuery scripts that describe B-Rep primitive selections in plain text.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
FutureCAD shows that high-fidelity CAD generation is possible when an LLM produces CadQuery programs containing text queries for B-Rep primitive selection and a dedicated grounding transformer resolves those queries to the actual geometric elements required by subsequent parametric operations.
What carries the argument
The text-based query mechanism paired with the BRepGround transformer, which lets the LLM specify selections via natural language and maps them to the correct B-Rep primitives.
If this is right
- Advanced parametric operations such as fillet and chamfer become available inside automatically generated models without manual primitive selection.
- End-to-end text-to-CAD pipelines can now handle the full feature-based workflow used in commercial CAD systems.
- A single trained LLM plus grounding module can be applied to a range of real-world industrial models after fine-tuning on the new dataset.
- Reinforcement learning further improves generalization beyond the supervised fine-tuning stage.
Where Pith is reading between the lines
- The same text-query interface could be reused inside interactive design loops where a user corrects selections by editing the natural-language description rather than the geometry.
- If BRepGround errors remain low, the framework could serve as a backend for multi-step design agents that iteratively refine both the script and the selections.
- Extending the grounding transformer to handle assemblies or multiple bodies would be a direct next step not addressed in the current work.
Load-bearing premise
The LLM reliably produces text queries that correctly identify the intended B-Rep primitives and BRepGround grounds them accurately enough for complex models without selection errors that break later parametric operations.
What would settle it
A generated CAD model in which a fillet or chamfer is applied to the wrong face or edge, producing visibly incorrect or invalid geometry on an industrial test part.
read the original abstract
The field of Computer-Aided Design (CAD) generation has made significant progress in recent years. Existing methods typically fall into two separate categories: parametric CAD modeling and direct boundary representation (B-Rep) synthesis. In modern feature-based CAD systems, parametric modeling and B-Rep are inherently intertwined, as advanced parametric operations (e.g., fillet and chamfer) require explicit selection of B-Rep geometric primitives, and the B-Rep itself is derived from parametric operations. Consequently, this paradigm gap remains a critical factor limiting AI-driven CAD modeling for complex industrial product design. This paper presents FutureCAD, a novel text-to-CAD framework that leverages large language models (LLMs) and a B-Rep grounding transformer (BRepGround) for high-fidelity CAD generation. Our method generates executable CadQuery scripts, and introduces a text-based query mechanism that enables the LLM to specify geometric selections via natural language, which BRepGround then grounds to the target primitives. To train our framework, we construct a new dataset comprising real-world CAD models. For the LLM, we apply supervised fine-tuning (SFT) to establish fundamental CAD generation capabilities, followed by reinforcement learning (RL) to improve generalization. Experiments show that FutureCAD achieves state-of-the-art CAD generation performance. Code and dataset are available at: https://github.com/JohanStackk/FutureCAD
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents FutureCAD, a text-to-CAD framework that uses LLMs to generate executable CadQuery scripts for parametric CAD modeling. It introduces a text-based query mechanism for specifying geometric selections on B-Rep primitives, which are then grounded by a dedicated BRepGround transformer. The method is trained via supervised fine-tuning followed by reinforcement learning on a newly constructed dataset of real-world CAD models and claims state-of-the-art CAD generation performance.
Significance. If the grounding accuracy and quantitative performance claims hold, the work could meaningfully close the gap between parametric feature-based modeling and direct B-Rep synthesis, enabling more reliable generation of complex industrial CAD models from natural-language descriptions. The public release of code and dataset would further support reproducibility and follow-on research.
major comments (2)
- [Abstract and Experiments] Abstract and Experiments section: the claim of state-of-the-art performance after SFT and RL is asserted without any quantitative metrics, baseline comparisons, error analysis, dataset statistics, or success rates for script execution. This absence prevents evaluation of the central performance claim.
- [Method (BRepGround)] Method section describing BRepGround: no precision, recall, or downstream execution success rates are reported for multi-primitive grounding on complex models containing dozens of primitives. Without these numbers, it is impossible to assess whether selection errors would invalidate subsequent parametric operations such as fillets and chamfers.
minor comments (1)
- [Introduction] The introduction could benefit from a concrete example illustrating how a single fillet operation depends on explicit B-Rep primitive selection, to clarify the claimed paradigm gap.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our submission. Below we provide point-by-point responses to the major comments and indicate the changes we plan to make in the revised manuscript.
read point-by-point responses
-
Referee: [Abstract and Experiments] Abstract and Experiments section: the claim of state-of-the-art performance after SFT and RL is asserted without any quantitative metrics, baseline comparisons, error analysis, dataset statistics, or success rates for script execution. This absence prevents evaluation of the central performance claim.
Authors: We thank the referee for highlighting this issue. The current version of the manuscript indeed lacks detailed quantitative metrics in both the abstract and the Experiments section to support the state-of-the-art claim. We will revise the paper to include comprehensive quantitative metrics, baseline comparisons, error analysis, dataset statistics, and script execution success rates in the updated abstract and Experiments section. revision: yes
-
Referee: [Method (BRepGround)] Method section describing BRepGround: no precision, recall, or downstream execution success rates are reported for multi-primitive grounding on complex models containing dozens of primitives. Without these numbers, it is impossible to assess whether selection errors would invalidate subsequent parametric operations such as fillets and chamfers.
Authors: We agree with the referee that additional metrics for the BRepGround transformer are necessary, particularly precision and recall for grounding on complex models, as well as downstream success rates for operations like fillets and chamfers. The current manuscript emphasizes overall framework performance, but we will add these specific evaluations in the revised Method and Experiments sections to demonstrate the reliability of the grounding step. revision: yes
Circularity Check
No significant circularity; framework is a new construction trained on external dataset
full rationale
The paper presents FutureCAD as a text-to-CAD system that generates CadQuery scripts via LLM with a text-based query mechanism grounded by BRepGround. It constructs a new dataset of real-world CAD models, applies SFT followed by RL, and reports experimental SOTA results. No equations, derivations, or self-referential definitions are described that reduce predictions to fitted inputs or prior self-citations. The central claims rest on the new pipeline and external training data rather than any load-bearing self-definition or fitted-input renaming. Self-citations, if present in the full text, are not required for the core construction.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
FutureCAD employs an LLM to directly generate executable CadQuery scripts... BRepGround takes the transient B-Rep as input and grounds the textual query to the corresponding primitives
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
HistCAD: A Constraint-Aware Parametric History-Based CAD Representation, Dataset, and Benchmark with Industrial Complexity
HistCAD provides a constraint-aware parametric CAD representation, a dataset of 170k industrial sequences, and an editability benchmark with metrics ER, cPCSR, and OES to evaluate preservation of design intent.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.