arxiv: 2605.05031 · v1 · submitted 2026-05-06 · 💻 cs.CV

Recognition: 3 theorem links

· Lean Theorem

Computer-Aided Design Generation by Cascaded Discrete Diffusion Model

Honghu Pan , Xiaoling Luo , Yongyong Chen , Zhenyu He , Pengyang Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-08 17:48 UTC · model grok-4.3

classification 💻 cs.CV

keywords CAD generationdiscrete diffusioncascaded diffusioncommand diffusionparameter diffusionDeepCADgenerative models

0 comments

The pith

A cascaded discrete diffusion model generates valid CAD designs by operating directly on command and parameter tokens with tailored transition matrices.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing deep learning methods for CAD generation represent models as sequences of discrete commands and parameters but then apply either autoregressive prediction or continuous diffusion in embedding space. The mismatch often produces tokens that map to invalid CAD symbols. This paper replaces those approaches with a cascaded discrete diffusion process that corrupts and recovers tokens using purpose-built transition matrices. Commands are diffused with an absorbing-state matrix, while parameters use separate kernels matched to coordinates, dimensions, and booleans. Recovery uses a Transformer encoder for commands and an attention-based network for parameters conditioned on commands. The resulting models outperform prior methods on unconditional metrics from the DeepCAD dataset and support controllable conditional generation.

Core claim

By separating CAD generation into a command diffusion stage that uses an absorbing-state transition matrix and a parameter diffusion stage conditioned on commands that uses a Gaussian kernel for coordinates, a scale-invariant kernel for dimensions, and a prior-preserving kernel for booleans, the cascaded model recovers valid token sequences via a Transformer-based command denoiser and a parameter network with local self-attention and cross-attention, surpassing autoregressive and continuous diffusion baselines on unconditional generation metrics while enabling effective conditional control.

What carries the argument

Cascaded discrete diffusion with an absorbing-state transition matrix for commands and type-specific kernels for parameters, reversed by a Transformer encoder for commands and an attention-equipped parameter network that injects command conditioning.

If this is right

CAD generation avoids mapping perturbed embeddings to invalid symbols by staying inside the discrete token space throughout the diffusion process.
Conditional CAD tasks become controllable because parameter diffusion is explicitly conditioned on the recovered command sequence.
Heterogeneous parameter types in CAD can be handled without a single isotropic noise model by using separate transition kernels for each attribute class.
Transformer-based and attention-based denoisers suffice to invert the discrete corruption process for both commands and parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same cascaded discrete diffusion pattern could be tested on other token-structured design domains such as floor plans or circuit layouts.
If the transition matrices prove stable across command vocabularies, the method could reduce reliance on post-generation validation pipelines in production CAD tools.
Extending the conditioning mechanism to include user-specified constraints like volume or material type would be a direct next experiment.

Load-bearing premise

The chosen transition matrices and the two denoising networks will map diffused states back to semantically valid CAD commands and parameters without extra correction steps.

What would settle it

Generate a large set of unconditional samples on the DeepCAD test split and measure whether the fraction of invalid or low-scoring models exceeds that of the best autoregressive and continuous diffusion baselines.

read the original abstract

Recent deep learning approaches seek to automate CAD creation by representing a model as a sequence of discrete commands and parameters, and then generating them using autoregressive models or continuous diffusion operating in Euclidean embedding space. However, continuous diffusion perturbs representations in a continuous Euclidean domain that does not reflect the inherently discrete and heterogeneous nature of CAD tokens, often producing perturbed representations that map to semantically invalid symbols. To overcome this limitation, we propose a cascaded discrete diffusion framework for CAD generation, which consists of a command diffusion for generating CAD commands and a parameter diffusion conditioned on CAD commands. Unlike isotropic Gaussian perturbation, the forward process of our approach operates directly over categorical token distributions using delicate transition matrices. For commands, we adopt an absorbing-state transition matrix that progressively corrupts tokens to a designated symbol; for parameters, we introduce specific transition matrices tailored to heterogeneous attributes: a Gaussian kernel for coordinate continuity, a scale-invariant kernel for dimensional values, and a prior-preserving kernel for boolean attributes. The reverse process is achieved by two denoising networks: a Transformer-based encoder for command recovery, and a parameter network with extra local self-attention for command-level interaction and cross-attention for conditional injection. Experiments on the DeepCAD dataset show that the proposed approach surpasses existing autoregressive and continuous diffusion models on unconditional generation metrics, while qualitative results validate effective controllability in conditional generation tasks. Source codes will be released.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper introduces a cascaded discrete diffusion setup for CAD sequences that directly models categorical tokens with tailored transition matrices instead of continuous embeddings.

read the letter

The main point is a practical modeling choice for CAD generation: split the task into command-level diffusion using an absorbing-state matrix, then parameter diffusion conditioned on the commands with attribute-specific kernels (Gaussian for coordinates, scale-invariant for dimensions, prior-preserving for booleans). Two separate networks handle the reverse steps, with attention to link commands and parameters. This directly targets the problem that continuous diffusion often maps back to invalid CAD symbols, and the abstract positions it as outperforming both autoregressive baselines and prior continuous diffusion on DeepCAD unconditional metrics, plus some qualitative controllability results. Source code is promised, which helps reproducibility if it arrives with the data splits and training details. The approach is new in the CAD literature referenced, as the combination of cascaded structure and heterogeneous discrete kernels does not appear in the autoregressive or embedding-based diffusion work they cite. The conditioning mechanism and the choice of transition matrices are the clearest technical contributions. The central risk is whether the reverse networks reliably output only valid command-parameter pairs without hidden filtering or post-processing. If invalid tokens still appear at scale, the reported metric gains could shrink once evaluation is strict. The abstract gives no numbers, error bars, or ablation tables, so the size of the improvement and the necessity of the cascaded design remain unverified from the summary alone. Dataset splits and exact metric definitions also need checking to rule out leakage or inconsistent baselines. This work is aimed at researchers building generative tools for engineering design and manufacturing. Readers already working on discrete diffusion or structured sequence models for CAD will find the transition matrix details worth examining. It is coherent on its own terms and engages the relevant prior art, so it deserves a serious referee to assess the experiments and implementation. I would send it out for review rather than desk reject.

Referee Report

3 major / 2 minor

Summary. The paper proposes a cascaded discrete diffusion framework for CAD generation consisting of a command-level diffusion process using an absorbing-state transition matrix and a parameter-level diffusion process conditioned on commands, employing specialized forward kernels (Gaussian for coordinates, scale-invariant for dimensions, prior-preserving for booleans) and two denoising networks (Transformer encoder for commands; local self-attention plus cross-attention network for parameters). It claims that this approach outperforms autoregressive and continuous diffusion baselines on unconditional generation metrics from the DeepCAD dataset while providing effective controllability in conditional tasks.

Significance. If the empirical results and validity guarantees hold, the work would be significant for CAD automation by aligning the diffusion process more closely with the discrete, heterogeneous structure of CAD tokens, potentially yielding higher rates of semantically valid outputs and improved conditional control compared to continuous embeddings.

major comments (3)

§4 (Experiments): The central claim of surpassing existing models on DeepCAD unconditional generation metrics is stated without any numerical scores, specific metrics (e.g., validity, coverage, or MMD), baseline details, dataset splits, ablation studies, or error bars, rendering the superiority assertion unverifiable from the provided text.
§3.2 (Transition matrices and reverse process): No quantitative evaluation is reported on the rate at which the cascaded reverse networks produce semantically invalid CAD tokens (e.g., mismatched command-parameter pairs or out-of-range values), which directly bears on whether the custom kernels and denoisers map back to valid sequences without implicit post-hoc filtering.
§3.1 (Cascaded conditioning): The parameter diffusion is conditioned on recovered commands, but the manuscript provides no analysis of error propagation from command-level denoising failures to parameter validity, leaving open whether the pipeline remains robust when command recovery is imperfect.

minor comments (2)

The phrase 'delicate transition matrices' in the abstract and §3 is imprecise; explicit matrix definitions or pseudocode for each kernel should appear at first use.
Notation for the two denoising networks (e.g., variable names for the Transformer encoder versus the parameter network) is introduced inconsistently across sections and would benefit from a unified table of symbols.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to incorporate additional quantitative details and analyses as suggested.

read point-by-point responses

Referee: §4 (Experiments): The central claim of surpassing existing models on DeepCAD unconditional generation metrics is stated without any numerical scores, specific metrics (e.g., validity, coverage, or MMD), baseline details, dataset splits, ablation studies, or error bars, rendering the superiority assertion unverifiable from the provided text.

Authors: We agree that the current presentation of results lacks sufficient numerical detail to allow verification. In the revised manuscript, we will expand §4 with a table of specific metrics including validity, coverage, and MMD scores; explicit comparisons to the autoregressive and continuous diffusion baselines; dataset split information; ablation studies; and error bars from multiple runs. revision: yes
Referee: §3.2 (Transition matrices and reverse process): No quantitative evaluation is reported on the rate at which the cascaded reverse networks produce semantically invalid CAD tokens (e.g., mismatched command-parameter pairs or out-of-range values), which directly bears on whether the custom kernels and denoisers map back to valid sequences without implicit post-hoc filtering.

Authors: We acknowledge the value of reporting explicit validity rates. We will add quantitative results in the revised version measuring the percentage of generated sequences that contain mismatched command-parameter pairs or out-of-range values, thereby demonstrating that the custom kernels produce valid outputs directly. revision: yes
Referee: §3.1 (Cascaded conditioning): The parameter diffusion is conditioned on recovered commands, but the manuscript provides no analysis of error propagation from command-level denoising failures to parameter validity, leaving open whether the pipeline remains robust when command recovery is imperfect.

Authors: We agree this robustness analysis is missing. In the revision we will include experiments that quantify error propagation, for example by measuring parameter validity when command recovery is intentionally degraded or imperfect. revision: yes

Circularity Check

0 steps flagged

No circularity: independent modeling choice evaluated on external dataset

full rationale

The paper proposes a cascaded discrete diffusion framework consisting of command diffusion with an absorbing-state transition matrix and parameter diffusion with tailored kernels (Gaussian for coordinates, scale-invariant for dimensions, prior-preserving for booleans), implemented via two denoising networks. Performance is claimed via experiments on the external DeepCAD dataset comparing against autoregressive and continuous diffusion baselines. No equations, fitted parameters, or self-citations are presented that reduce the reported metrics or validity claims to quantities defined by construction from the inputs. The transition matrices and network architectures are presented as deliberate, independent design choices rather than derived from or equivalent to the evaluation results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that discrete Markov transitions with the listed kernels preserve enough structure for the denoising networks to recover valid CAD sequences; no free parameters or new physical entities are introduced beyond the standard diffusion setup.

axioms (1)

domain assumption Discrete diffusion forward processes defined by custom transition matrices can be reversed by neural networks to recover valid categorical sequences.
Invoked when stating that the absorbing-state and tailored kernels overcome the semantic invalidity problem of continuous diffusion.

pith-pipeline@v0.9.0 · 5554 in / 1262 out tokens · 44281 ms · 2026-05-08T17:48:15.310265+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost (J(x)=½(x+x⁻¹)−1, ratio-symmetric cost) washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

a scale-invariant kernel ... (Q^scale_t)_{ij} = (1−α_t) exp[−μ((i−j)/(i+j))^2] / Σ_k exp[−μ((k−j)/(k+j))^2] + α_t δ_{ij}
IndisputableMonolith (8-tick period from 2^D=8) DimensionForcing / 8-tick unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the total number of diffusion steps T is fixed to 100

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 4 canonical work pages · 3 internal anchors

[1]

Brepgen: A b-rep generative diffusion model with structured latent geometry,

X. Xu, J. Lambourne, P. Jayaraman, Z. Wang, K. Willis, and Y . Fu- rukawa, “Brepgen: A b-rep generative diffusion model with structured latent geometry,”ACM Transactions on Graphics, vol. 43, no. 4, pp. 1–14, 2024. 10 IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, VOL. XX, NO. XX, XXXX

2024
[2]

Cmt: A cascade mar with topology predictor for multimodal conditional cad generation,

J. Wu, Y . Wang, X. Yue, X. Ma, J. Guo, D. Zhou, W. Ouyang, and S. Tang, “Cmt: A cascade mar with topology predictor for multimodal conditional cad generation,” inIEEE International Conference on Com- puter Vision, 2025, pp. 7014–7024

2025
[3]

Hierarchical neural coding for controllable cad model generation,

X. Xu, P. K. Jayaraman, J. G. Lambourne, K. D. Willis, and Y . Furukawa, “Hierarchical neural coding for controllable cad model generation,” in International Conference on Machine Learning. PMLR, 2023, pp. 38 443–38 461

2023
[4]

Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces Selection

D. Qi, C. Wang, J. Xu, T. Chu, Z. Zhao, W. Liu, W. Ding, Y . Ma, and S. Gao, “Pointer-cad: Unifying b-rep and command se- quences via pointer-based edges & faces selection,”arXiv preprint arXiv:2603.04337, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

Img2cad: Conditioned 3-d cad model generation from single image with structured visual geometry,

T. Chen, C. Yu, Y . Hu, J. Li, T. Xu, R. Cao, L. Zhu, Y . Zang, Y . Zhang, Z. Liet al., “Img2cad: Conditioned 3-d cad model generation from single image with structured visual geometry,”IEEE Transactions on Industrial Informatics, 2025

2025
[6]

View-based 3-d cad model retrieval with deep residual networks,

C. Zhang, G. Zhou, H. Yang, Z. Xiao, and X. Yang, “View-based 3-d cad model retrieval with deep residual networks,”IEEE Transactions on Industrial Informatics, vol. 16, no. 4, pp. 2335–2345, 2019

2019
[7]

Vq-cad: Computer-aided design model generation with vector quantized diffu- sion,

H. Wang, M. Zhao, Y . Wang, W. Quan, and D.-M. Yan, “Vq-cad: Computer-aided design model generation with vector quantized diffu- sion,”Computer Aided Geometric Design, vol. 111, p. 102327, 2024

2024
[8]

Deepcad: A deep generative network for computer-aided design models,

R. Wu, C. Xiao, and C. Zheng, “Deepcad: A deep generative network for computer-aided design models,” inIEEE International Conference on Computer Vision, 2021, pp. 6772–6782

2021
[9]

Skexgen: Autoregressive generation of cad construction sequences with disentangled codebooks,

X. Xu, K. D. Willis, J. G. Lambourne, C.-Y . Cheng, P. K. Jayaraman, and Y . Furukawa, “Skexgen: Autoregressive generation of cad construction sequences with disentangled codebooks,” inInternational Conference on Machine Learning, 2022, pp. 24 698–24 724

2022
[10]

Diffusion- cad: Controllable diffusion model for generating computer-aided design models,

A. Zhang, W. Jia, Q. Zou, Y . Feng, X. Wei, and Y . Zhang, “Diffusion- cad: Controllable diffusion model for generating computer-aided design models,”IEEE Transactions on Visualization and Computer Graphics, 2025

2025
[11]

Revisiting cad model generation by learning raster sketch,

P. Li, W. Zhang, J. Guo, J. Chen, and D.-M. Yan, “Revisiting cad model generation by learning raster sketch,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 5, 2025, pp. 4869– 4877

2025
[12]

Cadvlm: Bridging language and vision in the generation of parametric cad sketches,

S. Wu, A. H. Khasahmadi, M. Katz, P. K. Jayaraman, Y . Pu, K. Willis, and B. Liu, “Cadvlm: Bridging language and vision in the generation of parametric cad sketches,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 368–384

2024
[13]

Parametric primitive analysis of cad sketches with vision transformer,

X. Wang, L. Wang, H. Wu, G. Xiao, and K. Xu, “Parametric primitive analysis of cad sketches with vision transformer,”IEEE Transactions on Industrial Informatics, vol. 20, no. 10, pp. 12 041–12 050, 2024

2024
[14]

High-resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” inIEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684–10 695

2022
[15]

Denoising diffusion probabilistic models,

J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” inConference on Neural Information Processing Systems, vol. 33, 2020, pp. 6840–6851

2020
[16]

Denoising diffusion implicit models,

J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” inInternational Conference on Learning Representations, 2020

2020
[17]

Unified conditional image genera- tion for visible-infrared person re-identification,

H. Pan, W. Pei, X. Li, and Z. He, “Unified conditional image genera- tion for visible-infrared person re-identification,”IEEE Transactions on Information Forensics and Security, 2024

2024
[18]

Struc- tured denoising diffusion models in discrete state-spaces,

J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg, “Struc- tured denoising diffusion models in discrete state-spaces,”Conference on Neural Information Processing Systems, vol. 34, pp. 17 981–17 993, 2021

2021
[19]

Vector quantized diffusion model for text-to-image synthesis,

S. Gu, D. Chen, J. Bao, F. Wen, B. Zhang, D. Chen, L. Yuan, and B. Guo, “Vector quantized diffusion model for text-to-image synthesis,” inIEEE Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 696–10 706

2022
[20]

Simplified and generalized masked diffusion for discrete data,

J. Shi, K. Han, Z. Wang, A. Doucet, and M. Titsias, “Simplified and generalized masked diffusion for discrete data,”Conference on Neural Information Processing Systems, vol. 37, pp. 103 131–103 167, 2024

2024
[21]

Simple and effective masked diffusion lan- guage models,

S. Sahoo, M. Arriola, Y . Schiff, A. Gokaslan, E. Marroquin, J. Chiu, A. Rush, and V . Kuleshov, “Simple and effective masked diffusion lan- guage models,”Conference on Neural Information Processing Systems, vol. 37, pp. 130 136–130 184, 2024

2024
[22]

Discrete diffusion modeling by esti- mating the ratios of the data distribution,

A. Lou, C. Meng, and S. Ermon, “Discrete diffusion modeling by esti- mating the ratios of the data distribution,” inInternational Conference on Machine Learning, 2024, pp. 32 819–32 848

2024
[23]

Your absorbing discrete diffusion secretly models the conditional distributions of clean data,

J. Ou, S. Nie, K. Xue, F. Zhu, J. Sun, Z. Li, and C. Li, “Your absorbing discrete diffusion secretly models the conditional distributions of clean data,”arXiv preprint arXiv:2406.03736, 2024

work page arXiv 2024
[24]

Priority-centric human motion generation in discrete latent space,

H. Kong, K. Gong, D. Lian, M. B. Mi, and X. Wang, “Priority-centric human motion generation in discrete latent space,” inIEEE International Conference on Computer Vision, 2023, pp. 14 806–14 816

2023
[25]

M2d2m: Multi-motion generation from text with discrete diffusion models,

S. Chi, H.-g. Chi, H. Ma, N. Agarwal, F. Siddiqui, K. Ramani, and K. Lee, “M2d2m: Multi-motion generation from text with discrete diffusion models,” inIn Proceedings of the European Conference on Computer Vision. Springer, 2024, pp. 18–36

2024
[26]

Layoutdm: Discrete diffusion model for controllable layout generation,

N. Inoue, K. Kikuchi, E. Simo-Serra, M. Otani, and K. Yamaguchi, “Layoutdm: Discrete diffusion model for controllable layout generation,” inIEEE Conference on Computer Vision and Pattern Recognition, 2023, pp. 10 167–10 176

2023
[27]

Layoutdiffusion: Improving graphic layout generation by discrete diffusion probabilistic models,

J. Zhang, J. Guo, S. Sun, J.-G. Lou, and D. Zhang, “Layoutdiffusion: Improving graphic layout generation by discrete diffusion probabilistic models,” inIEEE International Conference on Computer Vision, 2023, pp. 7226–7236

2023
[28]

Diffu- sionbert: Improving generative masked language models with diffusion models,

Z. He, T. Sun, Q. Tang, K. Wang, X.-J. Huang, and X. Qiu, “Diffu- sionbert: Improving generative masked language models with diffusion models,” inProceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023, pp. 4521–4534

2023
[29]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017

2017
[30]

Complexgen: Cad reconstruction by b-rep chain complex generation,

H. Guo, S. Liu, H. Pan, Y . Liu, X. Tong, and B. Guo, “Complexgen: Cad reconstruction by b-rep chain complex generation,”ACM Transactions on Graphics, vol. 41, no. 4, pp. 1–18, 2022

2022
[31]

Cad-llama: leveraging large language models for computer-aided design parametric 3d model generation,

J. Li, W. Ma, X. Li, Y . Lou, G. Zhou, and X. Zhou, “Cad-llama: leveraging large language models for computer-aided design parametric 3d model generation,” inIEEE Conference on Computer Vision and Pattern Recognition, 2025, pp. 18 563–18 573

2025
[32]

Flexcad: Unified and versatile controllable cad generation with fine-tuned large language models,

Z. Zhang, S. Sun, W. Wang, D. Cai, and J. Bian, “Flexcad: Unified and versatile controllable cad generation with fine-tuned large language models,” inInternational Conference on Learning Representations, 2025

2025
[33]

The llama 3 herd of models,

A. Dubey, A. Jauhri, A. Pandey, A. Kadian, A. Al-Dahle, A. Letman, A. Mathur, A. Schelten, A. Yang, A. Fanet al., “The llama 3 herd of models,”arXiv e-prints, pp. arXiv–2407, 2024

2024
[34]

GPT-4 Technical Report

J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkatet al., “Gpt-4 technical report,”arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review arXiv 2023
[35]

Neural discrete representa- tion learning,

A. Van Den Oord, O. Vinyalset al., “Neural discrete representa- tion learning,”Conference on Neural Information Processing Systems, vol. 30, 2017

2017
[36]

Glide: Towards photorealistic image gen- eration and editing with text-guided diffusion models,

A. Q. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. Mcgrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image gen- eration and editing with text-guided diffusion models,” inInternational Conference on Machine Learning, 2022, pp. 16 784–16 804

2022
[37]

Motiondiffuse: Text-driven human motion generation with diffusion model,

M. Zhang, Z. Cai, L. Pan, F. Hong, X. Guo, L. Yang, and Z. Liu, “Motiondiffuse: Text-driven human motion generation with diffusion model,”IEEE transactions on pattern analysis and machine intelligence, vol. 46, no. 6, pp. 4115–4128, 2024

2024
[38]

Cad-signet: Cad language inference from point clouds using layer-wise sketch instance guided attention,

M. S. Khan, E. Dupont, S. A. Ali, K. Cherenkova, A. Kacem, and D. Aouada, “Cad-signet: Cad language inference from point clouds using layer-wise sketch instance guided attention,” inIEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 4713–4722

2024
[39]

Draw step by step: Reconstructing cad construction sequences from point clouds via multi- modal diffusion

W. Ma, S. Chen, Y . Lou, X. Li, and X. Zhou, “Draw step by step: Reconstructing cad construction sequences from point clouds via multi- modal diffusion.” inIEEE Conference on Computer Vision and Pattern Recognition, 2024, pp. 27 154–27 163

2024
[40]

Transcad: A hierarchical transformer for cad sequence inference from point clouds,

E. Dupont, K. Cherenkova, D. Mallis, G. Gusev, A. Kacem, and D. Aouada, “Transcad: A hierarchical transformer for cad sequence inference from point clouds,” inIn Proceedings of the European Con- ference on Computer Vision, 2024, pp. 19–36

2024
[41]

Cad-recode: Reverse engineering cad code from point clouds,

D. Rukhovich, E. Dupont, D. Mallis, K. Cherenkova, A. Kacem, and D. Aouada, “Cad-recode: Reverse engineering cad code from point clouds,” inIEEE International Conference on Computer Vision, 2025, pp. 9801–9811

2025
[42]

Pointnet: Deep learning on point sets for 3d classification and segmentation,

C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” inIEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 652–660

2017
[43]

Adam: A Method for Stochastic Optimization

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review arXiv 2014