pith. machine review for the scientific record. sign in

arxiv: 2601.22858 · v2 · submitted 2026-01-30 · 💻 cs.GR · cs.AI

Recognition: no theorem link

Learning to Build Shapes by Extrusion

Authors on Pith no claims yet

Pith reviewed 2026-05-16 09:44 UTC · model grok-4.3

classification 💻 cs.GR cs.AI
keywords 3D mesh generationextrusion sequenceslarge language modelsmanifold meshesshape editingtext-based representationquadrilateral meshesface loops
0
0 comments X

The pith

Extrusion sequences let language models generate and edit manifold 3D meshes with any number of faces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Text Encoded Extrusions, a text representation that turns 3D mesh construction into ordered sequences of face extrusions, and shows how to train a large language model on those sequences. Training begins by breaking quadrilateral meshes into their face loops, then teaching the model to reassemble each mesh by performing the corresponding extrusions one step at a time. Because the output is assembled through valid extrusions rather than free-form polygon lists, the resulting meshes are always manifold and can contain any number of faces. The same sequences can be applied to existing meshes, turning the method into an editing tool as well as a generator. A reader would care because the approach offers a controllable, structurally guaranteed way to create and modify 3D shapes that aligns with how artists actually work.

Core claim

By decomposing a library of quadrilateral meshes into non-self-intersecting face loops and fine-tuning a language model to reconstruct each mesh as a sequence of extrusions, the method produces meshes that are manifold by construction, supports arbitrary face counts, reconstructs given shapes, synthesizes new ones, and adds new features to existing meshes.

What carries the argument

Text Encoded Extrusions (TEE), a sequence representation of mesh construction as ordered face extrusions.

Load-bearing premise

That extrusion sequences learned from existing quadrilateral meshes will produce valid manifold meshes when applied to novel or edited shapes without further constraints or post-processing.

What would settle it

A single generated or edited mesh containing a non-manifold vertex, edge, or self-intersecting face after the extrusion sequence is applied.

read the original abstract

We introduce Text Encoded Extrusions (TEE), a text-based representation that expresses mesh construction as sequences of face extrusions rather than polygon lists, and a method for generating 3D meshes from TEE using a large language model (LLM). By learning extrusion sequences that assemble a mesh, similar to the way artists create meshes, our approach naturally supports arbitrary output face counts and produces manifold meshes by design, in contrast to recent mesh generative transformer based models. The learnt extrusion sequences can also be applied to existing meshes - enabling editing in addition to generation. To train our model, we decompose a library of quadrilateral meshes with non-self-intersecting face loops into constituent loops, which can be viewed as their building blocks, and finetune an LLM on the steps for reassembling the quadrilateral meshes by performing a sequence of extrusions. We demonstrate that our representation enables reconstruction, novel shape synthesis, and the addition of new features to existing meshes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Text Encoded Extrusions (TEE), a text-based representation that encodes quadrilateral mesh construction as ordered sequences of face-loop extrusions. An LLM is fine-tuned on sequences obtained by decomposing a library of quad meshes into non-self-intersecting loops and then reassembling them via extrusion steps. The central claims are that this representation supports arbitrary output face counts, produces manifold meshes by design (in contrast to mesh generative transformers), and enables both novel synthesis and editing of existing meshes by applying learned extrusion sequences.

Significance. If the manifoldness and generalization claims hold, the approach would provide a controllable, topologically consistent alternative to direct mesh generation methods by mimicking artist-like sequential construction. The explicit decomposition into extrusion steps and the ability to apply sequences to existing meshes are potentially useful strengths for editing tasks.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (method): the claim that the method 'produces manifold meshes by design' is load-bearing for the central contribution, yet the manuscript provides no validity filter, constrained decoding, or rejection sampling at inference. Because the model is a fine-tuned LLM using next-token sampling, generated sequences can violate the topological constraints (non-self-intersecting loops, consistent winding, closed manifold boundaries) used in the training decomposition, reducing the guarantee to an unverified distributional hope.
  2. [Abstract and §4] Abstract and §4 (experiments): no quantitative results, baselines, error metrics, or ablation studies are reported. Without measured reconstruction accuracy, generalization error on novel prompts, or failure rates on invalid sequence generation, it is impossible to evaluate whether the learned distribution stays inside the valid set or whether the editing capability works reliably.
minor comments (2)
  1. [§2] Notation for TEE sequences should be formalized with a precise grammar or BNF in §2 to clarify how vertex identifications and loop closures are encoded in text.
  2. [§3.1] The decomposition algorithm in §3.1 should specify the exact criteria used to select non-self-intersecting face loops and how ties are broken when multiple valid decompositions exist.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We agree that the manifoldness claim requires clarification and that quantitative evaluations are needed to substantiate the claims. We will revise the manuscript accordingly, as detailed in the point-by-point responses below.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (method): the claim that the method 'produces manifold meshes by design' is load-bearing for the central contribution, yet the manuscript provides no validity filter, constrained decoding, or rejection sampling at inference. Because the model is a fine-tuned LLM using next-token sampling, generated sequences can violate the topological constraints (non-self-intersecting loops, consistent winding, closed manifold boundaries) used in the training decomposition, reducing the guarantee to an unverified distributional hope.

    Authors: We acknowledge that the manuscript does not include constrained decoding, validity filters, or rejection sampling during inference. The 'by design' phrasing in the abstract and §3 refers to the fact that every training sequence is derived from valid, non-self-intersecting face-loop decompositions of manifold quad meshes, and that a correct extrusion step on a manifold base preserves manifoldness. However, we agree that next-token sampling from the fine-tuned LLM provides only a distributional tendency toward validity rather than a hard guarantee. We will revise the abstract and method section to remove the unqualified 'by design' claim, explicitly state the probabilistic nature of the guarantee, and add a new subsection reporting the empirical rate of valid (manifold, non-self-intersecting) sequences generated on held-out prompts. revision: partial

  2. Referee: [Abstract and §4] Abstract and §4 (experiments): no quantitative results, baselines, error metrics, or ablation studies are reported. Without measured reconstruction accuracy, generalization error on novel prompts, or failure rates on invalid sequence generation, it is impossible to evaluate whether the learned distribution stays inside the valid set or whether the editing capability works reliably.

    Authors: The current manuscript presents only qualitative demonstrations of reconstruction, novel synthesis, and editing. We agree that this is insufficient to evaluate the claims. In the revised version we will add a quantitative evaluation section that reports: (i) reconstruction accuracy via Chamfer distance and normal consistency on a held-out test set of quad meshes, (ii) the percentage of generated sequences that produce manifold, non-self-intersecting meshes, (iii) success rates for the editing task when applying learned sequences to new base meshes, and (iv) a simple baseline comparison against a direct mesh-generation transformer trained on the same data. We will also include an ablation on training-set size to show how validity rates scale. revision: yes

Circularity Check

0 steps flagged

No circularity: manifold claim follows from representation and training data, not self-definition or fitted reduction

full rationale

The paper defines TEE as sequences of face-loop extrusions derived from explicit decomposition of existing quad meshes, then fine-tunes an LLM to reproduce those sequences. The 'manifold by design' statement is a direct consequence of the representation: any sequence obeying the same topological rules used in decomposition yields a manifold mesh. No equations, parameters, or self-citations reduce the output to the input by construction; the central claim rests on observable training data and the inductive bias of next-token prediction on valid examples. The absence of an explicit validity filter at inference is a correctness or robustness concern, not a circularity in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The approach rests on the domain assumption that quadrilateral meshes admit clean decomposition into non-self-intersecting face loops usable as extrusion primitives, plus the implicit assumption that LLM sequence modeling will capture generalizable construction rules.

axioms (1)
  • domain assumption Quadrilateral meshes with non-self-intersecting face loops can be decomposed into constituent loops that serve as building blocks for extrusion sequences
    Described in the training data preparation step of the abstract
invented entities (1)
  • Text Encoded Extrusions (TEE) no independent evidence
    purpose: Text-based representation expressing mesh construction as sequences of face extrusions
    New representation introduced to enable LLM processing of mesh building steps

pith-pipeline@v0.9.0 · 5483 in / 1261 out tokens · 47343 ms · 2026-05-16T09:44:23.474186+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 1 internal anchor

  1. [1]

    Federica Bogo, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2017. Dy- namic FAUST: Registering Human Bodies in Motion. InIEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Figure 12: Three objects generated from extrusion sequences trained on a database of highly varied meshes. While the results are strange they demonstrate that the m...

  2. [2]

    Matteo Bracci, Marco Tarini, Nico Pietroni, Marco Livesu, and Paolo Cignoni

  3. [3]

    doi:10.1016/j.cad.2018.12.003

    HexaLab.net: An online viewer for hexahedral meshes.Cad Computer Aided Design110 (2019), 24–36. doi:10.1016/j.cad.2018.12.003

  4. [4]

    CG_Luke. 2016. LowPoly Base Mesh. InTurbosquid. https://www.turbosquid. com/3d-models/lowpolygon-human-base-mesh-max-free/1049654

  5. [5]

    Sijin Chen, Xin Chen, Anqi Pang, Xianfang Zeng, Wei Cheng, Yijun Fu, Fukun Yin, Zhibin Wang, Jingyi Yu, Gang Yu, Bin Fu, and Tao Chen. 2024. MeshXL: Neural Coordinate Field for Generative 3D Foundation Models.Advances in Neural Information Processing Systems37 (2024)

  6. [6]

    Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, and Guosheng Lin. 2024. MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization.Arxiv (cornell University)(2024). doi:10.48550/arXiv.2408.02555

  7. [7]

    Zhiqin Chen, Andrea Tagliasacchi, Thomas Funkhouser, and Hao Zhang. 2022. Neural dual contouring.Acm Transactions on Graphics41, 4 (2022), 3530108. doi:10.1145/3528223.3530108

  8. [8]

    Julian Chibane, Aymen Mir, and Gerard Pons-Moll. 2020. Neural Unsigned Distance Fields for Implicit Function Learning.Advances in Neural Information Processing Systems 33, Neurips 202033 (2020)

  9. [9]

    Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ah- mad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sra- vankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Learning to Build Shapes by Extrusion Rodriguez, Austen Gregers...

  10. [10]

    Daoyi Gao, Yawar Siddiqui, Lei Li, and Angela Dai. 2024. MeshArt: Generat- ing Articulated Meshes with Structure-guided Transformers.arXiv preprint arXiv:2412.11596(December 2024)

  11. [11]

    Jun Gao, Tianchang Shen, Zian Wang, Wenzheng Chen, Kangxue Yin, Daiqing Li, Or Litany, Zan Gojcic, and Sanja Fidler. 2022. Get3d: A generative model of high quality 3d textured shapes learned from images.Advances in neural information processing systems35 (2022), 31841–31854

  12. [12]

    SFF Gibson. 1998. Constrained elastic surface nets: Generating smooth surfaces from binary segmented data.Medical Image Computing and Computer-assisted Intervention - Miccai’981496 (1998), 888–898

  13. [13]

    Benoît Guillard, Federico Stella, and Pascal Fua. 2022. MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks.Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)13663 (2022), 576–592. doi:10.1007/978-3-031- 20062-5_33

  14. [14]

    Romero, Tsung-Yi Lin, and Ming-Yu Liu

    Zekun Hao, David W. Romero, Tsung-Yi Lin, and Ming-Yu Liu. 2024. Meshtron: High-Fidelity, Artist-Like 3D Mesh Generation at Scale. (2024)

  15. [15]

    Alec Jacobson, Daniele Panozzo, et al . 2018. libigl: A simple C++ geometry processing library. https://libigl.github.io/

  16. [16]

    Jae Joong Lee, Bosheng Li, and Bedrich Benes. 2023. Latent L-systems: Transformer-based Tree Generator.Acm Transactions on Graphics43, 1 (2023), 7. doi:10.1145/3627101

  17. [17]

    Jiahao Li, Hao Tan, Kai Zhang, Zexiang Xu, Fujun Luan, Yinghao Xu, Yicong Hong, Kalyan Sunkavalli, Greg Shakhnarovich, and Sai Bi. 2024. INSTANT3D: FAST TEXT-TO-3D WITH SPARSE-VIEW GENERATION AND LARGE RECON- STRUCTION MODEL.12th International Conference on Learning Representations, Iclr 2024(2024)

  18. [18]

    Chen Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming Yu Liu, and Tsung Yi Lin. 2023. Magic3D: High-Resolution Text-to-3D Content Creation.Proceedings of the Ieee Computer Society Conference on Computer Vision and Pattern Recognition2023- (2023), 300–309. doi:10.1109/CVPR52729.2023.00037

  19. [19]

    Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany

  20. [20]

    doi:10.1145/3528233

    Learning smooth neural functions via lipschitz regularization.Siggraph22 Conference Proceeding: Special Interest Group on Computer Graphics and Interac- tive Techniques Conference Proceedings(2022), 31 (13 pp.). doi:10.1145/3528233. 3530713

  21. [21]

    Lorensen and Harvey E

    William E. Lorensen and Harvey E. Cline. 1987. Marching cubes: A high resolu- tion 3D surface construction algorithm.Acm Siggraph Computer Graphics21, 4 (1987), 163–169. doi:10.1145/37402.37422

  22. [22]

    Paul Merrell and Dinesh Manocha. 2011. Model Synthesis: A General Procedural Modeling Algorithm.IEEE Transactions on Visualization and Computer Graphics 17, 6 (2011), 715–728. doi:10.1109/TVCG.2010.112

  23. [23]

    Srinivasan, Matthew Tancik, Jonathan T

    Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2022. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.Communications of the Acm65, 1 (2022), 99–106. doi:10.1145/3503250

  24. [24]

    M.Ali Eslami, and Peter W

    Charlie Nash, Yaroslav Ganin, S. M.Ali Eslami, and Peter W. Battaglia. 2020. PolyGen: An autoregressive generative model of 3D meshes.37th International Conference on Machine Learning, Icml 2020168147-10 (2020), 7177–7186

  25. [25]

    Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, and Mark Chen

  26. [26]

    Christiansen et al

    Point-E: A System for Generating 3D Point Clouds from Complex Prompts [arXiv].Arxiv(2022), 8 pp. Christiansen et al

  27. [27]

    Karran Pandey, Karan Singh, and Jakob Andreas Bærentzen. 2022. Face Extrusion Quad Meshes.Siggraph ’22 Conference Proceedings(2022), 10 (9 pp.). doi:10.1145/ 3528233.3530754

  28. [28]

    Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation.Proceedings of the Ieee Computer Society Conference on Computer Vision and Pattern Recognition2019- (2019), 165–174. doi:10.1109/ CVPR.2019.00025

  29. [29]

    Pedregosa, G

    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour- napeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python.Journal of Machine Learning Research12 (2011), 2825–2830

  30. [30]

    Nico Pietroni, Marcel Campen, Alla Sheffer, Gianmarco Cherchi, David Bommes, Xifeng Gao, Riccardo Scateni, Franck Ledoux, Jean Remacle, and Marco Livesu

  31. [31]

    Graph.42, 2, Article 16 (Oct

    Hex-Mesh Generation and Processing: A Survey.ACM Trans. Graph.42, 2, Article 16 (Oct. 2022), 44 pages. doi:10.1145/3554920

  32. [32]

    Barron, and Ben Mildenhall

    Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2023. DREAM- FUSION: TEXT-TO-3D USING 2D DIFFUSION.11th International Conference on Learning Representations, Iclr 2023(2023)

  33. [33]

    Prusinkiewicz

    P. Prusinkiewicz. 1986. Graphical applications of L-systems.Proceedings of Graphics Interface ’86 and Vision Interface ’86(1986), 247–253

  34. [34]

    Ava Pun, Kangle Deng, Ruixuan Liu, Deva Ramanan, Changliu Liu, and Jun-Yan Zhu. 2025. Generating Physically Stable and Buildable LEGO Designs from Text. arXiv preprint arXiv:2505.05469(2025)

  35. [35]

    Javier Romero, Dimitrios Tzionas, and Michael J. Black. 2017. Embodied Hands: Modeling and Capturing Hands and Bodies Together.ACM Transactions on Graphics, (Proc. SIGGRAPH Asia)36, 6 (Nov. 2017)

  36. [36]

    Schmidt, D

    P. Schmidt, D. Pieper, and L. Kobbelt. 2023. Surface Maps via Adaptive Triangu- lations.Computer Graphics Forum42, 2 (2023), 103–117. doi:10.1111/cgf.14747

  37. [37]

    Pavel Senin. 2008. Dynamic time warping algorithm review.Information and Computer Science Department University of Hawaii at Manoa Honolulu, USA855, 1-23 (2008), 40

  38. [38]

    Tianchang Shen, Zhaoshuo Li, Marc Law, Matan Atzmon, Sanja Fidler, James Lucas, Jun Gao, and Nicholas Sharp. 2024. SpaceMesh: A Continuous Represen- tation for Learning Manifold Surface Meshes.Siggraph Asia 2024 Conference Papers(2024). doi:10.1145/3680528.3687634

  39. [39]

    Tianchang Shen, Jacob Munkberg, Jon Hasselgren, Kangxue Yin, Zian Wang, Wenzheng Chen, Zan Gojcic, Sanja Fidler, Nicholas Sharp, and Jun Gao. 2023. Flexible Isosurface Extraction for Gradient-Based Mesh Optimization.ACM Trans. Graph.42, 4, Article 37 (jul 2023), 16 pages. doi:10.1145/3592430

  40. [40]

    Jonathan Richard Shewchuk. 1996. Triangle: Engineering a 2D quality mesh generator and delaunay triangulator.Lecture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1148 (1996), 203–222. doi:10.1007/bfb0014497

  41. [41]

    Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, and Matthias Nießner. 2024. MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers.Proceedings of the Ieee Computer Society Conference on Computer Vision and Pattern Recognition (2024), 19615–19625. doi:10.1109/CVPR52733.2024.01855

  42. [42]

    K. Sims. 1994. Evolving virtual creatures.Computer Graphics Proceedings. Annual Conference Series 1994. Siggraph 94 Conference Proceedings(1994), 15–22. doi:10. 1145/192161.192167

  43. [43]

    Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural Geometric Level of Detail: Real-Time Rendering with Implicit 3D Shapes. Proceedings of the Ieee Computer Society Conference on Computer Vision and Pattern Recognition(2021), 11353–11362. doi:10.1109/...

  44. [44]

    Jiaxiang Tang, Zhaoshuo Li, Zekun Hao, Xian Liu, Gang Zeng, Ming-Yu Liu, and Qinsheng Zhang. 2024. EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation. (2024). doi:10.48550/arXiv.2409.18114

  45. [45]

    Gomez, Lukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need.Advances in Neural Information Processing Systems 30 (nips 2017)30 (2017)

  46. [46]

    Cascaval, M

    P. Viville, P. Kraemer, and D. Bechmann. 2023. Meso-Skeleton Guided Hexahedral Mesh Design.Computer Graphics Forum42, 7 (2023), e14932. doi:10.1111/cgf. 14932

  47. [47]

    Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, and Karsten Kreis. 2022. LION: Latent Point Diffusion Models for 3D Shape Generation. InAdvances in Neural Information Processing Systems (NeurIPS)