PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion
Pith reviewed 2026-05-21 18:32 UTC · model grok-4.3
The pith
PartDiffuser generates 3D meshes from point clouds by autoregressing across semantic parts for global structure while diffusing in parallel inside each part for local details.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
PartDiffuser performs semantic segmentation on the input mesh or point cloud, then uses autoregression between parts to maintain global topology while running a parallel discrete diffusion process inside each semantic part to reconstruct high-frequency geometric features, all inside a DiT architecture equipped with a part-aware cross-attention layer that conditions on hierarchical point-cloud geometry to decouple the global and local tasks.
What carries the argument
The part-aware cross-attention mechanism inside the DiT backbone that uses hierarchical point-cloud conditioning to dynamically steer generation and separate global topology control from local detail reconstruction.
If this is right
- Global structural consistency is achieved through autoregressive ordering of semantic parts rather than full-sequence autoregression.
- High-frequency local details are recovered by parallel discrete diffusion performed independently inside each part.
- Error accumulation across the entire object is limited because diffusion steps remain local to each semantic region.
- Meshes exhibit richer surface detail than current state-of-the-art point-cloud-to-mesh generators while preserving overall topology.
Where Pith is reading between the lines
- The part-wise split could be tested on other conditional generation tasks such as texture synthesis or scene layout where global coherence and local fidelity must both be maintained.
- If segmentation can be made lightweight and online, the framework might support interactive 3D modeling tools that accept partial point clouds.
- The same conditioning hierarchy might improve consistency when extending the model to generate textured meshes or animated sequences.
Load-bearing premise
The method assumes that accurate semantic segmentation of the input can be obtained in advance and that the part-aware cross-attention will successfully prevent boundary artifacts when merging the autoregressive inter-part sequence with the parallel intra-part diffusion.
What would settle it
Quantitative results on a held-out test set of complex objects where the method shows no improvement over prior models on detail-sensitive metrics such as normal consistency or edge sharpness, or visual inspection revealing visible seams or loss of geometry at part boundaries in the generated meshes.
Figures
read the original abstract
Existing autoregressive (AR) methods for generating artist-designed meshes struggle to balance global structural consistency with high-fidelity local details, and are susceptible to error accumulation. To address this, we propose PartDiffuser, a novel semi-autoregressive diffusion framework for point-cloud-to-mesh generation. The method first performs semantic segmentation on the mesh and then operates in a "part-wise" manner: it employs autoregression between parts to ensure global topology, while utilizing a parallel discrete diffusion process within each semantic part to precisely reconstruct high-frequency geometric features. PartDiffuser is based on the DiT architecture and introduces a part-aware cross-attention mechanism, using point clouds as hierarchical geometric conditioning to dynamically control the generation process, thereby effectively decoupling the global and local generation tasks. Experiments demonstrate that this method significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail, exhibiting exceptional detail representation suitable for real-world applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes PartDiffuser, a semi-autoregressive discrete diffusion framework for point-cloud-to-mesh generation. It first performs semantic segmentation on the input, then applies autoregressive generation across parts to maintain global topology while running parallel discrete diffusion within each part to recover high-frequency details. The DiT architecture is extended with a part-aware cross-attention mechanism that conditions generation on hierarchical point clouds, thereby decoupling global and local tasks. The abstract states that experiments show significant outperformance over SOTA models in detail fidelity for real-world applications.
Significance. If the empirical claims are substantiated, the part-wise decomposition could offer a practical way to reconcile global consistency with local geometric fidelity in conditional mesh generation. The combination of autoregressive inter-part modeling and intra-part discrete diffusion, together with cross-attention conditioning, represents an architectural pattern that may influence subsequent work on structured 3D synthesis.
major comments (2)
- [Abstract] Abstract: the central claim that the method 'significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail' is unsupported by any quantitative metrics, baseline comparisons, error measures, or experimental protocol; without these the outperformance assertion cannot be evaluated and is load-bearing for the paper's contribution.
- [Method] Method description (abstract and implied §3): the framework presupposes accurate upfront semantic segmentation and artifact-free boundary handling via part-aware cross-attention, yet no robustness analysis, ablation on segmentation noise, or boundary-specific metrics (e.g., normal consistency or edge error across part interfaces) are reported; these assumptions directly determine whether the global-local decoupling succeeds.
minor comments (2)
- [Abstract] Abstract: the term 'semi-autoregressive' is introduced without a concise definition of how the autoregressive inter-part schedule interacts with the parallel intra-part diffusion steps.
- [Abstract] Abstract: 'hierarchical point clouds' are mentioned as conditioning input but the construction of the hierarchy (number of levels, sampling strategy) is not specified.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive review of our manuscript. We address each of the major comments below and have made revisions to the manuscript where appropriate to strengthen the presentation of our results and analysis.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the method 'significantly outperforms state-of-the-art (SOTA) models in generating 3D meshes with rich detail' is unsupported by any quantitative metrics, baseline comparisons, error measures, or experimental protocol; without these the outperformance assertion cannot be evaluated and is load-bearing for the paper's contribution.
Authors: The abstract is intended as a concise overview, and the supporting quantitative evidence—including specific metrics, baseline comparisons, and the experimental setup—is provided in detail in Section 4 of the full manuscript. Nevertheless, we agree that incorporating key quantitative results into the abstract would make the claim more immediately verifiable. We have revised the abstract to include brief references to the performance gains observed in our experiments. revision: yes
-
Referee: [Method] Method description (abstract and implied §3): the framework presupposes accurate upfront semantic segmentation and artifact-free boundary handling via part-aware cross-attention, yet no robustness analysis, ablation on segmentation noise, or boundary-specific metrics (e.g., normal consistency or edge error across part interfaces) are reported; these assumptions directly determine whether the global-local decoupling succeeds.
Authors: We recognize the importance of validating the robustness of the part-wise approach to segmentation inaccuracies. The current work assumes high-quality semantic segmentation as input, consistent with many part-based 3D generation methods. To directly address this concern, we have conducted additional experiments and included an ablation study on segmentation noise levels along with boundary-specific metrics in the revised manuscript. revision: yes
Circularity Check
No circularity: architectural combination with independent experimental claims
full rationale
The paper introduces PartDiffuser as a new semi-autoregressive framework that combines upfront semantic segmentation, autoregressive inter-part generation for global topology, parallel discrete diffusion within parts for local details, and part-aware cross-attention on hierarchical point clouds. No equations, fitted parameters, or derivation steps appear that reduce any claimed prediction or result to the inputs by construction. The abstract and description frame the approach as an original architectural synthesis rather than a self-referential fit or renamed prior result. Central performance claims rest on experimental outperformance rather than load-bearing self-citations or uniqueness theorems imported from the authors' prior work. This is the common case of a self-contained engineering contribution.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
employs autoregression between parts to ensure global topology, while utilizing a parallel discrete diffusion process within each semantic part
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
part-aware cross-attention mechanism, using point clouds as hierarchical geometric conditioning
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola, Aaron Gokaslan, Justin T Chiu, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Subham Sekhar Sahoo, and V olodymyr Kuleshov. Block diffusion: Interpolating be- tween autoregressive and diffusion language models.arXiv preprint arXiv:2503.09573, 2025. 2, 3, 5, 6
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[2]
Jacob Austin, Daniel D Johnson, Jonathan Ho, Daniel Tar- low, and Rianne Van Den Berg. Structured denoising dif- fusion models in discrete state-spaces.Advances in neural information processing systems, 34:17981–17993, 2021. 2, 3
work page 2021
-
[3]
Partgen: Part-level 3d generation and reconstruction with multi-view diffusion models
Minghao Chen, Roman Shapovalov, Iro Laina, Tom Mon- nier, Jianyuan Wang, David Novotny, and Andrea Vedaldi. Partgen: Part-level 3d generation and reconstruction with multi-view diffusion models. InProceedings of the Com- puter Vision and Pattern Recognition Conference, pages 5881–5892, 2025. 3
work page 2025
-
[4]
Autopartgen: Autogres- sive 3d part generation and discovery.arXiv preprint arXiv:2507.13346, 2025
Minghao Chen, Jianyuan Wang, Roman Shapovalov, Tom Monnier, Hyunyoung Jung, Dilin Wang, Rakesh Ranjan, Iro Laina, and Andrea Vedaldi. Autopartgen: Autogres- sive 3d part generation and discovery.arXiv preprint arXiv:2507.13346, 2025. 3
-
[5]
Sijin Chen, Xin Chen, Anqi Pang, Xianfang Zeng, Wei Cheng, Yijun Fu, Fukun Yin, Billzb Wang, Jingyi Yu, Gang Yu, et al. Meshxl: Neural coordinate field for generative 3d foundation models.Advances in Neural Information Pro- cessing Systems, 37:97141–97166, 2024. 2
work page 2024
-
[6]
Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Ji- axiang Tang, Xin Chen, Zhongang Cai, Lei Yang, Gang Yu, et al. Meshanything: Artist-created mesh generation with au- toregressive transformers.arXiv preprint arXiv:2406.10163,
-
[7]
Meshany- thing v2: Artist-created mesh generation with adjacent mesh tokenization
Yiwen Chen, Yikai Wang, Yihao Luo, Zhengyi Wang, Zilong Chen, Jun Zhu, Chi Zhang, and Guosheng Lin. Meshany- thing v2: Artist-created mesh generation with adjacent mesh tokenization. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 13922–13931, 2025. 6
work page 2025
-
[8]
Objaverse: A universe of annotated 3d objects
Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13142–13153, 2023. 6, 8, 1
work page 2023
-
[9]
Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis.Advances in neural informa- tion processing systems, 34:8780–8794, 2021. 3
work page 2021
-
[10]
3d-front: 3d furnished rooms with layouts and semantics
Huan Fu, Bowen Cai, Lin Gao, Ling-Xiao Zhang, Jiaming Wang, Cao Li, Qixun Zeng, Chengyue Sun, Rongfei Jia, Bin- qiang Zhao, et al. 3d-front: 3d furnished rooms with layouts and semantics. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 10933–10942,
-
[11]
Memdlm: De novo membrane protein design with masked discrete diffusion protein language models
Shrey Goel, Vishrut Thoutam, Edgar Mariano Marro- quin, Aaron Gokaslan, Arash Firouzbakht, Sophia Vincoff, V olodymyr Kuleshov, Huong T Kratochvil, and Pranam Chatterjee. Memdlm: De novo membrane protein design with masked discrete diffusion protein language models. arXiv preprint arXiv:2410.16735, 2024. 3
-
[12]
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong, Shivam Agarwal, Yizhe Zhang, Jiacheng Ye, Lin Zheng, Mukai Li, Chenxin An, Peilin Zhao, Wei Bi, Jiawei Han, et al. Scaling diffusion language models via adaptation from autoregressive models.arXiv preprint arXiv:2410.17891, 2024. 3
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[13]
Shansan Gong, Ruixiang Zhang, Huangjie Zheng, Jiatao Gu, Navdeep Jaitly, Lingpeng Kong, and Yizhe Zhang. Diffu- coder: Understanding and improving masked diffusion mod- els for code generation.arXiv preprint arXiv:2506.20639,
-
[14]
Zekun Hao, David W Romero, Tsung-Yi Lin, and Ming-Yu Liu. Meshtron: High-fidelity, artist-like 3d mesh generation at scale.arXiv preprint arXiv:2412.09548, 2024. 2
-
[15]
Classifier-Free Diffusion Guidance
Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance.arXiv preprint arXiv:2207.12598, 2022. 3
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[16]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising dif- fusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020. 3
work page 2020
-
[17]
Mukul Khanna, Yongsen Mao, Hanxiao Jiang, Sanjay Haresh, Brennan Shacklett, Dhruv Batra, Alexander Clegg, Eric Undersander, Angel X Chang, and Manolis Savva. Habitat synthetic scenes dataset (hssd-200): An analysis of 3d scene scale and realism tradeoffs for objectgoal naviga- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern...
-
[18]
Mercury: Ultra-Fast Language Models Based on Diffusion
Samar Khanna, Siddhant Kharbanda, Shufan Li, Harshit Varma, Eric Wang, Sawyer Birnbaum, Ziyang Luo, Ya- nis Miraoui, Akash Palrecha, Stefano Ermon, et al. Mer- cury: Ultra-fast language models based on diffusion.arXiv preprint arXiv:2506.17298, 2025. 3
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[19]
Xiang Li, John Thickstun, Ishaan Gulrajani, Percy S Liang, and Tatsunori B Hashimoto. Diffusion-lm improves control- lable text generation.Advances in neural information pro- cessing systems, 35:4328–4343, 2022. 3
work page 2022
-
[20]
Yuchen Lin, Chenguo Lin, Panwang Pan, Honglei Yan, Yiqiang Feng, Yadong Mu, and Katerina Fragkiadaki. Partcrafter: Structured 3d mesh generation via compo- sitional latent diffusion transformers.arXiv preprint arXiv:2506.05573, 2025. 3
-
[21]
Treemeshgpt: Artistic mesh generation with autoregressive tree sequenc- ing
Stefan Lionar, Jiabin Liang, and Gim Hee Lee. Treemeshgpt: Artistic mesh generation with autoregressive tree sequenc- ing. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 26608–26617, 2025. 2, 3, 6
work page 2025
-
[22]
Part123: part-aware 3d reconstruction from a single-view image
Anran Liu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Zhiyang Dou, Hao-Xiang Guo, Ping Luo, and Wenping Wang. Part123: part-aware 3d reconstruction from a single-view image. InACM SIGGRAPH 2024 Conference Papers, pages 1–12, 2024. 3
work page 2024
-
[23]
Partfield: Learn- ing 3d feature fields for part segmentation and beyond
Minghua Liu, Mikaela Angelina Uy, Donglai Xiang, Hao Su, Sanja Fidler, Nicholas Sharp, and Jun Gao. Partfield: Learn- ing 3d feature fields for part segmentation and beyond. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9704–9715, 2025. 4, 6, 1 9
work page 2025
-
[24]
Wonder3d: Sin- gle image to 3d using cross-domain diffusion
Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin, Yuan Liu, Zhiyang Dou, Lingjie Liu, Yuexin Ma, Song-Hai Zhang, Marc Habermann, Christian Theobalt, et al. Wonder3d: Sin- gle image to 3d using cross-domain diffusion. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9970–9980, 2024. 2
work page 2024
-
[25]
Marching cubes: A high resolution 3d surface construction algorithm
William E Lorensen and Harvey E Cline. Marching cubes: A high resolution 3d surface construction algorithm. InSem- inal graphics: pioneering efforts that shaped the field, pages 347–353. 1998. 2
work page 1998
-
[26]
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view syn- thesis.Communications of the ACM, 65(1):99–106, 2021. 2
work page 2021
-
[27]
Large Language Diffusion Models
Shen Nie, Fengqi Zhu, Zebin You, Xiaolu Zhang, Jingyang Ou, Jun Hu, Jun Zhou, Yankai Lin, Ji-Rong Wen, and Chongxuan Li. Large language diffusion models.arXiv preprint arXiv:2502.09992, 2025. 2, 3
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[28]
Deepsdf: Learning con- tinuous signed distance functions for shape representation
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning con- tinuous signed distance functions for shape representation. InProceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 165–174, 2019. 2
work page 2019
-
[29]
Scalable diffusion models with transformers
William Peebles and Saining Xie. Scalable diffusion models with transformers. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 4195–4205,
-
[30]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10684–10695, 2022. 3
work page 2022
-
[31]
Meshgpt: Generating triangle meshes with decoder-only transformers
Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Ta- tiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, and Matthias Nießner. Meshgpt: Generating triangle meshes with decoder-only transformers. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19615–19625, 2024. 2
work page 2024
-
[32]
Kaiyu Song, Hanjiang Lai, Yaqing Zhang, Chuangjian Cai, Yan Pan Kun Yue, and Jian Yin. Topology sculptor, shape refiner: Discrete diffusion model for high-fidelity 3d meshes generation.arXiv preprint arXiv:2510.21264, 2025. 3
-
[33]
Efficient part-level 3d object generation via dual volume packing.arXiv preprint arXiv:2506.09980,
Jiaxiang Tang, Ruijie Lu, Zhaoshuo Li, Zekun Hao, Xuan Li, Fangyin Wei, Shuran Song, Gang Zeng, Ming-Yu Liu, and Tsung-Yi Lin. Efficient part-level 3d object generation via dual volume packing.arXiv preprint arXiv:2506.09980,
-
[34]
arXiv preprint arXiv:2410.13782 , year=
Xinyou Wang, Zaixiang Zheng, Fei Ye, Dongyu Xue, Shujian Huang, and Quanquan Gu. Dplm-2: A multi- modal diffusion protein language model.arXiv preprint arXiv:2410.13782, 2024. 3
-
[35]
LLaMA-Mesh: Unifying 3d mesh generation with language models.arXiv preprint arXiv:2411.09595, 2024
Zhengyi Wang, Jonathan Lorraine, Yikai Wang, Hang Su, Jun Zhu, Sanja Fidler, and Xiaohui Zeng. Llama-mesh: Unifying 3d mesh generation with language models.arXiv preprint arXiv:2411.09595, 2024. 2
-
[36]
Scaling mesh generation via compressive tokenization
Haohan Weng, Zibo Zhao, Biwen Lei, Xianghui Yang, Jian Liu, Zeqiang Lai, Zhuo Chen, Yuhong Liu, Jie Jiang, Chun- chao Guo, et al. Scaling mesh generation via compressive tokenization. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 11093–11103, 2025. 2, 3, 4, 6, 1
work page 2025
-
[37]
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding
Chengyue Wu, Hao Zhang, Shuchen Xue, Zhijian Liu, Shizhe Diao, Ligeng Zhu, Ping Luo, Song Han, and Enze Xie. Fast-dllm: Training-free acceleration of diffusion llm by enabling kv cache and parallel decoding.arXiv preprint arXiv:2505.22618, 2025. 3
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[38]
Structured 3d latents for scalable and versatile 3d gen- eration
Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, and Jiaolong Yang. Structured 3d latents for scalable and versatile 3d gen- eration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 21469–21480, 2025. 2
work page 2025
-
[39]
Frankenstein: Generating semantic- compositional 3d scenes in one tri-plane
Han Yan, Yang Li, Zhennan Wu, Shenzhou Chen, Weix- uan Sun, Taizhang Shang, Weizhe Liu, Tian Chen, Xiaqiang Dai, Chao Ma, et al. Frankenstein: Generating semantic- compositional 3d scenes in one tri-plane. InSIGGRAPH Asia 2024 Conference Papers, pages 1–11, 2024. 3
work page 2024
-
[40]
Han Yan, Mingrui Zhang, Yang Li, Chao Ma, and Pan Ji. Phycage: Physically plausible compositional 3d asset gener- ation from a single image.arXiv preprint arXiv:2411.18548,
-
[41]
Holopart: Generative 3d part amodal segmentation.arXiv preprint arXiv:2504.07943, 2025
Yunhan Yang, Yuan-Chen Guo, Yukun Huang, Zi-Xin Zou, Zhipeng Yu, Yangguang Li, Yan-Pei Cao, and Xihui Liu. Holopart: Generative 3d part amodal segmentation.arXiv preprint arXiv:2504.07943, 2025. 3
-
[42]
Dream 7B: Diffusion Large Language Models
Jiacheng Ye, Zhihui Xie, Lin Zheng, Jiahui Gao, Zirui Wu, Xin Jiang, Zhenguo Li, and Lingpeng Kong. Dream 7b: Diffusion large language models.arXiv preprint arXiv:2508.15487, 2025. 2, 3
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[43]
Deepmesh: Auto- regressive artist-mesh creation with reinforcement learning
Ruowen Zhao, Junliang Ye, Zhengyi Wang, Guangce Liu, Yiwen Chen, Yikai Wang, and Jun Zhu. Deepmesh: Auto- regressive artist-mesh creation with reinforcement learning. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 10612–10623, 2025. 2, 3
work page 2025
-
[44]
Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, and Shenghua Gao. Michelangelo: Conditional 3d shape generation based on shape-image-text aligned latent representation.Advances in neural information processing systems, 36:73969–73982,
-
[45]
4 10 PartDiffuser: Part-wise 3D Mesh Generation via Discrete Diffusion Supplementary Material A. Dataset Construction As a supplement to the dataset introduction in the main text, we provide a detailed description of the dataset construction process. We utilize Objaverse [8] and 3D-Front [10] as our primary data sources. The data preprocessing pipeline co...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.