Recognition: no theorem link
From Visual Synthesis to Interactive Worlds: Toward Production-Ready 3D Asset Generation
Pith reviewed 2026-05-12 01:50 UTC · model grok-4.3
The pith
Organizing 3D generative methods by asset tiers and production stages reveals which outputs meet engine-ready standards for interactive use.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The survey establishes that despite rapid progress in generative modeling a persistent gap remains between current outputs and the production-ready standard required by interactive applications, and that a two-dimensional taxonomy organized around asset tiers and the full production lifecycle provides the clearest way to assess which methods produce assets directly usable in downstream engines and simulation platforms.
What carries the argument
The two-dimensional taxonomy that crosses three asset tiers (general objects, characters, scenes) with the vertical production lifecycle stages from data and geometry synthesis through topology, UV, PBR appearance, rigging, and scene assembly.
Load-bearing premise
That organizing the literature around asset tiers and production lifecycle stages accurately identifies which methods produce assets that satisfy engine-level constraints without further processing.
What would settle it
A broad empirical check showing that outputs from a majority of recent generative methods already satisfy topology, UV parameterization, PBR materials, skeletal rigging, and physics-aware scene constraints when imported directly into standard real-time engines.
Figures
read the original abstract
Three-dimensional content generation has progressed from producing isolated, visually plausible shapes to constructing structured assets that can be deployed in real-time interactive environments. This trajectory is driven by converging demands from game development, embodied AI, world simulation, digital twins, and spatial computing, all of which require 3D content that goes beyond surface appearance to satisfy engine-level constraints on topology, UV parameterization, physically based materials, skeletal rigging, and physics-aware scene layout. Despite rapid advances in generative modeling, a persistent gap separates the outputs of current methods from the production-ready standard expected by interactive applications. This survey addresses that gap by organizing the literature around the asset production pipeline rather than algorithmic families. Along the horizontal axis we distinguish three asset tiers, namely general objects, characters, and scenes, while the vertical axis traces each tier through the full production lifecycle from data foundations and geometry synthesis through topology optimization, UV unwrapping, PBR appearance, rigging, and scene assembly. Through this two-dimensional taxonomy we assess not only what current methods can generate but whether their outputs are directly usable in downstream engines and simulation platforms. We further consolidate evaluation metrics and protocols that span geometric fidelity, appearance quality, asset usability, and scene-level physical plausibility. The survey concludes by identifying open challenges in data quality, generation controllability, end-to-end assetization, and physically grounded generation, and by situating production-ready 3D content as foundational infrastructure for emerging interactive world models and embodied intelligent systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a literature survey on 3D asset generation. It organizes prior work via a two-dimensional taxonomy with horizontal axis of asset tiers (general objects, characters, scenes) and vertical axis of the production lifecycle (data foundations, geometry synthesis, topology optimization, UV unwrapping, PBR appearance, rigging, scene assembly). The central claim is that rapid advances in generative modeling have not closed a persistent gap to production-ready assets satisfying engine constraints on topology, UVs, materials, rigging, and physics layout; the survey assesses current methods' direct usability in downstream engines, consolidates metrics spanning geometric fidelity to scene-level physical plausibility, and identifies open challenges in data quality, controllability, end-to-end assetization, and physically grounded generation.
Significance. If the taxonomy accurately reflects usability, the survey supplies a useful organizing framework that shifts emphasis from algorithmic families to pipeline requirements. This can guide research toward assets deployable in games, simulation, digital twins, and embodied AI, while the consolidated metrics and challenge list provide concrete directions for closing the identified production gap.
major comments (1)
- [Taxonomy description and usability assessment (abstract and main taxonomy sections)] The assessment that current methods fail to produce directly usable assets (central to the gap claim and taxonomy evaluation) rests on qualitative categorization of the cited literature rather than direct empirical validation. No engine-level tests (e.g., importing outputs into Unity/Unreal to measure failure rates from non-manifold edges, invalid UV seams, or missing skeletal hierarchies) are described, which risks overstating or understating the gap by missing constraints not explicit in paper abstracts or results sections.
Simulated Author's Rebuttal
We thank the referee for the constructive review and for recognizing the survey's potential to guide research toward production-ready 3D assets. We address the major comment below, providing an honest account of our methodology as a literature survey and describing the revisions we will make.
read point-by-point responses
-
Referee: The assessment that current methods fail to produce directly usable assets (central to the gap claim and taxonomy evaluation) rests on qualitative categorization of the cited literature rather than direct empirical validation. No engine-level tests (e.g., importing outputs into Unity/Unreal to measure failure rates from non-manifold edges, invalid UV seams, or missing skeletal hierarchies) are described, which risks overstating or understating the gap by missing constraints not explicit in paper abstracts or results sections.
Authors: We agree that the usability assessment in the taxonomy is derived from qualitative synthesis of the capabilities, limitations, and output characteristics reported across the cited papers, rather than from new direct empirical tests such as importing assets into Unity or Unreal to measure specific failure rates. As this is a survey, performing comprehensive engine-level validation on dozens of methods would require re-implementation, standardized testing protocols, and resources that fall outside the scope of a literature review. We believe the central gap claim remains supported by the consistent absence of production features (e.g., guaranteed manifold topology, valid UV parameterization, complete skeletal hierarchies) in the documented outputs. To improve transparency, we will revise the taxonomy sections and add a brief limitations discussion explicitly noting the reliance on published results and the possibility that some engine-specific constraints may not be fully captured in the literature. This will clarify the methodology without altering the survey's conclusions or taxonomy structure. revision: yes
Circularity Check
No circularity: survey taxonomy synthesizes external literature without self-referential derivations
full rationale
The paper is a literature survey that proposes a 2D taxonomy (asset tiers × production lifecycle stages) to organize prior work on 3D generation. It makes no new quantitative predictions, fits no parameters to data, and advances no equations or uniqueness theorems. The central claim of a 'persistent gap' to production-ready assets is supported by qualitative mapping of cited external papers rather than any reduction to the authors' own inputs or self-citations. No load-bearing step reduces by construction to a fitted value or prior self-citation; the taxonomy is an organizational framework whose validity rests on the accuracy of the cited literature, not on internal self-definition.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. Dreamfusion: Text-to-3d using 2d diffusion. InInternational Conference on Learning Representations (ICLR), 2023
work page 2023
-
[2]
Magic3d: High-resolution text-to-3d content creation
Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, and Tsung-Yi Lin. Magic3d: High-resolution text-to-3d content creation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 300–309, 2023
work page 2023
-
[3]
Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, and Jun Zhu. Prolific- dreamer: High-fidelity and diverse text-to-3d generation with variational score distillation. InAdvances in Neural Information Processing Systems (NeurIPS), 2023
work page 2023
-
[4]
Zibo Zhao, Wen Liu, Xin Chen, Xianfang Zeng, Rui Wang, Pei Cheng, Bin Fu, Tao Chen, Gang Yu, and Shenghua Gao. Michelangelo: Conditional 3d shape generation based on shape-image-text aligned latent representation. InAdvances in Neural Information Processing Systems (NeurIPS), 2023. 22
work page 2023
-
[5]
Clay: A controllable large-scale generative model for creating high-quality 3d assets
Longwen Zhang, Ziyu Wang, Qixuan Zhang, Qiwei Qiu, Anqi Pang, Haoran Jiang, Wei Yang, Lan Xu, and Jingyi Yu. Clay: A controllable large-scale generative model for creating high-quality 3d assets. ACM Trans. Graph., 43(4):1–20, 2024
work page 2024
-
[6]
Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation
Zibo Zhao, Zeqiang Lai, Qingxiang Lin, Yunfei Zhao, Haolin Liu, Shuhui Yang, Yifei Feng, Mingxin Yang, Sheng Zhang, Xianghui Yang, et al. Hunyuan3d 2.0: Scaling diffusion models for high resolution textured 3d assets generation.arXiv preprint arXiv:2501.12202, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[7]
LRM: Large reconstruction model for single image to 3d
Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, and Hao Tan. LRM: Large reconstruction model for single image to 3d. InInternational Conference on Learning Representations (ICLR), 2024
work page 2024
-
[8]
Gs-lrm: Large reconstruction model for 3d gaussian splatting
Kai Zhang, Sai Bi, Hao Tan, Yuanbo Xiangli, Nanxuan Zhao, Kalyan Sunkavalli, and Zexiang Xu. Gs-lrm: Large reconstruction model for 3d gaussian splatting. InEuropean Conference on Computer Vision (ECCV), pages 1–19, 2024. doi: 10.1007/978-3-031-72670-5_1
-
[9]
Triposg: High-fidelity 3d shape synthesis using large-scale rectified flow models
Yangguang Li, Zi-Xin Zou, Zexiang Liu, Dehu Wang, Yuan Liang, Zhipeng Yu, Xingchao Liu, Yuan- Chen Guo, Ding Liang, Wanli Ouyang, and Yan-Pei Cao. Triposg: High-fidelity 3d shape synthesis using large-scale rectified flow models, 2025. URLhttps://arxiv.org/abs/2502.06608
-
[10]
Jake Bruce, Michael D. Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal M. P. Behbahani, Stephanie C. Y. Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott E. Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Fr...
work page 2024
-
[11]
Diffusion models are real-time game engines
Dani Valevski, Yaniv Leviathan, Moab Arar, and Shlomi Fruchter. Diffusion models are real-time game engines. InInternational Conference on Learning Representations (ICLR), 2025
work page 2025
-
[12]
Cosmos World Foundation Model Platform for Physical AI
NVIDIA. Cosmos world foundation model platform for physical ai.arXiv preprint arXiv:2501.03575, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[13]
Infinite photorealistic worlds using procedural gen- eration
AlexanderRaistrick, LahavLipson, ZeyuMa, LingjieMei, MingzheWang, YimingZuo, KarhanKayan, Hongyu Wen, Beining Han, Yihan Wang, et al. Infinite photorealistic worlds using procedural gen- eration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12630–12641, 2023
work page 2023
-
[14]
Procthor: Large-scale embodied ai using procedural generation
Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Kiana Ehsani, Jordi Salvador, Winson Han, Eric Kolve, Aniruddha Kembhavi, and Roozbeh Mottaghi. Procthor: Large-scale embodied ai using procedural generation. InAdvances in Neural Information Processing Systems (NeurIPS), volume 35, pages 5982–5994, 2022
work page 2022
-
[15]
Holodeck: Language guided generation of 3d embodied AI environments
Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, and Christopher Clark. Holodeck: Language guided generation of 3d embodied AI environments. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog...
work page 2024
-
[16]
3d-gpt: Procedural 3d modeling with large language models
Chunyi Sun, Junlin Han, Weijian Deng, Xinlong Wang, Zishan Qin, and Stephen Gould. 3d-gpt: Procedural 3d modeling with large language models. InProceedings of the International Conference on 3D Vision (3DV), pages 1253–1263, 2025
work page 2025
-
[17]
Ross, Cordelia Schmid, and Alireza Fathi
Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, and Alireza Fathi. Scenecraft: An LLM agent for synthesizing 3d scenes as blender code. InInternational Conference on Machine Learning (ICML), pages 19252–19282, 2024. 23
work page 2024
-
[18]
Habitat: A platform for embodied AI research
Manolis Savva, Jitendra Malik, Devi Parikh, Dhruv Batra, Abhishek Kadian, Oleksandr Maksymets, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, and Vladlen Koltun. Habitat: A platform for embodied AI research. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9338–9346, 2019
work page 2019
-
[19]
Isaac gym: High performance gpu-based physics simulation for robot learning
Viktor Makoviychuk, Lukasz Wawrzyniak, Yunrong Guo, Michelle Lu, Kier Storey, Miles Macklin, David Hoeller, Nikita Rudin, Arthur Allshire, Ankur Handa, and Gavriel State. Isaac gym: High performance gpu-based physics simulation for robot learning. InAdvances in Neural Information Processing Systems (NeurIPS), 2021
work page 2021
-
[20]
Infinigen indoors: Photorealistic indoor scenes using procedural generation
Alexander Raistrick, Lingjie Mei, Karhan Kayan, David Yan, Yiming Zuo, Beining Han, Hongyu Wen, Meenal Parakh, Stamatis Alexandropoulos, Lahav Lipson, Zeyu Ma, and Jia Deng. Infinigen indoors: Photorealistic indoor scenes using procedural generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21783–21...
work page 2024
-
[21]
Paint3d: Paint anything 3d with lighting-less texture diffusion models
Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong Liu, and Gang Yu. Paint3d: Paint anything 3d with lighting-less texture diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4252–4262, 2024
work page 2024
-
[22]
arXiv preprint arXiv:2403.02151 , year=
Dmitry Tochilkin, David Pankratz, Zexiang Liu, Zixuan Huang, , Adam Letts, Yangguang Li, Ding Liang, Christian Laforte, Varun Jampani, and Yan-Pei Cao. Triposr: Fast 3d object reconstruction from a single image.arXiv preprint arXiv:2403.02151, 2024
-
[23]
Meshgpt: Generating triangle meshes with decoder-only transformers
Yawar Siddiqui, Antonio Alliegro, Alexey Artemov, Tatiana Tommasi, Daniele Sirigatti, Vladislav Rosov, Angela Dai, and Matthias Nießner. Meshgpt: Generating triangle meshes with decoder-only transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pages 19615–19625, 2024
work page 2024
-
[24]
Meshanything: Artist-created mesh generation with autoregressive transformers
Yiwen Chen, Tong He, Di Huang, Weicai Ye, Sijin Chen, Jiaxiang Tang, Zhongang Cai, Lei Yang, Gang Yu, Guosheng Lin, and Chi Zhang. Meshanything: Artist-created mesh generation with autoregressive transformers. InInternational Conference on Learning Representations (ICLR), 2025
work page 2025
-
[25]
Deepmesh: Auto-regressive artist-mesh creation with reinforcement learning
Ruowen Zhao, Junliang Ye, Zhengyi Wang, Guangce Liu, Yiwen Chen, Yikai Wang, and Jun Zhu. Deepmesh: Auto-regressive artist-mesh creation with reinforcement learning. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10612–10623, 2025
work page 2025
-
[26]
Biwen Lei, Yang Li, Xinhai Liu, Shuhui Yang, Lixin Xu, Jingwei Huang, Ruining Tang, Haohan Weng, Jian Liu, Jing Xu, et al. Hunyuan3d studio: End-to-end ai pipeline for game-ready 3d asset generation. arXiv preprint arXiv:2509.12815, 2025
-
[27]
Structured 3d latents for scalable and versatile 3d generation
Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, and Jiaolong Yang. Structured 3d latents for scalable and versatile 3d generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 21469– 21480, 2025
work page 2025
-
[28]
Quadgpt: Native quadrilateral mesh generation with autoregressive models
Jian Liu, Chunshi Wang, Song Guo, Haohan Weng, Zhen Zhou, Zhiqi Li, Jiaao Yu, Yiling Zhu, Jing Xu, Biwen Lei, et al. Quadgpt: Native quadrilateral mesh generation with autoregressive models. arXiv preprint arXiv:2509.21420, 2025
-
[29]
Auto-regressive surface cutting.arXiv preprint arXiv:2506.18017, 2025
Yang Li, Victor Cheung, Xinhai Liu, Yuguang Chen, Zhongjin Luo, Biwen Lei, Haohan Weng, Zibo Zhao, Jingwei Huang, Zhuo Chen, et al. Auto-regressive surface cutting.arXiv preprint arXiv:2506.18017, 2025
-
[30]
Texgen: a generative diffusion model for mesh textures.ACM Trans
Xin Yu, Ze Yuan, Yuan-Chen Guo, Ying-Tian Liu, Jianhui Liu, Yangguang Li, Yan-Pei Cao, Ding Liang, and Xiaojuan Qi. Texgen: a generative diffusion model for mesh textures.ACM Trans. Graph., 43(6):213:1–213:14, 2024. 24
work page 2024
-
[31]
Materialmvp: Illumination-invariant material generation via multi-view pbr diffusion
Zebin He, Mingxin Yang, Shuhui Yang, Yixuan Tang, Tao Wang, Kaihao Zhang, Guanying Chen, Yuhong Liu, Jie Jiang, Chunchao Guo, et al. Materialmvp: Illumination-invariant material generation via multi-view pbr diffusion. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 26294–26305, 2025
work page 2025
-
[32]
Rignet: neural rigging for articulated characters.ACM Trans
Zhan Xu, Yang Zhou, Evangelos Kalogerakis, Chris Landreth, and Karan Singh. Rignet: neural rigging for articulated characters.ACM Trans. Graph., 39(4):58, 2020
work page 2020
-
[33]
Albert Mosella-Montoro and Javier Ruiz-Hidalgo. Skinningnet: Two-stream graph convolutional neural network for skinning prediction of synthetic characters. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 18593–18602, 2022
work page 2022
-
[34]
Native and compact structured latents for 3d generation.arXiv preprint arXiv:2512.14692, 2025
Jianfeng Xiang, Xiaoxue Chen, Sicheng Xu, Ruicheng Wang, Zelong Lv, Yu Deng, Hongyuan Zhu, Yue Dong, Hao Zhao, Nicholas Jing Yuan, and Jiaolong Yang. Native and compact structured latents for 3d generation.CoRR, abs/2512.14692, 2025
-
[35]
Physcene: Physically interactable 3d scene synthesis for embodied ai
Yandan Yang, Baoxiong Jia, Peiyuan Zhi, and Siyuan Huang. Physcene: Physically interactable 3d scene synthesis for embodied ai. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16262–16272, 2024
work page 2024
-
[36]
Pose2room: understanding 3d scenes from human activities
Yinyu Nie, Angela Dai, Xiaoguang Han, and Matthias Nießner. Pose2room: understanding 3d scenes from human activities. InEuropean Conference on Computer Vision (ECCV), pages 425–443, 2022
work page 2022
-
[37]
Mime: Human-aware 3d scene generation
Hongwei Yi, Chun-Hao P Huang, Shashank Tripathi, Lea Hering, Justus Thies, and Michael J Black. Mime: Human-aware 3d scene generation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12965–12976, 2023
work page 2023
-
[38]
Recent advances in 3d object and scene generation: A survey,
Xiang Tang, Ruotong Li, and Xiaopeng Fan. Recent advances in 3d object and scene generation: A survey.arXiv preprint arXiv:2504.11734, 2025
-
[39]
Kaisei Fukaya, Damon Daylamani-Zad, and Harry W. Agius. Intelligent generation of graphical game assets: A conceptual framework and systematic review of the state of the art.ACM Comput. Surv., 57(5):118:1–118:38, 2025
work page 2025
-
[40]
Ai-generated content (AIGC) for various data modali- ties: A survey.ACM Comput
Lin Geng Foo, Hossein Rahmani, and Jun Liu. Ai-generated content (AIGC) for various data modali- ties: A survey.ACM Comput. Surv., 57(9):243:1–243:66, 2025
work page 2025
-
[41]
arXiv preprint arXiv:2505.05474 (2025)
Beichen Wen, Haozhe Xie, Zhaoxi Chen, Fangzhou Hong, and Ziwei Liu. 3d scene generation: A survey.arXiv preprint arXiv:2505.05474, 2025
-
[42]
ShapeNet: An Information-Rich 3D Model Repository
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository.arXiv preprint arXiv:1512.03012, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[43]
Objaverse: A universe of annotated 3d objects
Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A universe of annotated 3d objects. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13142–13153, 2023
work page 2023
-
[44]
Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram Voleti, Samir Yitzhak Gadre, et al. Objaverse-xl: A universe of 10m+ 3d objects.Advances in Neural Information Processing Systems, 36:35799–35813, 2023
work page 2023
-
[45]
Smpl: A skinned multi-person linear model.ACM Trans
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J Black. Smpl: A skinned multi-person linear model.ACM Trans. Graph., 34(6), 2015
work page 2015
-
[46]
3d-front: 3d furnished rooms with layouts and semantics
Huan Fu, Bowen Cai, Lin Gao, Ling-Xiao Zhang, Jiaming Wang, Cao Li, Qixun Zeng, Chengyue Sun, Rongfei Jia, Binqiang Zhao, et al. 3d-front: 3d furnished rooms with layouts and semantics. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 10933– 10942, 2021. 25
work page 2021
-
[47]
ATISS: autoregressive transformers for indoor scene synthesis
Despoina Paschalidou, Amlan Kar, Maria Shugrina, Karsten Kreis, Andreas Geiger, and Sanja Fidler. ATISS: autoregressive transformers for indoor scene synthesis. InAdvances in Neural Information Processing Systems (NeurIPS), pages 12013–12026, 2021
work page 2021
-
[48]
Physgen3d: Crafting a miniature interactive world from a single image
Boyuan Chen, Hanxiao Jiang, Shaowei Liu, Saurabh Gupta, Yunzhu Li, Hao Zhao, and Shenlong Wang. Physgen3d: Crafting a miniature interactive world from a single image. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6178–6189, 2025
work page 2025
-
[49]
Citycraft: A real crafter for 3d city generation
JieDeng, WenhaoChai, JunshengHuang, ZhonghanZhao, QixuanHuang, MingyanGao, JianshuGuo, Shengyu Hao, Wenhao Hu, Jenq-Neng Hwang, et al. Citycraft: A real crafter for 3d city generation. arXiv preprint arXiv:2406.04983, 2024
-
[50]
Song Tang, Kaiyong Zhao, Lei Wang, Yuliang Li, Xuebo Liu, Junyi Zou, Qiang Wang, and Xiaowen Chu. Unrealllm: Towards highly controllable and interactable 3d scene generation by llm-powered procedural content generation. InFindings of the Association for Computational Linguistics: ACL 2025, pages 19417–19435, 2025. doi: 10.18653/v1/2025.findings-acl.994
-
[51]
Deep learning for 3d point clouds: A survey.IEEE Trans
Yulan Guo, Hanyun Wang, Qingyong Hu, Hao Liu, Li Liu, and Mohammed Bennamoun. Deep learning for 3d point clouds: A survey.IEEE Trans. Pattern Anal. Mach. Intell., 43(12):4338–4364, 2020
work page 2020
-
[52]
Pointnet: Deep learning on point sets for 3d classification and segmentation
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017
work page 2017
-
[53]
3d shapenets: A deep representation for volumetric shapes
Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3d shapenets: A deep representation for volumetric shapes. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1912–1920, 2015
work page 1912
-
[54]
Voxnet: A 3d convolutional neural network for real-time object recognition
Daniel Maturana and Sebastian Scherer. Voxnet: A 3d convolutional neural network for real-time object recognition. InProceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 922–928, 2015
work page 2015
-
[55]
Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs
Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. InProceedings of the IEEE international conference on computer vision, pages 2088–2096, 2017
work page 2088
-
[56]
Octnet: Learning deep 3d representations at high resolutions
Gernot Riegler, Ali Osman Ulusoy, and Andreas Geiger. Octnet: Learning deep 3d representations at high resolutions. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3577–3586, 2017
work page 2017
-
[57]
Instant neural graphics primi- tives with a multiresolution hash encoding.ACM Trans
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primi- tives with a multiresolution hash encoding.ACM Trans. Graph., 41(4):1–15, 2022
work page 2022
-
[58]
Xcube: Large-scale 3d generative modeling using sparse voxel hierarchies
Xuanchi Ren, Jiahui Huang, Xiaohui Zeng, Ken Museth, Sanja Fidler, and Francis Williams. Xcube: Large-scale 3d generative modeling using sparse voxel hierarchies. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4209–4219, 2024
work page 2024
-
[59]
Eman Ahmed, Alexandre Saint, Abd El Rahman Shabayek, Kseniya Cherenkova, Rig Das, Gleb Gusev, Djamila Aouada, and Bjorn Ottersten. A survey on deep learning advances on different 3d data representations.arXiv preprint arXiv:1808.01462, 2018
-
[60]
Geometric deep learning on graphs and manifolds using mixture model cnns
Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodola, Jan Svoboda, and Michael M Bronstein. Geometric deep learning on graphs and manifolds using mixture model cnns. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 5115–5124, 2017
work page 2017
-
[61]
Srinivasan, Matthew Tancik, Jonathan T
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InEuropean Conference on Computer Vision (ECCV), pages 405–421, 2020. 26
work page 2020
-
[62]
Mip-nerf 360: Unbounded anti-aliased neural radiance fields
Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5470–5479, 2022
work page 2022
-
[63]
Zip- nerf: Anti-aliased grid-based neural radiance fields
Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. Zip- nerf: Anti-aliased grid-based neural radiance fields. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 19697–19705, 2023
work page 2023
-
[64]
Deepsdf: Learning continuous signed distance functions for shape representation
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 165–174, 2019
work page 2019
-
[65]
Oc- cupancy networks: Learning 3d reconstruction in function space
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Oc- cupancy networks: Learning 3d reconstruction in function space. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4460–4470, 2019
work page 2019
-
[66]
Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction
Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. InAdvances in Neural Information Processing Systems (NeurIPS), pages 27171–27183, 2021
work page 2021
-
[67]
Volume rendering of neural implicit surfaces
Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. Volume rendering of neural implicit surfaces. Advances in neural information processing systems, 34:4805–4815, 2021
work page 2021
-
[68]
William E. Lorensen and Harvey E. Cline. Marching cubes: A high resolution 3d surface construction algorithm. InACM SIGGRAPH Conference Proceedings, pages 163–169, 1987
work page 1987
-
[69]
Splattingavatar: Realistic real-time human avatars with mesh-embedded gaussian splatting
Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, and Zeyu Wang. Splattingavatar: Realistic real-time human avatars with mesh-embedded gaussian splatting. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1606–1616, 2024
work page 2024
-
[70]
Mega: Hybrid mesh-gaussian head avatar for high-fidelity rendering and head editing
Cong Wang, Di Kang, Heyi Sun, Shenhan Qian, Zixuan Wang, Linchao Bao, and Song-Hai Zhang. Mega: Hybrid mesh-gaussian head avatar for high-fidelity rendering and head editing. InProceedings of the computer vision and pattern recognition conference, pages 26274–26284, 2025
work page 2025
-
[71]
Tianchang Shen, Jun Gao, Kangxue Yin, Ming-Yu Liu, and Sanja Fidler. Deep marching tetrahe- dra: a hybrid representation for high-resolution 3d shape synthesis.Advances in Neural Information Processing Systems, 34:6087–6101, 2021
work page 2021
-
[72]
Flexible isosurface extraction for gradient-based mesh optimization.ACM Trans
Tianchang Shen, Jacob Munkberg, Jon Hasselgren, Kangxue Yin, Zian Wang, Wenzheng Chen, Zan Gojcic, Sanja Fidler, Nicholas Sharp, and Jun Gao. Flexible isosurface extraction for gradient-based mesh optimization.ACM Trans. Graph., 42(4), 2023. doi: 10.1145/3592430
-
[73]
Get3d: A generative model of high quality 3d textured shapes learned from images
Jun Gao, Tianchang Shen, Zian Wang, Wenzheng Chen, Kangxue Yin, Daiqing Li, Or Litany, Zan Gojcic, and Sanja Fidler. Get3d: A generative model of high quality 3d textured shapes learned from images. InAdvances in Neural Information Processing Systems (NeurIPS), 2022
work page 2022
-
[74]
3d gaussian splatting for real-time radiance field rendering.ACM Trans
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, George Drettakis, et al. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023
work page 2023
-
[75]
Efficient geometry-aware 3d generative adversarial networks
Eric R Chan, Connor Z Lin, Matthew A Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas J Guibas, Jonathan Tremblay, Sameh Khamis, et al. Efficient geometry-aware 3d generative adversarial networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16123–16133, 2022
work page 2022
-
[76]
Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. InInternational Conference on Learning Representations (ICLR), 2014. 27
work page 2014
-
[77]
Generative adversarial nets.Advances in neural information processing systems, 27, 2014
Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.Advances in neural information processing systems, 27, 2014
work page 2014
-
[78]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
work page 2020
-
[79]
A point set generation network for 3d object recon- struction from a single image
Haoqiang Fan, Hao Su, and Leonidas J Guibas. A point set generation network for 3d object recon- struction from a single image. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 605–613, 2017
work page 2017
-
[80]
Pixel2mesh: Generating 3d mesh models from single rgb images
Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. InEuropean Conference on Computer Vision (ECCV), pages 52–67, 2018
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.