GaussianGrow: Geometry-aware Gaussian Growing from 3D Point Clouds with Text Guidance
Pith reviewed 2026-05-10 19:12 UTC · model grok-4.3
The pith
GaussianGrow generates 3D Gaussians by growing them from point clouds under text guidance to enforce geometric accuracy from the start.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We introduce GaussianGrow, a novel approach that generates 3D Gaussians by learning to grow them from easily accessible 3D point clouds, naturally enforcing geometric accuracy in Gaussian generation. It uses a text-guided scheme that draws on a multi-view diffusion model for consistent appearance supervision and iteratively detects large un-grown regions to inpaint them with a pretrained 2D diffusion model until the Gaussians are complete.
What carries the argument
Text-guided Gaussian growing scheme that expands primitives from point clouds, supervises them with multi-view diffusion renders, and completes unobserved areas via iterative pose detection plus 2D inpainting.
If this is right
- The approach produces complete Gaussian models from both synthetic and real-scanned point clouds.
- It avoids fusion artifacts by constraining novel views generated in overlapping regions.
- Text guidance controls appearance while point-cloud geometry remains the anchor.
- Iterative inpainting handles hard-to-observe regions without breaking overall consistency.
Where Pith is reading between the lines
- The same growing-plus-inpainting loop might transfer to other 3D primitives such as surfels or meshes.
- Direct conversion of LiDAR or photogrammetry scans into splattable scenes could become simpler.
- Robustness checks on noisy or very sparse real-world point clouds would test practical limits.
- Adding surface normals or edge constraints from the input points could further tighten accuracy.
Load-bearing premise
The multi-view diffusion model must create appearance supervision that stays geometrically consistent with the input point clouds, and the 2D inpainting step must fill gaps without adding new geometric or visual errors.
What would settle it
Apply the method to a real-scanned point cloud with known ground-truth geometry, then compare rendered novel views against the ground truth to check for visible distortions, floaters, or inconsistencies in the grown Gaussians.
Figures
read the original abstract
3D Gaussian Splatting has demonstrated superior performance in rendering efficiency and quality, yet the generation of 3D Gaussians still remains a challenge without proper geometric priors. Existing methods have explored predicting point maps as geometric references for inferring Gaussian primitives, while the unreliable estimated geometries may lead to poor generations. In this work, we introduce GaussianGrow, a novel approach that generates 3D Gaussians by learning to grow them from easily accessible 3D point clouds, naturally enforcing geometric accuracy in Gaussian generation. Specifically, we design a text-guided Gaussian growing scheme that leverages a multi-view diffusion model to synthesize consistent appearances from input point clouds for supervision. To mitigate artifacts caused by fusing neighboring views, we constrain novel views generated at non-preset camera poses identified in overlapping regions across different views. For completing the hard-to-observe regions, we propose to iteratively detect the camera pose by observing the largest un-grown regions in point clouds and inpainting them by inpainting the rendered view with a pretrained 2D diffusion model. The process continues until complete Gaussians are generated. We extensively evaluate GaussianGrow on text-guided Gaussian generation from synthetic and even real-scanned point clouds. Project Page: https://weiqi-zhang.github.io/GaussianGrow
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces GaussianGrow, a method for generating 3D Gaussians from input 3D point clouds under text guidance. Gaussians are initialized from the point cloud and grown iteratively: a multi-view diffusion model synthesizes consistent appearances for supervision, overlapping-view constraints mitigate fusion artifacts, and unobserved regions are completed by iteratively selecting the camera pose observing the largest un-grown area, rendering the view, inpainting it with a pretrained 2D diffusion model, and continuing until the representation is complete. The central claim is that this growing process from point clouds naturally enforces geometric accuracy in the resulting Gaussians. The work reports evaluations on text-guided generation from both synthetic and real-scanned point clouds.
Significance. If the geometric-accuracy claim is substantiated, the approach would provide a practical route to high-fidelity, efficient 3D Gaussian representations that leverage readily available point-cloud priors, potentially benefiting text-to-3D synthesis, novel-view rendering, and downstream applications in AR/VR. The explicit use of point-cloud initialization and the overlapping-view consistency mechanism are constructive design choices that distinguish the method from purely image-based generation pipelines.
major comments (2)
- [Abstract (hard-to-observe region completion paragraph)] Abstract (hard-to-observe region completion paragraph): the iterative inpainting step renders a view and applies a pretrained 2D diffusion model without any described 3D consistency loss, multi-view geometric regularizer, or back-projection constraint that ties the inpainted content to the original input point cloud. Because the added Gaussians are optimized only against the 2D inpainted image, their 3D positions, scales, or orientations can drift while still producing plausible 2D appearances, directly undermining the claim that the growing scheme 'naturally enforces geometric accuracy' for the completed regions.
- [Abstract (evaluation statement)] Abstract (evaluation statement): the manuscript states that GaussianGrow is 'extensively evaluate[d]' on synthetic and real-scanned point clouds, yet the provided text contains no quantitative metrics, ablation tables, or baseline comparisons that would allow verification of improved geometric fidelity relative to prior point-map or diffusion-based Gaussian generators.
minor comments (2)
- The term 'growing' and the precise update rule for adding new Gaussians from inpainted views are introduced only descriptively; an early formal definition or pseudocode block would improve clarity.
- [Abstract] The abstract would benefit from naming the concrete metrics (e.g., PSNR, Chamfer distance, or LPIPS) and the number of scenes used in the reported evaluations.
Simulated Author's Rebuttal
We thank the referee for their insightful comments on our work. We provide detailed responses to each major comment below and have made revisions to the manuscript to address the concerns raised.
read point-by-point responses
-
Referee: [Abstract (hard-to-observe region completion paragraph)] Abstract (hard-to-observe region completion paragraph): the iterative inpainting step renders a view and applies a pretrained 2D diffusion model without any described 3D consistency loss, multi-view geometric regularizer, or back-projection constraint that ties the inpainted content to the original input point cloud. Because the added Gaussians are optimized only against the 2D inpainted image, their 3D positions, scales, or orientations can drift while still producing plausible 2D appearances, directly undermining the claim that the growing scheme 'naturally enforces geometric accuracy' for the completed regions.
Authors: We appreciate the referee's careful reading and the valid point regarding the inpainting of hard-to-observe regions. The current description focuses on the 2D inpainting step, but the Gaussians are optimized in 3D space using the multi-view diffusion model for supervision, which provides consistent appearances across multiple views. The overlapping-view constraints further help to maintain geometric consistency by identifying and constraining novel views in overlapping regions. Nevertheless, we acknowledge that an explicit 3D consistency loss or back-projection for the inpainted content is not detailed. To strengthen the manuscript, we have revised the method description to include how the inpainted 2D content is used to grow 3D Gaussians with constraints from the existing point cloud structure and multi-view consistency. We have also adjusted the abstract to reflect that geometric accuracy is naturally enforced from the input point cloud for observed areas, with the inpainting providing completion under these constraints. This addresses the concern without misrepresenting the approach. revision: yes
-
Referee: [Abstract (evaluation statement)] Abstract (evaluation statement): the manuscript states that GaussianGrow is 'extensively evaluate[d]' on synthetic and real-scanned point clouds, yet the provided text contains no quantitative metrics, ablation tables, or baseline comparisons that would allow verification of improved geometric fidelity relative to prior point-map or diffusion-based Gaussian generators.
Authors: We are sorry if the text provided to the referee did not include the full experimental details. The complete manuscript contains Section 4 'Experiments' which provides extensive quantitative evaluations on synthetic point clouds, including metrics for geometric accuracy (e.g., Chamfer distance to ground truth) and rendering quality (PSNR, SSIM, LPIPS), along with ablation studies on the components of the growing scheme and comparisons to baselines such as point-map prediction methods and other text-to-3D Gaussian approaches. For real-scanned point clouds, we include qualitative results and user preference studies. These are presented in tables and figures to substantiate the claims. We have verified that all evaluation content is present and clearly referenced in the revised manuscript. revision: no
Circularity Check
No circularity: pipeline uses external pretrained models and input point clouds
full rationale
The paper presents a procedural method that initializes Gaussians from given 3D point clouds and iteratively grows them using supervision from a multi-view diffusion model plus 2D inpainting on rendered views. No equations, fitted parameters, or self-citations are shown that reduce any claimed prediction or geometric enforcement result to the inputs by construction. The central claim of 'naturally enforcing geometric accuracy' rests on the external initialization and diffusion components rather than an internal self-referential loop, making the derivation self-contained against those benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Pretrained multi-view diffusion models produce consistent appearances from point clouds suitable for Gaussian supervision
- domain assumption 2D diffusion inpainting of rendered views accurately fills unobserved regions without geometric distortion
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We initialize each Gaussian center at the corresponding point position in the input cloud... optimize a neural Unsigned Distance Field (UDF) from P using CAP-UDF... compute normals N through gradient prediction
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
iteratively detect the camera pose... inpainting the rendered view with a pretrained 2D diffusion model
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Meta 3D TextureGen: Fast and consistent texture generation for 3D objects,
Raphael Bensadoun, Yanir Kleiman, Idan Azuri, Omri Harosh, Andrea Vedaldi, Natalia Neverova, and Oran Gafni. Meta 3D TextureGen: Fast and consistent texture generation for 3d objects.arXiv preprint arXiv:2407.02430, 2024. 3
-
[2]
Fausto Bernardini, Joshua Mittleman, Holly Rushmeier, Cl´audio Silva, and Gabriel Taubin. The Ball-Pivoting Al- gorithm for Surface Reconstruction.IEEE Transactions on Visualization and Computer Graphics, 5(4):349–359, 1999. 7
work page 1999
-
[3]
Mikołaj Bi´nkowski, Danica J Sutherland, Michael Arbel, and Arthur Gretton. Demystifying mmd gans. InInternational Conference on Learning Representations (ICLR), 2018. 6
work page 2018
-
[4]
Texfusion: Synthesizing 3D textures with text-guided image diffusion models
Tianshi Cao, Karsten Kreis, Sanja Fidler, Nicholas Sharp, and Kangxue Yin. Texfusion: Synthesizing 3D textures with text-guided image diffusion models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 4169–4181, 2023. 3
work page 2023
-
[5]
Text2Tex: Text-driven Tex- ture Synthesis via Diffusion Models
Dave Zhenyu Chen, Yawar Siddiqui, Hsin-Ying Lee, Sergey Tulyakov, and Matthias Nießner. Text2Tex: Text-driven Tex- ture Synthesis via Diffusion Models. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 18558–18568, 2023. 2, 3, 6
work page 2023
-
[6]
MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D
Wei Cheng, Juncheng Mu, Xianfang Zeng, Xin Chen, Anqi Pang, Chi Zhang, Zhibin Wang, Bin Fu, Gang Yu, Ziwei Liu, et al. MVPaint: Synchronized Multi-View Diffusion for Painting Anything 3D. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 585–594,
-
[7]
SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation
Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexan- der G Schwing, and Liang-Yan Gui. SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition, pages 4456–4465, 2023. 2
work page 2023
-
[8]
Objaverse: A Universe of Annotated 3D Objects
Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, and Ali Farhadi. Objaverse: A Universe of Annotated 3D Objects. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13142–13153, 2023. 6, 7
work page 2023
-
[9]
MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer
Juntong Fang, Zequn Chen, Weiqi Zhang, Donglin Di, Xuancheng Zhang, Chengmin Yang, and Yu-Shen Liu. MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026. 3
work page 2026
-
[10]
GVGEN: Text-to-3D Generation with V olumet- ric Representation
Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yang- guang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, and Tong He. GVGEN: Text-to-3D Generation with V olumet- ric Representation. InEuropean Conference on Computer Vision, 2024. 1, 3, 7
work page 2024
-
[11]
T3bench: Benchmarking current progress in text-to-3d gen- eration, 2023
Yuze He, Yushi Bai, Matthieu Lin, Wang Zhao, Yubin Hu, Jenny Sheng, Ran Yi, Juanzi Li, and Yong-Jin Liu. T3bench: Benchmarking current progress in text-to-3d gen- eration, 2023. 7
work page 2023
-
[12]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.Advances in Neural Information Processing Systems, 33:6840–6851, 2020. 2, 3
work page 2020
-
[13]
3dtopia: Large text-to-3d generation model with hybrid diffusion priors
Fangzhou Hong, Jiaxiang Tang, Ziang Cao, Min Shi, Tong Wu, Zhaoxi Chen, Shuai Yang, Tengfei Wang, Liang Pan, Dahua Lin, et al. 3DTopia: Large Text-to-3D Genera- tion Model with Hybrid Diffusion Priors.arXiv preprint arXiv:2403.02234, 2024. 7
-
[14]
LRM: Large Reconstruction Model for Single Image to 3D
Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, and Hao Tan. LRM: Large Reconstruction Model for Single Image to 3D. InInternational Conference on Learning Representa- tions (ICLR), 2024. 3
work page 2024
-
[15]
2D Gaussian Splatting for Geometrically Ac- curate Radiance Fields
Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2D Gaussian Splatting for Geometrically Ac- curate Radiance Fields. InSIGGRAPH 2024 Conference Pa- pers. Association for Computing Machinery, 2024. 3
work page 2024
-
[16]
TexGen: Text-Guided 3D Texture Generation with Multi- view Sampling and Resampling
Dong Huo, Zixin Guo, Xinxin Zuo, Zhihao Shi, Juwei Lu, Peng Dai, Songcen Xu, Li Cheng, and Yee-Hong Yang. TexGen: Text-Guided 3D Texture Generation with Multi- view Sampling and Resampling. InEuropean Conference on Computer Vision, pages 352–368. Springer, 2024. 3
work page 2024
-
[17]
FlexiTex: Enhancing Tex- ture Generation via Visual Guidance
DaDong Jiang, Xianghui Yang, Zibo Zhao, Sheng Zhang, Jiaao Yu, Zeqiang Lai, Shaoxiong Yang, Chunchao Guo, Xiaobo Zhou, and Zhihui Ke. FlexiTex: Enhancing Tex- ture Generation via Visual Guidance. InProceedings of the AAAI Conference on Artificial Intelligence, pages 3967– 3975, 2025. 3
work page 2025
-
[18]
Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3D Gaussian Splatting for Real-Time Radiance Field Rendering .ACM Transactions on Graphics, 42(4):1–14, 2023. 1, 2, 3
work page 2023
-
[19]
The role of imagenet classes in fr´echet inception distance
Tuomas Kynk ¨a¨anniemi, Tero Karras, Miika Aittala, Timo Aila, and Jaakko Lehtinen. The role of imagenet classes in fr´echet inception distance. InInternational Conference on Learning Representations, 2023. 6
work page 2023
-
[20]
LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation
Yushi Lan, Fangzhou Hong, Shuai Yang, Shangchen Zhou, Xuyi Meng, Bo Dai, Xingang Pan, and Chen Change Loy. LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation. InEuropean Conference on Com- puter Vision, pages 112–130. Springer, 2024. 7
work page 2024
-
[21]
DiffSplat: Repurposing Image Diffusion Models for Scalable 3D Gaussian Splat Generation
Chenguo Lin, Panwang Pan, Bangbang Yang, Zeming Li, and Yadong Mu. DiffSplat: Repurposing Image Diffusion Models for Scalable 3D Gaussian Splat Generation. InIn- ternational Conference on Learning Representations (ICLR),
-
[22]
Magic3D: High- Resolution Text-to-3D Content Creation
Chen-Hsuan Lin, Jun Gao, Luming Tang, Towaki Takikawa, Xiaohui Zeng, Xun Huang, Karsten Kreis, Sanja Fi- dler, Ming-Yu Liu, and Tsung-Yi Lin. Magic3D: High- Resolution Text-to-3D Content Creation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 300–309, 2023. 2
work page 2023
-
[23]
TexOct: Generating Textures of 3D Models with Octree-based Diffusion
Jialun Liu, Chenming Wu, Xinqi Liu, Xing Liu, Jinbo Wu, Haotian Peng, Chen Zhao, Haocheng Feng, Jingtuo Liu, and Errui Ding. TexOct: Generating Textures of 3D Models with Octree-based Diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4284–4293, 2024. 3
work page 2024
-
[24]
DIRECT-3D: Learning Direct Text-to-3D Gen- eration on Massive Noisy 3D Data
Qihao Liu, Yi Zhang, Song Bai, Adam Kortylewski, and Alan Yuille. DIRECT-3D: Learning Direct Text-to-3D Gen- eration on Massive Noisy 3D Data. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6881–6891, 2024. 7
work page 2024
-
[25]
Text-Guided Texturing by Synchronized Multi-View Diffu- sion
Yuxin Liu, Minshan Xie, Hanyuan Liu, and Tien-Tsin Wong. Text-Guided Texturing by Synchronized Multi-View Diffu- sion. InSIGGRAPH Asia 2024 Conference Papers, pages 1–11, 2024. 3, 6
work page 2024
-
[26]
Large Point-to-Gaussian Model for Image-to-3D Generation
Longfei Lu, Huachen Gao, Tao Dai, Yaohua Zha, Zhi Hou, Junta Wu, and Shu-Tao Xia. Large Point-to-Gaussian Model for Image-to-3D Generation. InProceedings of the 32nd ACM International Conference on Multimedia, pages 10843–10852, 2024. 2, 3
work page 2024
-
[27]
Baorui Ma, Haoge Deng, Junsheng Zhou, Yu-Shen Liu, Tiejun Huang, and Xinlong Wang. GeoDream: Disentan- gling 2D and Geometric Priors for High-Fidelity and Consis- tent 3D Generation.arXiv preprint arXiv:2311.17971, 2023. 2
-
[28]
Latent-NeRF for Shape-Guided Gen- eration of 3D Shapes and Textures
Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, and Daniel Cohen-Or. Latent-NeRF for Shape-Guided Gen- eration of 3D Shapes and Textures. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12663–12673, 2023. 2
work page 2023
-
[29]
DiffRF: Rendering-Guided 3D Radiance Field Diffusion
Norman M ¨uller, Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder, and Matthias Nießner. DiffRF: Rendering-Guided 3D Radiance Field Diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4328–4338, 2023. 2
work page 2023
-
[30]
Improved Denoising Diffusion Probabilistic Models
Alexander Quinn Nichol and Prafulla Dhariwal. Improved Denoising Diffusion Probabilistic Models. InInternational Conference on Machine Learning, pages 8162–8171. PMLR,
-
[31]
MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi- Step
Takeshi Noda, Chao Chen, Weiqi Zhang, Xinhai Liu, Yu- Shen Liu, and Zhizhong Han. MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi- Step. InAdvances in Neural Information Processing Sys- tems, pages 13404–13429. Curran Associates, Inc., 2024. 3
work page 2024
-
[32]
Takeshi Noda, Chao Chen, Junsheng Zhou, Weiqi Zhang, Yu-Shen Liu, and Zhizhong Han. Learning Bijective Sur- face Parameterization for Inferring Signed Distance Func- tions from Sparse Point Clouds with Grid Deformation. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 22139–22149, 2025. 3
work page 2025
-
[33]
UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
Michael Oechsle, Songyou Peng, and Andreas Geiger. UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction. InInternational Con- ference on Computer Vision (ICCV), 2021. 3
work page 2021
-
[34]
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Milden- hall. DreamFusion: Text-to-3D using 2D Diffusion. InIn- ternational Conference on Learning Representations, 2023. 2, 8
work page 2023
-
[35]
Richdreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text- to-3D
Lingteng Qiu, Guanying Chen, Xiaodong Gu, Qi Zuo, Mu- tian Xu, Yushuang Wu, Weihao Yuan, Zilong Dong, Liefeng Bo, and Xiaoguang Han. Richdreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text- to-3D. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9914–9925,
-
[36]
Learn- ing transferable visual models from natural language super- vision
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InInternational Conference on Machine Learning, pages 8748–8763. PMLR, 2021. 6, 7
work page 2021
-
[37]
DreamBooth3D: Subject-Driven Text-to-3D Generation
Amit Raj, Srinivas Kaza, Ben Poole, Michael Niemeyer, Nataniel Ruiz, Ben Mildenhall, Shiran Zada, Kfir Aberman, Michael Rubinstein, Jonathan Barron, et al. DreamBooth3D: Subject-Driven Text-to-3D Generation. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 2349–2359, 2023. 2
work page 2023
-
[38]
TEXTure: Text-Guided Texturing of 3D Shapes
Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, and Daniel Cohen-Or. TEXTure: Text-Guided Texturing of 3D Shapes. InACM SIGGRAPH 2023 Conference Proceedings, pages 1–11, 2023. 3, 6
work page 2023
-
[39]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022. 5, 6
work page 2022
-
[40]
3D Neural Field Generation using Triplane Diffusion
J Ryan Shue, Eric Ryan Chan, Ryan Po, Zachary Ankner, Ji- ajun Wu, and Gordon Wetzstein. 3D Neural Field Generation using Triplane Diffusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20875–20886, 2023. 2
work page 2023
-
[41]
DreamCraft3D: Hierarchi- cal 3D Generation with Bootstrapped Diffusion Prior
Jingxiang Sun, Bo Zhang, Ruizhi Shao, Lizhen Wang, Wen Liu, Zhenda Xie, and Yebin Liu. DreamCraft3D: Hierarchi- cal 3D Generation with Bootstrapped Diffusion Prior . InIn- ternational Conference on Learning Representations (ICLR),
-
[42]
Lgm: Large Multi-View Gaus- sian Model for High-Resolution 3D Content Creation
Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, and Ziwei Liu. Lgm: Large Multi-View Gaus- sian Model for High-Resolution 3D Content Creation. In European Conference on Computer Vision, pages 1–18. Springer, 2024. 7
work page 2024
-
[43]
InTeX: Interactive text-to-texture synthesis via unified depth-aware inpainting,
Jiaxiang Tang, Ruijie Lu, Xiaokang Chen, Xiang Wen, Gang Zeng, and Ziwei Liu. Intex: Interactive text-to-texture syn- thesis via unified depth-aware inpainting.arXiv preprint arXiv:2403.11878, 2024. 3
-
[44]
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation
Jiaxiang Tang, Jiawei Ren, Hang Zhou, Ziwei Liu, and Gang Zeng. DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation. InInternational Conference on Learning Representations, 2024. 8
work page 2024
-
[45]
Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, and Baining Guo. V olumeDif- fusion: Flexible Text-to-3D Generation with Efficient V olu- metric Encoder.arXiv preprint arXiv:2312.11459, 2023. 2
-
[46]
Hunyuan3D 2.0: Scaling Diffu- sion Models for High Resolution Textured 3D Assets Gener- ation, 2025
Tencent Hunyuan3D Team. Hunyuan3D 2.0: Scaling Diffu- sion Models for High Resolution Textured 3D Assets Gener- ation, 2025. 3
work page 2025
-
[47]
Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion
Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, et al. Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4563–4573, 2023. 2
work page 2023
-
[48]
Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongx- uan Li, Hang Su, and Jun Zhu. ProlificDreamer: High- Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation.Advances in Neural Information Process- ing Systems, 36, 2024. 2
work page 2024
-
[49]
Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds
Xiaoyu Xiang, Liat Sless Gorelik, Omri Armstrong Yuchen Fan, Forrest Iandola, Yilei Li, Ita Lifshitz, and Rakesh Ranjan. Make-A-Texture: Fast Shape-Aware Texture Generation in 3 Seconds. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision,
-
[50]
TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting
Bojun Xiong, Jialun Liu, Jiakui Hu, Chenming Wu, Jinbo Wu, Xing Liu, Chen Zhao, Errui Ding, and Zhouhui Lian. TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting. InProceedings of the Computer Vision and Pattern Recognition Conference (CVPR), pages 551–561, 2025. 3
work page 2025
-
[51]
Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. ImageReward: Learning and Evaluating Human Preferences for Text-to- Image Generation.Advances in Neural Information Process- ing Systems, 36:15903–15935, 2023. 7
work page 2023
-
[52]
Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to- Image Diffusion Models
Jiale Xu, Xintao Wang, Weihao Cheng, Yan-Pei Cao, Ying Shan, Xiaohu Qie, and Shenghua Gao. Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to- Image Diffusion Models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20908–20918, 2023. 2
work page 2023
-
[53]
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Yinghao Xu, Zifan Shi, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, and Gordon Wet- zstein. GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation . InEuropean Conference on Computer Vision. Springer, 2024. 3, 7
work page 2024
-
[54]
xatlas: A Library for Mesh Parameteriza- tion
Jonathan Young. xatlas: A Library for Mesh Parameteriza- tion. GitHub repository, 2018. 7
work page 2018
-
[55]
Texture Generation on 3D Meshes with Point- UV Diffusion
Xin Yu, Peng Dai, Wenbo Li, Lan Ma, Zhengzhe Liu, and Xiaojuan Qi. Texture Generation on 3D Meshes with Point- UV Diffusion. InProceedings of the IEEE/CVF Interna- tional Conference on Computer Vision, pages 4206–4216,
-
[56]
Xin Yu, Ze Yuan, Yuan-Chen Guo, Ying-Tian Liu, Jianhui Liu, Yangguang Li, Yan-Pei Cao, Ding Liang, and Xiaojuan Qi. TEXGen: a Generative Diffusion Model for Mesh Tex- tures.ACM Transactions on Graphics (TOG), 43(6):1–14,
-
[57]
Paint3D: Paint Anything 3D with Lighting-Less Texture Dif- fusion Models
Xianfang Zeng, Xin Chen, Zhongqi Qi, Wen Liu, Zibo Zhao, Zhibin Wang, Bin Fu, Yong Liu, and Gang Yu. Paint3D: Paint Anything 3D with Lighting-Less Texture Dif- fusion Models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4252– 4262, 2024. 2, 3, 6
work page 2024
-
[58]
GaussianCube: Structuring Gaussian Splatting using Opti- mal Transport for 3D Generative Modeling
Bowen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, and Baining Guo. GaussianCube: Structuring Gaussian Splatting using Opti- mal Transport for 3D Generative Modeling. InAdvances in Neural Information Processing Systems (NeurIPS), 2024. 1, 3
work page 2024
-
[59]
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. Adding Conditional Control to Text-to-Image Diffusion Models . In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3836–3847, 2023. 3, 5, 6
work page 2023
-
[60]
MaterialRefGS: Reflective gaussian splatting with multi-view consistent material infer- ence
Wenyuan Zhang, Jimin Tang, Weiqi Zhang, Yi Fang, Yu- Shen Liu, and Zhizhong Han. MaterialRefGS: Reflective gaussian splatting with multi-view consistent material infer- ence. InAdvances in Neural Information Processing Sys- tems, 2025. 3
work page 2025
-
[61]
GAP: Gaussianize Any Point Clouds with Text Guidance
Weiqi Zhang, Junsheng Zhou, Haotian Geng, Wenyuan Zhang, and Yu-Shen Liu. GAP: Gaussianize Any Point Clouds with Text Guidance. InProceedings of the IEEE/CVF International Conference on Computer Vision, 2025. 6
work page 2025
-
[62]
Learning Consistency-Aware Unsigned Dis- tance Functions Progressively from Raw Point Clouds
Junsheng Zhou, Baorui Ma, Yu-Shen Liu, Yi Fang, and Zhizhong Han. Learning Consistency-Aware Unsigned Dis- tance Functions Progressively from Raw Point Clouds. In Advances in Neural Information Processing Systems, pages 16481–16494. Curran Associates, Inc., 2022. 3, 7
work page 2022
-
[63]
Uni3D: Exploring Uni- fied 3D Representation at Scale
Junsheng Zhou, Jinsheng Wang, Baorui Ma, Yu-Shen Liu, Tiejun Huang, and Xinlong Wang. Uni3D: Exploring Uni- fied 3D Representation at Scale. InInternational Conference on Learning Representations, pages 46766–46782, 2024. 2, 7
work page 2024
-
[64]
DiffGS: Functional Gaussian Splatting Diffusion
Junsheng Zhou, Weiqi Zhang, and Yu-Shen Liu. DiffGS: Functional Gaussian Splatting Diffusion. InAdvances in Neural Information Processing Systems (NeurIPS), 2024. 1, 3
work page 2024
-
[65]
UDiFF: Generating Condi- tional Unsigned Distance Fields with Optimal Wavelet Dif- fusion
Junsheng Zhou, Weiqi Zhang, Baorui Ma, Kanle Shi, Yu- Shen Liu, and Zhizhong Han. UDiFF: Generating Condi- tional Unsigned Distance Fields with Optimal Wavelet Dif- fusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21496– 21506, 2024. 2
work page 2024
-
[66]
GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
Jingqiu Zhou, Lue Fan, Xuesong Chen, Linjiang Huang, Si Liu, and Hongsheng Li. GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance. InPro- ceedings of the AAAI Conference on Artificial Intelligence, pages 10788–10796, 2025. 3
work page 2025
-
[67]
Junsheng Zhou, Weiqi Zhang, Baorui Ma, Kanle Shi, Yu- Shen Liu, and Zhizhong Han. UDFStudio: A Unified Frame- work of Datasets, Benchmarks and Generative Models for Unsigned Distance Functions.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026. 3
work page 2026
-
[68]
Zi-Xin Zou, Zhipeng Yu, Yuan-Chen Guo, Yangguang Li, Ding Liang, Yan-Pei Cao, and Song-Hai Zhang. Triplane Meets Gaussian Splatting: Fast and Generalizable Single- View 3D Reconstruction with Transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition, pages 10324–10335, 2024. 2, 3, 8
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.