TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
Pith reviewed 2026-06-29 22:08 UTC · model grok-4.3
The pith
TriSplat reconstructs scenes as oriented triangle meshes in one forward pass for direct use in physics engines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TriSplat is a feed-forward reconstruction network that represents scenes with oriented triangle primitives and directly exports simulation-ready mesh scenes from a single forward pass. Given input images, the network predicts local 3D point maps, triangle attributes, camera poses, and optional intrinsics. Rather than regressing triangle orientation as an unconstrained latent variable, the method constructs geometry normals from the predicted point maps, refines them with an image-conditioned normal head, and converts them into stable local frames for triangle parameterization. A mono-normal bootstrap schedule further stabilizes early training, while opacity and blur scheduling progressively
What carries the argument
Oriented triangle primitives whose local frames are derived from point-map normals refined by an image-conditioned normal head.
If this is right
- Output meshes can be ingested directly by physics engines, collision detectors, and standard rendering pipelines without conversion.
- Reconstructions are more faithful to scene geometry than those produced by Gaussian feed-forward baselines.
- Novel-view rendering quality stays competitive with existing methods on RealEstate10K and DL3DV.
- The network jointly estimates scene structure and camera parameters from sparse pose-free observations.
Where Pith is reading between the lines
- The triangle output could let robotics systems run physics-based planning directly on images captured in the field.
- The normal-refinement and scheduling techniques might transfer to other explicit primitive representations to improve surface stability.
- If the method scales to larger scenes, it could shorten the pipeline from casual video capture to interactive simulation.
Load-bearing premise
Deriving and refining normals from point maps produces accurate stable triangles that need no post-hoc fixes for mesh extraction.
What would settle it
Running the exported triangles through a physics engine and finding that they require additional cleanup or produce unstable collisions would show the simulation-readiness claim does not hold.
read the original abstract
Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussian primitives and expose surfaces only indirectly: extracting a usable mesh for downstream simulation, physics reasoning, or embodied interaction still requires expensive post-hoc steps that break the feed-forward promise. This limitation is especially pronounced in pose-free settings, where scene structure and camera parameters must be estimated jointly from sparse observations. We present TriSplat, a feed-forward reconstruction network that represents scenes with oriented triangle primitives and directly exports simulation-ready mesh scenes from a single forward pass. Given input images, the network predicts local 3D point maps, triangle attributes, camera poses, and optional intrinsics. Rather than regressing triangle orientation as an unconstrained latent variable, our approach constructs geometry normals from the predicted point maps, refines them with an image-conditioned normal head, and converts them into stable local frames for triangle parameterization. A mono-normal bootstrap schedule further stabilizes early training, while opacity and blur scheduling progressively sharpens the learned surface representation for direct mesh extraction. Experiments on RealEstate10K and DL3DV show that this representation produces more geometry-faithful reconstructions than Gaussian feed-forward baselines while maintaining competitive novel-view rendering quality. Because the rendering primitives are themselves surface triangles, the output can be directly ingested by physics engines, collision detectors, and standard rendering pipelines without any conversion, making it a practical simulation-ready solution for feed-forward 3D scene reconstruction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces TriSplat, a feed-forward network for sparse-view 3D scene reconstruction that predicts oriented triangle primitives directly from images. It constructs geometry normals from predicted point maps, refines them with an image-conditioned normal head, and uses scheduling for stable training to produce simulation-ready meshes. Experiments on RealEstate10K and DL3DV demonstrate improved geometry faithfulness over Gaussian baselines with competitive rendering quality.
Significance. If the simulation-readiness claim holds, the work would meaningfully advance feed-forward reconstruction by eliminating post-hoc mesh extraction steps and enabling direct integration into physics and collision pipelines.
major comments (2)
- [Abstract] Abstract: the central claim that 'the output can be directly ingested by physics engines, collision detectors, and standard rendering pipelines without any conversion' is load-bearing for the title and contribution but is unsupported by evidence; all reported experiments are confined to geometry faithfulness and novel-view synthesis quality on RealEstate10K and DL3DV, with no measurements of manifoldness, self-intersection rates, or dynamic stability.
- [Abstract] Abstract: the statement that the representation 'produces more geometry-faithful reconstructions than Gaussian feed-forward baselines' is presented without any quantitative metrics, baseline names, tables, or error analysis, preventing assessment of whether the geometry improvement is meaningful or statistically reliable.
minor comments (1)
- [Abstract] Abstract: the mono-normal bootstrap schedule, opacity scheduling, and blur scheduling are mentioned only at a high level; a brief description of their implementation or an ablation would clarify their role in producing stable triangles.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the abstract claims. We address each point below and will revise the manuscript to ensure all statements are appropriately supported by the presented evidence.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that 'the output can be directly ingested by physics engines, collision detectors, and standard rendering pipelines without any conversion' is load-bearing for the title and contribution but is unsupported by evidence; all reported experiments are confined to geometry faithfulness and novel-view synthesis quality on RealEstate10K and DL3DV, with no measurements of manifoldness, self-intersection rates, or dynamic stability.
Authors: We agree that the simulation-readiness claim would benefit from additional supporting evidence. The representation uses oriented triangles, which are a standard primitive directly compatible with mesh-based pipelines and require no conversion step, but the manuscript provides no quantitative checks on manifoldness, self-intersections, or simulation stability. We will revise the abstract to qualify the claim (emphasizing format compatibility rather than untested downstream performance) and add a short discussion or appendix with basic mesh-quality statistics derived from the existing outputs. revision: yes
-
Referee: [Abstract] Abstract: the statement that the representation 'produces more geometry-faithful reconstructions than Gaussian feed-forward baselines' is presented without any quantitative metrics, baseline names, tables, or error analysis, preventing assessment of whether the geometry improvement is meaningful or statistically reliable.
Authors: The abstract summarizes results that are quantified in the experiments section (specific baselines, geometry metrics, and tables). To make this immediately verifiable from the abstract, we will incorporate key quantitative comparisons and baseline references directly into the abstract text. revision: yes
Circularity Check
No circularity; empirical method with no self-referential derivations
full rationale
The paper presents a neural architecture that predicts point maps, normals, and triangle attributes from images, with a training schedule for stabilization. These are design and implementation choices, not mathematical derivations or predictions that reduce to fitted inputs by construction. The simulation-ready claim follows directly from the choice of triangle primitives rather than any equation or parameter fit that is tautological. No self-citation load-bearing steps, uniqueness theorems, or ansatzes are invoked in the provided text. The work is self-contained against external benchmarks via reported experiments on RealEstate10K and DL3DV.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
Latent Spatial Memory for Video World Models
Mirage stores and queries 3D scene information in diffusion latent space via depth-guided lifting and warping, yielding 10.57× faster generation and 55× smaller memory than explicit RGB point-cloud baselines while rea...
Reference graph
Works this paper leans on
-
[1]
Mohammad Nomaan Qureshi, Sparsh Garg, Francisco Yandun, David Held, George Kantor, and Abhisesh Silwal. Splatsim: Zero-shot sim2real transfer of rgb manipulation policies using gaussian splatting.arXiv preprint arXiv:2409.10161, 2024
-
[2]
Embodiedsplat: Personalized real-to-sim-to-real navigation with gaussian splats from a mobile device
Gunjan Chhablani, Xiaomeng Ye, Muhammad Zubair Irshad, and Zsolt Kira. Embodiedsplat: Personalized real-to-sim-to-real navigation with gaussian splats from a mobile device. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 25431– 25441, 2025
2025
-
[3]
Structure-from-motion revisited
Johannes Lutz Schönberger and Jan-Michael Frahm. Structure-from-motion revisited. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016
2016
-
[4]
Pixelwise view selection for unstructured multi-view stereo
Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. Pixelwise view selection for unstructured multi-view stereo. InEuropean Conference on Computer Vision, 2016
2016
-
[5]
Multi-view stereo: A tutorial.Foundations and trends®in Computer Graphics and Vision, 9(1-2):1–148, 2015
Yasutaka Furukawa, Carlos Hernández, et al. Multi-view stereo: A tutorial.Foundations and trends®in Computer Graphics and Vision, 9(1-2):1–148, 2015
2015
-
[6]
pixelnerf: Neural radiance fields from one or few images
Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. pixelnerf: Neural radiance fields from one or few images. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4578–4587, 2021
2021
-
[7]
Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo
Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, Fanbo Xiang, Jingyi Yu, and Hao Su. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In IEEE/CVF International Conference on Computer Vision, pages 14124–14133, 2021. 13
2021
-
[8]
Mvsplat: Efficient3dgaussiansplattingfromsparsemulti-view images
Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-JenCham, andJianfeiCai. Mvsplat: Efficient3dgaussiansplattingfromsparsemulti-view images. InEuropean Conference on Computer Vision, pages 370–386. Springer, 2024
2024
- [9]
-
[10]
3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4):139–1, 2023
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42(4):139–1, 2023
2023
-
[11]
pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction
David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19457–19467, 2024
2024
-
[12]
Depthsplat: Connecting gaussian splatting and depth
Haofei Xu, Songyou Peng, Fangjinhua Wang, Hermann Blum, Daniel Barath, Andreas Geiger, and Marc Pollefeys. Depthsplat: Connecting gaussian splatting and depth. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
2025
-
[13]
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction
Weijie Wang, Yeqing Chen, Zeyu Zhang, Hengyu Liu, Haoxiao Wang, Zhiyuan Feng, Wenkang Qin, Zheng Zhu, Donny Y. Chen, and Bohan Zhuang. Volsplat: Rethinking feed-forward 3d gaussian splatting with voxel-aligned prediction.arXiv preprint arXiv:2509.19297, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[14]
Dust3r: Geometric 3d vision made easy
Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vision made easy. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20697–20709, 2024
2024
-
[15]
Flare: Feed-forward geometry, appearance and camera estimation from uncalibrated sparse views
Shangzhan Zhang, Jianyuan Wang, Yinghao Xu, Nan Xue, Christian Rupprecht, Xiaowei Zhou, Yujun Shen, and Gordon Wetzstein. Flare: Feed-forward geometry, appearance and camera estimation from uncalibrated sparse views. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21936–21947, 2025
2025
-
[16]
arXiv preprint arXiv:2503.11651 (2025)
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer.arXiv preprint arXiv:2503.11651, 2025
-
[17]
Botao Ye, Sifei Liu, Haofei Xu, Xueting Li, Marc Pollefeys, Ming-Hsuan Yang, and Songyou Peng. No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images.arXiv preprint arXiv:2410.24207, 2024
-
[18]
Yonosplat: You only need one model for feedforward 3d gaussian splatting
Botao Ye, Boqi Chen, Haofei Xu, Daniel Barath, and Marc Pollefeys. Yonosplat: You only need one model for feedforward 3d gaussian splatting.arXiv preprint arXiv:2511.07321, 2025
-
[19]
2d gaussian splat- ting for geometrically accurate radiance fields
Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splat- ting for geometrically accurate radiance fields. InACM SIGGRAPH Conference Proceedings, pages 1–11, 2024
2024
-
[20]
Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics, 43(6):1–13, 2024
Zehao Yu, Torsten Sattler, and Andreas Geiger. Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics, 43(6):1–13, 2024
2024
-
[21]
3dgsr: Implicit surface reconstruction with 3d gaussian splatting
Xiaoyang Lyu, Yang-Tian Sun, Yi-Hua Huang, Xiuzhe Wu, Ziyi Yang, Yilun Chen, Jiangmiao Pang, and Xiaojuan Qi. 3dgsr: Implicit surface reconstruction with 3d gaussian splatting. ACM Transactions on Graphics, 43(6):1–12, 2024. 14
2024
-
[22]
Hanzhi Chang, Ruijie Zhu, Wenjie Chang, Mulin Yu, Yanzhe Liang, Jiahao Lu, Zhuoyuan Li, and Tianzhu Zhang. Meshsplat: Generalizable sparse-view surface reconstruction via gaussian splatting.arXiv preprint arXiv:2508.17811, 2025
-
[23]
Surfelsplat: Learning efficient and generalizable gaussian surfel representations for sparse-view surface reconstruction
Chensheng Dai, Shengjun Zhang, Min Chen, and Yueqi Duan. Surfelsplat: Learning efficient and generalizable gaussian surfel representations for sparse-view surface reconstruction. In Advances in Neural Information Processing Systems, 2025
2025
-
[24]
Jiale Xu, Weihao Cheng, Yiming Gao, Xintao Wang, Shenghua Gao, and Ying Shan. In- stantmesh: Efficient 3d mesh generation from a single image with sparse-view large recon- struction models.arXiv preprint arXiv:2404.07191, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[25]
Meshlrm: Large reconstruction model for high-quality meshes
XinyueWei, KaiZhang, SaiBi, HaoTan, FujunLuan, ValentinDeschaintre, KalyanSunkavalli, Hao Su, and Zexiang Xu. Meshlrm: Large reconstruction model for high-quality meshes. arXiv preprint arXiv:2404.12385, 2024
-
[26]
Meshformer: High-quality mesh generation with 3d-guided reconstruction model.Advances in Neural Information Processing Systems, 37:59314–59341, 2024
Minghua Liu, Chong Zeng, Xinyue Wei, Ruoxi Shi, Linghao Chen, Chao Xu, Mengqi Zhang, Zhaoning Wang, Xiaoshuai Zhang, Isabella Liu, et al. Meshformer: High-quality mesh generation with 3d-guided reconstruction model.Advances in Neural Information Processing Systems, 37:59314–59341, 2024
2024
-
[27]
3d-r2n2: A unified approach for single and multi-view 3d object reconstruction
Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. InEuropean Conference on Computer Vision, pages 628–644. Springer, 2016
2016
-
[28]
Learning category- specific mesh reconstruction from image collections
Angjoo Kanazawa, Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. Learning category- specific mesh reconstruction from image collections. InEuropean Conference on Computer Vision, pages 371–386, 2018
2018
-
[29]
Pixel2mesh: Generating 3d mesh models from single rgb images
Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. InEuropean Conference on Computer Vision, pages 52–67, 2018
2018
-
[30]
Triangle splatting for real-time radiance field rendering.arXiv, 2025
Jan Held, Renaud Vandeghen, Adrien Deliege, Abdullah Hamdi, Anthony Cioppa, Silvio Gian- cola, Andrea Vedaldi, Bernard Ghanem, Andrea Tagliasacchi, and Marc Van Droogenbroeck. Triangle splatting for real-time radiance field rendering.arXiv, 2025
2025
-
[31]
Stereo magnification: learning view synthesis using multiplane images.ACM Transactions on Graphics, 37(4):1–12, 2018
Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. Stereo magnification: learning view synthesis using multiplane images.ACM Transactions on Graphics, 37(4):1–12, 2018
2018
-
[32]
Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision
Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, et al. Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22160–22169, 2024
2024
-
[33]
Scannet: Richly-annotated 3d reconstructions of indoor scenes
AngelaDai, AngelXChang, ManolisSavva, MaciejHalber, ThomasFunkhouser, andMatthias Nießner. Scannet: Richly-annotated 3d reconstructions of indoor scenes. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5828–5839, 2017
2017
-
[34]
Fregs: 3d gaussian splatting with progressive frequency regularization
Jiahui Zhang, Fangneng Zhan, Muyu Xu, Shijian Lu, and Eric Xing. Fregs: 3d gaussian splatting with progressive frequency regularization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21424–21433, 2024. 15
2024
-
[35]
Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces
Yingwenqi Jiang, Jiadong Tu, Yuan Liu, Xifeng Gao, Xiaoxiao Long, Wenping Wang, and Yuexin Ma. Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5322–5332, 2024
2024
-
[36]
Scaffold-gs: Structured 3d gaussians for view-adaptive rendering
Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold-gs: Structured 3d gaussians for view-adaptive rendering. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20654–20664, 2024
2024
-
[37]
Bags: Blur agnostic gaussian splatting through multi-scale kernel modeling
Cheng Peng, Yutao Tang, Yifan Zhou, Nengyu Wang, Xijun Liu, Deming Li, and Rama Chellappa. Bags: Blur agnostic gaussian splatting through multi-scale kernel modeling. In European Conference on Computer Vision, pages 293–310. Springer, 2024
2024
-
[38]
Bad-gaussians: Bundle adjusted deblur gaussian splatting
Lingzhe Zhao, Peng Wang, and Peidong Liu. Bad-gaussians: Bundle adjusted deblur gaussian splatting. InEuropean Conference on Computer Vision, pages 233–250. Springer, 2024
2024
-
[39]
Compact 3d gaussian representation for radiance field
Joo Chan Lee, Daniel Rho, Xiangyu Sun, Jong Hwan Ko, and Eunbyung Park. Compact 3d gaussian representation for radiance field. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21719–21728, 2024
2024
-
[40]
Hac: Hash-grid assisted context for 3d gaussian splatting compression
Yihang Chen, Qianyi Wu, Weiyao Lin, Mehrtash Harandi, and Jianfei Cai. Hac: Hash-grid assisted context for 3d gaussian splatting compression. InEuropean Conference on Computer Vision, pages 422–438. Springer, 2024
2024
-
[41]
Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps
Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang, et al. Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps. Advances in Neural Information Processing Systems, 37:140138–140158, 2024
2024
-
[42]
Compressed 3d gaussian splatting for accelerated novel view synthesis
Simon Niedermayr, Josef Stumpfegger, and Rüdiger Westermann. Compressed 3d gaussian splatting for accelerated novel view synthesis. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10349–10358, 2024
2024
-
[43]
Zihui Gao, Jia-Wang Bian, Guosheng Lin, Hao Chen, and Chunhua Shen. Surfacesplat: Connecting surface reconstruction and gaussian splatting.arXiv preprint arXiv:2507.15602, 2025
-
[44]
Lue Fan, Yuxue Yang, Minxing Li, Hongsheng Li, and Zhaoxiang Zhang. Trim 3d gaussian splatting for accurate geometry representation.arXiv preprint arXiv:2406.07499, 2024
-
[45]
Surface reconstruction from gaussian splatting via novel stereo views.arXiv e-prints, pages arXiv–2404, 2024
Yaniv Wolf, Amit Bracha, and Ron Kimmel. Surface reconstruction from gaussian splatting via novel stereo views.arXiv e-prints, pages arXiv–2404, 2024
2024
-
[46]
Gaussianpro: 3d gaussian splatting with progressive propagation
Kai Cheng, Xiaoxiao Long, Kaizhi Yang, Yao Yao, Wei Yin, Yuexin Ma, Wenping Wang, and Xuejin Chen. Gaussianpro: 3d gaussian splatting with progressive propagation. In International Conference on Machine Learning, 2024
2024
-
[47]
Sags: structure-aware 3d gaussian splatting
Evangelos Ververas, Rolandos Alexandros Potamias, Jifei Song, Jiankang Deng, and Stefanos Zafeiriou. Sags: structure-aware 3d gaussian splatting. InEuropean Conference on Computer Vision, pages 221–238. Springer, 2024
2024
-
[48]
Maxime Oquab, Timothée Darcet, Theo Moutakanni, Huy V. Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Russell Howes, Po-Yao Huang, Hu Xu, Vasu Sharma, Shang-Wen Li, Wojciech Galuba, Mike Rabbat, Mido Assran, Nicolas Ballas, Gabriel Synnaeve, Ishan Misra, Herve Jegou, Julien Mairal, Patrick Labatu...
2023
-
[49]
Omnidata: A scalable pipeline for making multi-task mid-level vision datasets from 3d scans
Ainaz Eftekhar, Alexander Sax, Jitendra Malik, and Amir Zamir. Omnidata: A scalable pipeline for making multi-task mid-level vision datasets from 3d scans. InIEEE/CVF International Conference on Computer Vision, pages 10786–10796, 2021
2021
-
[50]
Boostmvsnerfs: Boosting mvs-based nerfs to generalizable view synthesis in large-scale scenes
Chih-Hai Su, Chih-Yao Hu, Shr-Ruei Tsai, Jie-Ying Lee, Chin-Yang Lin, and Yu-Lun Liu. Boostmvsnerfs: Boosting mvs-based nerfs to generalizable view synthesis in large-scale scenes. InACM SIGGRAPH Conference Proceedings, pages 1–12, 2024
2024
-
[51]
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective
Weijie Wang, Qihang Cao, Sensen Gao, Donny Y Chen, Haofei Xu, Wenjing Bian, Songyou Peng, Tat-Jen Cham, Chuanxia Zheng, Andreas Geiger, et al. Feed-forward 3d scene modeling: A problem-driven perspective.arXiv preprint arXiv:2604.14025, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[52]
latentsplat: Autoencoding variational gaussians for fast generalizable 3d reconstruction
Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, and Jan Eric Lenssen. latentsplat: Autoencoding variational gaussians for fast generalizable 3d reconstruction. InEuropean Conference on Computer Vision, pages 456–473. Springer, 2024
2024
-
[53]
Transplat: Generalizable 3d gaussian splatting from sparse multi-view images with transformers
Chuanrui Zhang, Yingshuang Zou, Zhuoling Li, Minmin Yi, and Haoqian Wang. Transplat: Generalizable 3d gaussian splatting from sparse multi-view images with transformers. In AAAI Conference on Artificial Intelligence, volume 39, pages 9869–9877, 2025
2025
-
[54]
Epipolar-free 3d gaussian splat- ting for generalizable novel view synthesis
Zhiyuan Min, Yawei Luo, Jianwen Sun, and Yi Yang. Epipolar-free 3d gaussian splat- ting for generalizable novel view synthesis. In A. Globerson, L. Mackey, D. Bel- grave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, editors,Advances in Neu- ral Information Processing Systems, volume 37, pages 39573–39596. Curran Associates, Inc., 2024. URL https://proceed...
2024
-
[55]
Shengji Tang, Weicai Ye, Peng Ye, Weihao Lin, Yang Zhou, Tao Chen, and Wanli Ouyang. Hisplat: Hierarchical 3d gaussian splatting for generalizable sparse-view reconstruction, 2024. URLhttps://arxiv.org/abs/2410.06245
-
[56]
Pixelgaussian: Generalizable 3d gaussian reconstruction from arbitrary views, 2024
Xin Fei, Wenzhao Zheng, Yueqi Duan, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, and Jiwen Lu. Pixelgaussian: Generalizable 3d gaussian reconstruction from arbitrary views, 2024. URLhttps://arxiv.org/abs/2410.18979
-
[57]
arXiv preprint arXiv:2505.23716 (2025)
Lihan Jiang, Yucheng Mao, Linning Xu, Tao Lu, Kerui Ren, Yichen Jin, Xudong Xu, Mulin Yu, Jiangmiao Pang, Feng Zhao, et al. Anysplat: Feed-forward 3d gaussian splatting from unconstrained views.arXiv preprint arXiv:2505.23716, 2025
-
[58]
Yunsong Wang, Tianxin Huang, Hanlin Chen, and Gim Hee Lee. Freesplat++: Generalizable 3dgaussiansplattingforefficientindoorscenereconstruction.arXiv preprint arXiv:2503.22986, 2025
-
[59]
Longsplat: Online generalizable 3d gaussian splatting from long sequence images
Guichen Huang, Ruoyu Wang, Xiangjun Gao, Che Sun, Yuwei Wu, Shenghua Gao, and Yunde Jia. Longsplat: Online generalizable 3d gaussian splatting from long sequence images. arXiv preprint arXiv:2507.16144, 2025
-
[60]
Yang Xiao, Guoan Xu, Qiang Wu, and Wenjing Jia. Jointsplat: Probabilistic joint flow-depth optimization for sparse-view gaussian splatting.arXiv preprint arXiv:2506.03872, 2025
-
[61]
Zpressor: Bottleneck-aware compression for scalable feed-forward 3dgs.Advances in Neural Information Processing Systems, 38:113407–113436, 2026
Weijie Wang, Donny Y Chen, Zeyu Zhang, Duochao Shi, Akide Liu, and Bohan Zhuang. Zpressor: Bottleneck-aware compression for scalable feed-forward 3dgs.Advances in Neural Information Processing Systems, 38:113407–113436, 2026. 17
2026
-
[62]
Chen, Zeyu Zhang, Jiawang Bian, Bohan Zhuang, and Chunhua Shen
Duochao Shi, Weijie Wang, Donny Y. Chen, Zeyu Zhang, Jiawang Bian, Bohan Zhuang, and Chunhua Shen. Revisiting depth representations for feed-forward 3d gaussian splatting. arXiv preprint arXiv:2506.05327, 2025
-
[63]
DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion
Weijie Wang, Jiagang Zhu, Zeyu Zhang, Xiaofeng Wang, Zheng Zhu, Guosheng Zhao, Chaojun Ni, Haoxiao Wang, Guan Huang, Xinze Chen, Yukun Zhou, Wenkang Qin, Duochao Shi, Haoyun Li, Yicheng Xiao, Donny Y. Chen, and Jiwen Lu. Drivegen3d: Boosting feed-forward driving scene generation with efficient video diffusion.arXiv preprint arXiv:2510.15264, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[64]
Xinhang Liu, Yuxi Xiao, Donny Y Chen, Jiashi Feng, Yu-Wing Tai, Chi-Keung Tang, and Bingyi Kang. Trace anything: Representing any video in 4d via trajectory fields.arXiv preprint arXiv:2510.13802, 2025
-
[65]
Grounding image matching in 3d with mast3r
Vincent Leroy, Yohann Cabon, and Jérôme Revaud. Grounding image matching in 3d with mast3r. InEuropean Conference on Computer Vision, pages 71–91. Springer, 2024
2024
-
[66]
Must3r: Multi-view network for stereo 3d reconstruction
Yohann Cabon, Lucas Stoffl, Leonid Antsfeld, Gabriela Csurka, Boris Chidlovskii, Jerome Revaud, and Vincent Leroy. Must3r: Multi-view network for stereo 3d reconstruction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1050–1060, 2025
2025
-
[67]
arXiv preprint arXiv:2501.13928 (2025)
Jianing Yang, Alexander Sax, Kevin J Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier, and Matt Feiszli. Fast3r: Towards 3d reconstruction of 1000+ images in one forward pass.arXiv preprint arXiv:2501.13928, 2025
-
[68]
Zhenggang Tang, Yuchen Fan, Dilin Wang, Hongyu Xu, Rakesh Ranjan, Alexander Schwing, and Zhicheng Yan. Mv-dust3r+: Single-stage scene reconstruction from sparse views in 2 seconds.arXiv preprint arXiv:2412.06974, 2024
-
[69]
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
Nikhil Keetha, Norman Müller, Johannes Schönberger, Lorenzo Porzi, Yuchen Zhang, Tobias Fischer, Arno Knapitsch, Duncan Zauss, Ethan Weber, Nelson Antunes, et al. Mapanything: Universal feed-forward metric 3d reconstruction.arXiv preprint arXiv:2509.13414, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[70]
Pow3r: Empowering unconstrained 3d reconstruction with camera and scene priors
Wonbong Jang, Philippe Weinzaepfel, Vincent Leroy, Lourdes Agapito, and Jerome Revaud. Pow3r: Empowering unconstrained 3d reconstruction with camera and scene priors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1071–1081, 2025
2025
-
[71]
Instantsplat: Unbounded sparse-view pose-free gaussian splatting in 40 seconds.CoRR, 2024
Zhiwen Fan, Wenyan Cong, Kairun Wen, Kevin Wang, Jian Zhang, Xinghao Ding, Danfei Xu, Boris Ivanovic, Marco Pavone, Georgios Pavlakos, et al. Instantsplat: Unbounded sparse-view pose-free gaussian splatting in 40 seconds.CoRR, 2024
2024
-
[72]
Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs
Brandon Smart, Chuanxia Zheng, Iro Laina, and Victor Adrian Prisacariu. Splatt3r: Zero- shot gaussian splatting from uncalibrated image pairs.arXiv preprint arXiv:2408.13912, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[73]
Freesplatter: Pose-free gaussian splatting for sparse-view 3d reconstruction
Jiale Xu, Shenghua Gao, and Ying Shan. Freesplatter: Pose-free gaussian splatting for sparse-view 3d reconstruction. InIEEE/CVF International Conference on Computer Vision, 2025
2025
-
[74]
RegGS: Unposed sparse views gaussian splatting with 3DGS registration
Chong Cheng, Yu Hu, Sicheng Yu, Beizhen Zhao, Zijian Wang, and Hao Wang. RegGS: Unposed sparse views gaussian splatting with 3DGS registration. InIEEE/CVF International Conference on Computer Vision, 2025
2025
-
[75]
Yuki Fujimura, Takahiro Kushida, Kazuya Kitano, Takuya Funatomi, and Yasuhiro Mukaigawa. Ufv-splatter: Pose-free feed-forward 3d gaussian splatting adapted to unfavorable views.arXiv preprint arXiv:2507.22342, 2025. 18
-
[76]
An analysis of svd for deep rotation estimation.Advances in Neural Information Processing Systems, 33:22554–22565, 2020
Jake Levinson, Carlos Esteves, Kefan Chen, Noah Snavely, Angjoo Kanazawa, Afshin Ros- tamizadeh, and Ameesh Makadia. An analysis of svd for deep rotation estimation.Advances in Neural Information Processing Systems, 33:22554–22565, 2020
2020
-
[77]
Scheduled sampling for sequence prediction with recurrent neural networks.Advances in Neural Information Processing Systems, 28, 2015
Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. Scheduled sampling for sequence prediction with recurrent neural networks.Advances in Neural Information Processing Systems, 28, 2015
2015
-
[78]
Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network
Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1874–1883, 2016
2016
-
[79]
The unrea- sonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 586–595, 2018
2018
-
[80]
Image quality assessment: from error visibility to structural similarity.IEEE Transactions on Image Processing, 13(4): 600–612, 2004
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE Transactions on Image Processing, 13(4): 600–612, 2004
2004
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.