arxiv: 2604.08370 · v1 · submitted 2026-04-09 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

SurfelSplat: Learning Efficient and Generalizable Gaussian Surfel Representations for Sparse-View Surface Reconstruction

Chensheng Dai , Shengjun Zhang , Min Chen , Yueqi Duan

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:46 UTC · model grok-4.3

classification 💻 cs.CV

keywords Gaussian surfelssparse-view reconstructionfeed-forward networksurface reconstruction3D Gaussian SplattingNyquist samplingcross-view aggregationlow-pass filters

0 comments

The pith

A feed-forward network reconstructs accurate 3D surfaces from sparse images by predicting Gaussian surfels after cross-view Nyquist filtering.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to build a fast, generalizable model that turns a handful of input photographs into precise 3D surface geometry represented as Gaussian surfels. It starts from the observation that ordinary feed-forward networks cannot recover fine geometric details because the spatial frequencies of pixel-aligned primitives exceed the sampling limits of the input views. To correct this, the authors insert a cross-view aggregation step that first damps the surfel geometry with low-pass filters scaled to each view's sampling rate, then projects the filtered surfels across all inputs to collect feature correlations, and finally routes those correlations through a fusion network that regresses accurate surfel parameters. The resulting system matches the reconstruction quality of slow per-scene optimization methods on standard benchmarks while finishing in roughly one second and requiring no scene-specific training. This matters because it removes the need for dense camera arrays or lengthy computation, making high-quality surface capture feasible in ordinary settings.

Core claim

SurfelSplat generates pixel-aligned Gaussian surfel representations from sparse-view images by adapting their geometric forms with spatial sampling rate-guided low-pass filters, projecting the filtered surfels across all input views to obtain cross-view feature correlations, and processing those correlations through a feature fusion network to regress Gaussian surfels that carry precise geometry, thereby producing efficient and generalizable surface reconstructions.

What carries the argument

The cross-view feature aggregation module, which first damps Gaussian surfel geometry with spatial sampling rate-guided low-pass filters, projects the results across views to extract correlations, and fuses them to regress accurate parameters.

If this is right

The model produces reconstruction quality comparable to state-of-the-art optimization pipelines on DTU benchmarks.
Gaussian surfels are predicted in approximately one second per scene.
No per-scene optimization or training is required, allowing the same weights to be used on new scenes.
The output representations remain efficient for both surface extraction and downstream rendering tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sampling-rate-guided filtering step could be inserted into other feed-forward 3D reconstruction pipelines that currently suffer from high-frequency artifacts.
The one-second inference time opens the possibility of near-real-time surface capture on mobile or embedded hardware.
Extending the aggregation to handle video sequences rather than static sparse sets could support online 3D modeling without additional changes to the core architecture.
If the low-pass filters were made content-adaptive, the method might maintain accuracy even when input views are fewer or more irregularly spaced than those used in current benchmarks.

Load-bearing premise

That the main reason feed-forward networks lose geometric accuracy is the spatial-frequency mismatch between pixel-aligned primitives and the Nyquist limit of sparse inputs, and that low-pass filtering plus cross-view aggregation fully corrects it.

What would settle it

A direct comparison of geometric error on the same DTU scenes run once with the low-pass filters and cross-view fusion enabled and once with both disabled; if the disabled version shows no large increase in surface error, the claimed mechanism does not carry the result.

Figures

Figures reproduced from arXiv: 2604.08370 by Chensheng Dai, Min Chen, Shengjun Zhang, Yueqi Duan.

**Figure 2.** Figure 2: Experimental Observation. (a) Current feed-forward networks generate geometrically [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Pipeline. Given an image pair, our method first extracts initial image features using a [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative Comparison of Surface Reconstruction with Sparse Views on DTU Benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Visualization of Nyquist Theorem Verification [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Nyquist Theorem Verification: (a) Before adaptation, most surfels exceed the Nyquist [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Given a pair of images, our method exhibits consistent and stable performance across [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 8.** Figure 8: Visual comparison of novel view synthesis on DTU dataset. [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

read the original abstract

3D Gaussian Splatting (3DGS) has demonstrated impressive performance in 3D scene reconstruction. Beyond novel view synthesis, it shows great potential for multi-view surface reconstruction. Existing methods employ optimization-based reconstruction pipelines that achieve precise and complete surface extractions. However, these approaches typically require dense input views and high time consumption for per-scene optimization. To address these limitations, we propose SurfelSplat, a feed-forward framework that generates efficient and generalizable pixel-aligned Gaussian surfel representations from sparse-view images. We observe that conventional feed-forward structures struggle to recover accurate geometric attributes of Gaussian surfels because the spatial frequency of pixel-aligned primitives exceeds Nyquist sampling rates. Therefore, we propose a cross-view feature aggregation module based on the Nyquist sampling theorem. Specifically, we first adapt the geometric forms of Gaussian surfels with spatial sampling rate-guided low-pass filters. We then project the filtered surfels across all input views to obtain cross-view feature correlations. By processing these correlations through a specially designed feature fusion network, we can finally regress Gaussian surfels with precise geometry. Extensive experiments on DTU reconstruction benchmarks demonstrate that our model achieves comparable results with state-of-the-art methods, and predict Gaussian surfels within 1 second, offering a 100x speedup without costly per-scene training.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SurfelSplat adds a Nyquist-guided cross-view fusion step to regress Gaussian surfels feed-forward from sparse views, delivering claimed speed gains but with thin validation on whether the filter actually restores lost geometry.

read the letter

Hi, The punchline on this paper is that SurfelSplat introduces a Nyquist theorem-based cross-view aggregation module to regress Gaussian surfels in a feed-forward manner from sparse views, aiming to bypass slow per-scene optimization while keeping reconstruction quality. The new part is the specific pipeline: they adapt Gaussian surfel geometry with spatial sampling rate-guided low-pass filters, project the filtered versions across input views to build correlations, and run those through a feature fusion network to get the final surfels with precise geometry. This addresses the issue they flag with conventional feed-forward nets struggling on high spatial frequencies. The practical upside they highlight is predicting in one second for a claimed 100x speedup and results on DTU that are comparable to state-of-the-art methods. Where it is softer is the lack of supporting details for the core mechanism. There is no frequency-domain analysis or pre/post-filter spectra to show that the cross-view fusion actually brings back the high-frequency attributes that the low-pass removed. The abstract also skips ablations on the low-pass filter design itself and gives no error bars or specifics on how the geometry accuracy was evaluated. The DTU claim stays at a high level without quantifying the contribution of the new module. This work would appeal to researchers in 3D computer vision who are developing feed-forward alternatives to optimization-based Gaussian splatting for surface reconstruction. A reader interested in efficient sparse-view methods could find the speed aspect valuable, though they would want to see the full experimental controls. It has enough substance to go to a serious referee, who could probe the missing validations and check reproducibility. I recommend sending it for peer review so the community can assess whether the Nyquist-guided approach delivers on the frequency recovery promise. Cheers

Referee Report

2 major / 2 minor

Summary. The paper proposes SurfelSplat, a feed-forward framework for generating pixel-aligned Gaussian surfel representations from sparse-view images for 3D surface reconstruction. It addresses the challenge of high spatial frequencies in pixel-aligned primitives exceeding Nyquist rates by introducing a cross-view feature aggregation module with spatial sampling rate-guided low-pass filters, followed by projection for cross-view correlations and a feature fusion network to regress precise geometry. The method claims to achieve comparable performance to state-of-the-art on DTU benchmarks while providing a 100x speedup and predicting within 1 second without per-scene optimization.

Significance. Should the claims hold under rigorous validation, this work would be significant for enabling efficient and generalizable surface reconstruction in sparse-view settings, extending 3D Gaussian Splatting to practical applications by avoiding costly per-scene training. The feed-forward design and 100x speedup, if substantiated, offer clear practical value for real-time or large-scale deployment.

major comments (2)

§3 (cross-view feature aggregation module): The core assumption that spatial sampling rate-guided low-pass filters followed by cross-view fusion can recover high-frequency geometric attributes filtered out by the Nyquist limit lacks any frequency-domain analysis, pre/post-filter spectra comparison to ground-truth surfaces, or explicit verification that the correlations restore the lost details using only sparse views.
§4 (DTU experiments): The reported 'comparable results' to SOTA methods provide no error bars, no ablation isolating the low-pass filter component, and no details on how geometry accuracy was measured (e.g., Chamfer distance computation or surface normal evaluation), making it impossible to attribute performance gains to the proposed module versus the base network.

minor comments (2)

The abstract would be strengthened by replacing the qualitative phrase 'comparable results' with specific quantitative metrics (e.g., mean Chamfer distance or PSNR values) from the DTU tables.
Notation for the filtered surfel parameters (e.g., how the low-pass cutoff is computed from the sampling rate) should be defined more explicitly with an equation in the method section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and will revise the paper to incorporate additional analysis and experimental details.

read point-by-point responses

Referee: §3 (cross-view feature aggregation module): The core assumption that spatial sampling rate-guided low-pass filters followed by cross-view fusion can recover high-frequency geometric attributes filtered out by the Nyquist limit lacks any frequency-domain analysis, pre/post-filter spectra comparison to ground-truth surfaces, or explicit verification that the correlations restore the lost details using only sparse views.

Authors: We agree that the manuscript would be strengthened by explicit frequency-domain analysis. Section 3 motivates the low-pass filters from the Nyquist theorem to address high spatial frequencies in pixel-aligned surfels, but does not include spectra plots or direct verification. In the revision we will add a dedicated analysis subsection with pre- and post-filter frequency spectra comparisons on DTU surfaces and quantitative verification that cross-view correlations recover high-frequency details from the sparse input views. revision: yes
Referee: §4 (DTU experiments): The reported 'comparable results' to SOTA methods provide no error bars, no ablation isolating the low-pass filter component, and no details on how geometry accuracy was measured (e.g., Chamfer distance computation or surface normal evaluation), making it impossible to attribute performance gains to the proposed module versus the base network.

Authors: We acknowledge the need for more rigorous reporting. The current experiments follow the standard DTU protocol for Chamfer distance and normal evaluation, but omit error bars and the requested ablation. The revised manuscript will report mean and standard deviation over multiple runs, add an ablation isolating the low-pass filter, and explicitly describe the metric computation (Chamfer distance on reconstructed meshes and normal consistency as in prior DTU surface reconstruction papers). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical feed-forward network with external validation

full rationale

The paper describes a learned feed-forward architecture that applies Nyquist-inspired low-pass filtering and cross-view fusion to regress Gaussian surfels. Its central result is obtained by training on external data and evaluating on DTU benchmarks, with no equations or steps that reduce the claimed output to a fitted input, self-definition, or self-citation chain by construction. The frequency-mismatch observation motivates the design but does not tautologically determine the network's learned parameters or predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that low-pass filtering guided by spatial sampling rate plus cross-view correlation fusion is sufficient to recover sub-Nyquist geometry; no new physical entities or ad-hoc constants are introduced beyond standard neural-network training.

axioms (2)

domain assumption Pixel-aligned Gaussian surfels have spatial frequencies that exceed the Nyquist rate of the input views.
Invoked in the second paragraph of the abstract to motivate the cross-view module.
domain assumption Cross-view feature correlations after low-pass filtering contain sufficient information to regress accurate surfel geometry.
Core premise of the feature fusion network described in the abstract.

pith-pipeline@v0.9.0 · 5543 in / 1356 out tokens · 24138 ms · 2026-05-10T17:46:44.415186+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We observe that conventional feed-forward structures struggle to recover accurate geometric attributes of Gaussian surfels because the spatial frequency of pixel-aligned primitives exceeds Nyquist sampling rates. Therefore, we propose a cross-view feature aggregation module based on the Nyquist sampling theorem. Specifically, we first adapt the geometric forms of Gaussian surfels with spatial sampling rate-guided low-pass filters.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

After adaptation, the spatial frequency can be constrained by setting s_u > 2/π : ν_k = 1/(s_u π) √(1 + 1/ν̂_k²) < ν̂_k / 2

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

69 extracted references · 10 canonical work pages · 1 internal anchor

[1]

Mvsnet: Depth inference for unstructured multi-view stereo

Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, and Long Quan. Mvsnet: Depth inference for unstructured multi-view stereo. InProceedings of the European conference on computer vision (ECCV), pages 767–783, 2018

2018
[2]

Transmvsnet: Global context-aware multi-view stereo network with transformers

Yikang Ding, Wentao Yuan, Qingtian Zhu, Haotian Zhang, Xiangyue Liu, Yuanjiang Wang, and Xiao Liu. Transmvsnet: Global context-aware multi-view stereo network with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8585–8594, 2022

2022
[3]

arXiv preprint arXiv:2106.10689 , year=

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021

work page arXiv 2021
[4]

V olume rendering of neural implicit surfaces.Advances in Neural Information Processing Systems, 34:4805–4815, 2021

Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. V olume rendering of neural implicit surfaces.Advances in Neural Information Processing Systems, 34:4805–4815, 2021

2021
[5]

Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction.Advances in neural information processing systems, 35:25018–25032, 2022

Zehao Yu, Songyou Peng, Michael Niemeyer, Torsten Sattler, and Andreas Geiger. Monosdf: Exploring monocular geometric cues for neural implicit surface reconstruction.Advances in neural information processing systems, 35:25018–25032, 2022

2022
[6]

Neusurf: On- surface priors for neural surface reconstruction from sparse input views

Han Huang, Yulun Wu, Junsheng Zhou, Ge Gao, Ming Gu, and Yu-Shen Liu. Neusurf: On- surface priors for neural surface reconstruction from sparse input views. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 2312–2320, 2024

2024
[7]

Sparseneus: Fast generalizable neural surface reconstruction from sparse views

Xiaoxiao Long, Cheng Lin, Peng Wang, Taku Komura, and Wenping Wang. Sparseneus: Fast generalizable neural surface reconstruction from sparse views. InEuropean Conference on Computer Vision, pages 210–227. Springer, 2022

2022
[8]

Uforecon: gener- alizable sparse-view surface reconstruction from arbitrary and unfavorable sets

Youngju Na, Woo Jae Kim, Kyu Beom Han, Suhyeon Ha, and Sung-Eui Yoon. Uforecon: gener- alizable sparse-view surface reconstruction from arbitrary and unfavorable sets. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5094–5104, 2024

2024
[9]

Neuralrecon: Real- time coherent 3d reconstruction from monocular video

Jiaming Sun, Yiming Xie, Linghao Chen, Xiaowei Zhou, and Hujun Bao. Neuralrecon: Real- time coherent 3d reconstruction from monocular video. In2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021

2021
[10]

Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction

Yiming Wang, Qin Han, Marc Habermann, Kostas Daniilidis, Christian Theobalt, and Lingjie Liu. Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023

2023
[11]

S-volsdf: Sparse multi-view stereo regularization of neural implicit surfaces

Haoyu Wu, Alexandros Graikos, and Dimitris Samaras. S-volsdf: Sparse multi-view stereo regularization of neural implicit surfaces. In2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023

2023
[12]

V olume rendering of neural implicit surfaces

Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. V olume rendering of neural implicit surfaces. InProceedings of the 35th International Conference on Neural Information Processing Systems, 2021

2021
[13]

Taylor, Mathias Unberath, Ming-Yu Liu, and Chen-Hsuan Lin

Zhaoshuo Li, Thomas Müller, Alex Evans, Russell H. Taylor, Mathias Unberath, Ming-Yu Liu, and Chen-Hsuan Lin. Neuralangelo: High-fidelity neural surface reconstruction. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

2023
[14]

C2f2neus: Cascade cost frustum fusion for high fidelity and generalizable neural surface reconstruction

Luoyuan Xu, Tao Guan, Yuesong Wang, Wenkai Liu, Zhaojie Zeng, Junle Wang, and Wei Yang. C2f2neus: Cascade cost frustum fusion for high fidelity and generalizable neural surface reconstruction. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 18291–18301, 2023

2023
[15]

Retr: Modeling rendering via transformer for generalizable neural surface reconstruction.Advances in neural information processing systems, 36:62332–62351, 2023

Yixun Liang, Hao He, and Yingcong Chen. Retr: Modeling rendering via transformer for generalizable neural surface reconstruction.Advances in neural information processing systems, 36:62332–62351, 2023. 11

2023
[16]

3d gaussian splatting for real-time radiance field rendering.ACM Trans

Bernhard Kerbl, Georgios Kopanas, Thomas Leimkühler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Trans. Graph., 42(4):139–1, 2023

2023
[17]

Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering

Antoine Guédon and Vincent Lepetit. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5354–5363, 2024

2024
[18]

2d gaussian splatting for geometrically accurate radiance fields

Binbin Huang, Zehao Yu, Anpei Chen, Andreas Geiger, and Shenghua Gao. 2d gaussian splatting for geometrically accurate radiance fields. InACM SIGGRAPH 2024 conference papers, pages 1–11, 2024

2024
[19]

High- quality surface reconstruction using gaussian surfels

Pinxuan Dai, Jiamin Xu, Wenxiang Xie, Xinguo Liu, Huamin Wang, and Weiwei Xu. High- quality surface reconstruction using gaussian surfels. InACM SIGGRAPH 2024 Conference Papers, pages 1–11, 2024

2024
[20]

Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics (TOG), 43(6):1–13, 2024

Zehao Yu, Torsten Sattler, and Andreas Geiger. Gaussian opacity fields: Efficient adaptive surface reconstruction in unbounded scenes.ACM Transactions on Graphics (TOG), 43(6):1–13, 2024

2024
[21]

arXiv preprint arXiv:2403.16964 , year=

Mulin Yu, Tao Lu, Linning Xu, Lihan Jiang, Yuanbo Xiangli, and Bo Dai. Gsdf: 3dgs meets sdf for improved rendering and reconstruction.arXiv preprint arXiv:2403.16964, 2024

work page arXiv 2024
[22]

Splatter image: Ultra-fast single-view 3d reconstruction

Stanislaw Szymanowicz, Chrisitian Rupprecht, and Andrea Vedaldi. Splatter image: Ultra-fast single-view 3d reconstruction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10208–10217, 2024

2024
[23]

pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction

David Charatan, Sizhe Lester Li, Andrea Tagliasacchi, and Vincent Sitzmann. pixelsplat: 3d gaussian splats from image pairs for scalable generalizable 3d reconstruction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19457–19467, 2024

2024
[24]

Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images

Yuedong Chen, Haofei Xu, Chuanxia Zheng, Bohan Zhuang, Marc Pollefeys, Andreas Geiger, Tat-Jen Cham, and Jianfei Cai. Mvsplat: Efficient 3d gaussian splatting from sparse multi-view images. InEuropean Conference on Computer Vision, pages 370–386. Springer, 2024

2024
[25]

latentsplat: Autoencoding variational gaussians for fast generalizable 3d reconstruction

Christopher Wewer, Kevin Raj, Eddy Ilg, Bernt Schiele, and Jan Eric Lenssen. latentsplat: Autoencoding variational gaussians for fast generalizable 3d reconstruction. InEuropean Conference on Computer Vision, pages 456–473. Springer, 2024

2024
[26]

arXiv preprint arXiv:2410.24207 (2024)

Botao Ye, Sifei Liu, Haofei Xu, Xueting Li, Marc Pollefeys, Ming-Hsuan Yang, and Songyou Peng. No pose, no problem: Surprisingly simple 3d gaussian splats from sparse unposed images. arXiv preprint arXiv:2410.24207, 2024

work page arXiv 2024
[27]

Gaussian graph network: Learning efficient and generalizable gaussian representations from multi-view images.Advances in Neural Information Processing Systems, 37:50361–50380, 2024

Shengjun Zhang, Xin Fei, Fangfu Liu, Haixu Song, and Yueqi Duan. Gaussian graph network: Learning efficient and generalizable gaussian representations from multi-view images.Advances in Neural Information Processing Systems, 37:50361–50380, 2024

2024
[28]

Hisplat: Hierarchical 3d gaussian splatting for generalizable sparse-view reconstruction.arXiv preprint arXiv:2410.06245, 2024

Shengji Tang, Weicai Ye, Peng Ye, Weihao Lin, Yang Zhou, Tao Chen, and Wanli Ouyang. Hisplat: Hierarchical 3d gaussian splatting for generalizable sparse-view reconstruction.arXiv preprint arXiv:2410.06245, 2024

work page arXiv 2024
[29]

Transplat: Generalizable 3d gaussian splatting from sparse multi-view images with transformers.AAAI, 2025

Chuanrui Zhang, Yingshuang Zou, Zhuoling Li, Minmin Yi, and Haoqian Wang. Transplat: Generalizable 3d gaussian splatting from sparse multi-view images with transformers.AAAI, 2025

2025
[30]

Nerf: Representing scenes as neural radiance fields for view synthesis

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoor- thi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. Communications of the ACM, 65(1):99–106, 2021

2021
[31]

Freenerf: Improving few-shot neural rendering with free frequency regularization

Jiawei Yang, Marco Pavone, and Yue Wang. Freenerf: Improving few-shot neural rendering with free frequency regularization. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023. 12

2023
[32]

Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo

Anpei Chen, Zexiang Xu, Fuqiang Zhao, Xiaoshuai Zhang, Fanbo Xiang, Jingyi Yu, and Hao Su. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021

2021
[33]

Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields

Jonathan T Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P Srinivasan. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. InICCV, pages 5855–5864, 2021

2021
[34]

Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps

Christian Reiser, Songyou Peng, Yiyi Liao, and Andreas Geiger. Kilonerf: Speeding up neural radiance fields with thousands of tiny mlps. InICCV, pages 14335–14345, 2021

2021
[35]

Point-nerf: Point-based neural radiance fields

Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, and Ulrich Neumann. Point-nerf: Point-based neural radiance fields. In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2022
[36]

Fastnerf: High-fidelity neural rendering at 200fps

Stephan J Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien Valentin. Fastnerf: High-fidelity neural rendering at 200fps. InICCV, pages 14346–14355, 2021

2021
[37]

Efficientnerf efficient neural radiance fields

Tao Hu, Shu Liu, Yilun Chen, Tiancheng Shen, and Jiaya Jia. Efficientnerf efficient neural radiance fields. InCVPR, pages 12902–12911, 2022

2022
[38]

Infonerf: Ray entropy minimization for few-shot neural volume rendering

Mijeong Kim, Seonguk Seo, and Bohyung Han. Infonerf: Ray entropy minimization for few-shot neural volume rendering. In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

2022
[39]

Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, and Dragomir Anguelov

Congyue Deng, Chiyu Max Jiang, Charles R. Qi, Xinchen Yan, Yin Zhou, Leonidas Guibas, and Dragomir Anguelov. Nerdi: Single-view nerf synthesis with language-guided diffusion as general image priors. In2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023

2023
[40]

Neural sparse voxel fields.NeurIPS, 33:15651–15663, 2020

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields.NeurIPS, 33:15651–15663, 2020

2020
[41]

Occupancy networks: Learning 3d reconstruction in function space

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3d reconstruction in function space. In2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019

2019
[42]

D-nerf: Neural radiance fields for dynamic scenes

Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. D-nerf: Neural radiance fields for dynamic scenes. InCVPR, pages 10318–10327, 2021

2021
[43]

Grid-guided neural radiance fields for large urban scenes

Linning Xu, Yuanbo Xiangli, Sida Peng, Xingang Pan, Nanxuan Zhao, Christian Theobalt, Bo Dai, and Dahua Lin. Grid-guided neural radiance fields for large urban scenes. InCVPR, pages 8296–8306, 2023

2023
[44]

Mega-nerf: Scalable construc- tion of large-scale nerfs for virtual fly-throughs

Haithem Turki, Deva Ramanan, and Mahadev Satyanarayanan. Mega-nerf: Scalable construc- tion of large-scale nerfs for virtual fly-throughs. InCVPR, pages 12922–12931, 2022

2022
[45]

Block-nerf: Scalable large scene neural view synthesis

Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Pradhan, Ben Mildenhall, Pratul P Srinivasan, Jonathan T Barron, and Henrik Kretzschmar. Block-nerf: Scalable large scene neural view synthesis. InCVPR, pages 8248–8258, 2022

2022
[46]

Efficient neural radiance fields for interactive free-viewpoint video

Haotong Lin, Sida Peng, Zhen Xu, Yunzhi Yan, Qing Shuai, Hujun Bao, and Xiaowei Zhou. Efficient neural radiance fields for interactive free-viewpoint video. InSIGGRAPH Asia, pages 1–9, 2022

2022
[47]

pixelnerf: Neural radiance fields from one or few images

Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. pixelnerf: Neural radiance fields from one or few images. InCVPR, pages 4578–4587, 2021

2021
[48]

Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering

Yuanbo Xiangli, Linning Xu, Xingang Pan, Nanxuan Zhao, Anyi Rao, Christian Theobalt, Bo Dai, and Dahua Lin. Bungeenerf: Progressive neural radiance field for extreme multi-scale scene rendering. InECCV, pages 106–122. Springer, 2022

2022
[49]

Dynamic view synthesis from dynamic monocular video

Chen Gao, Ayush Saraf, Johannes Kopf, and Jia-Bin Huang. Dynamic view synthesis from dynamic monocular video. InICCV, 2021. 13

2021
[50]

Atlas: End-to-end 3d scene reconstruction from posed images

Zak Murez, Tarrence Van As, James Bartolozzi, Ayan Sinha, Vijay Badrinarayanan, and Andrew Rabinovich. Atlas: End-to-end 3d scene reconstruction from posed images. InEuropean conference on computer vision, pages 414–431. Springer, 2020

2020
[51]

V olrecon: V olume rendering of signed ray distance functions for generalizable multi-view reconstruction

Yufan Ren, Fangjinhua Wang, Tong Zhang, Marc Pollefeys, and Sabine Süsstrunk. V olrecon: V olume rendering of signed ray distance functions for generalizable multi-view reconstruction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16685–16695, 2023

2023
[52]

Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis

Jonathon Luiten, Georgios Kopanas, Bastian Leibe, and Deva Ramanan. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In2024 International Conference on 3D Vision (3DV), 2024

2024
[53]

Gps-gaussian: Generalizable pixel-wise 3d gaussian splatting for real-time human novel view synthesis

Shunyuan Zheng, Boyao Zhou, Ruizhi Shao, Boning Liu, Shengping Zhang, Liqiang Nie, and Yebin Liu. Gps-gaussian: Generalizable pixel-wise 3d gaussian splatting for real-time human novel view synthesis. In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024

2024
[54]

Mip-splatting: Alias-free 3d gaussian splatting

Zehao Yu, Anpei Chen, Binbin Huang, Torsten Sattler, and Andreas Geiger. Mip-splatting: Alias-free 3d gaussian splatting. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19447–19456, 2024

2024
[55]

Structure-from-motion revisited

Johannes L Schonberger and Jan-Michael Frahm. Structure-from-motion revisited. InCVPR, pages 4104–4113, 2016

2016
[56]

Lightgaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps

Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, and Zhangyang Wang. Light- gaussian: Unbounded 3d gaussian compression with 15x reduction and 200+ fps.arXiv preprint arXiv:2311.17245, 2023

work page arXiv 2023
[57]

arXiv preprint arXiv:2312.00109 , year =

Tao Lu, Mulin Yu, Linning Xu, Yuanbo Xiangli, Limin Wang, Dahua Lin, and Bo Dai. Scaffold- gs: Structured 3d gaussians for view-adaptive rendering.arXiv preprint arXiv:2312.00109, 2023

work page arXiv 2023
[58]

An efficient 3D Gaussian representation for monocular/multi-view dynamic scenes.arXiv preprint arXiv:2311.12897, 2023

Kai Katsumata, Duc Minh V o, and Hideki Nakayama. An efficient 3d gaussian representation for monocular/multi-view dynamic scenes.arXiv preprint arXiv:2311.12897, 2023

work page arXiv 2023
[59]

arXiv preprint arXiv:2312.00846 , year=

Hanlin Chen, Chen Li, and Gim Hee Lee. Neusg: Neural implicit surface reconstruction with 3d gaussian splatting guidance.arXiv preprint arXiv:2312.00846, 2023

work page arXiv 2023
[60]

Uniquesplat: View- conditioned 3d gaussian splatting for generalizable 3d reconstruction.IEEE Transactions on Image Processing, 34:8376–8389, 2025

Haixu Song, Xiaoke Yang, Shengjun Zhang, Jiwen Lu, and Yueqi Duan. Uniquesplat: View- conditioned 3d gaussian splatting for generalizable 3d reconstruction.IEEE Transactions on Image Processing, 34:8376–8389, 2025

2025
[61]

Learning efficient and generalizable human representation with human gaussian model

Yifan Liu, Shengjun Zhang, Chensheng Dai, Yang Chen, Hao Liu, Chen Li, and Yueqi Duan. Learning efficient and generalizable human representation with human gaussian model. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11797– 11806, 2025

2025
[62]

Certain topics in telegraph transmission theory.Transactions of the American Institute of Electrical Engineers, 47(2):617–644, 2009

Harry Nyquist. Certain topics in telegraph transmission theory.Transactions of the American Institute of Electrical Engineers, 47(2):617–644, 2009

2009
[63]

Large-scale data for multiple-view stereopsis.International Journal of Computer Vision, 120:153–168, 2016

Henrik Aanæs, Rasmus Ramsbøl Jensen, George V ogiatzis, Engin Tola, and Anders Bjorholm Dahl. Large-scale data for multiple-view stereopsis.International Journal of Computer Vision, 120:153–168, 2016

2016
[64]

Stereo Magnification: Learning View Synthesis using Multiplane Images

Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, and Noah Snavely. Stereo magnifi- cation: Learning view synthesis using multiplane images.arXiv preprint arXiv:1805.09817, 2018

work page internal anchor Pith review arXiv 2018
[65]

Blendedmvs: A large-scale dataset for generalized multi-view stereo networks

Yao Yao, Zixin Luo, Shiwei Li, Jingyang Zhang, Yufan Ren, Lei Zhou, Tian Fang, and Long Quan. Blendedmvs: A large-scale dataset for generalized multi-view stereo networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1790–1799, 2020. 14

2020
[66]

Fatesgs: Fast and accurate sparse-view surface reconstruction using gaussian splatting with depth-feature consistency.arXiv preprint arXiv:2501.04628, 2025

Han Huang, Yulun Wu, Chao Deng, Ge Gao, Ming Gu, and Yu-Shen Liu. Fatesgs: Fast and accurate sparse-view surface reconstruction using gaussian splatting with depth-feature consistency.arXiv preprint arXiv:2501.04628, 2025

work page arXiv 2025
[67]

Surfels: Surface elements as rendering primitives

Hanspeter Pfister, Matthias Zwicker, Jeroen Van Baar, and Markus Gross. Surfels: Surface elements as rendering primitives. InProceedings of the 27th annual conference on Computer graphics and interactive techniques, pages 335–342, 2000

2000
[68]

Vggt: Visual geometry grounded transformer

Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 5294–5306, 2025

2025
[69]

Dust3r: Geometric 3d vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. Dust3r: Geometric 3d vision made easy. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20697–20709, 2024. 15 Appendix A Preliminaries A.1 Surfels: Surface Elements Surface elements, commonly referred to assurfels, constitute a point-b...

2024