Geometry-Aware Scene Configurations for Novel View Synthesis

Changwoon Choi; Minkwan Kim; Young Min Kim

arxiv: 2510.09880 · v2 · submitted 2025-10-10 · 💻 cs.CV

Geometry-Aware Scene Configurations for Novel View Synthesis

Minkwan Kim , Changwoon Choi , Young Min Kim This is my paper

Pith reviewed 2026-05-18 07:30 UTC · model grok-4.3

classification 💻 cs.CV

keywords novel view synthesisneural radiance fieldsindoor scene reconstructiongeometry-aware configurationscalable NeRFbasis placementvirtual viewpointsgeometric priors

0 comments

The pith

Geometric priors guide adaptive base placement in scalable NeRFs to improve indoor novel view synthesis over uniform arrangements.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes scene-adaptive strategies that allocate representation capacity more efficiently for novel view synthesis in indoor environments from incomplete observations. It records observation statistics on an estimated geometric scaffold to determine optimal placement of bases, replacing the uniform arrangements used in prior scalable NeRF methods. The approach also introduces scene-adaptive virtual viewpoints that impose regularization to address deficiencies in the input view trajectory. These choices are shown to deliver better rendering quality and lower memory use across several large-scale indoor scenes with clutter, occlusion, and irregular layouts.

Core claim

The central claim is that recording observation statistics on the estimated geometric scaffold and using them to guide optimal placement of bases in scalable NeRF representations greatly improves upon uniform basis arrangements, while scene-adaptive virtual viewpoints compensate for geometric deficiencies in the input trajectory; comprehensive analysis in large indoor scenes demonstrates significant enhancements in rendering quality and memory requirements compared to regular-placement baselines.

What carries the argument

Observation statistics recorded on an estimated geometric scaffold that guide adaptive placement of representation bases and selection of virtual viewpoints.

If this is right

Adaptive base placement utilizes limited resources more effectively in irregular multi-room layouts with varying complexity.
Scene-adaptive virtual viewpoints impose regularization that compensates for deficiencies in the original input trajectory.
Overall memory requirements decrease while rendering quality rises compared with regular-placement baselines.
The method handles clutter, occlusion, and flat walls more robustly than uniform configurations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same observation-statistic principle could be tested on outdoor or dynamic scenes once reliable geometric scaffolds become available.
Integration with denser input capture or multi-view stereo priors might further reduce the number of required bases.
Memory savings could enable deployment on resource-constrained devices for immersive VR walkthroughs of real indoor spaces.

Load-bearing premise

Geometric priors estimated after pre-processing stages are accurate and reliable enough to guide optimal basis placement and virtual viewpoint selection without introducing errors in cluttered or occluded indoor scenes.

What would settle it

A direct side-by-side rendering comparison on a cluttered indoor scene with heavy occlusion where the geometry-guided adaptive placement produces lower PSNR or visible artifacts relative to a uniform baseline of the same memory budget.

Figures

Figures reproduced from arXiv: 2510.09880 by Changwoon Choi, Minkwan Kim, Young Min Kim.

**Figure 1.** Figure 1: Geometry-aware basis configuration and novel view synthesis results. After recording measurement statistics as coverage weights on geometry scaffold, we find optimal basis positions and training-view configurations in indoor environments. Abstract We propose scene-adaptive strategies to efficiently allocate representation capacity for generating immersive experiences of indoor environments from incomplet… view at source ↗

**Figure 2.** Figure 2: Scene-splitting strategies. (a) The original NeRF represents the entire scene with a single block. Scalable scene representations set multiple bases (b) evenly along the camera trajectory or (c) uniformly dividing the scene’s spatial extent. (d) We propose an adaptive approach based on scene configurations. Red curves denote camera trajectories and green dots mark the bases’ centers. 2. Related Work Scen… view at source ↗

**Figure 3.** Figure 3: Method overview. (a) We first extract the geometric scaffold from multi-view image observation. (b) Then we define coverage weights wi for surface points xi on the obtained mesh surface which consider both scene geometry and observation statistics. Starting from bases sampled along the camera trajectory using the FPS algorithm, we optimize their positions by minimizing energy function Lcov. (c) Finally, we… view at source ↗

**Figure 4.** Figure 4: Basis placement pipeline. (a) After accumulating coverage weights wi along surface points xi, (b) bases positions are updated using these weights and their distance from the surface. NeRF blocks We modify the block locations to evaluate our placement strategy for multiple NeRF blocks. We set centroids of NeRF blocks at scene-adaptive basis positions while LocalRF [25] sequentially assigns NeRF blocks alon… view at source ↗

**Figure 5.** Figure 5: Novel view geometric regularization. (a) The illustration of selected virtual views (yellow) with rendered depth images on the right. (b) Test viewpoint images (red) show the effect of each component, added sequentially from left to right. the cluster centers after applying K-means clustering on the basis probe locations. When rendering, NeLF-pro aggregates features from a set of probes near the rendering… view at source ↗

**Figure 6.** Figure 6: Qualitative results of novel view synthesis on ScanNet++ [48] dataset. Best viewed on screen. (3) Depth-NeRFacto extends NeRFacto with depth supervision. We present results using SparseNeRF [13] depth supervision which showed the best performance among the available options. (4) 3DGS [15] is a state-of-the-art explicit scene representation method. Gaussians are initialized from the same feature points th… view at source ↗

**Figure 7.** Figure 7: Qualitative results of novel view synthesis on Zip-NeRF [2] dataset. Best viewed on screen. Method PSNR ↑ SSIM ↑ LPIPS ↓ Memory Time MonoSDF [52] 20.61 0.733 0.459 61 MB 6 h NeRFacto [39] 19.78 0.692 0.491 65 MB 14 min NeRFacto [39]-Big 17.31 0.626 0.556 197 MB 1 h NeRFacto [39]-Huge 17.41 0.627 0.514 211 MB 1.5 h Depth-NeRFacto [39] 20.64 0.710 0.486 65 MB 17 min ZipNeRF [2] 23.53 0.758 0.415 1.2 GB 42 mi… view at source ↗

**Figure 8.** Figure 8: View extrapolation on Zip-NeRF [2] datasets. LocalRF variants require more memory due to the TensoRF framework, Mega-NeRF uses less memory with the original NeRF implementation, taking over 2 days to train for up to 200k iterations. Meanwhile, Hierarchical-3DGS enables very fast rendering using 3D Gaussians within multiple chunks, leading to excessive memory consumption as the scene gets larger. To test… view at source ↗

**Figure 9.** Figure 9: PSNR comparison on a different number of bases. and (128,16), where the first value indicates the total number of bases and the second denotes the number of nearby bases. The number of core probes is fixed at 3. To adjust the number of NeRF blocks in the original LocalRF, we set different values for the maximum allowable drift before adding a new block. We examined PSNRs of the test views with the changi… view at source ↗

**Figure 1.** Figure 1: 2D toy examples of basis placement. We evaluate basis placement using Lcov (a) with and (b) without the denominator. Initial basis (circle outlines), optimized basis (solid green circles), and observation views (blue frustums) are visualized. 12 [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗

**Figure 2.** Figure 2: Coverage weights and optimized basis positions. (a) Given training (blue) and test (red) cameras, initial bases (yellow spheres) are obtained using FPS on cameras. (b) Optimized bases are distributed at balanced positions regarding the coverage weight. Initial bases locations are indicated with sphere outlines, while the optimized bases are solid spheres in a bright green color. age closeness in Euclidean … view at source ↗

**Figure 3.** Figure 3: Shifting optimized basis positions. (a) PSNR comparisons with different levels of perturbation noise scales. (b) Enlargements of rendered images for visual comparison of sampled noise levels. (a) Performance comparisons (b) Memory comparisons [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Experiments for different number of bases. (a) Performance comparisons and (b) memory comparisons are plotted. set. To clearly analyze the effects of basis perturbation, we used neural lightfield probes, reducing the number of basis probes to 16, with 4 nearby probes contributing to rendering. We have tested on ScanNet++ [48] e91722b5a3 scene for this experiment. As the bases are perturbed, they deviate … view at source ↗

**Figure 5.** Figure 5: Examples of inaccurate geometries. (a) Rendered RGBD from GT mesh and (b) example depth inputs with different noise levels. Blur PSNR Noise PSNR Reference PSNR σ = 0.25 27.61 σ = 0.25 27.61 NeLF-pro 26.83 σ = 0.5 27.49 σ = 0.5 27.52 Ours w/o depth 26.96 σ = 1.0 27.40 σ = 1.0 27.27 Ours w/ GT depth (σ = 0) 27.67 σ = 1.5 27.21 σ = 1.5 27.23 Ours w/ MonoSDF depth 27.20 [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative comparisons in Scuol dataset. We compared our framework to NeLF-pro and Depth-NeRFacto. Tanks (Barn) Scuol PSNR↑ SSIM↑ LPIPS↓ PSNR↑ SSIM↑ LPIPS↓ Depth-NeRFacto 17.04 0.634 0.571 18.21 0.676 0.471 NeLF-pro 21.12 0.652 0.556 25.44 0.779 0.3 NeLF-pro+Ours 21.88 0.654 0.523 25.73 0.781 0.236 LocalRF 21.33 0.706 0.566 22.54 0.776 0.418 LocalRF+Ours 22.20 0.711 0.511 23.01 0.780 0.359 [PITH_FULL_IM… view at source ↗

**Figure 7.** Figure 7: Comparisons to SOTA 3DGS algorithms. We compared our frameworks using both (a) ScanNet++ and (b) Zip-NeRF datasets. F. Additional Comparisons Comparison to SOTA 3DGS algorithms We additionally conducted visual comparisons of our methods with SOTA 3DGS baselines. We compared our methods to FSGS [55] and Hierarchical-3DGS [16] as in the main manuscript. For NeLF-pro [49] and LocalRF [25] variants, we evalu… view at source ↗

**Figure 8.** Figure 8: Examples of limitation. (a) Baseline comparison PSNR ± std (b) Ablation study PSNR ± std [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: Error bars for our experiments. We compared (a) baselines and (b) ablation studies on ScanNet++ (left) and ZipNeRF (right) dataset. tested NeLF-pro variants on ScanNet++ dataset and LocalRF variants on Zip-NeRF dataset, respectively. As illustrated in both figures, while optimal basis placement helps reduce blurry artifacts, our geometric regularization with virtual viewpoints further enhances stability… view at source ↗

**Figure 10.** Figure 10: Sampled virtual depths from ScanNet++ [48] dataset Full GT Full NeLF-pro +② +②,③ +① +①,② +①,②,④ [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Qualitative results for ablation study on ScanNet++ [48] dataset. Full GT Full LocalRF +② +②,③ +① +①,② +①,②,④ [PITH_FULL_IMAGE:figures/full_fig_p019_11.png] view at source ↗

**Figure 12.** Figure 12: Qualitative results for ablation study on Zip-NeRF [2] dataset. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_12.png] view at source ↗

**Figure 13.** Figure 13: Additional visualizations of novel view synthesis results on ScanNet++ [48] dataset. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗

**Figure 14.** Figure 14: Additional visualizations of novel view synthesis results on Zip-NeRF [2] dataset. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗

read the original abstract

We propose scene-adaptive strategies to efficiently allocate representation capacity for generating immersive experiences of indoor environments from incomplete observations. Indoor scenes with multiple rooms often exhibit irregular layouts with varying complexity, containing clutter, occlusion, and flat walls. We maximize the utilization of limited resources with guidance from geometric priors, which are often readily available after pre-processing stages. We record observation statistics on the estimated geometric scaffold and guide the optimal placement of bases, which greatly improves upon the uniform basis arrangements adopted by previous scalable Neural Radiance Field (NeRF) representations. We also suggest scene-adaptive virtual viewpoints to compensate for geometric deficiencies inherent in view configurations in the input trajectory and impose the necessary regularization. We present a comprehensive analysis and discussion regarding rendering quality and memory requirements in several large-scale indoor scenes, demonstrating significant enhancements compared to baselines that employ regular placements. Project page is available at: https://mkjjang3598.github.io/Geo-Scene-Config.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts scalable NeRF basis placement to indoor geometry using pre-processed priors and observation stats, but the gains depend on scaffold accuracy that gets little robustness checking.

read the letter

The core idea is to move away from uniform basis grids in large indoor NeRFs by recording how often each part of an estimated geometric scaffold is observed, then placing more capacity where the data actually demands it. They also generate extra virtual viewpoints tuned to the scene layout to add regularization where the input trajectory leaves gaps. This is a direct, practical response to the fact that indoor rooms vary a lot in complexity and often contain flat walls or heavy occlusion.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes scene-adaptive strategies for novel view synthesis in indoor environments using scalable NeRF representations. It records observation statistics on an estimated geometric scaffold derived from pre-processing to guide non-uniform optimal placement of bases, improving upon uniform grid arrangements in prior work. The approach also introduces scene-adaptive virtual viewpoints to compensate for deficiencies in input trajectories and provides regularization. The authors claim significant gains in rendering quality and reduced memory requirements, supported by analysis on several large-scale indoor scenes with clutter, occlusion, and irregular layouts.

Significance. If the central claims hold with proper validation, the work could meaningfully advance efficient NeRF deployment for complex indoor scenes by showing how readily available geometric priors enable better capacity allocation than fixed regular placements. This addresses practical challenges in resource-limited immersive rendering and could influence follow-on methods that incorporate scene geometry for adaptive representations.

major comments (2)

Abstract: The central claim of 'significant enhancements' and 'greatly improves upon the uniform basis arrangements' is load-bearing but unsupported, as the abstract (and visible description) contains no quantitative results, error metrics, ablation studies, or validation procedures to demonstrate the improvement magnitude or reliability.
Method (geometric scaffold and basis placement): The improvement over prior scalable NeRFs rests on the assumption that observation statistics from the pre-processed geometric scaffold produce superior non-uniform base placement; however, no robustness analysis or failure-case experiments are described for the cluttered, occluded, or flat-wall indoor scenes explicitly flagged in the introduction, leaving open the risk that estimation errors propagate directly into misplaced bases.

minor comments (2)

Introduction: The description of how geometric priors transition to virtual viewpoint selection could be expanded with a short diagram or pseudocode for clarity.
Project page reference: The link is provided but the manuscript should explicitly state which supplementary materials (e.g., additional quantitative tables) are hosted there to aid reviewers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for major revision. We address each major comment below, indicating where revisions will be made to strengthen the presentation of our results and method.

read point-by-point responses

Referee: [—] Abstract: The central claim of 'significant enhancements' and 'greatly improves upon the uniform basis arrangements' is load-bearing but unsupported, as the abstract (and visible description) contains no quantitative results, error metrics, ablation studies, or validation procedures to demonstrate the improvement magnitude or reliability.

Authors: We agree that the abstract would benefit from explicit quantitative support for the claimed improvements. The full manuscript reports quantitative comparisons of rendering quality (PSNR, SSIM) and memory usage against uniform grid baselines across multiple large indoor scenes. In the revised version we will update the abstract to include representative numerical results from these experiments. revision: yes
Referee: [—] Method (geometric scaffold and basis placement): The improvement over prior scalable NeRFs rests on the assumption that observation statistics from the pre-processed geometric scaffold produce superior non-uniform base placement; however, no robustness analysis or failure-case experiments are described for the cluttered, occluded, or flat-wall indoor scenes explicitly flagged in the introduction, leaving open the risk that estimation errors propagate directly into misplaced bases.

Authors: The experiments section evaluates the method on large-scale indoor scenes that contain clutter, occlusion, and irregular layouts, with results showing consistent gains over uniform placements. We acknowledge that the current manuscript does not provide dedicated robustness analysis or explicit failure-case studies for errors in the geometric scaffold estimation. We will add a discussion of potential limitations and sensitivity to scaffold accuracy in the revised manuscript. revision: partial

Circularity Check

0 steps flagged

No circularity: geometric priors are external pre-processing input, placement heuristic is independent design choice

full rationale

The paper's core proposal records observation statistics from a separately estimated geometric scaffold (obtained via pre-processing) to inform non-uniform base placement and scene-adaptive viewpoints. This is a methodological heuristic applied to an external input rather than a derivation that reduces the final rendering claim or capacity allocation back to fitted parameters by construction. No equations or self-citations in the provided text create a self-definitional loop, fitted-input prediction, or load-bearing uniqueness theorem. The improvement over uniform grids is presented as an empirical outcome evaluated on indoor scenes, not a tautological renaming or ansatz smuggled via prior author work. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review based on abstract only; no explicit free parameters, axioms, or invented entities are stated. The approach assumes geometric priors are available from standard pre-processing without detailing how they are computed or any new entities introduced.

pith-pipeline@v0.9.0 · 5687 in / 1044 out tokens · 48857 ms · 2026-05-18T07:30:37.078289+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 3 internal anchors

[1]

Barron, Ben Mildenhall, Dor Verbin, Pratul P

Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields.CVPR, 2022. 1, 4, 5, 6

work page 2022
[2]

Barron, Ben Mildenhall, Dor Verbin, Pratul P

Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. Zip-nerf: Anti-aliased grid- based neural radiance fields.ICCV, 2023. 2, 5, 6, 7, 8, 12, 14, 15, 16, 17, 18, 19, 21

work page 2023
[3]

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, and Matthias M ¨uller. Zoedepth: Zero-shot trans- fer by combining relative and metric depth.arXiv preprint arXiv:2302.12288, 2023. 3

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Nope-nerf: Optimising neu- ral radiance field with no pose prior

Wenjing Bian, Zirui Wang, Kejie Li, Jia-Wang Bian, and Victor Adrian Prisacariu. Nope-nerf: Optimising neu- ral radiance field with no pose prior. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023. 3

work page 2023
[5]

The opencv library.Dr

Gary Bradski. The opencv library.Dr. Dobb’s Journal: Soft- ware Tools for the Professional Programmer, 25(11):120– 123, 2000. 15

work page 2000
[6]

Tensorf: Tensorial radiance fields

Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. Tensorf: Tensorial radiance fields. InEuropean Conference on Computer Vision (ECCV), 2022. 4

work page 2022
[7]

Bal- anced spherical grid for egocentric view synthesis

Changwoon Choi, Sang Min Kim, and Young Min Kim. Bal- anced spherical grid for egocentric view synthesis. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16590–16599, 2023. 1

work page 2023
[8]

Dongyoung Choi, Hyeonjoong Jang, and Min H. Kim. Om- nilocalrf: Omnidirectional local radiance fields from dy- namic videos. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 4

work page 2024
[9]

Depth-supervised NeRF: Fewer views and faster training for free

Kangle Deng, Andrew Liu, Jun-Yan Zhu, and Deva Ra- manan. Depth-supervised NeRF: Fewer views and faster training for free. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR),

work page
[10]

Omnidata: A scalable pipeline for making multi- task mid-level vision datasets from 3d scans

Ainaz Eftekhar, Alexander Sax, Jitendra Malik, and Amir Zamir. Omnidata: A scalable pipeline for making multi- task mid-level vision datasets from 3d scans. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 10786–10796, 2021. 15, 16

work page 2021
[11]

A generic and flexible regularization framework for nerfs

Thibaud Ehret, Roger Mar ´ı, and Gabriele Facciolo. A generic and flexible regularization framework for nerfs. In Proceedings of the IEEE/CVF Winter Conference on Ap- plications of Computer Vision (WACV), pages 3088–3097,

work page
[12]

Plenoxels: Radiance fields without neural networks

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InCVPR, 2022. 1, 2, 16

work page 2022
[13]

Sparsenerf: Distilling depth ranking for few-shot novel view synthesis.IEEE/CVF International Conference on Computer Vision (ICCV), 2023

Guangcong, Zhaoxi Chen, Chen Change Loy, and Ziwei Liu. Sparsenerf: Distilling depth ranking for few-shot novel view synthesis.IEEE/CVF International Conference on Computer Vision (ICCV), 2023. 2, 3, 6, 15

work page 2023
[14]

Neural 3d scene reconstruction with the manhattan-world assumption

Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, and Xiaowei Zhou. Neural 3d scene reconstruction with the manhattan-world assumption. InCVPR, 2022. 3

work page 2022
[15]

3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42 (4), 2023

Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42 (4), 2023. 6, 7, 16, 20

work page 2023
[16]

A hierarchical 3d gaussian representation for real-time ren- dering of very large datasets.ACM Transactions on Graph- ics, 43(4), 2024

Bernhard Kerbl, Andreas Meuleman, Georgios Kopanas, Michael Wimmer, Alexandre Lanvin, and George Drettakis. A hierarchical 3d gaussian representation for real-time ren- dering of very large datasets.ACM Transactions on Graph- ics, 43(4), 2024. 6, 7, 16, 17

work page 2024
[17]

Tanks and temples: Benchmarking large-scale scene reconstruction.ACM Transactions on Graphics, 36(4), 2017

Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. Tanks and temples: Benchmarking large-scale scene reconstruction.ACM Transactions on Graphics, 36(4), 2017. 16

work page 2017
[18]

Improving NeRF Quality by Progressive Camera Placement for Free- Viewpoint Navigation

Georgios Kopanas and George Drettakis. Improving NeRF Quality by Progressive Camera Placement for Free- Viewpoint Navigation. InVision, Modeling, and Visualiza- tion. The Eurographics Association, 2023. 2, 14

work page 2023
[19]

Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normaliza- tion.arXiv preprint arXiv:2403.06912, 2024

Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, and Lin Gu. Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normaliza- tion.arXiv preprint arXiv:2403.06912, 2024. 6, 7, 16

work page arXiv 2024
[20]

NeRF-XL: Scaling nerfs with multiple GPUs

Ruilong Li, Sanja Fidler, Angjoo Kanazawa, and Francis Williams. NeRF-XL: Scaling nerfs with multiple GPUs. In European Conference on Computer Vision (ECCV), 2024. 2, 4

work page 2024
[21]

KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.Pattern Analysis and Machine Intelligence (PAMI),

Yiyi Liao, Jun Xie, and Andreas Geiger. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.Pattern Analysis and Machine Intelligence (PAMI),

work page
[22]

Neural sparse voxel fields.NeurIPS,

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields.NeurIPS,

work page
[23]

Lorensen and Harvey E

William E. Lorensen and Harvey E. Cline. Marching cubes: A high resolution 3d surface construction algorithm. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, page 163–169, New York, NY , USA, 1987. Association for Computing Machin- ery. 4

work page 1987
[24]

Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, and Gordon Wetzstein. Acorn: Adaptive coordinate networks for neural scene representa- tion.ACM Trans. Graph. (SIGGRAPH), 40(4), 2021. 2

work page 2021
[25]

Kim, and Johannes Kopf

Andreas Meuleman, Yu-Lun Liu, Chen Gao, Jia-Bin Huang, Changil Kim, Min H. Kim, and Johannes Kopf. Progres- sively optimized local radiance fields for robust view synthe- sis. InCVPR, 2023. 2, 4, 6, 7, 8, 16, 17, 21

work page 2023
[26]

Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar

Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines.ACM Transac- tions on Graphics (TOG), 2019. 6

work page 2019
[27]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: 9 Representing scenes as neural radiance fields for view syn- thesis. InECCV, 2020. 1, 6

work page 2020
[28]

Instant neural graphics primitives with a multires- olution hash encoding.ACM Trans

Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a multires- olution hash encoding.ACM Trans. Graph., 41(4):102:1– 102:15, 2022. 1, 5

work page 2022
[29]

Barron, Ben Mildenhall, Mehdi S

Michael Niemeyer, Jonathan T. Barron, Ben Mildenhall, Mehdi S. M. Sajjadi, Andreas Geiger, and Noha Radwan. Regnerf: Regularizing neural radiance fields for view syn- thesis from sparse inputs. InProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022. 2, 8, 15

work page 2022
[30]

Ac- tivenerf: Learning where to see with uncertainty estimation

Xuran Pan, Zihang Lai, Shiji Song, and Gao Huang. Ac- tivenerf: Learning where to see with uncertainty estimation. InComputer Vision–ECCV 2022: 17th European Confer- ence, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, pages 230–246. Springer, 2022. 2, 14

work page 2022
[31]

Vi- sion transformers for dense prediction.ICCV, 2021

Ren ´e Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. Vi- sion transformers for dense prediction.ICCV, 2021. 3, 8, 16

work page 2021
[32]

Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 44(3), 2022

Ren ´e Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 44(3), 2022. 3, 16

work page 2022
[33]

Barron, Ben Mildenhall, Pratul P

Barbara Roessle, Jonathan T. Barron, Ben Mildenhall, Pratul P. Srinivasan, and Matthias Nießner. Dense depth pri- ors for neural radiance fields from sparse input views. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), 2022. 2, 3

work page 2022
[34]

K-planes: Explicit radiance fields in space, time, and appearance

Sara Fridovich-Keil and Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. In CVPR, 2023. 1

work page 2023
[35]

Structure-from-motion revisited

Johannes Lutz Sch ¨onberger and Jan-Michael Frahm. Structure-from-motion revisited. InConference on Com- puter Vision and Pattern Recognition (CVPR), 2016. 1, 3, 15

work page 2016
[36]

ViP-NeRF: Visibility prior for sparse input neural radiance fields.ACM Trans

Nagabhushan Somraj and Rajiv Soundararajan. ViP-NeRF: Visibility prior for sparse input neural radiance fields.ACM Trans. Graph., 2023. 2

work page 2023
[37]

Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction

Cheng Sun, Min Sun, and Hwann-Tzong Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. InCVPR, 2022. 1

work page 2022
[38]

Srinivasan, Jonathan T

Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Prad- han, Ben Mildenhall, Pratul P. Srinivasan, Jonathan T. Bar- ron, and Henrik Kretzschmar. Block-nerf: Scalable large scene neural view synthesis.arXiv, 2022. 2, 4

work page 2022
[39]

Nerfstudio: A modular framework for neural radiance field development

Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristof- fersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, and Angjoo Kanazawa. Nerfstudio: A modular framework for neural radiance field development. InACM SIGGRAPH 2023 Conference Proceedings, 2023. 5, 6, 7, 16, 20

work page 2023
[40]

Mega-nerf: Scalable construction of large- scale nerfs for virtual fly-throughs

Haithem Turki, Deva Ramanan, and Mahadev Satya- narayanan. Mega-nerf: Scalable construction of large- scale nerfs for virtual fly-throughs. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12922–12931, 2022. 2, 4, 6, 7, 16, 21

work page 2022
[41]

Scade: Nerfs from space carving with ambiguity-aware depth estimates

Mikaela Angelina Uy, Ricardo Martin-Brualla, Leonidas Guibas, and Ke Li. Scade: Nerfs from space carving with ambiguity-aware depth estimates. InConference on Com- puter Vision and Pattern Recognition (CVPR), 2023. 2, 3

work page 2023
[42]

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021. 3

work page internal anchor Pith review Pith/arXiv arXiv 2021
[43]

DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffu- sion Models

Jamie Wynn and Daniyar Turmukhambetov. DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffu- sion Models. InCVPR, 2023. 8

work page 2023
[44]

Nerf director: Revisiting view selection in neural vol- ume rendering

Wenhui Xiao, Rodrigo Santa Cruz, David Ahmedt- Aristizabal, Olivier Salvado, Clinton Fookes, and Leo Le- brat. Nerf director: Revisiting view selection in neural vol- ume rendering. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 5, 14

work page 2024
[45]

Neural visibility field for uncertainty-driven active mapping

Shangjie Xue, Jesse Dill, Pranay Mathur, Frank Dellaert, Panagiotis Tsiotra, and Danfei Xu. Neural visibility field for uncertainty-driven active mapping. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18122–18132, 2024. 2

work page 2024
[46]

Ner- fvs: Neural radiance fields for free view synthesis via geom- etry scaffolds

Chen Yang, Peihao Li, Zanwei Zhou, Shanxin Yuan, Bing- bing Liu, Xiaokang Yang, Weichao Qiu, and Wei Shen. Ner- fvs: Neural radiance fields for free view synthesis via geom- etry scaffolds. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16549– 16558, 2023. 3, 5

work page 2023
[47]

V olume rendering of neural implicit surfaces

Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. V olume rendering of neural implicit surfaces. InThirty- Fifth Conference on Neural Information Processing Systems,

work page
[48]

Scannet++: A high-fidelity dataset of 3d indoor scenes

Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, and Angela Dai. Scannet++: A high-fidelity dataset of 3d indoor scenes. InProceedings of the International Confer- ence on Computer Vision (ICCV), 2023. 2, 5, 6, 8, 12, 13, 14, 15, 17, 18, 19, 20

work page 2023
[49]

Nelf-pro: Neural light field probes for multi-scale novel view synthe- sis

Zinuo You, Andreas Geiger, and Anpei Chen. Nelf-pro: Neural light field probes for multi-scale novel view synthe- sis. InConference on Computer Vision and Pattern Recog- nition (CVPR), 2024. 2, 4, 5, 6, 7, 8, 12, 14, 16, 17, 20, 21

work page 2024
[50]

PlenOctrees for real-time rendering of neural radiance fields

Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. PlenOctrees for real-time rendering of neural radiance fields. InICCV, 2021. 1, 2

work page 2021
[51]

Sdfstudio: A unified framework for surface reconstruction, 2022

Zehao Yu, Anpei Chen, Bozidar Antic, Songyou Peng, Apra- tim Bhattacharyya, Michael Niemeyer, Siyu Tang, Torsten Sattler, and Andreas Geiger. Sdfstudio: A unified framework for surface reconstruction, 2022. 5, 15

work page 2022
[52]

Monosdf: Exploring monocu- lar geometric cues for neural implicit surface reconstruc- tion.Advances in Neural Information Processing Systems (NeurIPS), 2022

Zehao Yu, Songyou Peng, Michael Niemeyer, Torsten Sat- tler, and Andreas Geiger. Monosdf: Exploring monocu- lar geometric cues for neural implicit surface reconstruc- tion.Advances in Neural Information Processing Systems (NeurIPS), 2022. 3, 5, 6, 7, 8, 15, 16, 17 10

work page 2022
[53]

Nerf++: Analyzing and improving neural radiance fields.arXiv preprint arXiv:2010.07492, 2020

Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. Nerf++: Analyzing and improving neural radiance fields.arXiv preprint arXiv:2010.07492, 2020. 1

work page arXiv 2010
[54]

Open3D: A Modern Library for 3D Data Processing

Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. Open3D: A modern library for 3D data processing.arXiv:1801.09847,

work page internal anchor Pith review Pith/arXiv arXiv
[55]

Fsgs: Real-time few-shot view synthesis using gaussian splatting

Zehao Zhu, Zhiwen Fan, Yifan Jiang, and Zhangyang Wang. Fsgs: Real-time few-shot view synthesis using gaussian splatting. InEuropean conference on computer vision, pages 145–163. Springer, 2024. 6, 7, 16, 17 11 Geometry-Aware Scene Configurations for Novel View Synthesis Supplementary Material A. Basis Placement Optimization A.1. Algorithms and Optimizati...

work page 2024

[1] [1]

Barron, Ben Mildenhall, Dor Verbin, Pratul P

Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. Mip-nerf 360: Unbounded anti-aliased neural radiance fields.CVPR, 2022. 1, 4, 5, 6

work page 2022

[2] [2]

Barron, Ben Mildenhall, Dor Verbin, Pratul P

Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. Zip-nerf: Anti-aliased grid- based neural radiance fields.ICCV, 2023. 2, 5, 6, 7, 8, 12, 14, 15, 16, 17, 18, 19, 21

work page 2023

[3] [3]

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Shariq Farooq Bhat, Reiner Birkl, Diana Wofk, Peter Wonka, and Matthias M ¨uller. Zoedepth: Zero-shot trans- fer by combining relative and metric depth.arXiv preprint arXiv:2302.12288, 2023. 3

work page internal anchor Pith review Pith/arXiv arXiv 2023

[4] [4]

Nope-nerf: Optimising neu- ral radiance field with no pose prior

Wenjing Bian, Zirui Wang, Kejie Li, Jia-Wang Bian, and Victor Adrian Prisacariu. Nope-nerf: Optimising neu- ral radiance field with no pose prior. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4160–4169, 2023. 3

work page 2023

[5] [5]

The opencv library.Dr

Gary Bradski. The opencv library.Dr. Dobb’s Journal: Soft- ware Tools for the Professional Programmer, 25(11):120– 123, 2000. 15

work page 2000

[6] [6]

Tensorf: Tensorial radiance fields

Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. Tensorf: Tensorial radiance fields. InEuropean Conference on Computer Vision (ECCV), 2022. 4

work page 2022

[7] [7]

Bal- anced spherical grid for egocentric view synthesis

Changwoon Choi, Sang Min Kim, and Young Min Kim. Bal- anced spherical grid for egocentric view synthesis. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16590–16599, 2023. 1

work page 2023

[8] [8]

Dongyoung Choi, Hyeonjoong Jang, and Min H. Kim. Om- nilocalrf: Omnidirectional local radiance fields from dy- namic videos. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 4

work page 2024

[9] [9]

Depth-supervised NeRF: Fewer views and faster training for free

Kangle Deng, Andrew Liu, Jun-Yan Zhu, and Deva Ra- manan. Depth-supervised NeRF: Fewer views and faster training for free. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR),

work page

[10] [10]

Omnidata: A scalable pipeline for making multi- task mid-level vision datasets from 3d scans

Ainaz Eftekhar, Alexander Sax, Jitendra Malik, and Amir Zamir. Omnidata: A scalable pipeline for making multi- task mid-level vision datasets from 3d scans. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion, pages 10786–10796, 2021. 15, 16

work page 2021

[11] [11]

A generic and flexible regularization framework for nerfs

Thibaud Ehret, Roger Mar ´ı, and Gabriele Facciolo. A generic and flexible regularization framework for nerfs. In Proceedings of the IEEE/CVF Winter Conference on Ap- plications of Computer Vision (WACV), pages 3088–3097,

work page

[12] [12]

Plenoxels: Radiance fields without neural networks

Sara Fridovich-Keil, Alex Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. Plenoxels: Radiance fields without neural networks. InCVPR, 2022. 1, 2, 16

work page 2022

[13] [13]

Sparsenerf: Distilling depth ranking for few-shot novel view synthesis.IEEE/CVF International Conference on Computer Vision (ICCV), 2023

Guangcong, Zhaoxi Chen, Chen Change Loy, and Ziwei Liu. Sparsenerf: Distilling depth ranking for few-shot novel view synthesis.IEEE/CVF International Conference on Computer Vision (ICCV), 2023. 2, 3, 6, 15

work page 2023

[14] [14]

Neural 3d scene reconstruction with the manhattan-world assumption

Haoyu Guo, Sida Peng, Haotong Lin, Qianqian Wang, Guofeng Zhang, Hujun Bao, and Xiaowei Zhou. Neural 3d scene reconstruction with the manhattan-world assumption. InCVPR, 2022. 3

work page 2022

[15] [15]

3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42 (4), 2023

Bernhard Kerbl, Georgios Kopanas, Thomas Leimk ¨uhler, and George Drettakis. 3d gaussian splatting for real-time radiance field rendering.ACM Transactions on Graphics, 42 (4), 2023. 6, 7, 16, 20

work page 2023

[16] [16]

A hierarchical 3d gaussian representation for real-time ren- dering of very large datasets.ACM Transactions on Graph- ics, 43(4), 2024

Bernhard Kerbl, Andreas Meuleman, Georgios Kopanas, Michael Wimmer, Alexandre Lanvin, and George Drettakis. A hierarchical 3d gaussian representation for real-time ren- dering of very large datasets.ACM Transactions on Graph- ics, 43(4), 2024. 6, 7, 16, 17

work page 2024

[17] [17]

Tanks and temples: Benchmarking large-scale scene reconstruction.ACM Transactions on Graphics, 36(4), 2017

Arno Knapitsch, Jaesik Park, Qian-Yi Zhou, and Vladlen Koltun. Tanks and temples: Benchmarking large-scale scene reconstruction.ACM Transactions on Graphics, 36(4), 2017. 16

work page 2017

[18] [18]

Improving NeRF Quality by Progressive Camera Placement for Free- Viewpoint Navigation

Georgios Kopanas and George Drettakis. Improving NeRF Quality by Progressive Camera Placement for Free- Viewpoint Navigation. InVision, Modeling, and Visualiza- tion. The Eurographics Association, 2023. 2, 14

work page 2023

[19] [19]

Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normaliza- tion.arXiv preprint arXiv:2403.06912, 2024

Jiahe Li, Jiawei Zhang, Xiao Bai, Jin Zheng, Xin Ning, Jun Zhou, and Lin Gu. Dngaussian: Optimizing sparse-view 3d gaussian radiance fields with global-local depth normaliza- tion.arXiv preprint arXiv:2403.06912, 2024. 6, 7, 16

work page arXiv 2024

[20] [20]

NeRF-XL: Scaling nerfs with multiple GPUs

Ruilong Li, Sanja Fidler, Angjoo Kanazawa, and Francis Williams. NeRF-XL: Scaling nerfs with multiple GPUs. In European Conference on Computer Vision (ECCV), 2024. 2, 4

work page 2024

[21] [21]

KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.Pattern Analysis and Machine Intelligence (PAMI),

Yiyi Liao, Jun Xie, and Andreas Geiger. KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d.Pattern Analysis and Machine Intelligence (PAMI),

work page

[22] [22]

Neural sparse voxel fields.NeurIPS,

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields.NeurIPS,

work page

[23] [23]

Lorensen and Harvey E

William E. Lorensen and Harvey E. Cline. Marching cubes: A high resolution 3d surface construction algorithm. In Proceedings of the 14th Annual Conference on Computer Graphics and Interactive Techniques, page 163–169, New York, NY , USA, 1987. Association for Computing Machin- ery. 4

work page 1987

[24] [24]

Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, and Gordon Wetzstein. Acorn: Adaptive coordinate networks for neural scene representa- tion.ACM Trans. Graph. (SIGGRAPH), 40(4), 2021. 2

work page 2021

[25] [25]

Kim, and Johannes Kopf

Andreas Meuleman, Yu-Lun Liu, Chen Gao, Jia-Bin Huang, Changil Kim, Min H. Kim, and Johannes Kopf. Progres- sively optimized local radiance fields for robust view synthe- sis. InCVPR, 2023. 2, 4, 6, 7, 8, 16, 17, 21

work page 2023

[26] [26]

Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar

Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. Local light field fusion: Practical view syn- thesis with prescriptive sampling guidelines.ACM Transac- tions on Graphics (TOG), 2019. 6

work page 2019

[27] [27]

Srinivasan, Matthew Tancik, Jonathan T

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: 9 Representing scenes as neural radiance fields for view syn- thesis. InECCV, 2020. 1, 6

work page 2020

[28] [28]

Instant neural graphics primitives with a multires- olution hash encoding.ACM Trans

Thomas M ¨uller, Alex Evans, Christoph Schied, and Alexan- der Keller. Instant neural graphics primitives with a multires- olution hash encoding.ACM Trans. Graph., 41(4):102:1– 102:15, 2022. 1, 5

work page 2022

[29] [29]

Barron, Ben Mildenhall, Mehdi S

Michael Niemeyer, Jonathan T. Barron, Ben Mildenhall, Mehdi S. M. Sajjadi, Andreas Geiger, and Noha Radwan. Regnerf: Regularizing neural radiance fields for view syn- thesis from sparse inputs. InProc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2022. 2, 8, 15

work page 2022

[30] [30]

Ac- tivenerf: Learning where to see with uncertainty estimation

Xuran Pan, Zihang Lai, Shiji Song, and Gao Huang. Ac- tivenerf: Learning where to see with uncertainty estimation. InComputer Vision–ECCV 2022: 17th European Confer- ence, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII, pages 230–246. Springer, 2022. 2, 14

work page 2022

[31] [31]

Vi- sion transformers for dense prediction.ICCV, 2021

Ren ´e Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. Vi- sion transformers for dense prediction.ICCV, 2021. 3, 8, 16

work page 2021

[32] [32]

Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 44(3), 2022

Ren ´e Ranftl, Katrin Lasinger, David Hafner, Konrad Schindler, and Vladlen Koltun. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer.IEEE Transactions on Pattern Analysis and Ma- chine Intelligence, 44(3), 2022. 3, 16

work page 2022

[33] [33]

Barron, Ben Mildenhall, Pratul P

Barbara Roessle, Jonathan T. Barron, Ben Mildenhall, Pratul P. Srinivasan, and Matthias Nießner. Dense depth pri- ors for neural radiance fields from sparse input views. In Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), 2022. 2, 3

work page 2022

[34] [34]

K-planes: Explicit radiance fields in space, time, and appearance

Sara Fridovich-Keil and Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. K-planes: Explicit radiance fields in space, time, and appearance. In CVPR, 2023. 1

work page 2023

[35] [35]

Structure-from-motion revisited

Johannes Lutz Sch ¨onberger and Jan-Michael Frahm. Structure-from-motion revisited. InConference on Com- puter Vision and Pattern Recognition (CVPR), 2016. 1, 3, 15

work page 2016

[36] [36]

ViP-NeRF: Visibility prior for sparse input neural radiance fields.ACM Trans

Nagabhushan Somraj and Rajiv Soundararajan. ViP-NeRF: Visibility prior for sparse input neural radiance fields.ACM Trans. Graph., 2023. 2

work page 2023

[37] [37]

Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction

Cheng Sun, Min Sun, and Hwann-Tzong Chen. Direct voxel grid optimization: Super-fast convergence for radiance fields reconstruction. InCVPR, 2022. 1

work page 2022

[38] [38]

Srinivasan, Jonathan T

Matthew Tancik, Vincent Casser, Xinchen Yan, Sabeek Prad- han, Ben Mildenhall, Pratul P. Srinivasan, Jonathan T. Bar- ron, and Henrik Kretzschmar. Block-nerf: Scalable large scene neural view synthesis.arXiv, 2022. 2, 4

work page 2022

[39] [39]

Nerfstudio: A modular framework for neural radiance field development

Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristof- fersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, and Angjoo Kanazawa. Nerfstudio: A modular framework for neural radiance field development. InACM SIGGRAPH 2023 Conference Proceedings, 2023. 5, 6, 7, 16, 20

work page 2023

[40] [40]

Mega-nerf: Scalable construction of large- scale nerfs for virtual fly-throughs

Haithem Turki, Deva Ramanan, and Mahadev Satya- narayanan. Mega-nerf: Scalable construction of large- scale nerfs for virtual fly-throughs. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12922–12931, 2022. 2, 4, 6, 7, 16, 21

work page 2022

[41] [41]

Scade: Nerfs from space carving with ambiguity-aware depth estimates

Mikaela Angelina Uy, Ricardo Martin-Brualla, Leonidas Guibas, and Ke Li. Scade: Nerfs from space carving with ambiguity-aware depth estimates. InConference on Com- puter Vision and Pattern Recognition (CVPR), 2023. 2, 3

work page 2023

[42] [42]

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689, 2021. 3

work page internal anchor Pith review Pith/arXiv arXiv 2021

[43] [43]

DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffu- sion Models

Jamie Wynn and Daniyar Turmukhambetov. DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffu- sion Models. InCVPR, 2023. 8

work page 2023

[44] [44]

Nerf director: Revisiting view selection in neural vol- ume rendering

Wenhui Xiao, Rodrigo Santa Cruz, David Ahmedt- Aristizabal, Olivier Salvado, Clinton Fookes, and Leo Le- brat. Nerf director: Revisiting view selection in neural vol- ume rendering. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024. 2, 5, 14

work page 2024

[45] [45]

Neural visibility field for uncertainty-driven active mapping

Shangjie Xue, Jesse Dill, Pranay Mathur, Frank Dellaert, Panagiotis Tsiotra, and Danfei Xu. Neural visibility field for uncertainty-driven active mapping. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18122–18132, 2024. 2

work page 2024

[46] [46]

Ner- fvs: Neural radiance fields for free view synthesis via geom- etry scaffolds

Chen Yang, Peihao Li, Zanwei Zhou, Shanxin Yuan, Bing- bing Liu, Xiaokang Yang, Weichao Qiu, and Wei Shen. Ner- fvs: Neural radiance fields for free view synthesis via geom- etry scaffolds. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16549– 16558, 2023. 3, 5

work page 2023

[47] [47]

V olume rendering of neural implicit surfaces

Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. V olume rendering of neural implicit surfaces. InThirty- Fifth Conference on Neural Information Processing Systems,

work page

[48] [48]

Scannet++: A high-fidelity dataset of 3d indoor scenes

Chandan Yeshwanth, Yueh-Cheng Liu, Matthias Nießner, and Angela Dai. Scannet++: A high-fidelity dataset of 3d indoor scenes. InProceedings of the International Confer- ence on Computer Vision (ICCV), 2023. 2, 5, 6, 8, 12, 13, 14, 15, 17, 18, 19, 20

work page 2023

[49] [49]

Nelf-pro: Neural light field probes for multi-scale novel view synthe- sis

Zinuo You, Andreas Geiger, and Anpei Chen. Nelf-pro: Neural light field probes for multi-scale novel view synthe- sis. InConference on Computer Vision and Pattern Recog- nition (CVPR), 2024. 2, 4, 5, 6, 7, 8, 12, 14, 16, 17, 20, 21

work page 2024

[50] [50]

PlenOctrees for real-time rendering of neural radiance fields

Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. PlenOctrees for real-time rendering of neural radiance fields. InICCV, 2021. 1, 2

work page 2021

[51] [51]

Sdfstudio: A unified framework for surface reconstruction, 2022

Zehao Yu, Anpei Chen, Bozidar Antic, Songyou Peng, Apra- tim Bhattacharyya, Michael Niemeyer, Siyu Tang, Torsten Sattler, and Andreas Geiger. Sdfstudio: A unified framework for surface reconstruction, 2022. 5, 15

work page 2022

[52] [52]

Monosdf: Exploring monocu- lar geometric cues for neural implicit surface reconstruc- tion.Advances in Neural Information Processing Systems (NeurIPS), 2022

Zehao Yu, Songyou Peng, Michael Niemeyer, Torsten Sat- tler, and Andreas Geiger. Monosdf: Exploring monocu- lar geometric cues for neural implicit surface reconstruc- tion.Advances in Neural Information Processing Systems (NeurIPS), 2022. 3, 5, 6, 7, 8, 15, 16, 17 10

work page 2022

[53] [53]

Nerf++: Analyzing and improving neural radiance fields.arXiv preprint arXiv:2010.07492, 2020

Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. Nerf++: Analyzing and improving neural radiance fields.arXiv preprint arXiv:2010.07492, 2020. 1

work page arXiv 2010

[54] [54]

Open3D: A Modern Library for 3D Data Processing

Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. Open3D: A modern library for 3D data processing.arXiv:1801.09847,

work page internal anchor Pith review Pith/arXiv arXiv

[55] [55]

Fsgs: Real-time few-shot view synthesis using gaussian splatting

Zehao Zhu, Zhiwen Fan, Yifan Jiang, and Zhangyang Wang. Fsgs: Real-time few-shot view synthesis using gaussian splatting. InEuropean conference on computer vision, pages 145–163. Springer, 2024. 6, 7, 16, 17 11 Geometry-Aware Scene Configurations for Novel View Synthesis Supplementary Material A. Basis Placement Optimization A.1. Algorithms and Optimizati...

work page 2024