Neural Surface Reconstruction from Sparse Views Using Epipolar Geometry

Kaichen Zhou; Xinhai Chang

arxiv: 2406.04301 · v4 · submitted 2024-06-06 · 💻 cs.CV

Neural Surface Reconstruction from Sparse Views Using Epipolar Geometry

Xinhai Chang , Kaichen Zhou This is my paper

Pith reviewed 2026-05-24 00:09 UTC · model grok-4.3

classification 💻 cs.CV

keywords neural surface reconstructionepipolar geometrysparse viewsgeneralizable reconstructioncost volumeSDFmonocular depth regularizationfeature aggregation

0 comments

The pith

EpiS reconstructs surfaces from sparse multi-view images by guiding fine-grained epipolar feature aggregation with coarse cost-volume features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that explicitly incorporating epipolar geometry into a neural surface reconstruction pipeline overcomes the geometric ambiguity and information loss that plague cost-volume methods when inputs are limited to a few views. It replaces reliance on simple statistics like mean and variance with guided sampling of features along epipolar lines, fusion through an epipolar transformer, ray-wise aggregation into SDF-aware features, and scale-invariant regularization drawn from a pretrained monocular depth model. A sympathetic reader would care because sparse-view capture is the practical norm for many real scenes, yet existing generalizable approaches produce over-smoothed or incomplete surfaces; the new design promises accurate reconstruction without dense imagery or per-scene optimization. If correct, the approach would make high-fidelity surface modeling feasible from ordinary limited photo sets.

Core claim

The authors present EpiS as a generalizable framework that uses coarse cost-volume features to guide aggregation of fine-grained epipolar features sampled along corresponding epipolar lines across source views. An epipolar transformer fuses the multi-view information, followed by ray-wise aggregation to produce SDF-aware features for surface estimation. A geometry regularization strategy that leverages a pretrained monocular depth model through scale-invariant global and local constraints further mitigates information loss under sparse views.

What carries the argument

Epipolar feature aggregation guided by cost-volume features, which samples and fuses view-dependent geometry along epipolar lines before producing SDF-aware outputs.

If this is right

Outperforms state-of-the-art generalizable surface reconstruction methods on DTU and BlendedMVS under sparse-view settings.
Maintains strong generalization without per-scene optimization.
Reduces over-smoothing by preserving view-dependent geometric structure that simple cost-volume statistics discard.
Handles occlusions and geometric ambiguity more effectively through explicit epipolar sampling and depth-based regularization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The hybrid cost-volume plus epipolar strategy could transfer to other sparse multi-view tasks such as depth estimation or novel-view synthesis.
Similar monocular priors might regularize reconstruction in dynamic or non-rigid scenes where epipolar consistency still holds across frames.
The design implies that learned priors aligned with epipolar geometry can substitute for additional views in extremely sparse regimes.

Load-bearing premise

Coarse cost-volume features can reliably guide fine-grained epipolar feature aggregation while a pretrained monocular depth model supplies unbiased scale-invariant constraints that align with multi-view epipolar geometry.

What would settle it

On the DTU dataset using three input views, EpiS produces higher Chamfer distance or lower F-score than prior generalizable cost-volume baselines.

Figures

Figures reproduced from arXiv: 2406.04301 by Kaichen Zhou, Xinhai Chang.

**Figure 2.** Figure 2: Illustration of the Pipeline. Given a ray in the target view, it is projected onto source views to extract the epipolar feature and distribution feature (variance and mean) using a cost volume. Subsequently, the distribution features are utilized as queries, while the epipolar features serve as keys and values for cross-attention transformers, facilitating cross-view epipolar feature fusion. This fused fe… view at source ↗

**Figure 3.** Figure 3: Visualization of Our Fine-Tuning Strategy Designs. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization results on the DTU dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Reconstruction results on the BlendedMVS dataset. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

read the original abstract

Reconstructing accurate surfaces from sparse multi-view images remains challenging due to severe geometric ambiguity and occlusions. Existing generalizable neural surface reconstruction methods primarily rely on cost volumes that summarize multi-view features using simple statistics (e.g., mean and variance), which discard critical view-dependent geometric structure and often lead to over-smoothed reconstructions. We propose EpiS, a generalizable neural surface reconstruction framework that explicitly leverages epipolar geometry for sparse-view inputs. Instead of directly regressing geometry from cost-volume statistics, EpiS uses coarse cost-volume features to guide the aggregation of fine-grained epipolar features sampled along corresponding epipolar lines across source views. An epipolar transformer fuses multi-view information, followed by ray-wise aggregation to produce SDF-aware features for surface estimation. To further mitigate information loss under sparse views, we introduce a geometry regularization strategy that leverages a pretrained monocular depth model through scale-invariant global and local constraints. Extensive experiments on DTU and BlendedMVS demonstrate that EpiS significantly outperforms state-of-the-art generalizable surface reconstruction methods under sparse-view settings, while maintaining strong generalization without per-scene optimization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EpiS uses cost-volume-guided epipolar sampling plus monocular depth reg to target over-smoothing in sparse-view SDF recon, but the monocular step is the part that needs checking.

read the letter

The main thing here is EpiS, which samples fine epipolar features along lines in source views, guided by coarse cost-volume features, then fuses them with an epipolar transformer and adds scale-invariant constraints from a pretrained monocular depth model. This is meant to keep more geometric structure than the mean/variance summaries common in cost-volume baselines for generalizable surface reconstruction from sparse inputs like three views on DTU or BlendedMVS.

Referee Report

2 major / 2 minor

Summary. The paper proposes EpiS, a generalizable neural surface reconstruction framework for sparse multi-view inputs. It replaces direct regression from cost-volume statistics with coarse cost-volume features guiding aggregation of fine-grained epipolar features sampled along epipolar lines, fused via an epipolar transformer and ray-wise aggregation to produce SDF-aware features. A geometry regularization strategy adds scale-invariant global and local constraints from a pretrained monocular depth model. Experiments on DTU and BlendedMVS report significant outperformance over prior generalizable methods under sparse views without per-scene optimization.

Significance. If the reported gains hold after verification of implementation details and ablations, the explicit use of epipolar geometry for feature aggregation combined with monocular regularization could advance sparse-view surface reconstruction by preserving view-dependent structure that simple cost-volume statistics discard.

major comments (2)

[Abstract] Abstract (geometry regularization strategy paragraph): The central performance claim depends on the monocular depth constraints supplying unbiased signals that align with multi-view epipolar geometry after scale normalization. No analysis or test is described showing that systematic errors in the pretrained model (e.g., in low-texture or view-dependent regions under 3-view DTU/BlendedMVS protocols) do not pull ray-wise SDF features toward inconsistent surfaces, which directly risks undermining the reported gains over cost-volume baselines.
[Method] Method description (epipolar feature aggregation): The claim that coarse cost-volume features reliably guide fine-grained epipolar aggregation is load-bearing for the outperformance result, yet the manuscript provides no quantitative measure (e.g., alignment error or ablation removing the guidance) of how well this guidance functions when the cost volume itself is severely under-constrained by only three views.

minor comments (2)

[Abstract] The abstract and method sections use 'SDF-aware features' without an explicit definition or equation linking the ray-wise aggregation output to the signed distance function used for surface extraction.
[Experiments] Dataset splits, number of views (e.g., exact 3-view protocol), and whether error bars or multiple runs are reported are not mentioned in the provided abstract; these details are needed for reproducibility of the 'significantly outperforms' claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting two important aspects of our method that warrant further clarification. We address each major comment below and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract] Abstract (geometry regularization strategy paragraph): The central performance claim depends on the monocular depth constraints supplying unbiased signals that align with multi-view epipolar geometry after scale normalization. No analysis or test is described showing that systematic errors in the pretrained model (e.g., in low-texture or view-dependent regions under 3-view DTU/BlendedMVS protocols) do not pull ray-wise SDF features toward inconsistent surfaces, which directly risks undermining the reported gains over cost-volume baselines.

Authors: We agree that an explicit analysis of potential systematic biases in the pretrained monocular depth model under the 3-view protocols would strengthen the paper. The scale-invariant global and local constraints are intended to reduce sensitivity to absolute scale and local inconsistencies, and the reported gains over pure cost-volume baselines on both DTU and BlendedMVS provide indirect evidence that any residual biases do not dominate. Nevertheless, we will add a dedicated paragraph in the revised manuscript discussing known limitations of monocular depth estimators in low-texture and view-dependent regions, together with qualitative visualizations of the depth predictions used during training on the evaluation scenes. revision: partial
Referee: [Method] Method description (epipolar feature aggregation): The claim that coarse cost-volume features reliably guide fine-grained epipolar aggregation is load-bearing for the outperformance result, yet the manuscript provides no quantitative measure (e.g., alignment error or ablation removing the guidance) of how well this guidance functions when the cost volume itself is severely under-constrained by only three views.

Authors: The guidance mechanism is indeed central. While the current manuscript does not report a direct alignment-error metric between coarse cost-volume features and the sampled epipolar features, the ablation studies already isolate the contribution of the epipolar transformer and ray-wise aggregation. To directly quantify the guidance quality under three-view sparsity, we will add a new ablation that replaces the learned guidance with uniform or random sampling along epipolar lines and report the resulting surface reconstruction metrics on DTU. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses external epipolar geometry and pretrained monocular model without self-referential reduction

full rationale

The paper's central claims rest on standard external components (epipolar geometry for feature aggregation along lines, coarse cost-volume guidance, and scale-invariant constraints from a pretrained monocular depth model) that are not defined in terms of the method's outputs or fitted parameters. No equations, self-citations, or uniqueness theorems are presented that reduce the performance gains or regularization strategy to a fit or renaming of the inputs themselves. The abstract and description treat these as independent priors, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated or derivable from the provided text.

pith-pipeline@v0.9.0 · 5720 in / 1117 out tokens · 25722 ms · 2026-05-24T00:09:03.357112+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

EpiS uses coarse cost-volume features to guide the aggregation of fine-grained epipolar features sampled along corresponding epipolar lines... epipolar transformer fuses multi-view information, followed by ray-wise aggregation to produce SDF-aware features... geometry regularization strategy that leverages a pretrained monocular depth model through scale-invariant global and local constraints (global triplet loss, local gradient loss).
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Epipolar & Ray Information Aggregation... Linearized Attention mechanism... Geometry Decoder & Weights Decoder... Lglobal = ((d̂1 − d̂s) × (d̃2 − d̃s) − (d̂2 − d̂s) × (d̃1 − d̃s))², Llocal = (1 − v̂ · ṽ / ||v̂||·||ṽ||)²

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages · 3 internal anchors

[1]

International Journal of Computer Vision120, 153–168 (2016) 3, 11

Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. International Journal of Computer Vision120, 153–168 (2016) 3, 11

work page 2016
[2]

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Bhat,S.F.,Birkl,R.,Wofk,D.,Wonka,P.,Müller,M.:Zoedepth:Zero-shottransfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288 (2023) 8

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

Campbell, N.D., Vogiatzis, G., Hernández, C., Cipolla, R.: Using multiple hypothe- sestoimprovedepth-mapsformulti-viewstereo.In:ComputerVision–ECCV2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10. pp. 766–779. Springer (2008) 3

work page 2008
[4]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Chen, A., Xu, Z., Zhao, F., Zhang, X., Xiang, F., Yu, J., Su, H.: Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 14124–14133 (2021) 3, 9, 10

work page 2021
[5]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Ding, Y., Yuan, W., Zhu, Q., Zhang, H., Liu, X., Wang, Y., Liu, X.: Transmvsnet: Global context-aware multi-view stereo network with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8585–8594 (2022) 1

work page 2022
[6]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Genova, K., Cole, F., Vlasic, D., Sarna, A., Freeman, W.T., Funkhouser, T.: Learn- ing shape templates with structured implicit functions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7154–7164 (2019) 3

work page 2019
[7]

arXiv preprint arXiv:2002.10099 (2020) 3

Gropp, A., Yariv, L., Haim, N., Atzmon, M., Lipman, Y.: Implicit geometric reg- ularization for learning shapes. arXiv preprint arXiv:2002.10099 (2020) 3

work page arXiv 2002
[8]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2495–2504 (2020) 1, 3

work page 2020
[9]

In: Proceedings of the IEEE international conference on computer vision

Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: Surfacenet: An end-to-end 3d neu- ral network for multiview stereopsis. In: Proceedings of the IEEE international conference on computer vision. pp. 2307–2315 (2017) 3

work page 2017
[10]

IEEE Transactions on Pattern Analysis and Machine Intelligence 43(11), 4078–4093 (2020) 3

Ji, M., Zhang, J., Dai, Q., Fang, L.: Surfacenet+: An end-to-end 3d neural network for very sparse multi-view stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(11), 4078–4093 (2020) 3

work page 2020
[11]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Johari, M.M., Lepoittevin, Y., Fleuret, F.: Geonerf: Generalizing nerf with geome- try priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 18365–18375 (2022) 3

work page 2022
[12]

Advances in neural information processing systems30 (2017) 3

Kar, A., Häne, C., Malik, J.: Learning a multi-view stereo machine. Advances in neural information processing systems30 (2017) 3

work page 2017
[13]

In: International conference on machine learning

Katharopoulos, A., Vyas, A., Pappas, N., Fleuret, F.: Transformers are rnns: Fast autoregressive transformers with linear attention. In: International conference on machine learning. pp. 5156–5165. PMLR (2020) 6, 7

work page 2020
[14]

Adam: A Method for Stochastic Optimization

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) 11

work page internal anchor Pith review Pith/arXiv arXiv 2014
[15]

International journal of computer vision38, 199–218 (2000) 3

Kutulakos, K.N., Seitz, S.M.: A theory of shape by space carving. International journal of computer vision38, 199–218 (2000) 3

work page 2000
[16]

IEEE transactions on pattern analysis and machine intelligence 27(3), 418–433 (2005) 3 16 Kaichen Zhou

Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from un- calibrated images. IEEE transactions on pattern analysis and machine intelligence 27(3), 418–433 (2005) 3 16 Kaichen Zhou

work page 2005
[17]

Advances in Neural Information Processing Systems 36 (2024) 4

Liang, Y., He, H., Chen, Y.: Retr: Modeling rendering via transformer for gener- alizable neural surface reconstruction. Advances in Neural Information Processing Systems 36 (2024) 4

work page 2024
[18]

Advances in Neural Information Processing Systems33, 15651–15663 (2020) 3

Liu, L., Gu, J., Zaw Lin, K., Chua, T.S., Theobalt, C.: Neural sparse voxel fields. Advances in Neural Information Processing Systems33, 15651–15663 (2020) 3

work page 2020
[19]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, S., Zhang, Y., Peng, S., Shi, B., Pollefeys, M., Cui, Z.: Dist: Rendering deep implicit signed distance function with differentiable sphere tracing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2019–2028 (2020) 3

work page 2019
[20]

In: European Conference on Computer Vision

Long, X., Lin, C., Wang, P., Komura, T., Wang, W.: Sparseneus: Fast generaliz- able neural surface reconstruction from sparse views. In: European Conference on Computer Vision. pp. 210–227. Springer (2022) 1, 2, 3, 4, 9, 10, 11, 12

work page 2022
[21]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4460–4470 (2019) 3

work page 2019
[22]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Michalkiewicz, M., Pontes, J.K., Jack, D., Baktashmotlagh, M., Eriksson, A.: Im- plicit surface representations as layers in neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4743–4752 (2019) 3

work page 2019
[23]

In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II 13

Middelberg,S.,Sattler,T.,Untzelmann,O.,Kobbelt,L.:Scalable6-doflocalization on mobile devices. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II 13. pp. 268–283. Springer (2014) 1

work page 2014
[24]

Commu- nications of the ACM65(1), 99–106 (2021) 1, 5, 7, 12

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commu- nications of the ACM65(1), 99–106 (2021) 1, 5, 7, 12

work page 2021
[25]

ACM Transactions on Graphics (ToG)41(4), 1– 15 (2022) 3

Müller,T.,Evans,A.,Schied,C.,Keller,A.:Instantneuralgraphicsprimitiveswith a multiresolution hash encoding. ACM Transactions on Graphics (ToG)41(4), 1– 15 (2022) 3

work page 2022
[26]

In: Proceedings of the IEEE/CVF inter- national conference on computer vision

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.: Occupancy flow: 4d recon- struction by learning particle dynamics. In: Proceedings of the IEEE/CVF inter- national conference on computer vision. pp. 5379–5389 (2019) 3

work page 2019
[27]

In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.: Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion. pp. 3504–3515 (2020) 3

work page 2020
[28]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.: Texture fields: Learning texture representations in function space. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4531–4540 (2019) 3

work page 2019
[29]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Oechsle, M., Peng, S., Geiger, A.: Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5589–5599 (2021) 3, 9, 10

work page 2021
[30]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Park,J.J.,Florence,P.,Straub,J.,Newcombe,R.,Lovegrove,S.:Deepsdf:Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 165– 174 (2019) 3

work page 2019
[31]

Advances in Neural Information Processing Systems 36 (2024) 2, 4, 10 Abbreviated paper title 17

Peng, R., Gu, X., Tang, L., Shen, S., Yu, F., Wang, R.: Gens: Generalizable neural surface reconstruction from multi-view images. Advances in Neural Information Processing Systems 36 (2024) 2, 4, 10 Abbreviated paper title 17

work page 2024
[32]

Peng, S., Niemeyer, M., Mescheder, L., Pollefeys, M., Geiger, A.: Convolutional occupancynetworks.In:ComputerVision–ECCV2020:16thEuropeanConference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. pp. 523–540. Springer (2020) 3

work page 2020
[33]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-nerf: Neural ra- diance fields for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10318–10327 (2021) 3

work page 2021
[34]

Pytorch, A.D.I.: Pytorch (2018) 11

work page 2018
[35]

IEEE transactions on pattern analysis and machine intelligence44(3), 1623–1637 (2020) 8

Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence44(3), 1623–1637 (2020) 8

work page 2020
[36]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Ren, Y., Zhang, T., Pollefeys, M., Süsstrunk, S., Wang, F.: Volrecon: Volume ren- dering of signed ray distance functions for generalizable multi-view reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16685–16695 (2023) 2, 4, 11, 14

work page 2023
[37]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4104–4113 (2016) 1, 3, 9, 10, 12

work page 2016
[38]

In: Computer Vision–ECCV 2016: 14th Euro- pean Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14

Schönberger, J.L., Zheng, E., Frahm, J.M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Computer Vision–ECCV 2016: 14th Euro- pean Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14. pp. 501–518. Springer (2016) 3

work page 2016
[39]

IEEE TRANS- ACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE32(8) (2010) 3

Stereopsis, R.M.: Accurate, dense, and robust multiview stereopsis. IEEE TRANS- ACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE32(8) (2010) 3

work page 2010
[40]

In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition

Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: Super-fast conver- gence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition. pp. 5459–5469 (2022) 3

work page 2022
[41]

Machine Vision and Applications23, 903–920 (2012) 3

Tola, E., Strecha, C., Fua, P.: Efficient large-scale multi-view stereo for ultra high- resolution image sets. Machine Vision and Applications23, 903–920 (2012) 3

work page 2012
[42]

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689 (2021) 1, 3, 5, 7, 9, 12

work page internal anchor Pith review Pith/arXiv arXiv 2021
[43]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Wang, Q., Wang, Z., Genova, K., Srinivasan, P.P., Zhou, H., Barron, J.T., Martin- Brualla, R., Snavely, N., Funkhouser, T.: Ibrnet: Learning multi-view image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4690–4699 (2021) 3, 5, 9, 10

work page 2021
[44]

In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision

Wang, Y., Han, Q., Habermann, M., Daniilidis, K., Theobalt, C., Liu, L.: Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision. pp. 3295– 3306 (2023) 1, 8, 9, 10

work page 2023
[45]

In: International Con- ference on Learning Representations (ICLR) (2023) 12

Wu, T., Wang, J., Pan, X., Xu, X., Theobalt, C., Liu, Z., Lin, D.: Voxurf: Voxel- based efficient and accurate neural surface reconstruction. In: International Con- ference on Learning Representations (ICLR) (2023) 12

work page 2023
[46]

In: Proceedings of the European conference on computer vision (ECCV)

Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: Mvsnet: Depth inference for unstruc- tured multi-view stereo. In: Proceedings of the European conference on computer vision (ECCV). pp. 767–783 (2018) 1, 3, 6, 9, 10, 11, 12

work page 2018
[47]

Computer Vision and Pattern Recognition (CVPR) (2020) 11 18 Kaichen Zhou

Yao, Y., Luo, Z., Li, S., Zhang, J., Ren, Y., Zhou, L., Fang, T., Quan, L.: Blend- edmvs: A large-scale dataset for generalized multi-view stereo networks. Computer Vision and Pattern Recognition (CVPR) (2020) 11 18 Kaichen Zhou

work page 2020
[48]

Advancesin Neural Information ProcessingSystems34, 4805–4815 (2021) 3, 9, 10, 12

Yariv, L., Gu, J., Kasten, Y., Lipman, Y.: Volume rendering of neural implicit surfaces. Advancesin Neural Information ProcessingSystems34, 4805–4815 (2021) 3, 9, 10, 12

work page 2021
[49]

Advances in Neural Information Processing Systems33 (2020) 3, 10, 12

Yariv, L., Kasten, Y., Moran, D., Galun, M., Atzmon, M., Ronen, B., Lipman, Y.: Multiview neural surface reconstruction by disentangling geometry and ap- pearance. Advances in Neural Information Processing Systems33 (2020) 3, 10, 12

work page 2020
[50]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelnerf: Neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4578–4587 (2021) 3, 9, 10

work page 2021
[51]

Advances in neural information processing systems35, 25018–25032 (2022) 3

Yu, Z., Peng, S., Niemeyer, M., Sattler, T., Geiger, A.: Monosdf: Exploring monoc- ular geometric cues for neural implicit surface reconstruction. Advances in neural information processing systems35, 25018–25032 (2022) 3

work page 2022
[52]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Zhang, J., Yao, Y., Quan, L.: Learning signed distance field for multi-view sur- face reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 6525–6534 (2021) 3, 12

work page 2021
[53]

In: Proceed- ingsoftheAAAIConferenceonArtificialIntelligence.vol.35,pp.6165–6173(2021) 1

Zhou, K., Chen, C., Wang, B., Saputra, M.R.U., Trigoni, N., Markham, A.: Vmloc: Variational fusion for learning-based multimodal camera localization. In: Proceed- ingsoftheAAAIConferenceonArtificialIntelligence.vol.35,pp.6165–6173(2021) 1

work page 2021
[54]

In: European Confer- ence on Computer Vision

Zhou, K., Hong, L., Chen, C., Xu, H., Ye, C., Hu, Q., Li, Z.: Devnet: Self-supervised monocular depth learning via density volume construction. In: European Confer- ence on Computer Vision. pp. 125–142. Springer (2022) 1, 5

work page 2022
[55]

Advances in Neural Information Processing Systems 36 (2024) 8

Zhou, K., Zhong, J.X., Shin, S., Lu, K., Yang, Y., Markham, A., Trigoni, N.: Dyn- point: Dynamic neural point for view synthesis. Advances in Neural Information Processing Systems 36 (2024) 8

work page 2024

[1] [1]

International Journal of Computer Vision120, 153–168 (2016) 3, 11

Aanæs, H., Jensen, R.R., Vogiatzis, G., Tola, E., Dahl, A.B.: Large-scale data for multiple-view stereopsis. International Journal of Computer Vision120, 153–168 (2016) 3, 11

work page 2016

[2] [2]

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Bhat,S.F.,Birkl,R.,Wofk,D.,Wonka,P.,Müller,M.:Zoedepth:Zero-shottransfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288 (2023) 8

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [3]

Campbell, N.D., Vogiatzis, G., Hernández, C., Cipolla, R.: Using multiple hypothe- sestoimprovedepth-mapsformulti-viewstereo.In:ComputerVision–ECCV2008: 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part I 10. pp. 766–779. Springer (2008) 3

work page 2008

[4] [4]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Chen, A., Xu, Z., Zhao, F., Zhang, X., Xiang, F., Yu, J., Su, H.: Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 14124–14133 (2021) 3, 9, 10

work page 2021

[5] [5]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Ding, Y., Yuan, W., Zhu, Q., Zhang, H., Liu, X., Wang, Y., Liu, X.: Transmvsnet: Global context-aware multi-view stereo network with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 8585–8594 (2022) 1

work page 2022

[6] [6]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Genova, K., Cole, F., Vlasic, D., Sarna, A., Freeman, W.T., Funkhouser, T.: Learn- ing shape templates with structured implicit functions. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7154–7164 (2019) 3

work page 2019

[7] [7]

arXiv preprint arXiv:2002.10099 (2020) 3

Gropp, A., Yariv, L., Haim, N., Atzmon, M., Lipman, Y.: Implicit geometric reg- ularization for learning shapes. arXiv preprint arXiv:2002.10099 (2020) 3

work page arXiv 2002

[8] [8]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 2495–2504 (2020) 1, 3

work page 2020

[9] [9]

In: Proceedings of the IEEE international conference on computer vision

Ji, M., Gall, J., Zheng, H., Liu, Y., Fang, L.: Surfacenet: An end-to-end 3d neu- ral network for multiview stereopsis. In: Proceedings of the IEEE international conference on computer vision. pp. 2307–2315 (2017) 3

work page 2017

[10] [10]

IEEE Transactions on Pattern Analysis and Machine Intelligence 43(11), 4078–4093 (2020) 3

Ji, M., Zhang, J., Dai, Q., Fang, L.: Surfacenet+: An end-to-end 3d neural network for very sparse multi-view stereopsis. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(11), 4078–4093 (2020) 3

work page 2020

[11] [11]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Johari, M.M., Lepoittevin, Y., Fleuret, F.: Geonerf: Generalizing nerf with geome- try priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 18365–18375 (2022) 3

work page 2022

[12] [12]

Advances in neural information processing systems30 (2017) 3

Kar, A., Häne, C., Malik, J.: Learning a multi-view stereo machine. Advances in neural information processing systems30 (2017) 3

work page 2017

[13] [13]

In: International conference on machine learning

Katharopoulos, A., Vyas, A., Pappas, N., Fleuret, F.: Transformers are rnns: Fast autoregressive transformers with linear attention. In: International conference on machine learning. pp. 5156–5165. PMLR (2020) 6, 7

work page 2020

[14] [14]

Adam: A Method for Stochastic Optimization

Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014) 11

work page internal anchor Pith review Pith/arXiv arXiv 2014

[15] [15]

International journal of computer vision38, 199–218 (2000) 3

Kutulakos, K.N., Seitz, S.M.: A theory of shape by space carving. International journal of computer vision38, 199–218 (2000) 3

work page 2000

[16] [16]

IEEE transactions on pattern analysis and machine intelligence 27(3), 418–433 (2005) 3 16 Kaichen Zhou

Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from un- calibrated images. IEEE transactions on pattern analysis and machine intelligence 27(3), 418–433 (2005) 3 16 Kaichen Zhou

work page 2005

[17] [17]

Advances in Neural Information Processing Systems 36 (2024) 4

Liang, Y., He, H., Chen, Y.: Retr: Modeling rendering via transformer for gener- alizable neural surface reconstruction. Advances in Neural Information Processing Systems 36 (2024) 4

work page 2024

[18] [18]

Advances in Neural Information Processing Systems33, 15651–15663 (2020) 3

Liu, L., Gu, J., Zaw Lin, K., Chua, T.S., Theobalt, C.: Neural sparse voxel fields. Advances in Neural Information Processing Systems33, 15651–15663 (2020) 3

work page 2020

[19] [19]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, S., Zhang, Y., Peng, S., Shi, B., Pollefeys, M., Cui, Z.: Dist: Rendering deep implicit signed distance function with differentiable sphere tracing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2019–2028 (2020) 3

work page 2019

[20] [20]

In: European Conference on Computer Vision

Long, X., Lin, C., Wang, P., Komura, T., Wang, W.: Sparseneus: Fast generaliz- able neural surface reconstruction from sparse views. In: European Conference on Computer Vision. pp. 210–227. Springer (2022) 1, 2, 3, 4, 9, 10, 11, 12

work page 2022

[21] [21]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.: Occupancy networks: Learning 3d reconstruction in function space. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 4460–4470 (2019) 3

work page 2019

[22] [22]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Michalkiewicz, M., Pontes, J.K., Jack, D., Baktashmotlagh, M., Eriksson, A.: Im- plicit surface representations as layers in neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4743–4752 (2019) 3

work page 2019

[23] [23]

In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II 13

Middelberg,S.,Sattler,T.,Untzelmann,O.,Kobbelt,L.:Scalable6-doflocalization on mobile devices. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part II 13. pp. 268–283. Springer (2014) 1

work page 2014

[24] [24]

Commu- nications of the ACM65(1), 99–106 (2021) 1, 5, 7, 12

Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: Nerf: Representing scenes as neural radiance fields for view synthesis. Commu- nications of the ACM65(1), 99–106 (2021) 1, 5, 7, 12

work page 2021

[25] [25]

ACM Transactions on Graphics (ToG)41(4), 1– 15 (2022) 3

Müller,T.,Evans,A.,Schied,C.,Keller,A.:Instantneuralgraphicsprimitiveswith a multiresolution hash encoding. ACM Transactions on Graphics (ToG)41(4), 1– 15 (2022) 3

work page 2022

[26] [26]

In: Proceedings of the IEEE/CVF inter- national conference on computer vision

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.: Occupancy flow: 4d recon- struction by learning particle dynamics. In: Proceedings of the IEEE/CVF inter- national conference on computer vision. pp. 5379–5389 (2019) 3

work page 2019

[27] [27]

In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.: Differentiable volumetric rendering: Learning implicit 3d representations without 3d supervision. In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion. pp. 3504–3515 (2020) 3

work page 2020

[28] [28]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.: Texture fields: Learning texture representations in function space. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 4531–4540 (2019) 3

work page 2019

[29] [29]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Oechsle, M., Peng, S., Geiger, A.: Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 5589–5599 (2021) 3, 9, 10

work page 2021

[30] [30]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Park,J.J.,Florence,P.,Straub,J.,Newcombe,R.,Lovegrove,S.:Deepsdf:Learning continuous signed distance functions for shape representation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 165– 174 (2019) 3

work page 2019

[31] [31]

Advances in Neural Information Processing Systems 36 (2024) 2, 4, 10 Abbreviated paper title 17

Peng, R., Gu, X., Tang, L., Shen, S., Yu, F., Wang, R.: Gens: Generalizable neural surface reconstruction from multi-view images. Advances in Neural Information Processing Systems 36 (2024) 2, 4, 10 Abbreviated paper title 17

work page 2024

[32] [32]

Peng, S., Niemeyer, M., Mescheder, L., Pollefeys, M., Geiger, A.: Convolutional occupancynetworks.In:ComputerVision–ECCV2020:16thEuropeanConference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. pp. 523–540. Springer (2020) 3

work page 2020

[33] [33]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-nerf: Neural ra- diance fields for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 10318–10327 (2021) 3

work page 2021

[34] [34]

Pytorch, A.D.I.: Pytorch (2018) 11

work page 2018

[35] [35]

IEEE transactions on pattern analysis and machine intelligence44(3), 1623–1637 (2020) 8

Ranftl, R., Lasinger, K., Hafner, D., Schindler, K., Koltun, V.: Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE transactions on pattern analysis and machine intelligence44(3), 1623–1637 (2020) 8

work page 2020

[36] [36]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Ren, Y., Zhang, T., Pollefeys, M., Süsstrunk, S., Wang, F.: Volrecon: Volume ren- dering of signed ray distance functions for generalizable multi-view reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16685–16695 (2023) 2, 4, 11, 14

work page 2023

[37] [37]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 4104–4113 (2016) 1, 3, 9, 10, 12

work page 2016

[38] [38]

In: Computer Vision–ECCV 2016: 14th Euro- pean Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14

Schönberger, J.L., Zheng, E., Frahm, J.M., Pollefeys, M.: Pixelwise view selection for unstructured multi-view stereo. In: Computer Vision–ECCV 2016: 14th Euro- pean Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14. pp. 501–518. Springer (2016) 3

work page 2016

[39] [39]

IEEE TRANS- ACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE32(8) (2010) 3

Stereopsis, R.M.: Accurate, dense, and robust multiview stereopsis. IEEE TRANS- ACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE32(8) (2010) 3

work page 2010

[40] [40]

In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition

Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: Super-fast conver- gence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition. pp. 5459–5469 (2022) 3

work page 2022

[41] [41]

Machine Vision and Applications23, 903–920 (2012) 3

Tola, E., Strecha, C., Fua, P.: Efficient large-scale multi-view stereo for ultra high- resolution image sets. Machine Vision and Applications23, 903–920 (2012) 3

work page 2012

[42] [42]

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Wang, P., Liu, L., Liu, Y., Theobalt, C., Komura, T., Wang, W.: Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689 (2021) 1, 3, 5, 7, 9, 12

work page internal anchor Pith review Pith/arXiv arXiv 2021

[43] [43]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Wang, Q., Wang, Z., Genova, K., Srinivasan, P.P., Zhou, H., Barron, J.T., Martin- Brualla, R., Snavely, N., Funkhouser, T.: Ibrnet: Learning multi-view image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4690–4699 (2021) 3, 5, 9, 10

work page 2021

[44] [44]

In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision

Wang, Y., Han, Q., Habermann, M., Daniilidis, K., Theobalt, C., Liu, L.: Neus2: Fast learning of neural implicit surfaces for multi-view reconstruction. In: Proceed- ings of the IEEE/CVF International Conference on Computer Vision. pp. 3295– 3306 (2023) 1, 8, 9, 10

work page 2023

[45] [45]

In: International Con- ference on Learning Representations (ICLR) (2023) 12

Wu, T., Wang, J., Pan, X., Xu, X., Theobalt, C., Liu, Z., Lin, D.: Voxurf: Voxel- based efficient and accurate neural surface reconstruction. In: International Con- ference on Learning Representations (ICLR) (2023) 12

work page 2023

[46] [46]

In: Proceedings of the European conference on computer vision (ECCV)

Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: Mvsnet: Depth inference for unstruc- tured multi-view stereo. In: Proceedings of the European conference on computer vision (ECCV). pp. 767–783 (2018) 1, 3, 6, 9, 10, 11, 12

work page 2018

[47] [47]

Computer Vision and Pattern Recognition (CVPR) (2020) 11 18 Kaichen Zhou

Yao, Y., Luo, Z., Li, S., Zhang, J., Ren, Y., Zhou, L., Fang, T., Quan, L.: Blend- edmvs: A large-scale dataset for generalized multi-view stereo networks. Computer Vision and Pattern Recognition (CVPR) (2020) 11 18 Kaichen Zhou

work page 2020

[48] [48]

Advancesin Neural Information ProcessingSystems34, 4805–4815 (2021) 3, 9, 10, 12

Yariv, L., Gu, J., Kasten, Y., Lipman, Y.: Volume rendering of neural implicit surfaces. Advancesin Neural Information ProcessingSystems34, 4805–4815 (2021) 3, 9, 10, 12

work page 2021

[49] [49]

Advances in Neural Information Processing Systems33 (2020) 3, 10, 12

Yariv, L., Kasten, Y., Moran, D., Galun, M., Atzmon, M., Ronen, B., Lipman, Y.: Multiview neural surface reconstruction by disentangling geometry and ap- pearance. Advances in Neural Information Processing Systems33 (2020) 3, 10, 12

work page 2020

[50] [50]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelnerf: Neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4578–4587 (2021) 3, 9, 10

work page 2021

[51] [51]

Advances in neural information processing systems35, 25018–25032 (2022) 3

Yu, Z., Peng, S., Niemeyer, M., Sattler, T., Geiger, A.: Monosdf: Exploring monoc- ular geometric cues for neural implicit surface reconstruction. Advances in neural information processing systems35, 25018–25032 (2022) 3

work page 2022

[52] [52]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Zhang, J., Yao, Y., Quan, L.: Learning signed distance field for multi-view sur- face reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 6525–6534 (2021) 3, 12

work page 2021

[53] [53]

In: Proceed- ingsoftheAAAIConferenceonArtificialIntelligence.vol.35,pp.6165–6173(2021) 1

Zhou, K., Chen, C., Wang, B., Saputra, M.R.U., Trigoni, N., Markham, A.: Vmloc: Variational fusion for learning-based multimodal camera localization. In: Proceed- ingsoftheAAAIConferenceonArtificialIntelligence.vol.35,pp.6165–6173(2021) 1

work page 2021

[54] [54]

In: European Confer- ence on Computer Vision

Zhou, K., Hong, L., Chen, C., Xu, H., Ye, C., Hu, Q., Li, Z.: Devnet: Self-supervised monocular depth learning via density volume construction. In: European Confer- ence on Computer Vision. pp. 125–142. Springer (2022) 1, 5

work page 2022

[55] [55]

Advances in Neural Information Processing Systems 36 (2024) 8

Zhou, K., Zhong, J.X., Shin, S., Lu, K., Yang, Y., Markham, A., Trigoni, N.: Dyn- point: Dynamic neural point for view synthesis. Advances in Neural Information Processing Systems 36 (2024) 8

work page 2024