pith. machine review for the scientific record. sign in

arxiv: 2106.10689 · v3 · submitted 2021-06-20 · 💻 cs.CV · cs.GR

Recognition: 2 theorem links

· Lean Theorem

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Authors on Pith no claims yet

Pith reviewed 2026-05-16 20:14 UTC · model grok-4.3

classification 💻 cs.CV cs.GR
keywords neural implicit surfacesvolume renderingsigned distance functionmulti-view reconstructionsurface reconstructionNeRFSDF
0
0 comments X

The pith

NeuS learns high-fidelity surfaces as neural signed distance functions by using a volume rendering formulation that removes first-order geometric bias.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a neural method that represents object surfaces as the zero level set of a learned signed distance function. It replaces standard volume rendering with a new formulation that avoids the geometric bias that arises when converting volume densities into surface locations. This change lets the network optimize directly from multi-view images without foreground masks, and it produces more accurate geometry on objects that have thin parts or heavy self-occlusion. The method is evaluated on the DTU and BlendedMVS benchmarks and is shown to outperform earlier surface and radiance-field approaches on those data.

Core claim

We represent a surface as the zero-level set of a signed distance function and develop a new volume rendering method to train a neural SDF representation. Conventional volume rendering introduces inherent geometric errors for surface reconstruction; the proposed formulation is free of bias in the first order of approximation and therefore yields more accurate surfaces even without mask supervision.

What carries the argument

A bias-free volume rendering integral for neural signed distance functions that approximates the surface integral to first order without geometric offset.

If this is right

  • Objects with severe self-occlusion or thin structures can be reconstructed to higher geometric accuracy without foreground masks.
  • Surface extraction from the learned implicit field becomes reliable enough to replace post-processing steps required by radiance-field methods.
  • The same network can be trained end-to-end on raw image collections that previously needed manual masking.
  • Reconstruction quality on standard multi-view benchmarks improves measurably over both DVR/IDR and NeRF-based baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bias-correction idea could be applied to other implicit representations that combine volume rendering with explicit surface constraints.
  • If the first-order unbiased property holds under noisy camera poses, the method may tolerate less precise calibration than mask-dependent alternatives.
  • Extending the formulation to time-varying scenes would require only adding a temporal dimension to the SDF network while preserving the unbiased rendering integral.

Load-bearing premise

The new rendering equation removes first-order geometric bias without creating new systematic errors that would require extra constraints or mask data to correct.

What would settle it

A controlled experiment on synthetic spheres or planes where the reconstructed zero-level set deviates from ground-truth geometry by an amount proportional to the first-order bias term when the new rendering is replaced by standard NeRF-style rendering.

read the original abstract

We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inputs. Existing neural surface reconstruction approaches, such as DVR and IDR, require foreground mask as supervision, easily get trapped in local minima, and therefore struggle with the reconstruction of objects with severe self-occlusion or thin structures. Meanwhile, recent neural methods for novel view synthesis, such as NeRF and its variants, use volume rendering to produce a neural scene representation with robustness of optimization, even for highly complex objects. However, extracting high-quality surfaces from this learned implicit representation is difficult because there are not sufficient surface constraints in the representation. In NeuS, we propose to represent a surface as the zero-level set of a signed distance function (SDF) and develop a new volume rendering method to train a neural SDF representation. We observe that the conventional volume rendering method causes inherent geometric errors (i.e. bias) for surface reconstruction, and therefore propose a new formulation that is free of bias in the first order of approximation, thus leading to more accurate surface reconstruction even without the mask supervision. Experiments on the DTU dataset and the BlendedMVS dataset show that NeuS outperforms the state-of-the-arts in high-quality surface reconstruction, especially for objects and scenes with complex structures and self-occlusion.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents NeuS, a neural implicit surface reconstruction method that represents object surfaces as the zero-level set of a signed distance function (SDF) and introduces a new volume rendering formulation derived from the SDF. The key technical contribution is a rendering equation claimed to be free of geometric bias to first order in the approximation, enabling high-fidelity reconstruction from multi-view images without foreground mask supervision. Experiments on the DTU and BlendedMVS datasets report superior quantitative and qualitative results compared to prior methods such as DVR, IDR, and NeRF variants, particularly for scenes with self-occlusion and thin structures.

Significance. If the first-order bias-free property of the proposed volume rendering holds under the discretization and network approximation used in practice, the work would meaningfully advance multi-view 3D reconstruction by combining the optimization robustness of volume rendering with accurate surface constraints. The reported gains on standard benchmarks for challenging geometry suggest practical utility in computer vision and graphics applications.

major comments (2)
  1. [Abstract / §3] Abstract and the derivation of the rendering formulation (presumably §3): the central claim that the new volume rendering is 'free of bias in the first order of approximation' is load-bearing for the contribution and the no-mask result. However, no error bound, remainder term analysis, or empirical check is supplied for the neglected higher-order terms in the Taylor expansion of transmittance/opacity around the zero level set, nor for interactions with quadrature discretization, curvature, or sampling density. Without this, it remains possible that residual systematic offsets persist and that accuracy gains arise from the SDF representation rather than bias elimination.
  2. [§4] Experimental section (§4, Tables 1-2): the reported outperformance on DTU and BlendedMVS is presented as evidence for the bias-free formulation, yet the paper lacks ablations that hold the SDF fixed while swapping the conventional versus proposed renderer (or vice versa). This makes it difficult to isolate the contribution of the first-order unbiased property from other modeling choices.
minor comments (2)
  1. [§2 / §3] Notation for the SDF, opacity, and transmittance functions should be introduced once with explicit definitions and then used consistently; occasional re-use of symbols for related but distinct quantities reduces readability.
  2. [Figures 3-5] Figure captions and axis labels in the qualitative results could more explicitly indicate which method corresponds to each column to aid quick comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and positive assessment of our work. We address each major comment below and will revise the manuscript to strengthen the presentation of the bias-free property and experimental validation.

read point-by-point responses
  1. Referee: [Abstract / §3] Abstract and the derivation of the rendering formulation (presumably §3): the central claim that the new volume rendering is 'free of bias in the first order of approximation' is load-bearing for the contribution and the no-mask result. However, no error bound, remainder term analysis, or empirical check is supplied for the neglected higher-order terms in the Taylor expansion of transmittance/opacity around the zero level set, nor for interactions with quadrature discretization, curvature, or sampling density. Without this, it remains possible that residual systematic offsets persist and that accuracy gains arise from the SDF representation rather than bias elimination.

    Authors: We thank the referee for this insightful comment. In §3, we derive the new volume rendering by performing a first-order Taylor expansion of the transmittance around the zero-level set of the SDF. This eliminates the geometric bias to first order, as the standard formulation has a bias proportional to the distance from the surface. While a complete remainder term analysis is not included in the original paper, the approximation is justified by the fact that near the surface the higher-order terms become negligible for small step sizes. We will add a new subsection in the revised manuscript providing a more detailed discussion of the approximation error, including a brief analysis of the remainder and its dependence on sampling density and curvature. Additionally, we will include an empirical study varying the number of quadrature points to verify robustness. revision: partial

  2. Referee: [§4] Experimental section (§4, Tables 1-2): the reported outperformance on DTU and BlendedMVS is presented as evidence for the bias-free formulation, yet the paper lacks ablations that hold the SDF fixed while swapping the conventional versus proposed renderer (or vice versa). This makes it difficult to isolate the contribution of the first-order unbiased property from other modeling choices.

    Authors: We agree that such an ablation would better isolate the contribution of our rendering formulation. The current comparisons involve different scene representations across methods, which confounds direct attribution. In the revised version, we will add an ablation experiment on the DTU dataset where we use the identical SDF network architecture and training setup but replace our proposed renderer with the conventional volume rendering formulation from NeRF. We will report the Chamfer distance and other metrics to demonstrate the specific improvement due to the bias-free rendering. revision: yes

Circularity Check

0 steps flagged

No significant circularity; new volume rendering formulation derived independently from SDF without reduction to fitted inputs

full rationale

The paper's central contribution is a new volume rendering formulation for neural SDF representations, presented as free of geometric bias to first order. The abstract and description show this as a mathematical derivation from the SDF zero-level set and transmittance, not a re-expression of prior fitted parameters or self-citations. No load-bearing step reduces by construction to inputs (e.g., no fitted bias term renamed as prediction). External evaluations on DTU and BlendedMVS provide independent validation. Minor self-citations to NeRF/IDR are present but not load-bearing for the bias-free claim, which rests on the first-order approximation analysis. This is a normal non-finding for a derivation that remains self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the standard assumption that surfaces can be represented as zero-level sets of SDFs and that volume rendering can be adapted to enforce surface constraints without masks.

axioms (1)
  • domain assumption A surface can be represented as the zero-level set of a signed distance function
    Invoked in the abstract as the core representation for NeuS.

pith-pipeline@v0.9.0 · 5555 in / 1042 out tokens · 19403 ms · 2026-05-16T20:14:24.628814+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • Cost.FunctionalEquation washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    We observe that the conventional volume rendering method causes inherent geometric errors (i.e. bias) for surface reconstruction, and therefore propose a new formulation that is free of bias in the first order of approximation, thus leading to more accurate surface reconstruction even without the mask supervision.

  • DAlembert.Inevitability bilinear_family_forced unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    In NeuS, we propose to represent a surface as the zero-level set of a signed distance function (SDF) and develop a new volume rendering method to train a neural SDF representation.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 19 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. PAGaS: Pixel-Aligned 1DoF Gaussian Splatting for Depth Refinement

    cs.CV 2026-04 unverdicted novelty 7.0

    PAGaS refines multi-view stereo depths by optimizing 1DoF Gaussians whose positions and sizes are fixed by back-projected pixel volumes, producing detailed depth maps that outperform reference baselines on 3D reconstr...

  2. SpUDD: Superpower Contouring of Unsigned Distance Data

    cs.GR 2026-04 unverdicted novelty 7.0

    SpUDD defines superpower contours on power diagrams of unsigned distance samples, proves their convergence to the true surface, and uses them to generate approximating meshes that outperform other strategies for this ...

  3. SpUDD: Superpower Contouring of Unsigned Distance Data

    cs.GR 2026-04 unverdicted novelty 7.0

    SpUDD defines superpower contours from power diagrams of unsigned distance samples, proves convergence to the true surface, and uses them to generate approximating polygonal meshes that outperform prior strategies.

  4. THOM: Generating Physically Plausible Hand-Object Meshes From Text

    cs.CV 2026-04 unverdicted novelty 7.0

    THOM is a training-free two-stage framework that generates physically plausible hand-object 3D meshes directly from text by combining text-guided Gaussians with contact-aware physics optimization and VLM refinement.

  5. TOPOS: High-Fidelity and Efficient Industry-Grade 3D Head Generation

    cs.CV 2026-05 unverdicted novelty 6.0

    TOPOS creates high-fidelity 3D heads with fixed industry topology from single images via a specialized VAE with Perceiver Resampler and a rectified flow transformer.

  6. Attention Itself Could Retrieve.RetrieveVGGT: Training-Free Long Context Streaming 3D Reconstruction via Query-Key Similarity Retrieval

    cs.CV 2026-05 unverdicted novelty 6.0

    RetrieveVGGT enables constant-memory long-context streaming 3D reconstruction by retrieving relevant frames via query-key similarities in VGGT's first attention layer, outperforming StreamVGGT and others.

  7. LagrangianSplats: Divergence-Free Transport of Gaussian Primitives for Fluid Reconstruction

    cs.GR 2026-05 unverdicted novelty 6.0

    A framework that structurally enforces divergence-free velocity and long-range transport coherence in 3D fluid reconstruction from 2D videos via divergence-free kernels advecting Lagrangian Gaussian splats.

  8. Sat3R: Satellite DSM Reconstruction via RPC-Aware Depth Fine-tuning

    cs.CV 2026-05 unverdicted novelty 6.0

    Sat3R adapts Depth Anything V2 via RPC-aware metric depth fine-tuning to deliver satellite DSM reconstruction with 38% lower MAE than zero-shot baselines and over 300x speedup versus optimization methods.

  9. Scalable GPU Construction of 3D Voronoi and Power Diagrams

    cs.CG 2026-05 unverdicted novelty 6.0

    A new GPU clipping algorithm with directional culling and hierarchical traversal constructs scalable 3D Voronoi and power diagrams for arbitrary point distributions.

  10. High-Fidelity Single-Image Head Modeling with Industry-Grade Topology

    cs.CV 2026-05 unverdicted novelty 6.0

    A single-image head reconstruction method uses coarse-to-fine optimization with normal consistency, landmarks, and geometry-aware constraints on curvature and conformality to produce meshes with industry-grade topolog...

  11. Greed for the Spheres: A Signed Distance Interpolation Method

    cs.GR 2026-05 unverdicted novelty 6.0

    A greedy algorithm interpolates consistent signed distance functions from discrete samples by treating SDF geometric properties as hard constraints.

  12. SurfelSplat: Learning Efficient and Generalizable Gaussian Surfel Representations for Sparse-View Surface Reconstruction

    cs.CV 2026-04 unverdicted novelty 6.0

    A feed-forward model regresses accurate Gaussian surfel geometry from sparse views using Nyquist-guided cross-view feature aggregation, achieving 100x speedup over optimization-based 3DGS surface methods on DTU benchmarks.

  13. Neural Harmonic Textures for High-Quality Primitive Based Neural Reconstruction

    cs.CV 2026-04 unverdicted novelty 6.0

    Neural Harmonic Textures add periodic feature interpolation and deferred neural decoding to primitive representations, achieving state-of-the-art real-time novel-view synthesis and bridging primitive and neural-field methods.

  14. First Shape, Then Meaning: Efficient Geometry and Semantics Learning for Indoor Reconstruction

    cs.CV 2026-05 unverdicted novelty 5.0

    FSTM improves indoor reconstruction by training geometry first without semantic supervision, then adding semantics, achieving 2.3x faster training and higher object surface recall than joint optimization.

  15. A Geometric Algebra-informed NeRF Framework for Generalizable Wireless Channel Prediction

    cs.NI 2026-04 unverdicted novelty 5.0

    GAI-NeRF combines geometric algebra attention and an adaptive ray tracing module inside a NeRF model to deliver more accurate and generalizable wireless channel predictions across varied indoor environments.

  16. Hitem3D 2.0: Multi-View Guided Native 3D Texture Generation

    cs.CV 2026-04 unverdicted novelty 5.0

    Hitem3D 2.0 combines multi-view image synthesis with native 3D texture projection to improve completeness, cross-view consistency, and geometry alignment over prior methods.

  17. MetroGS: Efficient and Stable Reconstruction of Geometrically Accurate High-Fidelity Large-Scale Scenes

    cs.CV 2025-11 unverdicted novelty 5.0

    MetroGS combines distributed 2D Gaussian Splatting with structured dense enhancement, progressive hybrid optimization, and depth-guided appearance modeling to deliver higher geometric accuracy and stability in large-s...

  18. MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion

    cs.CV 2024-10 unverdicted novelty 5.0

    By fine-tuning DUST3R to output per-timestep pointmaps on scarce dynamic video datasets, MonST3R achieves stronger video depth and pose estimation without explicit motion modeling.

  19. Attention Is not Everything: Efficient Alternatives for Vision

    cs.CV 2026-04 unverdicted novelty 3.0

    A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · cited by 18 Pith papers · 3 internal anchors

  1. [1]

    Sal: Sign agnostic learning of shapes from raw data

    Matan Atzmon and Yaron Lipman. Sal: Sign agnostic learning of shapes from raw data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2565–2574, 2020

  2. [2]

    Patchmatch: A ran- domized correspondence algorithm for structural image editing

    Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. Patchmatch: A ran- domized correspondence algorithm for structural image editing. ACM Trans. Graph., 28(3):24, 2009

  3. [3]

    A probabilistic framework for space carving

    Adrian Broadhurst, Tom W Drummond, and Roberto Cipolla. A probabilistic framework for space carving. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 1, pages 388–393. IEEE, 2001

  4. [4]

    Chen and H

    Z. Chen and H. Zhang. Learning implicit fields for generative shape modeling. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5932–5941, 2019

  5. [5]

    3d-r2n2: A unified approach for single and multi-view 3d object reconstruction

    Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision, pages 628–644. Springer, 2016

  6. [6]

    Poxels: Probabilistic voxelized volume reconstruction

    Jeremy S De Bonet and Paul Viola. Poxels: Probabilistic voxelized volume reconstruction. In Proceedings of International Conference on Computer Vision (ICCV), pages 418–425, 1999

  7. [7]

    A point set generation network for 3d object reconstruction from a single image

    Haoqiang Fan, Hao Su, and Leonidas J Guibas. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 605–613, 2017

  8. [8]

    Accurate, dense, and robust multiview stereopsis

    Yasutaka Furukawa and Jean Ponce. Accurate, dense, and robust multiview stereopsis. IEEE transactions on pattern analysis and machine intelligence, 32(8):1362–1376, 2009

  9. [9]

    Gipuma: Massively parallel multi- view stereo reconstruction

    Silvano Galliani, Katrin Lasinger, and Konrad Schindler. Gipuma: Massively parallel multi- view stereo reconstruction. Publikationen der Deutschen Gesellschaft für Photogrammetrie, Fernerkundung und Geoinformation e. V, 25(361-369):2, 2016

  10. [10]

    Implicit geometric regularization for learning shapes

    Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. Implicit geometric regularization for learning shapes. arXiv preprint arXiv:2002.10099, 2020

  11. [11]

    Large scale multi-view stereopsis evaluation

    Rasmus Jensen, Anders Dahl, George V ogiatzis, Engil Tola, and Henrik Aanæs. Large scale multi-view stereopsis evaluation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 406–413, 2014

  12. [12]

    Sdfdiff: Differentiable rendering of signed distance fields for 3d shape optimization

    Yue Jiang, Dantong Ji, Zhizhong Han, and Matthias Zwicker. Sdfdiff: Differentiable rendering of signed distance fields for 3d shape optimization. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

  13. [13]

    Learning a Multi-View Stereo Machine

    Abhishek Kar, Christian Häne, and Jitendra Malik. Learning a multi-view stereo machine. arXiv preprint arXiv:1708.05375, 2017

  14. [14]

    Neural 3d mesh renderer

    Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. Neural 3d mesh renderer. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3907–3916, 2018

  15. [15]

    Differentiable volume rendering using signed distance functions

    Srinivas Kaza et al. Differentiable volume rendering using signed distance functions . PhD thesis, Massachusetts Institute of Technology, 2019

  16. [16]

    Screened poisson surface reconstruction

    Michael Kazhdan and Hugues Hoppe. Screened poisson surface reconstruction. ACM Trans. Graph., 32(3), July 2013

  17. [17]

    Neural lumigraph rendering

    Petr Kellnhofer, Lars Jebe, Andrew Jones, Ryan Spicer, Kari Pulli, and Gordon Wetzstein. Neural lumigraph rendering. arXiv preprint arXiv:2103.11571, 2021

  18. [18]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 11

  19. [19]

    Learning efficient point cloud generation for dense 3d object reconstruction

    Chen-Hsuan Lin, Chen Kong, and Simon Lucey. Learning efficient point cloud generation for dense 3d object reconstruction. In proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018

  20. [20]

    Lingjie Liu, Duygu Ceylan, Cheng Lin, Wenping Wang, and Niloy J. Mitra. Image-based reconstruction of wire art. 36(4):63:1–63:11, 2017

  21. [21]

    Lingjie Liu, Nenglun Chen, Duygu Ceylan, Christian Theobalt, Wenping Wang, and Niloy J. Mitra. Curvefusion: Reconstructing thin structures from rgbd sequences. 37(6), 2018

  22. [22]

    Neural sparse voxel fields.Advances in Neural Information Processing Systems, 33, 2020

    Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. Neural sparse voxel fields.Advances in Neural Information Processing Systems, 33, 2020

  23. [23]

    Dist: Rendering deep implicit signed distance function with differentiable sphere tracing

    Shaohui Liu, Yinda Zhang, Songyou Peng, Boxin Shi, Marc Pollefeys, and Zhaopeng Cui. Dist: Rendering deep implicit signed distance function with differentiable sphere tracing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2019–2028, 2020

  24. [24]

    Neural volumes: Learning dynamic renderable volumes from images

    Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. Neural volumes: Learning dynamic renderable volumes from images. ACM Transactions on Graphics (TOG), 38(4):65, 2019

  25. [25]

    3D-LMNet: Latent embedding matching for accurate and diverse 3d point cloud reconstruction from a single image

    Priyanka Mandikal, K L Navaneet, Mayank Agarwal, and R Venkatesh Babu. 3D-LMNet: Latent embedding matching for accurate and diverse 3d point cloud reconstruction from a single image. In Proceedings of the British Machine Vision Conference (BMVC), 2018

  26. [26]

    Real-time visibility-based fusion of depth maps

    Paul Merrell, Amir Akbarzadeh, Liang Wang, Philippos Mordohai, Jan-Michael Frahm, Ruigang Yang, David Nistér, and Marc Pollefeys. Real-time visibility-based fusion of depth maps. pages 1–8, 01 2007

  27. [27]

    Occupancy networks: Learning 3d reconstruction in function space

    Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4460–4470, 2019

  28. [28]

    Pontes, Dominic Jack, Mahsa Baktashmotlagh, and An- ders Eriksson

    Mateusz Michalkiewicz, Jhony K. Pontes, Dominic Jack, Mahsa Baktashmotlagh, and An- ders Eriksson. Implicit surface representations as layers in neural networks. In The IEEE International Conference on Computer Vision (ICCV), October 2019

  29. [29]

    Nerf: Representing scenes as neural radiance fields for view synthesis

    Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. Nerf: Representing scenes as neural radiance fields for view synthesis. InEuropean Conference on Computer Vision, pages 405–421. Springer, 2020

  30. [30]

    Differentiable volu- metric rendering: Learning implicit 3d representations without 3d supervision

    Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. Differentiable volu- metric rendering: Learning implicit 3d representations without 3d supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3504–3515, 2020

  31. [31]

    Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction.arXiv preprint arXiv:2104.10078, 2021

    Michael Oechsle, Songyou Peng, and Andreas Geiger. Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction.arXiv preprint arXiv:2104.10078, 2021

  32. [32]

    Deepsdf: Learning continuous signed distance functions for shape representation

    Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 165–174, 2019

  33. [33]

    Mescheder, Marc Pollefeys, and Andreas Geiger

    Songyou Peng, Michael Niemeyer, Lars M. Mescheder, Marc Pollefeys, and Andreas Geiger. Convolutional occupancy networks. ArXiv, abs/2003.04618, 2020

  34. [34]

    Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization

    Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. ICCV, 2019

  35. [35]

    Pifuhd: Multi-level pixel- aligned implicit function for high-resolution 3d human digitization

    Shunsuke Saito, Tomas Simon, Jason Saragih, and Hanbyul Joo. Pifuhd: Multi-level pixel- aligned implicit function for high-resolution 3d human digitization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 84–93, 2020

  36. [36]

    Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

    Tim Salimans and Diederik P Kingma. Weight normalization: A simple reparameterization to accelerate training of deep neural networks. arXiv preprint arXiv:1602.07868, 2016. 12

  37. [37]

    Pixelwise view selection for unstructured multi-view stereo

    Johannes L Schönberger, Enliang Zheng, Jan-Michael Frahm, and Marc Pollefeys. Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision, pages 501–518. Springer, 2016

  38. [38]

    Photorealistic scene reconstruction by voxel coloring

    Steven M Seitz and Charles R Dyer. Photorealistic scene reconstruction by voxel coloring. International Journal of Computer Vision, 35(2):151–173, 1999

  39. [39]

    Deepvoxels: Learning persistent 3d feature embeddings

    Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, and Michael Zollhofer. Deepvoxels: Learning persistent 3d feature embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2437–2446, 2019

  40. [40]

    Scene representation networks: Continuous 3d-structure-aware neural scene representations

    Vincent Sitzmann, Michael Zollhöfer, and Gordon Wetzstein. Scene representation networks: Continuous 3d-structure-aware neural scene representations. In Advances in Neural Information Processing Systems, pages 1119–1130, 2019

  41. [41]

    A. Tabb. Shape from silhouette probability maps: Reconstruction of thin objects in the presence of silhouette extraction and calibration error. pages 161–168, June 2013

  42. [42]

    Fourier features let networks learn high frequency functions in low dimensional domains

    Matthew Tancik, Pratul P Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Ragha- van, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimensional domains. arXiv preprint arXiv:2006.10739, 2020

  43. [43]

    Grf: Learning a general radiance field for 3d scene representation and rendering

    Alex Trevithick and Bo Yang. Grf: Learning a general radiance field for 3d scene representation and rendering. arXiv preprint arXiv:2010.04595, 2020

  44. [44]

    Pixel2mesh: Generating 3d mesh models from single rgb images

    Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. InProceedings of the European Conference on Computer Vision (ECCV), pages 52–67, 2018

  45. [45]

    Vid2curve: Simultaneous camera motion estimation and thin structure reconstruction from an rgb video

    Peng Wang, Lingjie Liu, Nenglun Chen, Hung-Kuo Chu, Christian Theobalt, and Wenping Wang. Vid2curve: Simultaneous camera motion estimation and thin structure reconstruction from an rgb video. ACM Trans. Graph., 39(4), July 2020

  46. [46]

    Pixel2mesh++: Multi-view 3d mesh generation via deformation

    Chao Wen, Yinda Zhang, Zhuwen Li, and Yanwei Fu. Pixel2mesh++: Multi-view 3d mesh generation via deformation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1042–1051, 2019

  47. [47]

    Pix2vox: Context-aware 3d reconstruction from single and multi-view images

    Haozhe Xie, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, and Shengping Zhang. Pix2vox: Context-aware 3d reconstruction from single and multi-view images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2690–2698, 2019

  48. [48]

    Blendedmvs: A large-scale dataset for generalized multi-view stereo networks

    Yao Yao, Zixin Luo, Shiwei Li, Jingyang Zhang, Yufan Ren, Lei Zhou, Tian Fang, and Long Quan. Blendedmvs: A large-scale dataset for generalized multi-view stereo networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1790–1799, 2020

  49. [49]

    Multiview neural surface reconstruction by disentangling geometry and appearance

    Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Basri Ronen, and Yaron Lipman. Multiview neural surface reconstruction by disentangling geometry and appearance. Advances in Neural Information Processing Systems, 33, 2020

  50. [50]

    Iso-points: Optimizing neural implicit surfaces with hybrid representations

    Wang Yifan, Shihao Wu, Cengiz Oztireli, and Olga Sorkine-Hornung. Iso-points: Optimizing neural implicit surfaces with hybrid representations. arXiv preprint arXiv:2012.06434, 2020

  51. [51]

    A globally optimal algorithm for robust tv-l1 range image integration

    Christopher Zach, Thomas Pock, and Horst Bischof. A globally optimal algorithm for robust tv-l1 range image integration. In 2007 IEEE 11th International Conference on Computer Vision, pages 1–8, 2007

  52. [52]

    Physg: Inverse rendering with spherical gaussians for physics-based material editing and relighting

    Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, and Noah Snavely. Physg: Inverse rendering with spherical gaussians for physics-based material editing and relighting. arXiv preprint arXiv:2104.00674, 2021

  53. [53]

    Nerf++: Analyzing and improving neural radiance fields

    Kai Zhang, Gernot Riegler, Noah Snavely, and Vladlen Koltun. Nerf++: Analyzing and improving neural radiance fields. arXiv preprint arXiv:2010.07492, 2020. 13 - Supplementary - A Derivation for Computing Opacity αi In this section we will derive the formula in Eqn. 13 of the paper for computing the discrete opacity αi. Recall that the opaque density functi...

  54. [54]

    Here we perform a local analysis at¯t near the surface intersectiont∗, wheref(p(t∗)) = 0, ¯t =t∗+∆t

    After organizing, we have d2f dt (p(¯t))·φs(f(p(¯t))) + (df dt (p(¯t)) )2 φ ′ s(f(p(¯t))) = 0. Here we perform a local analysis at¯t near the surface intersectiont∗, wheref(p(t∗)) = 0, ¯t =t∗+∆t. And we let df dt (p(t∗)) = µ, and d2f dt2 (p(t∗)) = τ. As a second-order analysis, we assume that in this local intervalt∈ [tl,tr], d2f dt2 (p(t)) is fixed. After...