pith. sign in

arxiv: 2605.10174 · v1 · submitted 2026-05-11 · 💻 cs.CV

BathyFacto: Refraction-Aware Two-Media Neural Radiance Fields for Bathymetry

Pith reviewed 2026-05-12 03:38 UTC · model grok-4.3

classification 💻 cs.CV
keywords bathymetryneural radiance fieldsrefractionphotogrammetryUAV imagerytwo-media reconstructionunderwater depth
0
0 comments X

The pith

BathyFacto traces camera rays through air then refracted water to remove systematic depth bias in UAV bathymetry.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a neural radiance field model that explicitly accounts for light bending at the air-water interface. Conventional structure-from-motion and NeRF methods treat rays as straight lines, which produces large errors in underwater depth estimates. BathyFacto splits each ray into an air segment and a water segment, applies Snell's law at a known planar surface, and shares a single density field across both media. On simulated data with ground truth, the refraction-aware model recovers point clouds with far lower error and higher completeness than uncorrected baselines. The method also supplies corrected back-projection and coordinate transforms for exporting metrically usable bathymetric data.

Core claim

BathyFacto extends a hash-grid NeRF with a medium-conditioned color head and a refraction-aware ray marcher: each ray travels straight in air to a planar water surface, then follows the refracted direction inside water according to known refractive indices; a single proposal sampler operates on a virtual straight ray while a kinked density wrapper evaluates density at the physically correct water positions.

What carries the argument

The kinked density wrapper combined with a virtual straight-ray proposal sampler that transparently corrects sample positions along the refracted water segment before density lookup.

If this is right

  • Point clouds exported with refraction-corrected back-projection become metrically usable for shallow-water mapping.
  • The same density field can be queried from both above and below the surface without separate models.
  • Data pipelines that supply per-pixel medium masks and water-plane estimates become prerequisites for accurate through-water NeRFs.
  • Conventional multi-view stereo without refraction correction remains limited to 20-30 percent completeness on two-media scenes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The planar-surface assumption could be relaxed by estimating a low-order surface model jointly with the density field.
  • The same two-segment ray construction applies to other refractive boundaries such as air-glass or water-glass interfaces.
  • Because the proposal sampler operates on a virtual straight ray, the method can reuse existing NeRF sampling code with only a wrapper change.

Load-bearing premise

The water surface is perfectly planar and its location can be measured accurately from boundary markers, with constant known refractive indices for air and water.

What would settle it

Reconstruct the same scene with the method and with independent sonar or laser bathymetry on a visibly non-planar water surface; a large persistent depth offset would falsify the claim.

Figures

Figures reproduced from arXiv: 2605.10174 by Anatol G\"unthner, Boris Jutzi, Frederik Schulte, Gottfried Mandlburger, Lukas Winiwarter, Markus Brezovsky.

Figure 1
Figure 1. Figure 1: BathyFacto workflow from photogrammetric re [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Acquisition geometry. Camera positions and view [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example image and corresponding mask used to [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Two-segment ray geometry in the BathyFacto [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative 2D rendering comparison for three representative evaluation views. Columns: ground-truth RGB, BathyFacto without refraction (ablation), BathyFacto with refraction, and Nerfacto baseline. For each view, the top row shows RGB renderings and the bottom row shows predicted depth maps (no ground-truth depth available). Views span frontal (0021), steep-angle (0080), and oblique (0004) camera geometri… view at source ↗
Figure 6
Figure 6. Figure 6: 3D point-cloud evaluation across all four configurations. (a) Cloud-to-Mesh (C2M) signed distance maps. BathyFacto with refraction enabled achieves the lowest C2M standard deviation (0.17 m). The color scale corresponds to the histograms shown in (b). (b) C2M signed distance histograms. Brezovsky et al.: Preprint submitted to Elsevier Page 12 of 15 [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Cross-section through the ship hull. The refraction-enabled BathyFacto variant (blue) closely follows the reference geometry, whereas non-refracted variants exhibit a systematic depth error. consistent two-media reconstruction. (ii) Nerfstudio integra￾tion ensures reproducibility and access to ongoing frame￾work improvements. (iii) The refraction-corrected point cloud export uses the original photogrammetr… view at source ↗
Figure 8
Figure 8. Figure 8: Completeness maps for all four configurations. Colors indicate the fraction of reconstructed points that lie within ±0.3 m of the reference mesh in the common reference frame. Below the point cloud visualizations, a shared color legend is shown, mapping colors to local completeness values across all configurations. Magenta colored points, labelled nr for not reconstructed, do not lie within the ±0.3 m tole… view at source ↗
read the original abstract

Through-water photogrammetry based on UAV imagery enables shallow-water bathymetry, but refraction at the air-water interface violates the straight-ray assumption of Structure-from-Motion and causes systematic depth bias. We present BathyFacto, a refraction-aware two-media extension of Nerfacto integrated into Nerfstudio that targets metrically precise underwater point clouds. BathyFacto uses a shared hash-grid-based density field with a medium-conditioned color head that receives a one-bit medium flag (air or water) and traces each camera ray as two segments: a straight segment in air up to a planar water surface and a refracted segment in water computed via Snell's law with known refractive indices. To allocate samples efficiently across the air-water boundary, we employ a single proposal-network sampler that operates on a virtual straight ray spanning both media, combined with a kinked density wrapper that transparently corrects water-segment positions along the refracted direction before density evaluation. A data adaptation pipeline converts photogrammetric reconstructions to a Nerfstudio-compatible format, estimates the water plane from boundary markers, and provides per-pixel medium masks to gate refraction. We also extend the point cloud export with refraction-corrected backprojection and reversible coordinate transforms to world and global frames. On a simulated two-media scene with known ground truth, BathyFacto with refraction achieves a Cloud-to-Mesh mean distance of 0.06 m and 87 % completeness, compared to 0.52 m / 29 % for the Nerfacto baseline and 0.36 m / 21% for conventional MVS without refraction correction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to introduce BathyFacto, a refraction-aware two-media extension of Nerfacto for bathymetric reconstruction from UAV imagery. It models a planar air-water interface, traces kinked rays via Snell's law with known refractive indices, employs a shared hash-grid density field with a medium flag, a virtual straight-ray proposal sampler, and a kinked density wrapper for sampling. A data pipeline estimates the water plane from markers and provides medium masks. On a simulated two-media scene with ground truth, it reports Cloud-to-Mesh mean distance of 0.06 m and 87% completeness, outperforming Nerfacto (0.52 m / 29%) and refraction-ignorant MVS (0.36 m / 21%).

Significance. If the kinked-ray volume rendering is correctly formulated, the approach could provide a practical NeRF-based solution for metrically accurate shallow-water bathymetry that corrects systematic refraction bias without additional sensors. The quantitative improvement on simulated data with known ground truth demonstrates clear gains in both accuracy and completeness over strong baselines, which is a strength of the evaluation.

major comments (2)
  1. [Method (kinked density wrapper and volume rendering)] The description of the kinked density wrapper (method section) states that it 'transparently corrects water-segment positions along the refracted direction before density evaluation' and re-uses a single proposal sampler on a virtual straight ray. However, it does not specify how the quadrature weights are computed: volume rendering requires ds to be the true Euclidean distance along the bent path in water after applying Snell's law. If sample increments are taken directly from the virtual ray without rescaling by the actual segment length or the cosine of the refracted angle, the transmittance and accumulated weights are incorrect, which directly affects density optimization and undermines the reported 0.06 m / 87% Cloud-to-Mesh metrics on the simulated scene.
  2. [Experiments and results] The experimental evaluation (results section) is confined to a single simulated scene. No real-world UAV datasets, ablation studies isolating the refraction correction, sensitivity analysis to water-plane estimation errors, or quantitative error propagation from the assumed constant refractive indices are presented. This limits confidence that the metrically precise claim generalizes beyond the controlled simulation where ground truth is known by construction.
minor comments (2)
  1. [Abstract] In the abstract, the completeness figure is written as '87 %' with a space before the percent sign; conventional notation is 87%.
  2. [Method (point cloud export)] The paper mentions 'reversible coordinate transforms to world and global frames' in the point-cloud export but does not provide the explicit transformation equations or implementation details.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with clarifications and indicate where revisions will be made to the manuscript.

read point-by-point responses
  1. Referee: [Method (kinked density wrapper and volume rendering)] The description of the kinked density wrapper (method section) states that it 'transparently corrects water-segment positions along the refracted direction before density evaluation' and re-uses a single proposal sampler on a virtual straight ray. However, it does not specify how the quadrature weights are computed: volume rendering requires ds to be the true Euclidean distance along the bent path in water after applying Snell's law. If sample increments are taken directly from the virtual ray without rescaling by the actual segment length or the cosine of the refracted angle, the transmittance and accumulated weights are incorrect, which directly affects density optimization and undermines the reported 0.06 m / 87% Cloud-to-Mesh metrics on the simulated scene.

    Authors: We thank the referee for identifying this ambiguity in the volume rendering formulation. The manuscript description of the kinked density wrapper emphasizes position correction for density queries but does not explicitly detail how quadrature weights and transmittance are computed from the corrected samples. We agree that ds must reflect the true Euclidean distances along the refracted path segments. In the revised manuscript, we will add a precise derivation showing that sample positions are mapped to the actual kinked geometry and that ds values are computed directly from the Euclidean distances between consecutive corrected points (rather than virtual-ray increments). This will include the full weight and transmittance equations to confirm correctness of the optimization and reported metrics. revision: yes

  2. Referee: [Experiments and results] The experimental evaluation (results section) is confined to a single simulated scene. No real-world UAV datasets, ablation studies isolating the refraction correction, sensitivity analysis to water-plane estimation errors, or quantitative error propagation from the assumed constant refractive indices are presented. This limits confidence that the metrically precise claim generalizes beyond the controlled simulation where ground truth is known by construction.

    Authors: We acknowledge that the current evaluation is restricted to one simulated scene with ground-truth bathymetry, which was chosen to enable reliable quantitative Cloud-to-Mesh assessment. We agree that this limits generalizability claims and that additional studies are needed. In the revision we will incorporate ablation studies isolating the refraction correction, sensitivity analysis to water-plane estimation errors, and quantitative propagation of refractive-index uncertainty. We are also preparing to include results from a real-world shallow-water UAV dataset with the associated data pipeline. revision: partial

Circularity Check

0 steps flagged

No circularity: refraction extension uses external Snell's law and independent simulation GT

full rationale

The paper's derivation consists of a technical modification to Nerfacto: ray tracing split into an air segment and a refracted water segment via Snell's law (standard geometric optics), a shared hash-grid density field, a medium flag for the color head, and a kinked wrapper around a virtual straight-ray proposal sampler. These steps are defined from first principles and prior NeRF literature rather than self-referential definitions or fitted parameters renamed as predictions. The reported Cloud-to-Mesh metrics are obtained by direct comparison against known ground-truth geometry in a simulated scene, which is independent of the model's internal parameterization. No load-bearing claim reduces to a self-citation chain or by-construction equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach relies on standard optics and existing NeRF components with domain assumptions about scene geometry; no new free parameters or invented entities are introduced beyond the base model.

axioms (2)
  • standard math Snell's law governs the direction change at the air-water interface
    Invoked to compute the refracted segment of each ray in water.
  • domain assumption The water surface is planar
    Required for straight-ray tracing in air and water-plane estimation from markers.

pith-pipeline@v0.9.0 · 5619 in / 1565 out tokens · 47708 ms · 2026-05-12T03:38:38.243224+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    and Tancik, Matthew and Barron, Jonathan T

    Mildenhall, Ben and Srinivasan, Pratul P. and Tancik, Matthew and Barron, Jonathan T. and Ramamoorthi, Ravi and Ng, Ren , title =. Proceedings of the European Conference on Computer Vision (ECCV) , pages =. 2020 , doi =

  2. [2]

    ACM SIGGRAPH 2023 Conference Proceedings , pages =

    Tancik, Matthew and Weber, Ethan and Ng, Evonne and Li, Ruilong and Yi, Brent and Wang, Terrance and Kristoffersen, Alexander and Austin, Jake and Salahi, Kamyar and Ahuja, Abhik and McAllister, David and Kerr, Justin and Kanazawa, Angjoo , title =. ACM SIGGRAPH 2023 Conference Proceedings , pages =. 2023 , doi =

  3. [3]

    and Mildenhall, Ben and Verbin, Dor and Srinivasan, Pratul P

    Barron, Jonathan T. and Mildenhall, Ben and Verbin, Dor and Srinivasan, Pratul P. and Hedman, Peter , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages =. 2022 , doi =

  4. [4]

    ACM Transactions on Graphics , volume =

    Kerbl, Bernhard and Kopanas, Georgios and Leimk. ACM Transactions on Graphics , volume =. 2023 , doi =

  5. [5]

    Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , pages =

    Zhan, Yifan and Nobuhara, Shohei and Nishino, Ko and Zheng, Yinqiang , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) , pages =. 2023 , doi =

  6. [6]

    Exploring the

    G. Exploring the. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume =

  7. [7]

    The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume =

    Brezovsky, Markus and G. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume =

  8. [8]

    Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages =

    Levy, Deborah and Peleg, Amit and Pearl, Naama and Rosenbaum, Dan and Akkaynak, Derya and Korman, Simon and Treibitz, Tali , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages =. 2023 , doi =

  9. [9]

    Pattern Recognition and Computer Vision , series =

    Zhang, Xiaoqiang and Zhang, Zhixin and Ran, Lingyan and Li, Xinmin , title =. Pattern Recognition and Computer Vision , series =. 2026 , doi =

  10. [10]

    , title =

    Dietrich, James T. , title =. Earth Surface Processes and Landforms , volume =. 2017 , doi =

  11. [11]

    and Carbonneau, Patrice E

    Woodget, Amy S. and Carbonneau, Patrice E. and Visser, Fleur and Maddock, Ian P. , title =. Earth Surface Processes and Landforms , volume =. 2015 , doi =

  12. [12]

    Remote Sensing , volume =

    Agrafiotis, Panagiotis and Karantzalos, Konstantinos and Georgopoulos, Andreas and Skarlatos, Dimitrios , title =. Remote Sensing , volume =. 2020 , doi =

  13. [13]

    The International Hydrographic Review , volume =

    A Review of Active and Passive Optical Methods in Hydrography , author =. The International Hydrographic Review , volume =

  14. [14]

    Meeting the accuracy challenge in airborne Lidar bathymetry , volume =

    Guenther, Gary and Cunningham, Andrew and LaRocque, Paul and Reid, David , year =. Meeting the accuracy challenge in airborne Lidar bathymetry , volume =

  15. [15]

    Castillo-Frias,. Using. Applied Sciences , volume =. 2023 , doi =

  16. [16]

    Del Savio, Alexandre Almeida and Luna Torres, Ana and Vergara Olivera, M. Using. Applied Sciences , volume =. doi:10.3390/app13063420 , urldate =

  17. [17]

    Geomorphology , volume =

    Rossi, Leonardo and Mamber, Filippo and Reggiannini, Marco and Moroni, Davide and Salvetti, Ovidio , title =. Geomorphology , volume =. 2024 , doi =

  18. [18]

    and Maschio, Paolo and Spadaro, Alessandra and Vezza, Paolo and Negro, Giovanni , title =

    Lingua, Andrea M. and Maschio, Paolo and Spadaro, Alessandra and Vezza, Paolo and Negro, Giovanni , title =. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume =. 2023 , doi =

  19. [19]

    ISPRS Archives , volume =

    Agrafiotis, Panagiotis and Karantzalos, Konstantinos and Georgopoulos, Andreas and Skarlatos, Dimitrios , title =. ISPRS Archives , volume =. 2019 , doi =

  20. [20]

    Optimisation of

    Mandlburger, Gottfried and Hauer, Christoph and H. Optimisation of. Hydrology and Earth System Sciences , volume =. 2015 , doi =

  21. [21]

    and Legleiter, Carl J

    Kinzel, Paul J. and Legleiter, Carl J. and Nelson, Jonathan M. , title =. Journal of Hydraulic Engineering , volume =. 2013 , doi =

  22. [22]

    and Collier, Philip A

    Quadros, Nathan D. and Collier, Philip A. and Fraser, Clive S. , title =. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences , volume =

  23. [23]

    and Kniest, H

    Fryer, John G. and Kniest, H. T. , title =. Photogrammetric Record , volume =. 1986 , doi =

  24. [24]

    Sensors , volume =

    Maas, Hans-Gerd , title =. Sensors , volume =. 2015 , doi =

  25. [25]

    and Brasington, James and Glasser, Neil F

    Westoby, Matthew J. and Brasington, James and Glasser, Neil F. and Hambrey, Michael J. and Reynolds, Jennifer M. , title =. Geomorphology , volume =. 2012 , doi =

  26. [26]

    and Dietrich, James T

    Fonstad, Mark A. and Dietrich, James T. and Courville, Brittany C. and Jensen, Jennifer L. and Carbonneau, Patrice E. , title =. Earth Surface Processes and Landforms , volume =. 2013 , doi =

  27. [27]

    and Frahm, Jan-Michael , journal=

    Schönberger, Johannes L. and Frahm, Jan-Michael , journal=. Structure-from-Motion Revisited , year=

  28. [28]

    Pixelwise View Selection for Unstructured Multi-View Stereo , journal=

    Sch\". Pixelwise View Selection for Unstructured Multi-View Stereo , journal=. 2016 , doi=

  29. [29]

    , title =

    Distinctive Image Features from Scale-Invariant Keypoints , volume =. International Journal of Computer Vision , author =. doi:10.1023/B:VISI.0000029664.99615.94 , pages =

  30. [30]

    Fischler and Robert C

    Fischler, Martin A. and Bolles, Robert C. , year =. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , volume =. Communications of the. doi:10.1145/358669.358692 , pages =

  31. [31]

    Bundle Adjustment — A Modern Synthesis

    Triggs, Bill and. Bundle Adjustment — A Modern Synthesis , volume =. Vision Algorithms: Theory and Practice , publisher =. doi:10.1007/3-540-44480-7_21 , pages =

  32. [32]

    Foundations and Trends® in Computer Graphics and Vision , author =

    Multi-View Stereo: A Tutorial , volume =. Foundations and Trends® in Computer Graphics and Vision , author =. doi:10.1561/0600000052 , pages =

  33. [33]

    version: 5.0 https://www.blender.org (16.02.2026) , url =

    Blender. version: 5.0 https://www.blender.org (16.02.2026) , url =

  34. [34]

    Simulation and validation of underwater scenes for two-media optical 3D reconstruction , volume =

    Schulte, Frederik and Brezovsky, Markus and Günthner, Anatol and Jutzi, Boris and Mandlburger, Gottfried and Winiwarter, Lukas , year =. Simulation and validation of underwater scenes for two-media optical 3D reconstruction , volume =. doi:10.5194/isprs-archives-xlviii-2-w10-2025-271-2025 , pages =

  35. [35]

    Besl, P.J. and. A method for registration of 3-D shapes , volume =. 1992 , keywords =. doi:10.1109/34.121791 , pages =

  36. [36]

    and Curless, B

    Seitz, S.M. and Curless, B. and Diebel, J. and Scharstein, D. and Szeliski, R. , year =. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , url =. doi:10.1109/CVPR.2006.19 , journal =

  37. [37]

    Instant Neural Graphics Primitives with a Multiresolution Hash Encoding , journal =

    M. Instant Neural Graphics Primitives with a Multiresolution Hash Encoding , journal =. 2022 , doi =

  38. [38]

    and Srinivasan, Pratul P

    Verbin, Dor and Hedman, Peter and Mildenhall, Ben and Zickler, Todd and Barron, Jonathan T. and Srinivasan, Pratul P. , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pages =. 2022 , doi =

  39. [39]

    , title =

    Sethuraman, Advaith Venkatramanan and Ramanagopal, Manikandasriram Srinivasan and Skinner, Katherine A. , title =. arXiv preprint arXiv:2209.13091 , year =

  40. [40]

    Proceedings of the IEEE International Conference on Image Processing (ICIP) , pages =

    Fujitomi, Taku and Hamaguchi, Ryuhei and Onishi, Masaki and Sakurada, Ken , title =. Proceedings of the IEEE International Conference on Image Processing (ICIP) , pages =. 2022 , doi =

  41. [41]

    arXiv preprint arXiv:2311.17116 , year =

    Kim, Wooseok and Fukiage, Taiki and Oishi, Takeshi , title =. arXiv preprint arXiv:2311.17116 , year =

  42. [42]

    Stereo Processing by Semiglobal Matching and Mutual Information , year=

    Hirschmuller, Heiko , journal=. Stereo Processing by Semiglobal Matching and Mutual Information , year=

  43. [43]

    Refraction-Aware Two-Media

    Brezovsky, Markus and G. Refraction-Aware Two-Media. 2025 , note =

  44. [44]

    Synthetic Photogrammetric Dataset for Two-Media 3D Reconstruction: Shipwreck & Terrain , doi =

    Schulte, Frederik and Brezovsky, Markus and Günthner, Anatol and Jutzi, Boris and Mandlburger, Gottfried and Winiwarter, Lukas , year =. Synthetic Photogrammetric Dataset for Two-Media 3D Reconstruction: Shipwreck & Terrain , doi =