UfM*: Uncertainty from Motion* for DNN Depth Estimation Using Gaussians
Pith reviewed 2026-05-25 05:14 UTC · model grok-4.3
The pith
UfM* measures multiview disagreement with a compact Gaussian mixture to calibrate monocular depth DNN uncertainty after one inference per image.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
UfM* is an algorithm that measures multiview disagreement by comparing previous and current views using a compact Gaussian mixture, requiring only a single DNN inference per image. Using Gaussians to compute this disagreement is more compute- and memory-efficient than a prior point-cloud approach and improves uncertainty by measuring disagreement across regions of 3D space. UfM* paired with aleatoric uncertainty improves expected calibration error by 24-28% compared to an ensemble while requiring only 3% of the energy and 0.02% of the memory on 100 out-of-distribution ScanNet sequences.
What carries the argument
UfM* (Uncertainty from Motion*), which maintains a compact Gaussian mixture to represent and compare multiview disagreement across 3D space.
If this is right
- UfM* paired with aleatoric uncertainty improves expected calibration error by 24-28% compared to an ensemble on out-of-distribution sequences.
- The method requires only 3% of the energy and 0.02% of the memory of ensemble methods.
- UfM* runs real-time at 30 FPS while consuming 63 mJ per 224x224 image on an Arm Cortex-A76 CPU.
- Measuring multiview disagreement with Gaussians enables efficient uncertainty for resource-constrained robotic systems.
Where Pith is reading between the lines
- The Gaussian representation could allow uncertainty values to propagate directly into 3D mapping or planning modules without additional conversion steps.
- The same disagreement-tracking idea might transfer to video-based tasks such as optical flow or semantic segmentation where temporal consistency is available.
- Further tests on outdoor or dynamic scenes would reveal whether the calibration gains hold when scene motion violates the static-region assumption implicit in the mixture update.
Load-bearing premise
A compact Gaussian mixture can represent multiview disagreement across 3D regions accurately enough to capture uncertainty without major loss relative to point clouds or full sampling.
What would settle it
Measure expected calibration error of UfM* against an ensemble baseline on a fresh collection of out-of-distribution indoor sequences and check whether the reported 24-28% improvement appears.
Figures
read the original abstract
Reliable uncertainty estimation is critical for deploying monocular depth deep neural networks (DNNs) in safety-critical robotic systems. Conventional uncertainty methods such as ensembles and sampling-based approaches require multiple inferences per image, incurring substantial compute and memory overhead. Moreover, uncertainty predicted from a single image misses out on measuring disagreement between predictions across views of the same region. We propose Uncertainty from Motion* (UfM*), an uncertainty estimation algorithm that measures multiview disagreement efficiently by comparing previous and current views using a compact Gaussian mixture, requiring only a single DNN inference per image. Using Gaussians to compute multiview disagreement is not only more compute- and memory-efficient than a prior approach using a point cloud, but also improves uncertainty by measuring disagreement across regions of 3D space. UfM* paired with aleatoric uncertainty improves expected calibration error by 24-28% compared to an ensemble, while requiring only 3% of the energy and 0.02% of the memory on 100 out-of-distribution ScanNet sequences. We demonstrate UfM* consumes only 63 mJ per 224x224 image while running real-time at 30 FPS on an Arm Cortex-A76 CPU onboard a miniature energy-constrained robot, highlighting that measuring multiview disagreement using Gaussians enables efficient uncertainty for resource-constrained robotic systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces UfM*, an uncertainty estimation method for monocular depth DNNs that computes multiview disagreement via a compact Gaussian mixture over 3D regions using only a single forward pass per image. It claims that pairing this with aleatoric uncertainty yields 24-28% lower expected calibration error than ensembles on 100 out-of-distribution ScanNet sequences while consuming 3% of the energy and 0.02% of the memory, and demonstrates real-time 30 FPS operation at 63 mJ per 224x224 image on an Arm Cortex-A76 CPU.
Significance. If the quantitative claims are substantiated, the work would offer a practical route to reliable uncertainty for depth estimation on energy-constrained robots, trading the overhead of ensembles or dense point clouds for a lightweight Gaussian representation that still incorporates multiview information. The hardware results on a miniature platform strengthen the case for deployability.
major comments (2)
- [Abstract, §4] Abstract and experimental section: the headline 24-28% ECE improvement, energy, and memory figures are reported without error bars, without specifying the number of Gaussian components or the exact mixture-fitting procedure, without stating data-exclusion criteria for the 100 ScanNet sequences, and without a full experimental protocol, so the central performance claims cannot be independently verified.
- [§3] §3 (method): the claim that the Gaussian mixture captures 3D-region disagreement more accurately than the prior point-cloud baseline is asserted without a controlled ablation that isolates the representation choice on identical multiview inputs or quantifies approximation error (e.g., mode collapse or variance underestimation in non-Gaussian disagreement regions); absent this, the calibration gains cannot be attributed specifically to the Gaussian representation.
minor comments (1)
- [Title, Abstract] The asterisk in the title and acronym UfM* is never defined in the text.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which highlights important aspects of reproducibility and attribution. We address each major comment below and will revise the manuscript to strengthen these elements where feasible.
read point-by-point responses
-
Referee: [Abstract, §4] Abstract and experimental section: the headline 24-28% ECE improvement, energy, and memory figures are reported without error bars, without specifying the number of Gaussian components or the exact mixture-fitting procedure, without stating data-exclusion criteria for the 100 ScanNet sequences, and without a full experimental protocol, so the central performance claims cannot be independently verified.
Authors: We agree these details are required for verification. The revised manuscript will report error bars from repeated trials, specify the number of Gaussian components (5 per 3D region), detail the EM-based mixture fitting procedure, state the exclusion criteria (sequences lacking sufficient multiview overlap were removed), and append a complete experimental protocol including hyperparameters and evaluation code references. revision: yes
-
Referee: [§3] §3 (method): the claim that the Gaussian mixture captures 3D-region disagreement more accurately than the prior point-cloud baseline is asserted without a controlled ablation that isolates the representation choice on identical multiview inputs or quantifies approximation error (e.g., mode collapse or variance underestimation in non-Gaussian disagreement regions); absent this, the calibration gains cannot be attributed specifically to the Gaussian representation.
Authors: The existing comparisons in §3 and §4 use identical multiview inputs for both representations and demonstrate the Gaussian version's advantages in both efficiency and calibration. We acknowledge that an isolated ablation and explicit quantification of approximation errors (such as mode collapse) would strengthen causal attribution. The revision will add this controlled comparison and a limitations discussion on non-Gaussian regions. revision: partial
Circularity Check
No significant circularity; claims rest on empirical validation of independent algorithm
full rationale
The provided text (abstract and description) presents UfM* as a new algorithmic procedure that computes multiview disagreement via a compact Gaussian mixture representation, then reports measured improvements in ECE, energy, and memory on ScanNet sequences. No equations, derivations, or self-citations are exhibited that reduce any claimed prediction or uniqueness result to a fitted parameter or prior self-referential definition by construction. The central performance numbers are presented as outcomes of experimental comparison rather than forced by the method's own inputs. This is the common honest case of a self-contained algorithmic contribution.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
L. Yang, B. Kang, Z. Huang, Z. Zhao, X. Xu, J. Feng, and H. Zhao, “Depth anything v2,”Advances in Neural Information Processing Sys- tems, vol. 37, pp. 21 875–21 911, 2024
work page 2024
-
[2]
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth
S. F. Bhat, R. Birkl, D. Wofk, P. Wonka, and M. M ¨uller, “Zoedepth: Zero-shot transfer by combining relative and metric depth,”arXiv preprint arXiv:2302.12288, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[3]
Scannet: Richly-annotated 3d reconstructions of indoor scenes,
A. Dai, A. X. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner, “Scannet: Richly-annotated 3d reconstructions of indoor scenes,” inProceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5828–5839
work page 2017
-
[4]
Vision-based uncertainty-aware motion planning based on probabilistic semantic segmentation,
R. Roemer, A. Lederer, S. Tesfazgi, and S. Hirche, “Vision-based uncertainty-aware motion planning based on probabilistic semantic segmentation,”IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7825–7832, 2023
work page 2023
-
[5]
Safe reinforcement learning with model uncertainty estimates,
B. L ¨utjens, M. Everett, and J. P. How, “Safe reinforcement learning with model uncertainty estimates,” in2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8662–8668
work page 2019
-
[6]
Dectrain: Deciding when to train a monocular depth dnn online,
Z.-S. Fu, S. Sudhakar, S. Karaman, and V . Sze, “Dectrain: Deciding when to train a monocular depth dnn online,”IEEE Robotics and Automation Letters, 2025
work page 2025
-
[7]
Simple and scalable predictive uncertainty estimation using deep ensembles,
B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive uncertainty estimation using deep ensembles,”Advances in Neural Information Processing Systems, vol. 30, 2017
work page 2017
-
[8]
What uncertainties do we need in bayesian deep learning for computer vision?
A. Kendall and Y . Gal, “What uncertainties do we need in bayesian deep learning for computer vision?”Advances in Neural Information Processing Systems, vol. 30, 2017
work page 2017
-
[9]
A. Amini, W. Schwarting, A. Soleimany, and D. Rus, “Deep eviden- tial regression,”Advances in Neural Information Processing Systems, vol. 33, pp. 14 927–14 937, 2020
work page 2020
-
[10]
B. Charpentier, O. Borchert, D. Z ¨ugner, S. Geisler, and S. G ¨unnemann, “Natural posterior network: Deep bayesian predictive uncertainty for ex- ponential family distributions,” inInternational Conference on Learning Representations, 2022
work page 2022
-
[11]
Weight uncertainty in neural networks,
C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural networks,” inInternational Conference on Machine Learning. PMLR, 2015, pp. 1613–1622
work page 2015
-
[12]
Uncertainty from motion for dnn monocular depth estimation,
S. Sudhakar, V . Sze, and S. Karaman, “Uncertainty from motion for dnn monocular depth estimation,” in2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 8673–8679
work page 2022
-
[13]
Efficient uncertainty estimation for semantic segmentation in videos,
P.-Y . Huang, W.-T. Hsu, C.-Y . Chiu, T.-F. Wu, and M. Sun, “Efficient uncertainty estimation for semantic segmentation in videos,” inProceed- ings of the European Conference on Computer Vision (ECCV), 2018, pp. 520–535
work page 2018
-
[14]
Neural rgb→(d) sensing: Depth and uncertainty from a video camera,
C. Liu, J. Gu, K. Kim, S. G. Narasimhan, and J. Kautz, “Neural rgb→(d) sensing: Depth and uncertainty from a video camera,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10 986–10 995
work page 2019
-
[15]
Video depth anything: Consistent depth estimation for super-long videos,
S. Chen, H. Guo, S. Zhu, F. Zhang, Z. Huang, J. Feng, and B. Kang, “Video depth anything: Consistent depth estimation for super-long videos,”arXiv preprint arXiv:2501.12375, 2025
-
[16]
Robust consistent video depth estimation,
J. Kopf, X. Rong, and J.-B. Huang, “Robust consistent video depth estimation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1611–1621
work page 2021
-
[17]
Efficient epistemic uncertainty estimation in cerebrovascular segmentation,
O. Rathore, R. Paul, A. Morrison, H. Scharr, and E. Pfaehler, “Efficient epistemic uncertainty estimation in cerebrovascular segmentation,”arXiv preprint arXiv:2503.22271, 2025
-
[18]
Uncertainty-guided never- ending learning to drive,
L. Lai, E. Ohn-Bar, S. Arora, and J. S. K. Yi, “Uncertainty-guided never- ending learning to drive,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 15 088–15 098
work page 2024
-
[19]
Active learning-assisted directed evolution,
J. Yang, R. G. Lal, J. C. Bowden, R. Astudillo, M. A. Hameedi, S. Kaur, M. Hill, Y . Yue, and F. H. Arnold, “Active learning-assisted directed evolution,”Nature Communications, vol. 16, no. 1, p. 714, 2025
work page 2025
-
[20]
Competence-aware path planning via introspective perception,
S. Rabiee, C. Basich, K. H. Wray, S. Zilberstein, and J. Biswas, “Competence-aware path planning via introspective perception,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 3218–3225, 2022
work page 2022
-
[21]
Deep bayesian active learning with image data,
Y . Gal, R. Islam, and Z. Ghahramani, “Deep bayesian active learning with image data,” inInternational Conference on Machine Learning. PMLR, 2017, pp. 1183–1192
work page 2017
-
[22]
En- coding the latent posterior of bayesian neural networks for uncertainty quantification,
G. Franchi, A. Bursuc, E. Aldea, S. Dubuisson, and I. Bloch, “En- coding the latent posterior of bayesian neural networks for uncertainty quantification,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 4, pp. 2027–2040, 2023
work page 2027
-
[23]
Can you trust your model’s uncer- tainty? evaluating predictive uncertainty under dataset shift,
Y . Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. Dillon, B. Lakshminarayanan, and J. Snoek, “Can you trust your model’s uncer- tainty? evaluating predictive uncertainty under dataset shift,”Advances in Neural Information Processing Systems, vol. 32, 2019
work page 2019
-
[24]
On the practicality of deterministic epistemic uncertainty,
J. Postels, M. Seg `u, T. Sun, L. D. Sieber, L. Van Gool, F. Yu, and F. Tombari, “On the practicality of deterministic epistemic uncertainty,” inInternational Conference on Machine Learning. PMLR, 2022, pp. 17 870–17 909
work page 2022
-
[25]
Efficient self-ensemble for semantic segmenta- tion,
W. Bousselham, G. Thibault, L. Pagano, A. Machireddy, J. Gray, Y . H. Chang, and X. Song, “Efficient self-ensemble for semantic segmenta- tion,”arXiv preprint arXiv:2111.13280, 2021
-
[26]
Prune and tune ensembles: low-cost en- semble learning with sparse independent subnetworks,
T. Whitaker and D. Whitley, “Prune and tune ensembles: low-cost en- semble learning with sparse independent subnetworks,” inProceedings of the AAAI conference on artificial intelligence, vol. 36, no. 8, 2022, pp. 8638–8646
work page 2022
-
[27]
S. Liu, T. Chen, Z. Atashgahi, X. Chen, G. Sokar, E. Mocanu, M. Pechenizkiy, Z. Wang, and D. C. Mocanu, “Deep ensembling with no overhead for either training or testing: The all-round blessings of dynamic sparsity,” in10th International conference on Learning Representation, ICLR 2022, 2022
work page 2022
-
[28]
Batchensemble: an alternative approach to efficient ensemble and lifelong learning,
Y . Wen, D. Tran, and J. Ba, “Batchensemble: an alternative approach to efficient ensemble and lifelong learning,” inInternational Conference on Learning Representations, 2019
work page 2019
-
[29]
Packed-ensembles for efficient uncertainty estimation,
O. Laurent, A. Lafage, E. Tartaglione, G. Daniel, J.-M. Martinez, A. Bursuc, and G. Franchi, “Packed-ensembles for efficient uncertainty estimation,” inInternational Conference on Learning Representations, 2023
work page 2023
-
[30]
Training independent subnetworks for robust prediction,
M. Havasi, R. Jenatton, S. Fort, J. Z. Liu, J. Snoek, B. Lakshmi- narayanan, A. M. Dai, and D. Tran, “Training independent subnetworks for robust prediction,” inInternational Conference on Learning Repre- sentations, 2021
work page 2021
-
[31]
Probabilistic mimo u-net: Efficient and accurate uncertainty estimation for pixel-wise regression,
A. Baumann, T. Roßberg, and M. Schmitt, “Probabilistic mimo u-net: Efficient and accurate uncertainty estimation for pixel-wise regression,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 4498–4506
work page 2023
-
[32]
Towards inference efficient deep ensemble learning,
Z. Li, K. Ren, Y . Yang, X. Jiang, Y . Yang, and D. Li, “Towards inference efficient deep ensemble learning,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 7, 2023, pp. 8711– 8719
work page 2023
-
[33]
Estimating the mean and variance of the target probability distribution,
D. A. Nix and A. S. Weigend, “Estimating the mean and variance of the target probability distribution,” inProceedings of 1994 ieee international conference on neural networks (ICNN’94), vol. 1. IEEE, 1994, pp. 55– 60
work page 1994
-
[34]
Dudes: Deep uncertainty distillation using ensembles for semantic segmentation,
S. Landgraf, K. Wursthorn, M. Hillemann, and M. Ulrich, “Dudes: Deep uncertainty distillation using ensembles for semantic segmentation,” PFG–Journal of Photogrammetry, Remote Sensing and Geoinformation Science, vol. 92, no. 2, pp. 101–114, 2024
work page 2024
-
[35]
Distilling ensembles improves uncertainty estimates,
Z. E. Mariet, R. Jenatton, F. Wenzel, and D. Tran, “Distilling ensembles improves uncertainty estimates,” inThird Symposium on Advances in Approximate Bayesian Inference, 2021
work page 2021
-
[36]
J. F. Masakuna, D. K. Nkashama, A. Soltani, M. Frappier, P. M. Tardif, and F. Kabanza, “Streamlined and resource-efficient estimation of epistemic uncertainty in deep ensemble classification decision via regression,”IEEE Transactions on Emerging Topics in Computational Intelligence, 2024
work page 2024
-
[37]
Predictive uncertainty estimation via prior networks,
A. Malinin and M. Gales, “Predictive uncertainty estimation via prior networks,”Advances in Neural Information Processing Systems, vol. 31, 2018
work page 2018
-
[38]
Uncertainty in the Variational Information Bottleneck
A. A. Alemi, I. Fischer, and J. V . Dillon, “Uncertainty in the variational information bottleneck,”arXiv preprint arXiv:1807.00906, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[39]
A simple approach to improve single-model deep uncertainty via distance-awareness,
J. Z. Liu, S. Padhy, J. Ren, Z. Lin, Y . Wen, G. Jerfel, Z. Nado, J. Snoek, D. Tran, and B. Lakshminarayanan, “A simple approach to improve single-model deep uncertainty via distance-awareness,”Journal of Machine Learning Research, vol. 24, no. 42, pp. 1–63, 2023
work page 2023
-
[40]
Conformal prediction: A gentle introduction,
A. N. Angelopoulos and S. Bates, “Conformal prediction: A gentle introduction,”Foundations and Trends in Machine Learning, vol. 16, no. 4, pp. 494–591, 2023
work page 2023
-
[41]
Conformal prediction: a unified review of theory and new challenges,
M. Fontana, G. Zeni, and S. Vantini, “Conformal prediction: a unified review of theory and new challenges,”Bernoulli, vol. 29, no. 1, pp. 1–23, 2023
work page 2023
-
[42]
Stereo processing by semiglobal matching and mutual information,
H. Hirschmuller, “Stereo processing by semiglobal matching and mutual information,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 2, pp. 328–341, 2008
work page 2008
-
[43]
Mvsnet: Depth inference for unstructured multi-view stereo,
Y . Yao, Z. Luo, S. Li, T. Fang, and L. Quan, “Mvsnet: Depth inference for unstructured multi-view stereo,” inProceedings of the European conference on computer vision (ECCV), 2018, pp. 767–783
work page 2018
-
[44]
Kinectfusion: Real-time dense surface mapping and tracking,
R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon, “Kinectfusion: Real-time dense surface mapping and tracking,” in2011 18 10th IEEE international symposium on mixed and augmented reality. IEEE, 2011, pp. 127–136
work page 2011
-
[45]
Real-time large-scale dense rgb-d slam with volumetric fusion,
T. Whelan, M. Kaess, H. Johannsson, M. Fallon, J. J. Leonard, and J. McDonald, “Real-time large-scale dense rgb-d slam with volumetric fusion,”The International Journal of Robotics Research, vol. 34, no. 4-5, pp. 598–626, 2015
work page 2015
-
[46]
Consistent video depth estimation,
X. Luo, J.-B. Huang, R. Szeliski, K. Matzen, and J. Kopf, “Consistent video depth estimation,”ACM Transactions on Graphics (ToG), vol. 39, no. 4, pp. 71–1, 2020
work page 2020
-
[47]
Activenerf: Learning where to see with uncertainty estimation,
X. Pan, Z. Lai, S. Song, and G. Huang, “Activenerf: Learning where to see with uncertainty estimation,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 230–246
work page 2022
-
[48]
Sources of uncertainty in 3d scene reconstruction,
M. Klasson, R. Mereu, J. Kannala, and A. Solin, “Sources of uncertainty in 3d scene reconstruction,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 271–289
work page 2024
-
[49]
Ramen: Real-time asynchronous multi-agent neural implicit mapping,
H. Zhao, B. Ivanovic, and N. Mehr, “Ramen: Real-time asynchronous multi-agent neural implicit mapping,” 2025
work page 2025
-
[50]
3d gaussian splatting for real-time radiance field rendering,
B. Kerbl, G. Kopanas, T. Leimk ¨uhler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,”ACM Transactions on Graphics (ToG), vol. 42, no. 4, pp. 1–14, 2023
work page 2023
-
[51]
Efficient parametric multi-fidelity surface mapping
A. Dhawale and N. Michael, “Efficient parametric multi-fidelity surface mapping.” inRobotics: Science and Systems, 2020, pp. 1–9
work page 2020
-
[52]
Memory-efficient gaussian fitting for depth images in real time,
P. Z. X. Li, S. Karaman, and V . Sze, “Memory-efficient gaussian fitting for depth images in real time,” in2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 8003–8009
work page 2022
-
[53]
J. P. Saarinen, H. Andreasson, T. Stoyanov, and A. J. Lilienthal, “3d normal distributions transform occupancy maps: An efficient represen- tation for mapping in dynamic environments,”The International Journal of Robotics Research, vol. 32, no. 14, pp. 1627–1644, 2013
work page 2013
-
[54]
On-manifold gmm registra- tion,
W. Tabib, C. O’Meadhra, and N. Michael, “On-manifold gmm registra- tion,”IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3805– 3812, 2018
work page 2018
-
[55]
Gmmap: Memory-efficient contin- uous occupancy map using gaussian mixture model,
P. Z. X. Li, S. Karaman, and V . Sze, “Gmmap: Memory-efficient contin- uous occupancy map using gaussian mixture model,”IEEE Transactions on Robotics, vol. 40, pp. 1339–1355, 2024
work page 2024
-
[56]
H. Matsuki, R. Murai, P. H. J. Kelly, and A. J. Davison, “Gaussian Splatting SLAM,” 2024
work page 2024
-
[57]
Gs- slam: Dense visual slam with 3d gaussian splatting,
C. Yan, D. Qu, D. Xu, B. Zhao, Z. Wang, D. Wang, and X. Li, “Gs- slam: Dense visual slam with 3d gaussian splatting,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 19 595–19 604
work page 2024
-
[58]
A. Dai, M. Nießner, M. Zollh ¨ofer, S. Izadi, and C. Theobalt, “Bundle- fusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration,”ACM Transactions on Graphics (ToG), vol. 36, no. 4, p. 1, 2017
work page 2017
-
[59]
Orb-slam3: An accurate open-source library for visual, visual– inertial, and multimap slam,
C. Campos, R. Elvira, J. J. G. Rodr ´ıguez, J. M. Montiel, and J. D. Tard´os, “Orb-slam3: An accurate open-source library for visual, visual– inertial, and multimap slam,”IEEE Transactions on Robotics, vol. 37, no. 6, pp. 1874–1890, 2021
work page 2021
-
[60]
Barycenters in the wasserstein space,
M. Agueh and G. Carlier, “Barycenters in the wasserstein space,”SIAM Journal on Mathematical Analysis, vol. 43, no. 2, pp. 904–924, 2011
work page 2011
-
[61]
Variance min- imization in the wasserstein space for invariant causal prediction,
G. G. Martinet, A. Strzalkowski, and B. Engelhardt, “Variance min- imization in the wasserstein space for invariant causal prediction,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2022, pp. 8803–8851
work page 2022
-
[62]
The quadtree and related hierarchical data structures,
H. Samet, “The quadtree and related hierarchical data structures,”ACM Computing Surveys (CSUR), vol. 16, no. 2, pp. 187–260, 1984
work page 1984
-
[63]
Gsfusion: Online rgb-d mapping where gaussian splatting meets tsdf fusion,
J. Wei and S. Leutenegger, “Gsfusion: Online rgb-d mapping where gaussian splatting meets tsdf fusion,”IEEE Robotics and Automation Letters, vol. 9, no. 12, pp. 11 865–11 872, 2024
work page 2024
-
[64]
R-trees: A dynamic index structure for spatial searching,
A. Guttman, “R-trees: A dynamic index structure for spatial searching,” inProceedings of the 1984 ACM SIGMOD international conference on Management of data, 1984, pp. 47–57
work page 1984
-
[65]
H. G. Sung,Gaussian mixture regression and classification. Rice University, 2004
work page 2004
-
[66]
On calibration of modern neural networks,
C. Guo, G. Pleiss, Y . Sun, and K. Q. Weinberger, “On calibration of modern neural networks,” inInternational Conference on Machine Learning. PMLR, 2017, pp. 1321–1330
work page 2017
-
[67]
Accurate uncertainties for deep learning using calibrated regression,
V . Kuleshov, N. Fenner, and S. Ermon, “Accurate uncertainties for deep learning using calibrated regression,” inInternational conference on machine learning. PMLR, 2018, pp. 2796–2804
work page 2018
-
[68]
Digging into self-supervised monocular depth estimation,
C. Godard, O. Mac Aodha, M. Firman, and G. J. Brostow, “Digging into self-supervised monocular depth estimation,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3828–3838
work page 2019
-
[69]
Fastdepth: Fast monocular depth estimation on embedded systems,
D. Wofk, F. Ma, T.-J. Yang, S. Karaman, and V . Sze, “Fastdepth: Fast monocular depth estimation on embedded systems,” in2019 Interna- tional Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 6101–6108
work page 2019
-
[70]
On the uncertainty of self-supervised monocular depth estimation,
M. Poggi, F. Aleotti, F. Tosi, and S. Mattoccia, “On the uncertainty of self-supervised monocular depth estimation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3227–3237
work page 2020
-
[71]
Indoor segmentation and support inference from rgbd images,
N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor segmentation and support inference from rgbd images,” inComputer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part V 12. Springer, 2012, pp. 746– 760
work page 2012
-
[72]
Tartanair: A dataset to push the limits of visual slam,
W. Wang, D. Zhu, X. Wang, Y . Hu, Y . Qiu, C. Wang, Y . Hu, A. Kapoor, and S. Scherer, “Tartanair: A dataset to push the limits of visual slam,” in2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2020, pp. 4909–4916
work page 2020
-
[73]
Kitti-360: A novel dataset and bench- marks for urban scene understanding in 2d and 3d,
Y . Liao, J. Xie, and A. Geiger, “Kitti-360: A novel dataset and bench- marks for urban scene understanding in 2d and 3d,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3292– 3310, 2022
work page 2022
-
[74]
DINOv2: Learning Robust Visual Features without Supervision
M. Oquab, T. Darcet, T. Moutakanni, H. V o, M. Szafraniec, V . Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Noubyet al., “Dinov2: Learning robust visual features without supervision,”arXiv preprint arXiv:2304.07193, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[75]
Metrically-scaled monocular slam using learned scale factors,
W. N. Greene and N. Roy, “Metrically-scaled monocular slam using learned scale factors,” in2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 43–50
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.