Belief Consistency Between Foundation-Model Evidence and Geometric Perception in Persistent Robotic Maps

Brendan Crowe; Christoffer Heckman; Harel Biggie; Nicholas Roy

arxiv: 2606.00318 · v1 · pith:7FNL4JQPnew · submitted 2026-05-29 · 💻 cs.RO · cs.CV

Belief Consistency Between Foundation-Model Evidence and Geometric Perception in Persistent Robotic Maps

Christoffer Heckman , Harel Biggie , Brendan Crowe , Nicholas Roy This is my paper

Pith reviewed 2026-06-28 21:48 UTC · model grok-4.3

classification 💻 cs.RO cs.CV

keywords persistent mapsfoundation modelsgeometric perceptionbelief consistencyrobotic mappingsemantic segmentationconflict resolutioncalibrated commit

0 comments

The pith

A commit gate and conflict-drop window let robots keep only the foundation-model semantic claims that match their geometric perception in persistent maps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an update operator for persistent robotic maps that fuses a geometric perception stack with foundation-model semantic claims. It adds a per-class calibrated commit gate plus a per-event conflict-drop window that refuses to commit any foundation-model claim contradicted by geometry at the moment it appears. This produces committed maps with substantially higher accuracy, such as 99.7 percent car commit precision versus 43.9 percent for the calibration-only baseline and mean per-class IoU of 0.522 versus 0.180. A sympathetic reader would care because current fusion methods treat uncalibrated foundation-model outputs as equal voters and lack any mechanism to detect direct contradictions between channels.

Core claim

The operator with a per-class calibrated commit gate and a per-event conflict-drop window refuses to commit foundation-model claims contradicted by the geometric channel at the moment of the claim. On KITTI-360 and ScanNet, with both oracle and off-the-shelf geometric channels, the operator produces substantially more accurate committed maps, retains more compositional true positives at higher precision than a monolithic compositional VLM prompt, operates at deployment quality across geometric channels, and remains invariant under foundation-model substitution.

What carries the argument

The update operator consisting of a per-class calibrated commit gate and a per-event conflict-drop window that enforces moment-of-claim consistency before any label is added to the persistent map.

If this is right

The operator achieves car commit precision of 99.7 percent on KITTI versus 43.9 percent without the conflict-drop window.
Mean per-class IoU rises from 0.180 to 0.522 on the same data.
The framework retains more compositional true positives at higher precision than monolithic VLM prompting.
Performance holds at deployment quality for both oracle ground-truth geometry and an off-the-shelf online segmenter.
The operator remains invariant when the foundation model is swapped for another.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same consistency mechanism could filter foundation-model outputs in other robotic tasks that already possess a reliable geometric or proprioceptive channel.
Longer-duration mapping runs would show whether repeated conflict drops accumulate into measurable map sparsity over time.
Feeding the refused claims back as negative training signals might improve future foundation-model reliability on geometric scenes.

Load-bearing premise

The geometric perception channel supplies assertions reliable enough to detect and refuse foundation-model claims that contradict it at the moment of the claim.

What would settle it

A test set in which the geometric channel systematically errs on objects where the foundation model is correct, such that dropping the conflicting foundation-model claims lowers final map accuracy below the calibration-only baseline.

Figures

Figures reproduced from arXiv: 2606.00318 by Brendan Crowe, Christoffer Heckman, Harel Biggie, Nicholas Roy.

**Figure 1.** Figure 1: Failure modes and their detection on a single 51-event trace (18 geometricchannel contradictions). (a) Per-element posterior mean and 90% CI: the calibrationonly operator absorbs every event and drifts away from the correct estimate; the compatibility-checking operator refuses contradicting events and holds near ground truth. (b) Reliability bins from a single 50/50 held-out test split (n = 3699): raw fo… view at source ↗

**Figure 2.** Figure 2: Evidence-fusion pipelines. Top: classical (each channel commits in its own scope; the geometric channel’s assertion never reaches the foundation-model commit gate). Bottom: proposed operator with calibration Λˆ, compatibility verdict ⊤/⊥/∅, revision quarantine Q, and disagreement sink Q⊥; verification (dashed) feeds the calibration buffer when ϕg is decisive. Symbols defined in Defs. 1–3. 3.2 Map and Evide… view at source ↗

**Figure 3.** Figure 3: Ucomod refusing foundation-model claims that contradict the geometric channel in a 60-second window of KITTI-360 drive 0010 sync. Left: top-down voxel scatter colored by ground-truth class labels; ego trajectory in blue (▲ start, ■ end); four red circles mark refused-claim sites (A–D). Right: image crops with the foundation-model query region; banners report the model’s claim, its confidence, and the geome… view at source ↗

read the original abstract

Persistent maps used by autonomous robots increasingly fuse a geometric perception stack whose assertions are well-characterized with a foundation-model channel that produces semantic claims without calibrated reliability about the same scene. Contemporary mapping systems integrate the two channels by treating the foundation-model channel as an additional voter into a per-element posterior, uncalibrated for its own per-class reliability and without machinery to flag when the two channels contradict each other at a given moment. We propose an update operator with two cooperating mechanisms: a per-class calibrated commit gate, and a per-event conflict-drop window that refuses to commit foundation-model claims contradicted by the geometric channel at the moment of the claim. We evaluate on KITTI-360 and ScanNet, with an oracle geometric channel (panoptic ground truth) and an off-the-shelf online semantic segmenter (Mask2Former) to demonstrate real-world performance. The operator produces substantially more accurate committed maps (KITTI is car commit precision 99.7% vs. 43.9% for the calibration-only operator; mean per-class IoU 0.522 vs. 0.180), retains more compositional true positives at higher precision than a monolithic compositional VLM prompt. The framework operates at deployment quality across both oracle and off-the-shelf-segmenter geometric channels, and is invariant under foundation-model substitution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's operator lifts committed map precision by refusing FM claims that clash with geometry at claim time, but the reported gains lack a direct check on whether the conflict window is accurate rather than just aggressive.

read the letter

The main thing to know is that this work adds two specific pieces to map updates: a per-class calibrated commit gate and a per-event conflict-drop window that blocks foundation-model claims contradicted by the geometric channel right at the moment. That pairing is presented as new for handling uncalibrated semantic inputs in persistent robotic maps.

It does a solid job showing the effect on real data. On KITTI-360 the car commit precision rises from 43.9% with calibration alone to 99.7% with the full operator, and mean per-class IoU moves from 0.180 to 0.522. The same pattern holds on ScanNet. The results stay strong when the geometric channel is an off-the-shelf Mask2Former instead of oracle ground truth, and the framework does not depend on which foundation model supplies the semantics. It also keeps more compositional true positives at higher precision than a monolithic VLM prompt.

The soft spot is exactly the one the stress-test flags. The precision jump depends on the geometric channel correctly spotting contradictions; any mismatch in timing, viewpoint, or label detail between channels can create false drops or missed ones. The paper gives end-to-end map metrics but no auxiliary numbers on conflict-detection precision or recall against known contradictions. Without that, it is hard to tell how much of the improvement is genuine consistency enforcement versus simply refusing more claims.

This is aimed at people building semantic maps that combine reliable geometry with foundation-model outputs. The concrete operator, public datasets, and side-by-side numbers make it worth a serious referee even if the conflict validation needs tightening.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an update operator for persistent robotic maps fusing foundation-model semantic claims with geometric perception assertions. The operator uses a per-class calibrated commit gate and a per-event conflict-drop window to refuse foundation-model claims contradicted by the geometric channel at claim time. Evaluations on KITTI-360 and ScanNet (with both oracle panoptic ground truth and off-the-shelf Mask2Former) report substantially higher committed-map accuracy than calibration-only or monolithic VLM baselines (e.g., KITTI car commit precision 99.7% vs. 43.9%; mean per-class IoU 0.522 vs. 0.180), while retaining more compositional true positives and remaining invariant under foundation-model substitution.

Significance. If the central mechanisms function as described, the work supplies a concrete, deployment-oriented solution to the problem of uncalibrated fusion between reliable geometric perception and unreliable foundation-model evidence. The dual-channel evaluation (oracle and real segmenter) and the explicit comparison against both calibration-only and monolithic prompting baselines are strengths; the use of public datasets and the parameter-free character of the conflict logic further support potential impact in robotic mapping.

major comments (2)

[Evaluation section (KITTI-360 and ScanNet results)] The headline precision and IoU gains on KITTI-360 rest on the conflict-drop window correctly refusing only when the geometric channel is right. No auxiliary table or subsection reports conflict-detection precision/recall (or false-positive/false-negative rates) against oracle contradictions, so it is impossible to determine whether the reported improvements arise from accurate consistency enforcement or from systematic over-refusal of valid foundation-model claims.
[Methods (commit gate definition)] The per-class calibration of the commit gate is presented as feasible without post-hoc selection effects, yet the manuscript supplies no description of the calibration procedure, the data splits used for calibration versus test, or any cross-validation that would confirm the calibration remains valid under the same distribution shift that affects the geometric channel.

minor comments (2)

[Abstract] The abstract states concrete numerical results but does not cite the corresponding tables or figures that contain those numbers.
[Methods] Notation for the conflict-drop window and commit gate should be introduced with explicit equations rather than prose descriptions alone.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the potential deployment impact of the proposed update operator. We address each major comment below.

read point-by-point responses

Referee: [Evaluation section (KITTI-360 and ScanNet results)] The headline precision and IoU gains on KITTI-360 rest on the conflict-drop window correctly refusing only when the geometric channel is right. No auxiliary table or subsection reports conflict-detection precision/recall (or false-positive/false-negative rates) against oracle contradictions, so it is impossible to determine whether the reported improvements arise from accurate consistency enforcement or from systematic over-refusal of valid foundation-model claims.

Authors: We agree that reporting conflict-detection performance metrics is necessary to isolate the contribution of the conflict-drop window. In the revised manuscript we will add a dedicated subsection (and accompanying table) that computes precision, recall, and F1 of conflict detection against oracle contradictions on both KITTI-360 and ScanNet, for both the oracle geometric channel and the Mask2Former channel. This analysis will confirm that the observed accuracy gains derive from accurate refusal of contradicted claims rather than indiscriminate dropping. revision: yes
Referee: [Methods (commit gate definition)] The per-class calibration of the commit gate is presented as feasible without post-hoc selection effects, yet the manuscript supplies no description of the calibration procedure, the data splits used for calibration versus test, or any cross-validation that would confirm the calibration remains valid under the same distribution shift that affects the geometric channel.

Authors: The referee is correct that the current manuscript lacks an explicit description of the calibration protocol. We will revise the Methods section to detail the full calibration procedure: per-class threshold selection on a held-out calibration subset drawn from the training sequences (disjoint from all test sequences), the exact optimization criterion used, and the absence of any post-hoc selection on test data. We will also add a short discussion of robustness under distribution shift, supported by the existing dual-channel (oracle vs. Mask2Former) results already present in the evaluation. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claims rest on external public datasets and off-the-shelf components

full rationale

The paper defines an update operator (commit gate + conflict-drop window) and reports end-to-end metrics (commit precision, IoU) on KITTI-360 and ScanNet against ground truth, using both oracle and Mask2Former channels. No equations or steps reduce by construction to fitted parameters, self-citations, or renamed inputs; the framework is tested for invariance under FM substitution without internal redefinition of the target quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so no concrete free parameters, axioms, or invented entities can be extracted.

pith-pipeline@v0.9.1-grok · 5770 in / 1209 out tokens · 32887 ms · 2026-06-28T21:48:37.593368+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 1 canonical work pages

[1]

Journal of Symbolic Logic50(2), 510–530 (1985)

Alchourr´ on, C.E., G¨ ardenfors, P., Makinson, D.: On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic50(2), 510–530 (1985)

1985
[2]

Foundations and Trends in Machine Learning16(4), 494–591 (2023)

Angelopoulos, A.N., Bates, S.: A gentle introduction to conformal prediction and distribution- free uncertainty quantification. Foundations and Trends in Machine Learning16(4), 494–591 (2023)

2023
[3]

In: ICRA (2021)

Asgharivaskasi, A., Atanasov, N.: Active Bayesian multi-class mapping from range and semantic segmentation observations. In: ICRA (2021)

2021
[4]

IEEE Trans

Asgharivaskasi, A., Atanasov, N.: Semantic OcTree mapping and Shannon mutual information computation for robot exploration. IEEE Trans. Robotics39(3), 1910–1928 (2023)

1910
[5]

IEEE Robot

Bavle, H., Sanchez-Lopez, J.L., Shaheer, M., Civera, J., Voos, H.: S-Graphs 2.0 – a hierarchical- semantic optimization and loop closure for SLAM. IEEE Robot. Autom. Lett. (2025)

2025
[6]

In: IROS (2019)

Chen, X., Milioto, A., Palazzolo, E., Gigu` ere, P., Behley, J., Stachniss, C.: SuMa++: Efficient LiDAR-based semantic SLAM. In: IROS (2019)

2019
[7]

In: CVPR (2022)

Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: CVPR (2022)

2022
[8]

In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: Richly- annotated 3D reconstructions of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

2017
[9]

IEEE Trans

Doherty, K., Shan, T., Wang, J., Englot, B.: Learning-aided 3-d occupancy mapping with Bayesian generalized kernel inference. IEEE Trans. Robotics35(4), 953–966 (2019)

2019
[10]

In: ICRA (2017)

Doherty, K., Wang, J., Englot, B.: Bayesian generalized kernel inference for occupancy map prediction. In: ICRA (2017)

2017
[11]

IEEE Robot

Gan, L., Zhang, R., Grizzle, J.W., Eustice, R.M., Ghaffari, M.: Bayesian spatial kernel smoothing for scalable dense semantic mapping. IEEE Robot. Autom. Lett.5(2), 790–797 (2020)

2020
[12]

MIT Press (1988) 16 Heckman, Biggie, Crowe, Roy

G¨ ardenfors, P.: Knowledge in Flux: Modeling the Dynamics of Epistemic States. MIT Press (1988) 16 Heckman, Biggie, Crowe, Roy

1988
[13]

In: NeurIPS (2021)

Gibbs, I., Cand` es, E.: Adaptive conformal inference under distribution shift. In: NeurIPS (2021)

2021
[14]

In: CVPR (2026)

Gorlo, N., Schmid, L., Carlone, L.: Describe anything anywhere at any moment: Hierarchical 4D scene graphs with open-vocabulary language. In: CVPR (2026)

2026
[15]

IEEE Robot

Grinvald, M., Furrer, F., Novkovic, T., Chung, J.J., Cadena, C., Siegwart, R., Nieto, J.: Volu- metric instance-aware semantic mapping and 3D object discovery. IEEE Robot. Autom. Lett. 4(3), 3037–3044 (2019)

2019
[16]

In: ICRA (2024)

Gu, Q., Kuwajerwala, A., Morin, S., Jatavallabhula, K.M., Sen, B., et al.: ConceptGraphs: Open-vocabulary 3D scene graphs for perception and planning. In: ICRA (2024)

2024
[17]

Autonomous Robots34(3), 189–206 (2013)

Hornung, A., Wurm, K.M., Bennewitz, M., Stachniss, C., Burgard, W.: OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots34(3), 189–206 (2013)

2013
[18]

In: Robotics: Science and Systems (RSS) (2022)

Hughes, N., Chang, Y., Carlone, L.: Hydra: A real-time spatial perception system for 3D scene graph construction and optimization. In: Robotics: Science and Systems (RSS) (2022)

2022
[19]

The International Journal of Robotics Research43(10) (2024)

Hughes, N., Chang, Y., Hu, S., Talak, R., Abdulhai, R., Strader, J., Carlone, L.: Foundations of spatial perception for robotics: Hierarchical representations and real-time systems. The International Journal of Robotics Research43(10) (2024)

2024
[20]

In: Robotics: Science and Systems (RSS) (2023)

Jatavallabhula, K.M., Kuwajerwala, A., Gu, Q., Omama, M., Chen, T., et al.: ConceptFusion: Open-set multimodal 3D mapping. In: Robotics: Science and Systems (RSS) (2023)

2023
[21]

Jocher, G., Qiu, J., Chaurasia, A.: Ultralytics YOLO (2023), https://github.com/ultralytics/ ultralytics

2023
[22]

In: ICCV (2023)

Kerr, J., Kim, C.M., Goldberg, K., Kanazawa, A., Tancik, M.: LERF: Language embedded radiance fields. In: ICCV (2023)

2023
[23]

In: Handbook of Knowledge Representation

Lakemeyer, G., Levesque, H.J.: Cognitive robotics. In: Handbook of Knowledge Representation. Elsevier (2007)

2007
[24]

In: ICRA (2025)

Li, B., Cai, Z., Li, Y.F., Reid, I., Rezatofighi, H.: Hier-SLAM: Scaling-up semantics in SLAM with a hierarchically categorical Gaussian splatting. In: ICRA (2025)

2025
[25]

IEEE Trans

Liao, Y., Xie, J., Geiger, A.: KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D. IEEE Trans. Pattern Anal. Mach. Intell.45(3), 3292–3310 (2023)

2023
[26]

Au- tonomous Robots4(4), 333–349 (1997)

Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Au- tonomous Robots4(4), 333–349 (1997)

1997
[27]

Maggio, D., Carlone, L.: Bayesian fields: Task-driven open-set semantic Gaussian splatting (2025), arXiv:2503.05949

work page arXiv 2025
[28]

IEEE Robotics and Automation Letters9(10), 8921–8928 (2024)

Maggio, D., Chang, Y., Hughes, N., Trang, M., Griffith, D., Dougherty, C., Cristofalo, E., Schmid, L., Carlone, L.: Clio: Real-time task-driven open-set 3D scene graphs. IEEE Robotics and Automation Letters9(10), 8921–8928 (2024)

2024
[29]

In: ICRA (2017)

McCormac, J., Handa, A., Davison, A.J., Leutenegger, S.: SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: ICRA (2017)

2017
[30]

In: IROS (2019)

Narita, G., Seno, T., Ishikawa, T., Kaji, Y.: PanopticFusion: Online volumetric semantic mapping at the level of stuff and things. In: IROS (2019)

2019
[31]

In: CVPR (2023)

Peng, S., Genova, K., Jiang, C., Tagliasacchi, A., Pollefeys, M., Funkhouser, T.: OpenScene: 3D scene understanding with open vocabularies. In: CVPR (2023)

2023
[32]

In: ICLR (2024)

Quach, V., Fisch, A., Schuster, T., Yala, A., Sohn, J.H., Jaakkola, T.S., Barzilay, R.: Conformal language modeling. In: ICLR (2024)

2024
[33]

MIT Press (2001)

Reiter, R.: Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems. MIT Press (2001)

2001
[34]

In: CoRL (2023)

Ren, A.Z., Dixit, A., Bodrova, A., Singh, S., Tu, S., Brown, N., Xu, P., Takayama, L., Xia, F., Varley, J., et al.: Robots that ask for help: Uncertainty alignment for large language model planners. In: CoRL (2023)

2023
[35]

In: ICRA (2020)

Rosinol, A., Abate, M., Chang, Y., Carlone, L.: Kimera: An open-source library for real-time metric-semantic localization and mapping. In: ICRA (2020)

2020
[36]

The International Journal of Robotics Research40(12-14) (2021)

Rosinol, A., Violette, A., Abate, M., Hughes, N., Chang, Y., Shi, J., Gupta, A., Carlone, L.: Kimera: From SLAM to spatial perception with 3D dynamic scene graphs. The International Journal of Robotics Research40(12-14) (2021)

2021
[37]

In: Robotics: Science and Systems (RSS) (2024)

Schmid, L., Abate, M., Chang, Y., Carlone, L.: Khronos: A unified approach for spatio-temporal metric-semantic SLAM in dynamic environments. In: Robotics: Science and Systems (RSS) (2024)

2024
[38]

In: IROS (2017)

S¨ underhauf, N., Pham, T.T., Latif, Y., Milford, M., Reid, I.: Meaningful maps with object- oriented semantic mapping. In: IROS (2017)

2017
[39]

Springer (2005)

Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer (2005)

2005
[40]

IEEE Trans

Wilson, J., Fu, Y., Friesen, J., Ewen, P., Capodieci, A., Jayakumar, P., Barton, K., Ghaffari, M.: ConvBKI: Real-time probabilistic semantic mapping network with quantifiable uncertainty. IEEE Trans. Robotics (2024)

2024

[1] [1]

Journal of Symbolic Logic50(2), 510–530 (1985)

Alchourr´ on, C.E., G¨ ardenfors, P., Makinson, D.: On the logic of theory change: Partial meet contraction and revision functions. Journal of Symbolic Logic50(2), 510–530 (1985)

1985

[2] [2]

Foundations and Trends in Machine Learning16(4), 494–591 (2023)

Angelopoulos, A.N., Bates, S.: A gentle introduction to conformal prediction and distribution- free uncertainty quantification. Foundations and Trends in Machine Learning16(4), 494–591 (2023)

2023

[3] [3]

In: ICRA (2021)

Asgharivaskasi, A., Atanasov, N.: Active Bayesian multi-class mapping from range and semantic segmentation observations. In: ICRA (2021)

2021

[4] [4]

IEEE Trans

Asgharivaskasi, A., Atanasov, N.: Semantic OcTree mapping and Shannon mutual information computation for robot exploration. IEEE Trans. Robotics39(3), 1910–1928 (2023)

1910

[5] [5]

IEEE Robot

Bavle, H., Sanchez-Lopez, J.L., Shaheer, M., Civera, J., Voos, H.: S-Graphs 2.0 – a hierarchical- semantic optimization and loop closure for SLAM. IEEE Robot. Autom. Lett. (2025)

2025

[6] [6]

In: IROS (2019)

Chen, X., Milioto, A., Palazzolo, E., Gigu` ere, P., Behley, J., Stachniss, C.: SuMa++: Efficient LiDAR-based semantic SLAM. In: IROS (2019)

2019

[7] [7]

In: CVPR (2022)

Cheng, B., Misra, I., Schwing, A.G., Kirillov, A., Girdhar, R.: Masked-attention mask transformer for universal image segmentation. In: CVPR (2022)

2022

[8] [8]

In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: Richly- annotated 3D reconstructions of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

2017

[9] [9]

IEEE Trans

Doherty, K., Shan, T., Wang, J., Englot, B.: Learning-aided 3-d occupancy mapping with Bayesian generalized kernel inference. IEEE Trans. Robotics35(4), 953–966 (2019)

2019

[10] [10]

In: ICRA (2017)

Doherty, K., Wang, J., Englot, B.: Bayesian generalized kernel inference for occupancy map prediction. In: ICRA (2017)

2017

[11] [11]

IEEE Robot

Gan, L., Zhang, R., Grizzle, J.W., Eustice, R.M., Ghaffari, M.: Bayesian spatial kernel smoothing for scalable dense semantic mapping. IEEE Robot. Autom. Lett.5(2), 790–797 (2020)

2020

[12] [12]

MIT Press (1988) 16 Heckman, Biggie, Crowe, Roy

G¨ ardenfors, P.: Knowledge in Flux: Modeling the Dynamics of Epistemic States. MIT Press (1988) 16 Heckman, Biggie, Crowe, Roy

1988

[13] [13]

In: NeurIPS (2021)

Gibbs, I., Cand` es, E.: Adaptive conformal inference under distribution shift. In: NeurIPS (2021)

2021

[14] [14]

In: CVPR (2026)

Gorlo, N., Schmid, L., Carlone, L.: Describe anything anywhere at any moment: Hierarchical 4D scene graphs with open-vocabulary language. In: CVPR (2026)

2026

[15] [15]

IEEE Robot

Grinvald, M., Furrer, F., Novkovic, T., Chung, J.J., Cadena, C., Siegwart, R., Nieto, J.: Volu- metric instance-aware semantic mapping and 3D object discovery. IEEE Robot. Autom. Lett. 4(3), 3037–3044 (2019)

2019

[16] [16]

In: ICRA (2024)

Gu, Q., Kuwajerwala, A., Morin, S., Jatavallabhula, K.M., Sen, B., et al.: ConceptGraphs: Open-vocabulary 3D scene graphs for perception and planning. In: ICRA (2024)

2024

[17] [17]

Autonomous Robots34(3), 189–206 (2013)

Hornung, A., Wurm, K.M., Bennewitz, M., Stachniss, C., Burgard, W.: OctoMap: An efficient probabilistic 3D mapping framework based on octrees. Autonomous Robots34(3), 189–206 (2013)

2013

[18] [18]

In: Robotics: Science and Systems (RSS) (2022)

Hughes, N., Chang, Y., Carlone, L.: Hydra: A real-time spatial perception system for 3D scene graph construction and optimization. In: Robotics: Science and Systems (RSS) (2022)

2022

[19] [19]

The International Journal of Robotics Research43(10) (2024)

Hughes, N., Chang, Y., Hu, S., Talak, R., Abdulhai, R., Strader, J., Carlone, L.: Foundations of spatial perception for robotics: Hierarchical representations and real-time systems. The International Journal of Robotics Research43(10) (2024)

2024

[20] [20]

In: Robotics: Science and Systems (RSS) (2023)

Jatavallabhula, K.M., Kuwajerwala, A., Gu, Q., Omama, M., Chen, T., et al.: ConceptFusion: Open-set multimodal 3D mapping. In: Robotics: Science and Systems (RSS) (2023)

2023

[21] [21]

Jocher, G., Qiu, J., Chaurasia, A.: Ultralytics YOLO (2023), https://github.com/ultralytics/ ultralytics

2023

[22] [22]

In: ICCV (2023)

Kerr, J., Kim, C.M., Goldberg, K., Kanazawa, A., Tancik, M.: LERF: Language embedded radiance fields. In: ICCV (2023)

2023

[23] [23]

In: Handbook of Knowledge Representation

Lakemeyer, G., Levesque, H.J.: Cognitive robotics. In: Handbook of Knowledge Representation. Elsevier (2007)

2007

[24] [24]

In: ICRA (2025)

Li, B., Cai, Z., Li, Y.F., Reid, I., Rezatofighi, H.: Hier-SLAM: Scaling-up semantics in SLAM with a hierarchically categorical Gaussian splatting. In: ICRA (2025)

2025

[25] [25]

IEEE Trans

Liao, Y., Xie, J., Geiger, A.: KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2D and 3D. IEEE Trans. Pattern Anal. Mach. Intell.45(3), 3292–3310 (2023)

2023

[26] [26]

Au- tonomous Robots4(4), 333–349 (1997)

Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Au- tonomous Robots4(4), 333–349 (1997)

1997

[27] [27]

Maggio, D., Carlone, L.: Bayesian fields: Task-driven open-set semantic Gaussian splatting (2025), arXiv:2503.05949

work page arXiv 2025

[28] [28]

IEEE Robotics and Automation Letters9(10), 8921–8928 (2024)

Maggio, D., Chang, Y., Hughes, N., Trang, M., Griffith, D., Dougherty, C., Cristofalo, E., Schmid, L., Carlone, L.: Clio: Real-time task-driven open-set 3D scene graphs. IEEE Robotics and Automation Letters9(10), 8921–8928 (2024)

2024

[29] [29]

In: ICRA (2017)

McCormac, J., Handa, A., Davison, A.J., Leutenegger, S.: SemanticFusion: Dense 3D semantic mapping with convolutional neural networks. In: ICRA (2017)

2017

[30] [30]

In: IROS (2019)

Narita, G., Seno, T., Ishikawa, T., Kaji, Y.: PanopticFusion: Online volumetric semantic mapping at the level of stuff and things. In: IROS (2019)

2019

[31] [31]

In: CVPR (2023)

Peng, S., Genova, K., Jiang, C., Tagliasacchi, A., Pollefeys, M., Funkhouser, T.: OpenScene: 3D scene understanding with open vocabularies. In: CVPR (2023)

2023

[32] [32]

In: ICLR (2024)

Quach, V., Fisch, A., Schuster, T., Yala, A., Sohn, J.H., Jaakkola, T.S., Barzilay, R.: Conformal language modeling. In: ICLR (2024)

2024

[33] [33]

MIT Press (2001)

Reiter, R.: Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems. MIT Press (2001)

2001

[34] [34]

In: CoRL (2023)

Ren, A.Z., Dixit, A., Bodrova, A., Singh, S., Tu, S., Brown, N., Xu, P., Takayama, L., Xia, F., Varley, J., et al.: Robots that ask for help: Uncertainty alignment for large language model planners. In: CoRL (2023)

2023

[35] [35]

In: ICRA (2020)

Rosinol, A., Abate, M., Chang, Y., Carlone, L.: Kimera: An open-source library for real-time metric-semantic localization and mapping. In: ICRA (2020)

2020

[36] [36]

The International Journal of Robotics Research40(12-14) (2021)

Rosinol, A., Violette, A., Abate, M., Hughes, N., Chang, Y., Shi, J., Gupta, A., Carlone, L.: Kimera: From SLAM to spatial perception with 3D dynamic scene graphs. The International Journal of Robotics Research40(12-14) (2021)

2021

[37] [37]

In: Robotics: Science and Systems (RSS) (2024)

Schmid, L., Abate, M., Chang, Y., Carlone, L.: Khronos: A unified approach for spatio-temporal metric-semantic SLAM in dynamic environments. In: Robotics: Science and Systems (RSS) (2024)

2024

[38] [38]

In: IROS (2017)

S¨ underhauf, N., Pham, T.T., Latif, Y., Milford, M., Reid, I.: Meaningful maps with object- oriented semantic mapping. In: IROS (2017)

2017

[39] [39]

Springer (2005)

Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer (2005)

2005

[40] [40]

IEEE Trans

Wilson, J., Fu, Y., Friesen, J., Ewen, P., Capodieci, A., Jayakumar, P., Barton, K., Ghaffari, M.: ConvBKI: Real-time probabilistic semantic mapping network with quantifiable uncertainty. IEEE Trans. Robotics (2024)

2024