Change-Robust Online Spatial-Semantic Topological Mapping
Pith reviewed 2026-05-08 18:56 UTC · model grok-4.3
The pith
Robots can navigate reliably amid lighting changes and rearranged furniture by using an online topological graph of RGB-D keyframes instead of a globally consistent metric map.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an online, pose-aware topological graph of RGB-D keyframes, together with sequential hypothesis testing in continuous SE(3), supplies sufficient spatial-semantic information for navigation without requiring a globally consistent metric substrate. The estimator maintains a bounded Gaussian-mixture belief over poses, which supports principled loop-closure handling and recovery from kidnapped-robot events. Experiments with real-robot object-goal navigation under lighting shifts and furniture rearrangement show improved robustness over SLAM-based and standard topological baselines while preserving safety under perceptual aliasing.
What carries the argument
An online pose-aware topological graph of RGB-D keyframes combined with sequential hypothesis testing in continuous three-dimensional pose space, which supplies the spatial-semantic information and bounded pose beliefs needed for navigation decisions.
If this is right
- Object-goal navigation stays safe and accurate even when lighting conditions change or furniture is moved.
- The system handles perceptual aliasing without catastrophic failure where multiple locations appear similar.
- Loop closures and sudden robot displacements are managed through the bounded mixture belief over poses.
- Navigation decisions remain more reliable than those from methods that depend on global metric consistency.
Where Pith is reading between the lines
- The bounded pose belief could support incremental updates over very long periods without map rebuilding.
- Sharing such topological graphs among multiple robots might avoid the alignment problems that metric maps create.
- Pairing the graph with object-level semantic labels could enable planning that reasons directly about reachable places rather than coordinates.
Load-bearing premise
An online pose-aware topological graph of RGB-D keyframes plus sequential hypothesis testing in continuous three-dimensional pose space can supply enough spatial-semantic information for reliable navigation decisions without a globally consistent metric substrate.
What would settle it
A real-robot trial in which the topological-graph method produces unsafe paths or loses localization accuracy during combined lighting shifts and furniture rearrangement, performing no better than the SLAM or topological baselines under perceptual aliasing, would falsify the claim.
Figures
read the original abstract
Autonomous robots require change-robust spatial-semantic reasoning: using spatial and semantic knowledge to decide where to go, how to get there, and where the robot is despite environmental change. Existing approaches typically attach semantics to SLAM-built metric maps, but these pipelines are brittle under appearance shifts and scene dynamics, where data association and relocalization degrade. We propose a Change-Robust Online Spatial-Semantic (CROSS) representation that replaces a globally consistent metric substrate with an online, pose-aware topological graph of RGB-D keyframes. The system explicitly reasons over perceptual ambiguity using sequential hypothesis testing in continuous SE(3). Our estimator maintains a bounded Gaussian-mixture belief over poses, enabling principled handling of loop closures and kidnapped-robot events. Experiments under severe appearance change, including real-robot object-goal navigation with lighting shifts and furniture rearrangement, demonstrate improved robustness over SLAM-based and topological baselines while remaining safe under perceptual aliasing.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes the Change-Robust Online Spatial-Semantic (CROSS) representation for autonomous robots, which replaces a globally consistent metric substrate with an online pose-aware topological graph of RGB-D keyframes. The system uses sequential hypothesis testing in continuous SE(3) and maintains a bounded Gaussian-mixture belief over poses to handle perceptual ambiguities, loop closures, and kidnapped-robot events. Experiments with real-robot object-goal navigation under severe appearance changes (lighting shifts and furniture rearrangement) are reported to show improved robustness over SLAM-based and topological baselines while remaining safe under perceptual aliasing.
Significance. If the central claims hold, the work would be significant for robot mapping and navigation in dynamic environments, as it provides a principled topological alternative to brittle metric SLAM pipelines under appearance and structural change. The explicit handling of ambiguity via SE(3) hypothesis testing and bounded mixture beliefs, combined with real-robot trials, offers a concrete advance over purely metric or purely topological baselines. Credit is due for focusing on safety under aliasing and for grounding the evaluation in object-goal navigation tasks.
major comments (2)
- [the proposed CROSS representation and estimator] The load-bearing claim that relative pose estimates between keyframes plus the multi-hypothesis SE(3) belief suffice for reliable navigation decisions without global metric consistency is not accompanied by an explicit analysis of residual pose uncertainty (particularly when furniture rearrangement alters keyframe visibility). This assumption underpins the safety and robustness assertions but lacks a concrete bound or failure-mode characterization in the method description.
- [Experiments] The abstract states that experiments demonstrate improved robustness, yet provides no quantitative metrics, error bars, or statistical comparison details. Without these, it is impossible to assess whether the topological approach actually outperforms baselines by a margin that justifies replacing metric substrates.
minor comments (2)
- The abstract would be strengthened by including at least one key quantitative result (e.g., success rate or navigation time under change) to support the robustness claim.
- Notation for the bounded Gaussian-mixture belief and the sequential hypothesis test should be introduced with explicit equations or pseudocode for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below and have revised the manuscript to incorporate the suggested improvements where they strengthen the work.
read point-by-point responses
-
Referee: [the proposed CROSS representation and estimator] The load-bearing claim that relative pose estimates between keyframes plus the multi-hypothesis SE(3) belief suffice for reliable navigation decisions without global metric consistency is not accompanied by an explicit analysis of residual pose uncertainty (particularly when furniture rearrangement alters keyframe visibility). This assumption underpins the safety and robustness assertions but lacks a concrete bound or failure-mode characterization in the method description.
Authors: We thank the referee for identifying this point. The CROSS estimator maintains a bounded Gaussian-mixture belief over SE(3) poses via sequential hypothesis testing precisely to represent residual uncertainty and perceptual ambiguity without relying on global metric consistency; navigation decisions are conditioned on the full belief support to preserve safety. We agree, however, that an explicit characterization of how this uncertainty evolves under furniture rearrangement (which can reduce keyframe visibility) would make the safety claims more concrete. We have added a dedicated paragraph in the method section providing a bound on residual pose uncertainty and discussing associated failure modes. revision: yes
-
Referee: [Experiments] The abstract states that experiments demonstrate improved robustness, yet provides no quantitative metrics, error bars, or statistical comparison details. Without these, it is impossible to assess whether the topological approach actually outperforms baselines by a margin that justifies replacing metric substrates.
Authors: We appreciate the referee's emphasis on quantitative rigor. The manuscript's experiments section already reports success rates, navigation times, and direct comparisons against SLAM-based and topological baselines under lighting shifts and furniture rearrangement. To address the concern about the abstract and presentation, we have revised the abstract to include key quantitative metrics with error bars and have added explicit statistical significance tests in the experiments section. revision: yes
Circularity Check
No circularity detected in derivation chain
full rationale
The paper introduces a CROSS representation based on an online pose-aware topological graph of RGB-D keyframes with sequential SE(3) hypothesis testing and a bounded Gaussian-mixture pose belief. No equations, predictions, or first-principles results are shown that reduce by construction to the inputs (e.g., no fitted parameters renamed as predictions, no self-definitional loops, and no load-bearing self-citations). The central claims rest on experimental validation under appearance change rather than tautological redefinitions or imported uniqueness theorems. This is a standard non-circular proposal of a new mapping method.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A topological graph of RGB-D keyframes can sufficiently capture spatial-semantic knowledge for navigation despite environmental changes.
Lean theorems connected to this paper
-
Classical SE(3) covariance transport; RS's emergent Lorentzian (1,3) signature is conceptually unrelated to estimator covariance propagationreality_from_one_distinction (spacetime emergence) unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Σ⁻ = Ad_{ΔT⁻¹} Σ Ad^T_{ΔT⁻¹} + Q_t (first-order pushforward via the adjoint)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review arXiv 2023
-
[2]
Boq: A place is worth a bag of learnable queries
Amar Ali-Bey, Brahim Chaib-draa, and Philippe Giguere. Boq: A place is worth a bag of learnable queries. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17794–17803, 2024
work page 2024
-
[3]
D. L. Alspach and H. W. Sorenson. Nonlinear bayesian estimation using gaussian sum approximations. IEEE Transactions on Automatic Control, 17(4):439– 448, 1972. doi: 10.1109/TAC.1972.1100034
-
[4]
Adrien Angeli, David Filliat, St ´ephane Doncieux, and Jean-Arcady Meyer. Fast and incremental method for loop-closure detection using bags of visual words.IEEE transactions on robotics, 24(5):1027–1037, 2008
work page 2008
-
[5]
Megaloc: One re- trieval to place them all
Gabriele Berton and Carlo Masone. Megaloc: One re- trieval to place them all. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 2861– 2867, 2025
work page 2025
-
[6]
Carlos Campos, Richard Elvira, Juan J G ´omez Rodr´ıguez, Jos ´e MM Montiel, and Juan D Tard ´os. Orb-slam3: An accurate open-source library for visual, visual–inertial, and multimap slam.IEEE transactions on robotics, 37(6):1874–1890, 2021
work page 2021
-
[7]
NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Matthew Chang, Theophile Gervet, Mukul Khanna, Sri- ram Yenamandra, Dhruv Shah, So Yeon Min, Kavit Shah, Chris Paxton, Saurabh Gupta, Dhruv Batra, Roozbeh Mottaghi, Jitendra Malik, and Devendra Singh Chaplot. GOAT: GO to any thing. InProceedings of Robotics: Science and Systems (RSS), 2024. doi: 10.15607/RSS. 2024.XX.073
-
[8]
Object goal navigation using goal-oriented semantic exploration
Devendra Singh Chaplot, Dhiraj Prakashchand Gandhi, Abhinav Gupta, and Russ R Salakhutdinov. Object goal navigation using goal-oriented semantic exploration. Advances in Neural Information Processing Systems, 33: 4247–4258, 2020
work page 2020
-
[9]
Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gem- ini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025
work page Pith review arXiv 2025
-
[10]
Mark Cummins and Paul Newman. Appearance-only slam at large scale with fab-map 2.0.The International Journal of Robotics Research, 30(9):1100–1123, 2011. doi: 10.1177/0278364910385483
-
[11]
Frank Dellaert and GTSAM Contributors. borglab/gtsam, May 2022. URL https://github.com/borglab/gtsam)
work page 2022
-
[12]
M.A.T. Figueiredo and A.K. Jain. Unsupervised learning of finite mixture models.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3):381–396, 2002. doi: 10.1109/34.990138
-
[13]
ConceptGraphs: Open-vocabulary 3D scene graphs for perception and planning
Qiao Gu, Alihusein Kuwajerwala, Sacha Morin, Kr- ishna Murthy Jatavallabhula, Aditya Sen, Aditya Agar- wal, Corban Rivera, William Knudson, Erik Sudderth, Oscar Beijbom, et al. ConceptGraphs: Open-vocabulary 3D scene graphs for perception and planning. InProceed- ings of the IEEE International Conference on Robotics and Automation (ICRA), 2024
work page 2024
- [14]
-
[15]
ConceptFusion: Open-set multimodal 3D mapping
Krishna Murthy Jatavallabhula, Alihusein Kuwajerwala, Qiao Gu, Mohd Omama, Tao Chen, Alaa Maalouf, Shuang Li, Ganesh Iyer, Soroush Saryazdi, Nikhil Keetha, et al. ConceptFusion: Open-set multimodal 3D mapping. InProceedings of Robotics: Science and Systems (RSS), 2023
work page 2023
-
[16]
Mathieu Labbe and Francois Michaud. Appearance- based loop closure detection for online large-scale and long-term operation.IEEE Transactions on Robotics, 29 (3):734–745, 2013
work page 2013
-
[17]
Mathieu Labb ´e and Franc ¸ois Michaud. Rtab-map as an open-source lidar and visual simultaneous localization and mapping library for large-scale and long-term online operation.Journal of field robotics, 36(2):416–446, 2019
work page 2019
-
[18]
Vincent Lepetit, Francesc Moreno-Noguer, and Pascal Fua. Ep n p: An accurate o (n) solution to the p n p problem.International journal of computer vision, 81 (2):155–166, 2009
work page 2009
-
[19]
Sgs- slam: Semantic gaussian splatting for neural dense slam
Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, and Hongyu Wang. Sgs- slam: Semantic gaussian splatting for neural dense slam. InEuropean Conference on Computer Vision, pages 163–
-
[20]
Lightglue: Local feature matching at light speed
Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Pollefeys. Lightglue: Local feature matching at light speed. InProceedings of the IEEE/CVF international conference on computer vision, pages 17627–17638, 2023
work page 2023
-
[21]
OK-Robot: What really matters in integrating open- knowledge models for robotics
Peiqi Liu, Yaswanth Orru, Jay Vakil, Chris Paxton, Nur Muhammad Mahi Shafiullah, and Lerrel Pinto. OK-Robot: What really matters in integrating open- knowledge models for robotics. InProceedings of Robotics: Science and Systems (RSS), 2024. doi: 10. 15607/RSS.2024.XX.091
work page 2024
-
[22]
A comprehensive survey of visual slam algorithms.Robotics, 11(1):24, 2022
Andr ´ea Macario Barros, Maugan Michel, Yoann Moline, Gwenol´e Corre, and Fr ´ed´erick Carrel. A comprehensive survey of visual slam algorithms.Robotics, 11(1):24, 2022
work page 2022
-
[23]
Will Maddern, Michael Milford, and Gordon Wyeth. CAT-SLAM: Probabilistic localisation and mapping us- ing a continuous appearance-based trajectory.The In- ternational Journal of Robotics Research (IJRR), 31(4): 429–451, 2012. doi: 10.1177/0278364912438273
-
[24]
Scaling local control to large-scale topological navigation
Xiangyun Meng, Nathan Ratliff, Yu Xiang, and Dieter Fox. Scaling local control to large-scale topological navigation. In2020 IEEE International Conference on Robotics and Automation (ICRA), pages 672–678. IEEE, 2020
work page 2020
-
[25]
Michael J Milford and Gordon F Wyeth. Mapping a suburb with a single camera using a biologically inspired slam system.IEEE Transactions on Robotics, 24(5): 1038–1053, 2008
work page 2008
-
[26]
Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. Orb-slam: A versatile and accurate monocular slam system.IEEE transactions on robotics, 31(5):1147–1163, 2015
work page 2015
-
[27]
Mast3r-slam: Real-time dense slam with 3d reconstruc- tion priors
Riku Murai, Eric Dexheimer, and Andrew J Davison. Mast3r-slam: Real-time dense slam with 3d reconstruc- tion priors. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 16695–16705, 2025
work page 2025
-
[28]
Xfeat: Accelerated features for lightweight image matching
Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Martins, and Erickson R Nascimento. Xfeat: Accelerated features for lightweight image matching. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2682–2691, 2024
work page 2024
-
[29]
Long Quan and Zhongdan Lan. Linear n-point camera pose determination.IEEE Transactions on pattern anal- ysis and machine intelligence, 21(8):774–780, 1999
work page 1999
-
[30]
Beyond the Kalman Filter: Particle Filters for Track- ing Applications
Branko Ristic, Sanjeev Arulampalam, and Neil Gordon. Beyond the Kalman Filter: Particle Filters for Track- ing Applications. Artech House Radar Library. Artech House, Boston, London, 2004. ISBN 9781580536318
work page 2004
-
[31]
Semi-parametric topological memory for nav- igation
Nikolay Savinov, Alexey Dosovitskiy, and Vladlen Koltun. Semi-parametric topological memory for nav- igation. InInternational Conference on Learning Repre- sentations, 2018
work page 2018
-
[32]
Rover: A multi-season dataset for visual slam.IEEE Transactions on Robotics, 2025
Fabian Schmidt, Julian Daubermann, Marcel Mitschke, Constantin Blessing, Stephan Meyer, Markus Enzweiler, and Abhinav Valada. Rover: A multi-season dataset for visual slam.IEEE Transactions on Robotics, 2025
work page 2025
-
[33]
Xuesong Shi, Dongjiang Li, Pengpeng Zhao, Qinbin Tian, Yuxin Tian, Qiwei Long, Chunhao Zhu, Jingwei Song, Fei Qiao, Le Song, Yangquan Guo, Zhigang Wang, Yimin Zhang, Baoxing Qin, Wei Yang, Fangshi Wang, Rosa H. M. Chan, and Qi She. Are we ready for ser- vice robots? the OpenLORIS-Scene datasets for lifelong SLAM. In2020 International Conference on Robotic...
work page 2020
-
[34]
Placenav: Topological navigation through place recognition
Lauri Suomela, Jussi Kalliola, Harry Edelman, and Joni- Kristian K ¨am¨ar¨ainen. Placenav: Topological navigation through place recognition. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 5205–5213. IEEE, 2024
work page 2024
-
[35]
S Urban, J Leitloff, and S Hinz. Mlpnp–a real-time maximum likelihood solution to the perspective-n-point problem.ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 3:131–138, 2016
work page 2016
-
[36]
Probable object location (polo) score estimation for efficient object goal naviga- tion
Jiaming Wang and Harold Soh. Probable object location (polo) score estimation for efficient object goal naviga- tion. In2024 IEEE International Conference on Robotics and Automation (ICRA), pages 5221–5227. IEEE, 2024
work page 2024
-
[37]
Jiaming Wang, Diwen Liu, Jizhuo Chen, Jiaxuan Da, Nuowen Qian, Minh Man Tram, and Harold Soh. Genie: A generalizable navigation system for in-the-wild envi- ronments.IEEE Robotics and Automation Letters, 2025
work page 2025
-
[38]
Jiaming Wang, Diwen Liu, Jizhuo Chen, and Harold Soh. Topo-bench: An open-source topological mapping eval- uation framework with quantifiable perceptual aliasing. arXiv preprint arXiv:2510.04100, 2025
-
[39]
Vggt: Visual geometry grounded transformer
Jianyuan Wang, Minghao Chen, Nikita Karaev, Andrea Vedaldi, Christian Rupprecht, and David Novotny. Vggt: Visual geometry grounded transformer. InProceedings of the Computer Vision and Pattern Recognition Confer- ence, pages 5294–5306, 2025
work page 2025
-
[40]
Hierarchical open-vocabulary 3d scene graphs for language-grounded robot navigation,
Abdelrhman Werby, Chenguang Huang, Martin B ¨uchner, Abhinav Valada, and Wolfram Burgard. Hierarchi- cal Open-V ocabulary 3D Scene Graphs for Language- Grounded Robot Navigation. InProceedings of Robotics: Science and Systems, Delft, Netherlands, July 2024. doi: 10.15607/RSS.2024.XX.077
-
[41]
Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields
Shijie Zhou, Haoran Chang, Sicheng Jiang, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, and Achuta Kadambi. Feature 3dgs: Supercharging 3d gaussian splatting to enable distilled feature fields. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition, pages 21676–21685, 2024
work page 2024
-
[42]
Sni-slam: Semantic neural implicit slam
Siting Zhu, Guangming Wang, Hermann Blum, Jiuming Liu, Liang Song, Marc Pollefeys, and Hesheng Wang. Sni-slam: Semantic neural implicit slam. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21167–21177, 2024. APPENDIXA EXPERIMENTDETAILS A. Topological Localization Baselines This appendix describes the topological...
work page 2024
-
[43]
Greedy Matching (GM):The greedy matching baseline localizes by selecting the node with the highest similarity score to the current observation. If the maximum similarity exceeds a fixed thresholdτ, the corresponding node is selected as the localization result; otherwise, the localization estimate remains unchanged. This baseline reflects a common retrieva...
-
[44]
Sequence Matching (SM):Instead of matching a single observation, sequence matching aggregates similarity scores over a short temporal window to improve robustness against perceptual aliasing and viewpoint variation. A candidate match between nodes(v i, vj)is accepted if the aggregated similarity over a window of size2h+1satisfies f sim(zvi−h, zvj −h), . ....
-
[45]
Probabilistic Belief Update (PBU):The probabilistic belief update baseline maintains a discrete posterior belief bt(v) =P(v t =v|z 1:t)over the topological nodesv∈ Vat timet. Given the belief at the previous timestep, the state is first propagated via a motion modelP(v t |v t−1)that constrains allowable transitions based on the graph topology: P(v t |v t−...
-
[46]
Preliminaries and Notation:Letx t ∈R n be the (locally Euclidean) state with motion and measurement models xt =f t(xt−1, ut−1) +q t, q t ∼ N(0, Q t), zt =h t(xt) +r t, r t ∼ N(0, R t), (10) whereQ t, Rt ≻0. The GSF represents the filtering density as a finite mixture p(xt−1 |z 1:t−1, u1:t−2) = Kt−1X k=1 w(k) t−1 N xt−1;µ (k) t−1,Σ (k) t−1 , withw (k) t−1 ...
-
[47]
a) Prediction (per component).:For eachk= 1,
Exact Linear–Gaussian GSF:Assume linear–Gaussian models: xt =F txt−1 +B tut−1 +q t, z t =H txt +r t. a) Prediction (per component).:For eachk= 1, . . . , Kt−1, µ(k) t|t−1 =F tµ(k) t−1 +B tut−1, Σ(k) t|t−1 =F tΣ(k) t−1F ⊤ t +Q t, w(k) t|t−1 =w (k) t−1. (11) Thusp(x t |z 1:t−1, u1:t−1) =P k w(k) t|t−1N(x t;µ (k) t|t−1,Σ (k) t|t−1). b) Update (per component ...
-
[48]
Nonlinear GSF via Local Gaussianization:For nonlin- ear (10), GSF applies a local Gaussian filter to each compo- nent. a) EKF-style (per component).:Linearize around the current component mean: ft(x, u)≈f t(µ(k) t−1, u) +F (k) t (x−µ (k) t−1), ht(x)≈h t(µ(k) t|t−1) +H (k) t (x−µ (k) t|t−1), whereF (k) t , H(k) t are Jacobians. Then apply (11)–(14) with (F...
-
[49]
Mixture Identities:LetN i(x) =N(x;m i, Si)fori∈ {1,2}. a) Product of Gaussians.: N1(x)N2(x) =N(m 1;m 2, S1+S2)N(x;m, S),(15) whereS= (S −1 1 +S −1 2 )−1 andm=S(S −1 1 m1 +S −1 2 m2). b) Innovation evidence.:For predicted(µ −,Σ −)and measurementz=Hx+r,r∼ N(0, R), the innovation y=z−Hµ − satisfiesy∼ N(0, S)withS=HΣ −H ⊤ +R, yielding the evidence term in (14)
-
[50]
Mixture Growth Control:To prevent unbounded mixture growth, GSF typically uses: a) Pruning.:Remove components withw (k) t < ε. b) Reduction / merging.:Iteratively merge nearby com- ponents (e.g., using a KL-based criterion) untilK t ≤K max. Merging two components with weightsa, bby moment match- ing gives µ= aµa +bµ b a+b ,(16) Σ = a Σa + (µa −µ)(µ a −µ) ...
-
[51]
In practice, gating and sparsification reduce theK×C t expansion
Mixture–Mixture Update (Optional):If the measurement factor is approximated by a mixture Qt(x) = PCt c=1 π(c) t N(x;ν (c) t ,Λ (c) t ), then the update is a mixture–mixture product: p(xt |z 1:t)∝p −(xt)Q t(xt), with p−(xt) = X k w(k) t|t−1N xt;µ (k) t|t−1,Σ (k) t|t−1 , and p(xt |z 1:t) = Kt−1X k=1 CtX c=1 ˜wk,c N(x t;m k,c, Sk,c).(17) Here(m k,c, Sk,c)fol...
-
[52]
Manifold Adaptation (Lie Groups):On a Lie groupX (e.g.,SE(3)), represent each mixture component as a Gaussian in a consistent tangent chartϕ(·), apply the Euclidean GSF updates to ξt =ϕ(x t), and reconstruct means viaexp(·). During prediction, covariances are transported through group composition using the ap- propriate adjoint (first-order), yielding the...
-
[53]
2)Update:apply (12)–(14) (or (17) for mixture likelihoods)
One-Step GSF Summary:Given{w (k) t−1, µ(k) t−1,Σ (k) t−1} Kt−1 k=1 : leftmargin=1.2em,itemsep=2pt 1)Predict:propagate each component (linear (11), or EKF/UKF/CKF per component). 2)Update:apply (12)–(14) (or (17) for mixture likelihoods). 3)Control:prune/reduce (and optionally split) to enforceK t ≤ Kmax. On manifolds, perform all steps in the chosen chart...
-
[54]
Setup:LetX t−1 ∈SE(3)be distributed asX t−1 =µexp(ε) withε∼ N(0,Σ)⊂se(3). The (right-invariant) stochastic motion model is Xt =X t−1 ∆Tt exp(νt), ν t ∼ N(0, Q t)⊂se(3), withν t independent ofε
-
[55]
Prediction Kernel:For a single mixand, the predicted density is ¯p(xt) = Z p(xt |x t−1)N se(3) log(µ−1xt−1); 0,Σ dxt−1
-
[56]
Using the group adjoint and BCH, µexp(ε) ∆Tt =µ∆T t exp Ad∆T −1 t ε+O(∥ε∥ 2)
First-Order Pushforward:WriteX t−1 =µexp(ε). Using the group adjoint and BCH, µexp(ε) ∆Tt =µ∆T t exp Ad∆T −1 t ε+O(∥ε∥ 2) . Post-multiplying byexp(ν t)and applying BCH again yields µ∆Tt exp Ad∆T −1 t ε exp(νt) =µ∆T t exp Ad∆T −1 t ε +ν t +O(∥ε∥ 2 +∥ν t∥2) . Neglecting higher-order terms, the updated error in the right- invariant chart at the predicted mea...
-
[57]
Mixtures and Weights:Since R p(xt |x t−1)dx t = 1, prediction preserves mixture weights: ifp(x) = P k wkpk(x)then ¯p(x) =P k wk ¯pk(x)
-
[58]
Small-Increment Approximation:If∆T t = exp(ξ t)with ∥ξt∥ ≪1, then Ad∆T −1 t =I−ad(ξ t) +O(∥ξ t∥2), and the transported covariance expands as Ad∆T −1 t ΣAd⊤ ∆T −1 t = Σ−ad(ξ t)Σ−Σ ad(ξ t)⊤ +O(∥ξ t∥2∥Σ∥). At high update rates (small∥ξ t∥) and when covariances are main- tained in the updated right-invariant chart, a common conservative approximation is Σ− ≈Σ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.