pith. sign in

arxiv: 2510.09574 · v2 · submitted 2025-10-10 · 💻 cs.RO

Online Structure Learning and Planning for Autonomous Robot Navigation using Active Inference

Pith reviewed 2026-05-18 07:37 UTC · model grok-4.3

classification 💻 cs.RO
keywords active inferencerobot navigationtopological mappingexpected free energyself-supervised learningautonomous navigationgenerative modelonline planning
0
0 comments X

The pith

AIMAPP unifies mapping, localisation and planning in one generative model for online robot navigation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AIMAPP as a framework that combines mapping, localisation and decision-making into a single generative model for robots moving through unfamiliar spaces. The robot builds a sparse topological map incrementally, learns state transitions from its own experience, and chooses actions by minimising expected free energy so that it can both explore and pursue goals. The approach runs fully self-supervised on standard hardware, tolerates sensor noise and drift, and was tested in large real and simulated environments against conventional planners. A reader would care because it offers a way for robots to operate without pre-built maps or separate training phases for each new setting.

Core claim

AIMAPP unifies mapping, localisation, and decision-making within a single generative model. The agent builds and updates a sparse topological map online, learns state transitions dynamically, and plans actions by minimising Expected Free Energy. This allows it to balance goal-directed and exploratory behaviours. The system is implemented as a ROS-compatible, sensor and robot-agnostic framework that operates fully self-supervised, remains resilient to sensor failure and odometric drift, and supports both exploration and goal-directed navigation without pre-training.

What carries the argument

The unified generative model that maintains a sparse topological map while using expected free energy minimisation to select actions and update transition beliefs.

Load-bearing premise

The topological map and learned transition probabilities stay accurate enough for planning even when sensors are noisy, the robot drifts, or the environment changes, all without external corrections or manual parameter tuning.

What would settle it

Run the robot through a large space with added sensor noise and repeated layout changes; if it repeatedly fails to reach goals or becomes lost while using only its internal model and without any external map updates, the central claim does not hold.

Figures

Figures reproduced from arXiv: 2510.09574 by Bart Dhoedt, Daria de Tinguy, Emilio Gamba, Tim Verbelen.

Figure 1
Figure 1. Figure 1: Our model coarsely localises itself based on local motion and [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: POMDP of our model. Cat stands for Categorical. The model integrates states (st), positions (pt), and observations (ot) over time, guided by policies (π) and expected free energy (G). The categorical distributions define transition and observation likelihoods: Ap (position likelihoods), Ao (observation likelihoods), Bp (position transitions), and Bs (state transitions). This structure underpins the inferen… view at source ↗
Figure 3
Figure 3. Figure 3: Influence of a node at position (0,0) on adjacent node creation for [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: schematic of what could be the topological map in a simple 2-room [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Overview of the system architecture. In grey, we have modules [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Impact of drift on the agent’s localisation. The top row illustrates the [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Coverage efficiency of each model in the largest warehouse and home, [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The environment can adapt to change in real-time, here a box was dis [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Drift comparison in the (x, y) plane between our agent’s internal belief (model odometry), the robot’s onboard sensor odometry, and the ground-truth trajectory measured with Qualisys. Missing segments in the plot correspond to gaps in ground-truth perception. TABLE II RMSE OVER X AND Y AVERAGED OVER 5 RUNS PER MODEL. AIMAPP Frontiers Manual model sensor sensor sensor RMSE (x,y) 1.83 ± 0.77 1.65 ± 0.79 1.68… view at source ↗
Figure 10
Figure 10. Figure 10: MCTS path ”states rewards” (free energy minimisation) considering an observation held by two states (circled green on figures). The higher the [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Travelled distance to goals vs A* expected distance from robot [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: In the big warehouse and garage, examples of paths taken by the agent (red, green, blue), from the start to the goal image presented along the ideal [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The measured impact of AIMAPP with MCTS policy evaluation and [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Turtlebot3 waffle robot was used in simulation, while tests have been [PITH_FULL_IMAGE:figures/full_fig_p018_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: The three Amazon warehouse environments and the house used in Gazebo, as well as the parking lot, small house and warehouse used in the real [PITH_FULL_IMAGE:figures/full_fig_p020_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Coverage efficiency of each model over 5 runs in small environments, [PITH_FULL_IMAGE:figures/full_fig_p021_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Approximate exploration trajectories from black to yellow. We can notice how some agents never enter the bedroom and playroom, as their Lidar [PITH_FULL_IMAGE:figures/full_fig_p022_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Coverage of the real-world parking lot by AIMAPP compared to a [PITH_FULL_IMAGE:figures/full_fig_p022_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Example of the FAEL getting stuck in an open area. [PITH_FULL_IMAGE:figures/full_fig_p023_19.png] view at source ↗
read the original abstract

Autonomous navigation in unfamiliar environments requires robots to simultaneously explore, localise, and plan under uncertainty, without relying on predefined maps or extensive training. We present Active Inference MAPping and Planning (AIMAPP), a framework unifying mapping, localisation, and decision-making within a single generative model, drawing on cognitive-mapping concepts from animal navigation (topological organisation, discrete spatial representations and predictive belief updating) as design inspiration. The agent builds and updates a sparse topological map online, learns state transitions dynamically, and plans actions by minimising Expected Free Energy. This allows it to balance goal-directed and exploratory behaviours. We implemented AIMAPP as a ROS-compatible system that is sensor and robot-agnostic and integrates with diverse hardware configurations. It operates in a fully self-supervised manner, is resilient to sensor failure, continues operating under odometric drift, and supports both exploration and goal-directed navigation without any pre-training. We evaluate the system in large-scale real and simulated environments against state-of-the-art planning baselines, demonstrating its adaptability to ambiguous observations, environmental changes, and sensor noise. The model offers a modular, self-supervised solution to scalable navigation in unstructured settings. AIMAPP is available at https://github.com/decide-ugent/aimapp.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces AIMAPP, an active-inference framework that unifies mapping, localisation, and planning for autonomous robot navigation within a single generative model. The agent builds and updates a sparse topological map online, learns state-transition probabilities dynamically from raw observations, and selects actions by minimising Expected Free Energy, thereby balancing exploration and goal-directed behaviour. The system is implemented as a ROS-compatible, sensor- and robot-agnostic package that operates fully self-supervised, is claimed to remain resilient to sensor noise, odometric drift, and environmental changes, and is evaluated against planning baselines in large-scale real-world and simulated environments.

Significance. If the central claims are substantiated, the work supplies a modular, cognitively inspired alternative to conventional SLAM-plus-planning pipelines that avoids pre-training, external pose correction, and hand-tuned parameters. The open-source release and hardware-agnostic design would strengthen its utility for scalable navigation in unstructured settings.

major comments (3)
  1. [Abstract] Abstract: the assertion that the system 'continues operating under odometric drift' and remains 'self-supervised' is load-bearing for the unification claim, yet no quantitative metrics (e.g., node-identity consistency, transition-prediction error under controlled drift levels, or loop-closure frequency) are reported to demonstrate that local Dirichlet updates preserve a faithful approximation of the true dynamics.
  2. [Evaluation] Evaluation section: performance is reported against baselines, but the absence of ablation studies isolating the contribution of online transition learning versus the topological map construction, together with missing error bars on success rates and path-length metrics, leaves the resilience and adaptability claims only partially supported.
  3. [Model description] Model description: the manuscript does not supply the explicit update rules or free-energy functional for the joint inference over map structure, transition probabilities, and policy, making it impossible to verify that Expected Free Energy minimisation remains well-defined when node identities become ambiguous under sustained drift.
minor comments (2)
  1. [Notation] Notation for the generative model and free-energy terms should be introduced with a single, self-contained table or appendix to improve readability.
  2. [Figures] Figure captions for the real-world experiments should explicitly state the sensor suite, drift magnitude, and environmental changes tested.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating the revisions we intend to incorporate to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that the system 'continues operating under odometric drift' and remains 'self-supervised' is load-bearing for the unification claim, yet no quantitative metrics (e.g., node-identity consistency, transition-prediction error under controlled drift levels, or loop-closure frequency) are reported to demonstrate that local Dirichlet updates preserve a faithful approximation of the true dynamics.

    Authors: We appreciate the referee drawing attention to the need for more targeted quantitative support. While the experimental results demonstrate continued successful navigation and mapping under real odometric drift and sensor noise, we agree that dedicated metrics would better substantiate the claims regarding local Dirichlet updates. In the revised manuscript we will add quantitative evaluations of node-identity consistency and transition-prediction error under controlled levels of simulated drift, together with loop-closure statistics. revision: yes

  2. Referee: [Evaluation] Evaluation section: performance is reported against baselines, but the absence of ablation studies isolating the contribution of online transition learning versus the topological map construction, together with missing error bars on success rates and path-length metrics, leaves the resilience and adaptability claims only partially supported.

    Authors: We concur that ablation studies and statistical reporting would clarify the individual contributions. We will introduce ablations that compare the full model against variants with fixed (non-online) transition probabilities and against variants without dynamic map updates. In addition, all success-rate and path-length results will be reported with error bars or standard deviations across repeated trials in the revised evaluation section. revision: yes

  3. Referee: [Model description] Model description: the manuscript does not supply the explicit update rules or free-energy functional for the joint inference over map structure, transition probabilities, and policy, making it impossible to verify that Expected Free Energy minimisation remains well-defined when node identities become ambiguous under sustained drift.

    Authors: The generative model, state-transition learning via Dirichlet updates, and Expected Free Energy policy selection are described in the Model Description section. However, we acknowledge that the explicit joint update equations and the full free-energy functional were not presented at a level of detail sufficient for independent verification under node ambiguity. We will expand this section (or add an appendix) with the precise update rules for map structure, transition probabilities, and policy, together with a brief analysis of how Expected Free Energy minimisation behaves when node identities are uncertain. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain.

full rationale

The paper presents AIMAPP as an integration of mapping, localisation and planning inside a single active-inference generative model that builds a sparse topological map online, learns transitions dynamically and selects actions by minimising Expected Free Energy. These steps are described as direct applications of established active-inference constructs and cognitive-mapping principles rather than as quantities derived from the same fitted parameters or self-citations that are then re-labelled as predictions. The system is evaluated against external planning baselines in both real and simulated environments, and the claims of resilience to drift and sensor noise are supported by empirical runs rather than by internal re-use of the learned quantities. No equation or procedural step in the manuscript reduces a claimed result to an input by construction, so the derivation remains self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on standard active-inference assumptions and the domain premise that a sparse topological graph plus learned transitions suffice for reliable navigation under uncertainty.

axioms (1)
  • domain assumption A sparse topological representation plus learned transition probabilities is sufficient to support both exploration and goal-directed navigation under sensor noise and drift.
    Invoked throughout the description of map building and planning.

pith-pipeline@v0.9.0 · 5755 in / 1201 out tokens · 30923 ms · 2026-05-18T07:37:52.776160+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 2 internal anchors

  1. [1]

    Past, present and future of path-planning algorithms for mobile robot navi- gation in dynamic environments,

    H. S. Hewawasam, M. Y . Ibrahim, and G. K. Appuhamillage, “Past, present and future of path-planning algorithms for mobile robot navi- gation in dynamic environments,”IEEE Open Journal of the Industrial Electronics Society, vol. 3, pp. 353–365, 2022

  2. [2]

    A survey on reinforcement learning applications in slam,

    M. Dehghani Tezerjani, M. Khoshnazar, M. Tangestanizade, and Q. Yang, “A survey on reinforcement learning applications in slam,” 07 2024

  3. [3]

    Learning to navigate from scratch using world models and curiosity: the good, the bad, and the ugly,

    D. de Tinguy, S. Remmery, P. Mazzaglia, T. Verbelen, and B. Dhoedt, “Learning to navigate from scratch using world models and curiosity: the good, the bad, and the ugly,” 2023

  4. [4]

    Etpnav: Evolving topological planning for vision-language navigation in continuous environments,

    D. An, H. Wang, W. Wang, Z. Wang, Y . Huang, K. He, and L. Wang, “Etpnav: Evolving topological planning for vision-language navigation in continuous environments,” 2024

  5. [5]

    ORB-SLAM3: An accurate open-source library for visual, visual-inertial and multi-map SLAM,

    C. Campos, R. Elvira, J. J. Gomez, J. M. M. Montiel, and J. D. Tardos, “ORB-SLAM3: An accurate open-source library for visual, visual-inertial and multi-map SLAM,”IEEE Transactions on Robotics, vol. 37, no. 6, pp. 1874–1890, 2021

  6. [6]

    Frontier based exploration for autonomous robot,

    A. Topiwala, P. Inani, and A. Kathpal, “Frontier based exploration for autonomous robot,” 2018

  7. [7]

    Learning to explore using active neural slam,

    D. S. Chaplot, D. Gandhi, S. Gupta, A. Gupta, and R. Salakhutdinov, “Learning to explore using active neural slam,” inInternational Confer- ence on Learning Representations (ICLR), 2020

  8. [8]

    T. Parr, G. Pezzulo, and K. Friston,Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. The MIT Press, 03 2022

  9. [9]

    World model learning and inference,

    K. Friston, R. J. Moran, Y . Nagai, T. Taniguchi, H. Gomi, and J. Tenen- baum, “World model learning and inference,”Neural Networks, vol. 144, pp. 573–590, 2021

  10. [10]

    Learning dynamic cognitive map with autonomous navigation,

    D. de Tinguy, T. Verbelen, and B. Dhoedt, “Learning dynamic cognitive map with autonomous navigation,”Frontiers in Computational Neuro- science, vol. 18, Dec. 2024

  11. [11]

    Fael: Fast autonomous exploration for large-scale environments with a mobile robot,

    J. Huang, B. Zhou, Z. Fan, Y . Zhu, Y . Jie, L. Li, and H. Cheng, “Fael: Fast autonomous exploration for large-scale environments with a mobile robot,”IEEE Robotics and Automation Letters, vol. 8, pp. 1667–1674, 2023

  12. [12]

    Graph-based subterranean exploration path planning using aerial and legged robots,

    T. Dang, M. Tranzatto, S. Khattak, F. Mascarich, K. Alexis, and M. Hutter, “Graph-based subterranean exploration path planning using aerial and legged robots,”Journal of Field Robotics, vol. 37, no. 8, pp. 1363–1388, 2020. Wiley Online Library

  13. [13]

    Accessed: 2024-12-01

    nav2, “nav2,” 2021. Accessed: 2024-12-01

  14. [14]

    Lightweight 3-d localization and mapping for solid-state lidar,

    H. Wang, C. Wang, and L. Xie, “Lightweight 3-d localization and mapping for solid-state lidar,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1801–1807, 2021

  15. [15]

    Structure plp-slam: Efficient sparse mapping and localization using point, line and plane for monocular, rgb-d and stereo cameras,

    F. Shu, J. Wang, A. Pagani, and D. Stricker, “Structure plp-slam: Efficient sparse mapping and localization using point, line and plane for monocular, rgb-d and stereo cameras,” 2023

  16. [16]

    FAST- LIO2: fast direct lidar-inertial odometry,

    W. Xu, Y . Cai, D. He, J. Lin, and F. Zhang, “FAST-LIO2: fast direct lidar-inertial odometry,”CoRR, vol. abs/2107.06829, 2021

  17. [17]

    Review of autonomous path planning algorithms for mobile robots,

    H. Qin, S. Shao, T. Wang, X. Yu, Y . Jiang, and Z. Cao, “Review of autonomous path planning algorithms for mobile robots,”Drones, vol. 7, no. 3, 2023

  18. [18]

    Autonomous mapless navigation on uneven terrains,

    H. Jardali, M. Ali, and L. Liu, “Autonomous mapless navigation on uneven terrains,”2024 IEEE International Conference on Robotics and Automation (ICRA), pp. 13227–13233, 2024

  19. [19]

    Map induction: Compositional spatial submap learning for efficient exploration in novel environments,

    S. Sharma, A. Curtis, M. Kryven, J. B. Tenenbaum, and I. R. Fiete, “Map induction: Compositional spatial submap learning for efficient exploration in novel environments,”CoRR, vol. abs/2110.12301, 2021

  20. [20]

    Byol-explore: Exploration by bootstrapped prediction,

    Z. D. Guo, S. Thakoor, M. P ˆıslar, B. A. Pires, F. Altch ´e, C. Tallec, A. Saade, D. Calandriello, J.-B. Grill, Y . Tang, M. Valko, R. Munos, M. G. Azar, and B. Piot, “Byol-explore: Exploration by bootstrapped prediction,” 2022

  21. [21]

    Rapid exploration for open-world navigation with latent goal models,

    D. Shah, B. Eysenbach, G. Kahn, N. Rhinehart, and S. Levine, “Rapid exploration for open-world navigation with latent goal models,” 2023

  22. [22]

    Viking: Vision-based kilometer-scale naviga- tion with geographic hints,

    D. Shah and S. Levine, “Viking: Vision-based kilometer-scale naviga- tion with geographic hints,” inRobotics: Science and Systems XVIII, Robotics: Science and Systems Foundation, June 2022

  23. [23]

    Nomad: Goal masked diffusion policies for navigation and exploration,

    A. Sridhar, D. Shah, C. Glossop, and S. Levine, “Nomad: Goal masked diffusion policies for navigation and exploration,” 2023

  24. [24]

    Discovering and achieving goals via world models,

    R. Mendonca, O. Rybkin, K. Daniilidis, D. Hafner, and D. Pathak, “Discovering and achieving goals via world models,” 2021

  25. [25]

    Learning to Navigate in Complex Environments

    P. Mirowski, R. Pascanu, F. Viola, H. Soyer, A. J. Ballard, A. Banino, M. Denil, R. Goroshin, L. Sifre, K. Kavukcuoglu, D. Kumaran, and R. Hadsell, “Learning to navigate in complex environments,”CoRR, vol. abs/1611.03673, 2016

  26. [26]

    Ratslam: a hippocampal model for simultaneous localization and mapping,

    M. Milford, G. Wyeth, and D. Prasser, “Ratslam: a hippocampal model for simultaneous localization and mapping,” inIEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA ’04. 2004, vol. 1, pp. 403–408 V ol.1, 2004

  27. [27]

    Bio- inspired intelligence with applications to robotics: a survey,

    J. Li, Z. Xu, D. Zhu, K. Dong, T. Yan, Z. Zeng, and S. X. Yang, “Bio- inspired intelligence with applications to robotics: a survey,”Intelligence and Robotics, vol. 1, no. 1, 2021

  28. [28]

    Planning and navigation as active inference,

    R. Kaplan and K. Friston, “Planning and navigation as active inference,” bioRxiv, 12 2017

  29. [29]

    Human spatial representation: What we cannot learn from the studies of rodent navigation,

    M. Zhao, “Human spatial representation: What we cannot learn from the studies of rodent navigation,”Journal of Neurophysiology, vol. 120, 08 2018

  30. [30]

    Spatial and temporal hierarchy for autonomous navigation using active inference in minigrid environment,

    de Tinguy, Daria and Van de Maele, Toon and Verbelen, Tim and Dhoedt, Bart, “Spatial and temporal hierarchy for autonomous navigation using active inference in minigrid environment,”ENTROPY, vol. 26, no. 1, p. 32, 2024

  31. [31]

    Struc- ture learning enhances concept formation in synthetic active inference agents,

    V . Neacsu, M. B. Mirza, R. A. Adams, and K. J. Friston, “Struc- ture learning enhances concept formation in synthetic active inference agents,”PLOS ONE, vol. 17, pp. 1–34, 11 2022

  32. [32]

    Exploring action-oriented models via active inference for autonomous vehicles,

    S. Nozari, A. Krayani, P. Marin, L. Marcenaro, D. Mart ´ın G´omez, and C. Regazzoni, “Exploring action-oriented models via active inference for autonomous vehicles,”EURASIP Journal on Advances in Signal Processing, vol. 2024, 10 2024

  33. [33]

    Safron, O

    A. Safron, O. C ¸ atal, and T. Verbelen, “Generalized simultaneous lo- calization and mapping (g-slam) as unification framework for natural and artificial intelligences: towards reverse engineering the hippocam- pal/entorhinal system and principles of high-level cognition,”Frontiers in Systems Neuroscience, vol. V olume 16 - 2022, 2022

  34. [34]

    How to build a cognitive map,

    J. C. R. Whittington, D. McCaffary, J. J. W. Bakermans, and T. E. J. Behrens, “How to build a cognitive map,”Nature Neuroscience, vol. 25, no. 10, pp. 1257–1272, 2022

  35. [35]

    Active inference as a theory of sentient behavior,

    G. Pezzulo, T. Parr, and K. Friston, “Active inference as a theory of sentient behavior,”Biological Psychology, vol. 186, p. 108741, 2024

  36. [36]

    Active inference and learning,

    K. Friston, T. FitzGerald, F. Rigoli, P. Schwartenbeck, J. O. Doherty, and G. Pezzulo, “Active inference and learning,”Neuroscience and Biobehavioral Reviews, vol. 68, pp. 862–879, 2016. JOURNAL OF LATEX CLASS FILES, 17

  37. [37]

    Mice in a labyrinth show rapid learning, sudden insight, and efficient exploration,

    M. Rosenberg, T. Zhang, P. Perona, and M. Meister, “Mice in a labyrinth show rapid learning, sudden insight, and efficient exploration,”eLife, vol. 10, p. e66175, jul 2021

  38. [38]

    A survey of monte carlo tree search methods,

    C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A survey of monte carlo tree search methods,”IEEE Transactions on Computational Intelligence and AI in Games, vol. 4, no. 1, pp. 1–43, 2012

  39. [39]

    Deep active inference agents using monte-carlo methods,

    Z. Fountas, N. Sajid, P. A. M. Mediano, and K. Friston, “Deep active inference agents using monte-carlo methods,” 2020

  40. [40]

    Active inference and intentional behaviour,

    K. J. Friston, T. Salvatori, T. Isomura, A. Tschantz, A. Kiefer, T. Ver- belen, M. Koudahl, A. Paul, T. Parr, A. Razi, B. Kagan, C. L. Buckley, and M. J. D. Ramstead, “Active inference and intentional behaviour,” 2023

  41. [41]

    Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps,

    D. George, R. Rikhye, N. Gothoskar, J. S. Guntupalli, A. Dedieu, and M. L ´azaro-Gredilla, “Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps,”Nature Communications, vol. 12, 04 2021

  42. [42]

    Struc- turing knowledge with cognitive maps and cognitive graphs,

    M. Peer, I. K. Brunec, N. S. Newcombe, and R. A. Epstein, “Struc- turing knowledge with cognitive maps and cognitive graphs,”Trends in Cognitive Sciences, vol. 25, no. 1, pp. 37–54, 2021

  43. [43]

    Do humans integrate routes into a cognitive map? map- versus landmark-based navigation of novel shortcuts.,

    P. Foo, W. Warren, A. Duchon, and M. Tarr, “Do humans integrate routes into a cognitive map? map- versus landmark-based navigation of novel shortcuts.,”Journal of experimental psychology. Learning, memory, and cognition, vol. 31, pp. 195–215, 04 2005

  44. [44]

    The cognitive map in humans: Spatial navigation and beyond,

    R. Epstein, E. Z. Patai, J. Julian, and H. Spiers, “The cognitive map in humans: Spatial navigation and beyond,”Nature Neuroscience, vol. 20, pp. 1504–1513, 10 2017

  45. [45]

    Wormholes in virtual space: From cognitive maps to cognitive graphs,

    W. H. Warren, D. B. Rothman, B. H. Schnapp, and J. D. Ericson, “Wormholes in virtual space: From cognitive maps to cognitive graphs,” Cognition, vol. 166, pp. 152–163, 2017

  46. [46]

    Vision-based place recognition: how low can you go?,

    M. Milford, “Vision-based place recognition: how low can you go?,”The International Journal of Robotics Research, vol. 32, no. 7, pp. 766–789, 2013

  47. [47]

    Object goal navigation using goal-oriented semantic exploration,

    D. S. Chaplot, D. Gandhi, A. Gupta, and R. Salakhutdinov, “Object goal navigation using goal-oriented semantic exploration,” 2020

  48. [48]

    Can an embodied agent find your “cat-shaped mug

    V . S. Dorbala, J. F. Mullen, and D. Manocha, “Can an embodied agent find your “cat-shaped mug”? llm-based zero-shot object navigation,” IEEE Robotics and Automation Letters, vol. 9, p. 4083–4090, May 2024

  49. [49]

    aws-robomaker-small-warehouse-world,

    aws-robotics, “aws-robomaker-small-warehouse-world,” 2020. Ac- cessed: 2024-08-01

  50. [50]

    aws-robomaker-small-house-world,

    aws-robotics, “aws-robomaker-small-house-world,” 2021. Accessed: 2024-10-01

  51. [51]

    Autonomous mobile ve- hicle using ros2 and 2d-lidar and slam navigation,

    P. Gyanani, M. Agarwal, R. Osari,et al., “Autonomous mobile ve- hicle using ros2 and 2d-lidar and slam navigation,”Research Square, vol. Preprint (Version 1), May 2024. Available at Research Square

  52. [52]

    Team cerberus wins the darpa subterranean challenge: Technical overview and lessons learned,

    M. Tranzatto, M. Dharmadhikari, L. Bernreiter, M. Camurri, S. Khattak, F. Mascarich, P. Pfreundschuh, D. Wisth, S. Zimmermann, M. Kulkarni, V . Reijgwart, B. Casseau, T. Homberger, P. D. Petris, L. Ott, W. Tubby, G. Waibel, H. Nguyen, C. Cadena, R. Buchanan, L. Wellhausen, N. Khedekar, O. Andersson, L. Zhang, T. Miki, T. Dang, M. Mattamala, M. Montenegro,...

  53. [53]

    Voxblox: Incremental 3D Euclidean Signed Distance Fields for On-Board MAV Planning

    H. Oleynikova, Z. Taylor, M. Fehr, J. I. Nieto, and R. Siegwart, “V oxblox: Building 3d signed distance fields for planning,”CoRR, vol. abs/1611.03631, 2016

  54. [54]

    UFOMap: An efficient probabilistic 3D mapping framework that embraces the unknown,

    D. Duberg and P. Jensfelt, “UFOMap: An efficient probabilistic 3D mapping framework that embraces the unknown,”IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6411–6418, 2020

  55. [55]

    clearpathrobotics, “Jackal.” Accessed: 2025-07-28

  56. [56]

    rosbotxl

    husarion, “rosbotxl.” Accessed: 2025-07-28

  57. [57]

    Assembly responses of hippocampal ca1 place cells predict learned behavior in goal-directed spatial tasks on the radial eight-arm maze,

    H. Xu, P. Baracskay, J. O’Neill, and J. Csicsvari, “Assembly responses of hippocampal ca1 place cells predict learned behavior in goal-directed spatial tasks on the radial eight-arm maze,”Neuron, vol. 101, no. 1, pp. 119–132.e4, 2019

  58. [58]

    The information geometry of space and time,

    A. Caticha, “The information geometry of space and time,” 2005

  59. [59]

    Grid cell-inspired fragmentation and recall for efficient map building,

    J. Hwang, Z.-W. Hong, E. Chen, A. Boopathy, P. Agrawal, and I. Fiete, “Grid cell-inspired fragmentation and recall for efficient map building,” 2024

  60. [60]

    Efficient autonomous exploration planning of large-scale 3-d environments,

    M. Selin, M. Tiger, D. Duberg, F. Heintz, and P. Jensfelt, “Efficient autonomous exploration planning of large-scale 3-d environments,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1699–1706, 2019

  61. [61]

    A message passing realization of expected free energy minimization,

    W. W. L. Nuijten, M. Lukashchuk, T. van de Laar, and B. de Vries, “A message passing realization of expected free energy minimization,” 2025. APPENDIXA MODELDEFINITION A. Model Parameters This section details the parameters used to configure and operate our active inference-based navigation model. These parameters define the structure of the agent’s inter...

  62. [62]

    Transition Update: Bπ =B π +Q(s t|st−1, π)Q(st−1)∗B π ∗λ(9) To update its beliefs about the environment, the agent uses a Dirichlet-based pseudo-count mechanism (Equation 9) with situation-dependent learning rates (λ). These rates vary based on whether the agent: •successfully reaches a location, •is physically blocked while trying to move, •or anticipate...

  63. [63]

    Specifically, we compute a Z-score to measure how strongly the most likely state stands out compared to the others

    Uncertainty About Current State:To determine whether an agent is lost, we evaluate its certainty about the current state. Specifically, we compute a Z-score to measure how strongly the most likely state stands out compared to the others. If this dominance falls below a user-defined threshold (set to 4 in this work), the agent is considered uncertain about...