TravExplorer: Cross-Floor Embodied Exploration via Traversability-Aware 3-D Planning

Han Zheng; Haoran Liu; Jinghao Wang; Ming Yang; Tong Qin; Yudong Huang; Zhe Chen

REVIEW 3 major objections 2 minor 52 references

TravExplorer couples zero-shot semantic guidance with traversability-aware 3-D planning to search for open-vocabulary targets across multiple floors in unseen buildings.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

2026-05-20 04:59 UTC pith:7P3WJSNU

load-bearing objection TravExplorer delivers a functional cross-floor navigation system with real-robot tests, though the key traversability mapping step could use more validation against sensor noise. the 3 major comments →

arxiv 2605.19958 v1 pith:7P3WJSNU submitted 2026-05-19 cs.RO

TravExplorer: Cross-Floor Embodied Exploration via Traversability-Aware 3-D Planning

Han Zheng , Zhe Chen , Yudong Huang , Haoran Liu , Jinghao Wang , Ming Yang , Tong Qin This is my paper

classification cs.RO

keywords cross-floor navigationtraversability-aware planningzero-shot object navigationvolumetric mappingembodied explorationhierarchical planneropen-vocabulary searchmulti-floor indoor environments

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TravExplorer as a framework for cross-floor embodied exploration that overcomes single-floor and planar assumptions in existing zero-shot object navigation systems. It builds and maintains a unified volumetric map from online sensor data that separates occupied structures from reachable support surfaces such as floors, stairs, and landings. Traversable frontiers are extracted from connected support surfaces, and a hierarchical planner uses semantic and geometric memories to tour object hypotheses while generating executable cross-floor motions. A lightweight guidance module reduces latency by aligning instance maps with spatial value maps. Large-scale simulations and real-world tests on a quadruped robot show advantages over standard baselines in both single-floor and multi-floor indoor settings without prior maps.

Core claim

TravExplorer maintains a unified volumetric map that distinguishes occupied structures from robot-reachable support surfaces and extracts traversable frontiers from connected support surfaces including floors, stairs, and landings. It aligns a probabilistic instance map from online open-vocabulary segmentation with a spatial value map from fast image-to-text matching, then applies a hierarchical planner for target-aware frontier touring over object hypotheses, traversable frontiers, and stair landmarks, generating motions through foothold-guided 3-D search and vertically constrained local trajectory optimization.

What carries the argument

The unified volumetric map that distinguishes occupied structures from robot-reachable support surfaces, together with the hierarchical planner that performs target-aware frontier touring and foothold-guided 3-D search.

Load-bearing premise

Online sensor data alone suffices to reliably separate occupied structures from robot-reachable support surfaces such as stairs and landings in real buildings.

What would settle it

A sequence of real-building trials in which the robot repeatedly misclassifies stairs as obstacles or fails to reach connected landings from sensor data would show the volumetric map cannot support the claimed cross-floor capability.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Robots gain the ability to traverse stairs and landings in multi-level indoor spaces using only online perception.
Open-vocabulary target search becomes feasible across single-floor and cross-floor environments without prior maps or human intervention.
Hierarchical planning over semantic and geometric memories produces consistent gains over representative ObjectNav baselines in thousands of simulated episodes.
FOV-aware active perception resolves incomplete observations that arise during vertical movement between floors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same traversability distinction could be tested for navigation on outdoor slopes or ramps with elevation changes.
Replacing the current segmentation module with higher-accuracy vision-language models might further lower semantic-reasoning delays.
The foothold-guided local search could be evaluated on legged platforms in warehouses containing mezzanines and catwalks.
Scaling the volumetric map to taller structures would test whether frontier extraction remains efficient as the number of floors increases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

TravExplorer delivers a functional cross-floor navigation system with real-robot tests, though the key traversability mapping step could use more validation against sensor noise.

read the letter

The main thing to know is that this paper presents a system called TravExplorer that enables robots to search for objects across multiple floors in buildings without prior maps, by using 3D planning that accounts for what surfaces are actually traversable. It builds on existing 3D mapping and open-vocabulary segmentation but adds a traversability-aware approach to extract frontiers from floors, stairs, and landings. The hierarchical planner then decides on target-aware touring and generates motions with foothold guidance and trajectory optimization. They also include a lightweight module to align semantic and spatial maps to cut down on reasoning time. The experiments include a large number of simulated runs on HM3D and MP3D showing advantages over ObjectNav methods, plus fifty real-world tests on a Unitree Go2 robot in single and cross-floor settings. This setup does a good job of addressing the single-floor limitation in previous work. The real-robot validation is a plus, as it shows the system can operate without human help in indoor environments. One soft spot is the reliance on the unified volumetric map to correctly identify reachable support surfaces from noisy online sensor data. In practice, depth sensor issues or complex stair shapes could lead to errors in frontier connectivity, which would affect the planner. The paper does not provide detailed accuracy metrics for this component or ablations showing its impact, so it's difficult to gauge how robust the cross-floor performance really is beyond the overall success rates reported. Another minor point is the lack of error bars or specifics on episode selection in the simulations, which makes the quantitative claims a bit harder to interpret at first glance. Overall, this work is for roboticists focused on embodied exploration and navigation in realistic, multi-level spaces. Readers interested in practical deployments in homes or offices would find the methods useful. I think it deserves a serious referee because it tackles an important practical problem with both simulation scale and real hardware tests, even if additional analysis on the traversability module would help.

Referee Report

3 major / 2 minor

Summary. The paper proposes TravExplorer, a cross-floor embodied exploration framework for zero-shot object navigation in unseen multi-floor indoor environments. It couples open-vocabulary semantic guidance with traversability-aware 3-D planning by maintaining a unified volumetric map that distinguishes occupied structures from robot-reachable support surfaces (floors, stairs, landings) and extracts traversable frontiers. Additional components include FOV-aware active perception, a lightweight guidance module aligning probabilistic instance maps with spatial value maps, and a hierarchical planner performing target-aware frontier touring with foothold-guided 3-D search and vertically constrained trajectory optimization. The work reports performance advantages over ObjectNav baselines in 4,195 simulated episodes on HM3D and MP3D, plus validation via 50 real-world trials on a Unitree Go2 robot without prior maps.

Significance. If the traversability mapping and planning components prove reliable, the framework would address a clear limitation in existing ZSON systems by enabling practical navigation across floors and vertical spaces in real buildings. The integration of geometric memory with open-vocabulary semantics and the commitment to code release are strengths that support reproducibility and extension by the community.

major comments (3)

The experimental results section reports consistent advantages over baselines across 4,195 simulated episodes but provides no error bars, ablation studies on the volumetric map or planner components, or details on episode selection and success criteria. This weakens the ability to evaluate the internal validity of the quantitative claims.
The real-world validation with 50 Unitree Go2 trials is presented as evidence for cross-floor open-vocabulary search, yet no per-component metrics are given on traversability classification accuracy (e.g., precision/recall for reachable vs. occupied voxels) under real sensor noise and partial observations. This is load-bearing for the central claim that the unified volumetric map enables reliable frontier extraction without prior maps.
In the method description of the unified volumetric map and frontier extraction, the approach to distinguishing robot-reachable support surfaces (including stairs and landings) from occupied structures relies on online sensor data alone; however, the handling of depth/LiDAR noise, occlusions, and varying geometries is not specified in sufficient detail to assess robustness.

minor comments (2)

The abstract states 'consistent advantages' without naming the specific metrics (e.g., success rate, SPL, or navigation efficiency) used for comparison with ObjectNav baselines.
Figure captions and method diagrams would benefit from clearer labeling of the hierarchical planner stages and how traversable frontiers connect across floors.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate the revisions planned for the next version.

read point-by-point responses

Referee: The experimental results section reports consistent advantages over baselines across 4,195 simulated episodes but provides no error bars, ablation studies on the volumetric map or planner components, or details on episode selection and success criteria. This weakens the ability to evaluate the internal validity of the quantitative claims.

Authors: We agree that error bars, ablation studies, and clearer experimental details would strengthen the evaluation. In the revised manuscript we will add standard error bars to the success rate and SPL results. We will also include ablation studies that remove the traversability mapping and the hierarchical planner in turn. Episode selection details will be expanded to note random sampling from the HM3D and MP3D validation splits using the standard ObjectNav criteria of reaching within 0.1 m of the target within 500 steps. revision: yes
Referee: The real-world validation with 50 Unitree Go2 trials is presented as evidence for cross-floor open-vocabulary search, yet no per-component metrics are given on traversability classification accuracy (e.g., precision/recall for reachable vs. occupied voxels) under real sensor noise and partial observations. This is load-bearing for the central claim that the unified volumetric map enables reliable frontier extraction without prior maps.

Authors: We acknowledge that voxel-level precision/recall metrics on traversability would provide stronger quantitative support. Our real-world evaluation centered on end-to-end task success across the 50 trials. Obtaining accurate ground-truth voxel labels for reachable versus occupied space in unmapped real environments proved impractical during the study. In revision we will explicitly discuss this limitation and note that the observed successful stair and landing traversals offer qualitative evidence of map utility, while suggesting such granular metrics as future work. revision: partial
Referee: In the method description of the unified volumetric map and frontier extraction, the approach to distinguishing robot-reachable support surfaces (including stairs and landings) from occupied structures relies on online sensor data alone; however, the handling of depth/LiDAR noise, occlusions, and varying geometries is not specified in sufficient detail to assess robustness.

Authors: We thank the referee for highlighting this gap. The revised method section will add explicit descriptions of noise mitigation via probabilistic occupancy updates and voxel filtering, multi-view temporal integration to address occlusions, and geometric heuristics (surface normal consistency and height-continuity checks) used to classify support surfaces. These mechanisms are intended to remain robust across varying stair geometries and partial observations. revision: yes

Circularity Check

0 steps flagged

No circularity detected; empirical system validated against external benchmarks

full rationale

The paper describes an engineering framework for cross-floor navigation using a unified volumetric map, traversability-aware planning, and hierarchical target-aware touring. Claims rest on 4,195 simulated episodes across HM3D/MP3D and 50 real-world Unitree Go2 trials, framed as direct empirical comparisons to ObjectNav baselines. No equations, fitted parameters, or derivations appear that reduce outputs to inputs by construction. No self-citation chains or ansatzes are invoked as load-bearing uniqueness theorems. The central mapping and planning components are presented as independent contributions tested on standard datasets, making the reported advantages self-contained against those external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; full manuscript required to populate the ledger.

pith-pipeline@v0.9.0 · 5815 in / 1086 out tokens · 35826 ms · 2026-05-20T04:59:13.062377+00:00 · methodology

0 comments

read the original abstract

Zero-shot Object Navigation (ZSON) has shown promise for open-vocabulary target search in unseen environments, yet most existing systems remain tied to planar representations and single-floor assumptions. These assumptions become inadequate in real buildings, where navigation involves floors, stairs, landings, and vertically overlapping spaces. This article presents TravExplorer, a cross-floor embodied exploration framework that couples zero-shot semantic guidance with traversability-aware 3-D planning. TravExplorer maintains a unified volumetric map that distinguishes occupied structures from robot-reachable support surfaces and extracts traversable frontiers from connected support surfaces, including floors, stairs, and landings. A FOV-aware active perception strategy further resolves incomplete observations during cross-floor traversal. To reduce semantic-reasoning latency, a lightweight guidance module aligns a probabilistic instance map from online open-vocabulary segmentation with a spatial value map from fast image-to-text matching. Based on these geometric and semantic memories, a hierarchical planner performs target-aware frontier touring over object hypotheses, traversable frontiers, and stair landmarks, and generates executable cross-floor motions through foothold-guided 3-D search and vertically constrained local trajectory optimization. Experiments over 4,195 simulated episodes on HM3D and MP3D demonstrate consistent advantages over representative ObjectNav baselines. Fifty real-world trials on a Unitree Go2 further validate open-vocabulary target search across single-floor and cross-floor indoor environments without prior maps or human intervention. The code will be released at https://github.com/wuyi2121/TravExplorer.

Figures

Figures reproduced from arXiv: 2605.19958 by Han Zheng, Haoran Liu, Jinghao Wang, Ming Yang, Tong Qin, Yudong Huang, Zhe Chen.

**Figure 2.** Figure 2: System overview of TravExplorer. The quadruped robot receives RGB-D and odometry observations and incrementally maintains volumetric occupancy, [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: 3-D traversable frontier representation. Blue, gray, and inflated voxels [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: Traversability-aware spatial value map on a staircase. Image-text [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: Cross-floor exploration process. (a) The robot initializes the map and frontier candidates from early RGB-D observations. (b) TravExplorer expands the [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Performance comparison on HM3D across single-floor, multi-floor, and [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 9.** Figure 9: Qualitative mapping results in six representative indoor scenes. Each column shows the original scene geometry, the reconstructed traversability map, and [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: Qualitative comparison of frontier representations. (a) TravExplorer [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

**Figure 11.** Figure 11: Mapping ablation on exploration efficiency. TravExplorer reaches [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

**Figure 13.** Figure 13: Real-world deployment results of TravExplorer in four representative indoor environments. The first row shows deployment in a multi-floor villa, where [PITH_FULL_IMAGE:figures/full_fig_p013_13.png] view at source ↗

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 2 internal anchors

[1]

CoWs on PASTURE: Baselines and benchmarks for language-driven zero-shot object navigation,

S. Y. Gadre, M. Wortsman, G. Ilharco, L. Schmidt, and S. Song, “CoWs on PASTURE: Baselines and benchmarks for language-driven zero-shot object navigation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 23 171– 23 181

work page 2023
[2]

L3MVN: Leveraging large language models for visual target navigation,

B. Yu, H. Kasaei, and M. Cao, “L3MVN: Leveraging large language models for visual target navigation,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 3554– 3560

work page 2023
[3]

VLFM: Vision- language frontier maps for zero-shot semantic navigation,

N. Yokoyama, S. Ha, D. Batra, J. Wang, and B. Bucher, “VLFM: Vision- language frontier maps for zero-shot semantic navigation,” in2024 IEEE International Conference on Robotics and Automation (ICRA), 2024, pp. 42–48

work page 2024
[4]

SG-Nav: Online 3d scene graph prompting for llm-based zero-shot object navigation,

H. Yin, X. Xu, Z. Wu, J. Zhou, and J. Lu, “SG-Nav: Online 3d scene graph prompting for llm-based zero-shot object navigation,” inAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, Eds., vol. 37, 2024

work page 2024
[5]

ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-Centric Semantic Fusion

M. Zhang, Y. Du, C. Wu, J. Zhou, Z. Qi, J. Ma, and B. Zhou, “ApexNav: An adaptive exploration strategy for zero-shot object navigation with target- centric semantic fusion,”CoRR, vol. abs/2504.14478, 2025

work page arXiv 2025
[6]

STRIVE: Structured representation integrating VLM reasoning for efficient object navigation,

H. Zhu, Z. Li, Z. Liu, W. Wang, J. Zhang, J. Francis, and J. Oh, “STRIVE: Structured representation integrating VLM reasoning for efficient object navigation,”CoRR, vol. abs/2505.06729, 2025

work page arXiv 2025
[7]

SysNav: Multi-level systematic cooperation enables real-world, cross-embodiment object navigation,

H. Zhu, Z. Li, Z. Liu, K. Guo, Z. Lin, Y. Cai, G. Chen, C. Lv, W. Wang, J. Oh, and J. Zhang, “SysNav: Multi-level systematic cooperation enables real-world, cross-embodiment object navigation,”CoRR, vol. abs/2603.06914, 2026

work page arXiv 2026
[8]

USS-Nav: Unified spatio-semantic scene graph for lightweight UA V zero-shot object navigation,

W. Gai, Y. Gao, Y. Zhou, Y. Xie, Z. Liu, Y. Wu, X. Zhou, F. Gao, and Z. Meng, “USS-Nav: Unified spatio-semantic scene graph for lightweight UA V zero-shot object navigation,”CoRR, vol. abs/2602.00708, 2026

work page arXiv 2026
[9]

Multi-floor zero-shot object navigation policy,

L. Zhang, H. Wang, E. Xiao, X. Zhang, Q. Zhang, Z. Jiang, and R. Xu, “Multi-floor zero-shot object navigation policy,”CoRR, vol. abs/2409.10906, 2024

work page arXiv 2024
[10]

Stairway to success: An online floor-aware zero-shot object-goal navigation framework via LLM-driven coarse-to-fine exploration,

Z. Gong, R. Li, T. Hu, R. Qiu, L. Kong, L. Zhang, G. Zhao, Y. Ding, and J. Liang, “Stairway to success: An online floor-aware zero-shot object-goal navigation framework via LLM-driven coarse-to-fine exploration,”IEEE Robotics and Automation Letters, 2026

work page 2026
[11]

Efficient global navigational planning in 3-D structures based on point cloud tomography,

B. Yang, J. Cheng, B. Xue, J. Jiao, and M. Liu, “Efficient global navigational planning in 3-D structures based on point cloud tomography,” IEEE/ASME Transactions on Mechatronics, vol. 30, no. 1, pp. 321–332, 2025

work page 2025
[12]

A frontier-based approach for autonomous exploration,

B. Yamauchi, “A frontier-based approach for autonomous exploration,” inProceedings of the 1997 IEEE International Symposium on Com- putational Intelligence in Robotics and Automation (CIRA), 1997, pp. 146–151

work page 1997
[13]

Efficient dense frontier detection for 2-D graph SLAM based on occupancy grid submaps,

J. Orsulic, D. Miklic, and Z. Kovacic, “Efficient dense frontier detection for 2-D graph SLAM based on occupancy grid submaps,”IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3569–3576, 2019

work page 2019
[14]

FAEL: Fast autonomous exploration for large-scale environments with a mobile robot,

J. Huang, B. Zhou, Z. Fan, Y. Zhu, Y. Jie, L. Li, and H. Cheng, “FAEL: Fast autonomous exploration for large-scale environments with a mobile robot,”IEEE Robotics and Automation Letters, vol. 8, no. 3, pp. 1667– 1674, 2023

work page 2023
[15]

Rapid autonomous exploration of large-scale environments for ground robots based on region partitioning,

Z. Wen, X. Liu, G. Lu, and J. Liu, “Rapid autonomous exploration of large-scale environments for ground robots based on region partitioning,” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 13 067–13 073

work page 2025
[16]

FUEL: Fast UA V exploration using incremental frontier structure and hierarchical planning,

B. Zhou, Y. Zhang, X. Chen, and S. Shen, “FUEL: Fast UA V exploration using incremental frontier structure and hierarchical planning,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 779–786, 2021

work page 2021
[17]

FALCON: Fast autonomous aerial exploration using coverage path guidance,

Y. Zhang, X. Chen, C. M. Feng, B. Zhou, and S. Shen, “FALCON: Fast autonomous aerial exploration using coverage path guidance,”IEEE Transactions on Robotics, vol. 41, pp. 1365–1385, 2025

work page 2025
[18]

3D active metric-semantic SLAM,

Y. Tao, X. Liu, I. Spasojevic, S. Agarwal, and V. Kumar, “3D active metric-semantic SLAM,”IEEE Robotics and Automation Letters, vol. 9, no. 3, pp. 2989–2996, 2024

work page 2024
[19]

EPIC: A lightweight LiDAR- based UA V exploration framework for large-scale scenarios,

S. Geng, Z. Ning, F. Zhang, and B. Zhou, “EPIC: A lightweight LiDAR- based UA V exploration framework for large-scale scenarios,”IEEE Robotics and Automation Letters, vol. 10, no. 5, pp. 5090–5097, 2025

work page 2025
[20]

FLARE: Fast large-scale autonomous exploration guided by unknown regions,

X. Liu, M. Lin, S. Li, G. Xu, Z. Wang, H. Wu, and Y. Liu, “FLARE: Fast large-scale autonomous exploration guided by unknown regions,”IEEE Robotics and Automation Letters, vol. 10, no. 11, pp. 12 197–12 204, 2025

work page 2025
[21]

SMUG planner: A safe multi-goal planner for mobile robots in challenging environments,

C. Chen, J. Frey, P. Arm, and M. Hutter, “SMUG planner: A safe multi-goal planner for mobile robots in challenging environments,”IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7170–7177, 2023

work page 2023
[22]

LRAE: Large- region-aware safe and fast autonomous exploration of ground robots for uneven terrains,

Q. Bi, X. Zhang, S. Zhang, R. Wang, L. Li, and J. Yuan, “LRAE: Large- region-aware safe and fast autonomous exploration of ground robots for uneven terrains,”IEEE Robotics and Automation Letters, vol. 9, no. 12, pp. 11 186–11 193, 2024

work page 2024
[23]

Portable planner for enhancing ground robots exploration performance in unstructured environments,

Y. Jia, W. Tang, H. Sun, J. Yang, B. Liu, and C. Wang, “Portable planner for enhancing ground robots exploration performance in unstructured environments,”IEEE Robotics and Automation Letters, vol. 9, no. 11, pp. 9295–9302, 2024

work page 2024
[24]

AAGE: Air- assisted ground robotic autonomous exploration in large-scale unknown environments,

L. Zheng, M. Wei, R. Mei, K. Xu, J. Huang, and H. Cheng, “AAGE: Air- assisted ground robotic autonomous exploration in large-scale unknown environments,”IEEE Transactions on Robotics, vol. 41, pp. 1918–1937, 2025

work page 1918
[25]

Probabilistic terrain mapping for mobile robots with uncertain localization,

P. Fankhauser, M. Bloesch, and M. Hutter, “Probabilistic terrain mapping for mobile robots with uncertain localization,”IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3019–3026, 2018

work page 2018
[26]

Elevation mapping for locomotion and navigation using GPU,

T. Miki, L. Wellhausen, R. Grandia, F. Jenelten, T. Homberger, and M. Hutter, “Elevation mapping for locomotion and navigation using GPU,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 2273–2280

work page 2022
[27]

Perceptive locomotion through nonlinear model-predictive control,

R. Grandia, F. Jenelten, S. Yang, F. Farshidian, and M. Hutter, “Perceptive locomotion through nonlinear model-predictive control,”IEEE Transac- tions on Robotics, vol. 39, no. 5, pp. 3402–3421, 2023

work page 2023
[28]

An efficient trajectory planner for car-like robots on uneven terrain,

L. Xu, K. Chai, Z. Han, H. Liu, C. Xu, Y. Cao, and F. Gao, “An efficient trajectory planner for car-like robots on uneven terrain,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 2853–2860

work page 2023
[29]

Towards efficient trajectory generation for ground robots beyond 2D environment,

J. Wang, L. Xu, H. Fu, Z. Meng, C. Xu, Y. Cao, X. Lyu, and F. Gao, “Towards efficient trajectory generation for ground robots beyond 2D environment,”CoRR, vol. abs/2302.03323, 2023

work page arXiv 2023
[30]

Efficient trajectory generation based on traversable planes in 3D complex archi- tectural spaces,

M. Zhang, Z. Tian, Y. Xia, C. Xu, F. Gao, and Y. Cao, “Efficient trajectory generation based on traversable planes in 3D complex archi- tectural spaces,” in2025 IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 14 513–14 519

work page 2025
[31]

Real- time multilevel terrain-aware path planning for ground mobile robots in large-scale rough terrains,

Y. Li, K. Chen, Y. Wang, W. Zhang, J. Wang, H. Chen, and Y. Liu, “Real- time multilevel terrain-aware path planning for ground mobile robots in large-scale rough terrains,”IEEE Transactions on Robotics, vol. 41, pp. 4159–4179, 2025

work page 2025
[32]

TiFA: A terrain- informed navigation framework for articulated tracked robots in rescue missions,

Y. Wang, W. Zhang, Y. Li, J. Wang, and H. Chen, “TiFA: A terrain- informed navigation framework for articulated tracked robots in rescue missions,”The International Journal of Robotics Research, 2025, on- lineFirst

work page 2025
[33]

Object goal navigation using goal-oriented semantic exploration,

D. S. Chaplot, D. P. Gandhi, A. Gupta, and R. Salakhutdinov, “Object goal navigation using goal-oriented semantic exploration,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 4247–4258

work page 2020
[34]

PONI: Potential functions for objectgoal navigation with interaction-free learning,

S. K. Ramakrishnan, D. S. Chaplot, Z. Al-Halah, J. Malik, and K. Grauman, “PONI: Potential functions for objectgoal navigation with interaction-free learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 18 868– 18 878

work page 2022
[35]

ZSON: Zero-shot object-goal navigation using multimodal goal embed- dings,

A. Majumdar, G. Aggarwal, B. Devnani, J. Hoffman, and D. Batra, “ZSON: Zero-shot object-goal navigation using multimodal goal embed- dings,” inAdvances in Neural Information Processing Systems, vol. 35, 2022

work page 2022
[36]

Yuncong Yang, Han Yang, Jiachen Zhou, Peihao Chen, Hongxin Zhang, Yilun Du, and Chuang Gan

T. Chabal, S. Chen, J. Ponce, and C. Schmid, “FOM-Nav: Frontier-object maps for object goal navigation,”CoRR, vol. abs/2512.01009, 2025

work page arXiv 2025
[37]

OpenFrontier: General Navigation with Visual-Language Grounded Frontiers

E. Padilla-Cerdio, B. Sun, M. Pollefeys, and H. Blum, “OpenFrontier: General navigation with visual-language grounded frontiers,”CoRR, vol. abs/2603.05377, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[38]

Objectnav revisited: On evaluation of embodied agents navigating to objects

D. Batra, A. Gokaslan, A. Kembhavi, O. Maksymets, R. Mottaghi, M. Savva, A. Toshev, and E. Wijmans, “ObjectNav revisited: On evaluation 15 of embodied agents navigating to objects,”CoRR, vol. abs/2006.13171, 2020

work page arXiv 2006
[39]

BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models,

J. Li, D. Li, S. Savarese, and S. C. H. Hoi, “BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 19 730–19 742

work page 2023
[40]

A two-stage optimized next-view planning framework for 3-D unknown environment exploration, and structural reconstruction,

Z. Meng, H. Qin, Z. Chen, X. Chen, H. Sun, F. Lin, and M. H. A. Jr., “A two-stage optimized next-view planning framework for 3-D unknown environment exploration, and structural reconstruction,”IEEE Robotics and Automation Letters, vol. 2, no. 3, pp. 1680–1687, 2017

work page 2017
[41]

Ego-planner: An esdf-free gradient-based local planner for quadrotors,

X. Zhou, Z. Wang, H. Ye, C. Xu, and F. Gao, “Ego-planner: An esdf-free gradient-based local planner for quadrotors,”IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 478–485, 2021

work page 2021
[42]

Habitat navigation challenge 2023,

Habitat Team, “Habitat navigation challenge 2023,” https://github.com/ facebookresearch/habitat-challenge, 2023, accessed: 2026-05-17

work page 2023
[43]

Habitat-Matterport 3D Dataset (HM3D): 1000 large-scale 3d environments for embodied AI,

S. K. Ramakrishnan, A. Gokaslan, E. Wijmans, O. Maksymets, A. Clegg, J. M. Turner, E. Undersander, W. Galuba, A. Westbury, A. X. Chang, M. Savva, Y. Zhao, and D. Batra, “Habitat-Matterport 3D Dataset (HM3D): 1000 large-scale 3d environments for embodied AI,” inAdvances in Neural Information Processing Systems Datasets and Benchmarks Track, 2021

work page 2021
[44]

Matterport3D: Learning from RGB-D data in indoor environments,

A. X. Chang, A. Dai, T. Funkhouser, M. Halber, M. Nießner, M. Savva, S. Song, A. Zeng, and Y. Zhang, “Matterport3D: Learning from RGB-D data in indoor environments,” in2017 International Conference on 3D Vision (3DV), 2017, pp. 667–676

work page 2017
[45]

ESC: Exploration with soft commonsense constraints for zero-shot object navigation,

K. Zhou, K. Zheng, C. Pryor, Y. Shen, H. Jin, L. Getoor, and X. E. Wang, “ESC: Exploration with soft commonsense constraints for zero-shot object navigation,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 42 829–42 842

work page 2023
[46]

OpenFMNav: Towards open-set zero-shot object navigation via vision-language foundation models,

Y. Kuang, H. Lin, and M. Jiang, “OpenFMNav: Towards open-set zero-shot object navigation via vision-language foundation models,” in Findings of the Association for Computational Linguistics: NAACL 2024. Association for Computational Linguistics, 2024, pp. 338–351

work page 2024
[47]

TriHelper: Zero-shot object navigation with dynamic assistance,

L. Zhang, Q. Zhang, H. Wang, E. Xiao, Z. Jiang, H. Chen, and R. Xu, “TriHelper: Zero-shot object navigation with dynamic assistance,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

work page 2024
[48]

SAM 3: Segment Anything with Concepts

N. Carion, L. Gustafson, Y.-T. Hu, S. Debnath, R. Hu, D. S. Coll-Vinent, C. Ryali, K. V. Alwala, H. Khedr, A. Huanget al., “SAM 3: Segment anything with concepts,”CoRR, vol. abs/2511.16719, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[49]

An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems,

K. Helsgaun, “An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems,” Roskilde University, Roskilde, Denmark, Technical Report, 2017. [Online]. Available: http://www.akira.ruc.dk/ ∼keld/ research/LKH-3/LKH-3 REPORT.pdf

work page 2017
[50]

InstructNav: Zero-shot system for generic instruction navigation in unexplored environment,

Y. Long, W. Cai, H. Wang, G. Zhan, and H. Dong, “InstructNav: Zero-shot system for generic instruction navigation in unexplored environment,” in Proceedings of The 8th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, vol. 270. PMLR, 2025, pp. 2049–2060

work page 2025
[51]

Fast-lio2: Fast direct lidar- inertial odometry,

W. Xu, Y. Cai, D. He, J. Lin, and F. Zhang, “Fast-lio2: Fast direct lidar- inertial odometry,”IEEE Transactions on Robotics, vol. 38, no. 4, pp. 2053–2073, 2022

work page 2053
[52]

Intel RealSense Depth Camera D435i,

Intel Corporation, “Intel RealSense Depth Camera D435i,” https://www.intel.com/content/www/us/en/products/sku/190004/ intel-realsense-depth-camera-d435i.html, 2026, accessed: 2026-05- 17

work page 2026

[1] [1]

CoWs on PASTURE: Baselines and benchmarks for language-driven zero-shot object navigation,

S. Y. Gadre, M. Wortsman, G. Ilharco, L. Schmidt, and S. Song, “CoWs on PASTURE: Baselines and benchmarks for language-driven zero-shot object navigation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 23 171– 23 181

work page 2023

[2] [2]

L3MVN: Leveraging large language models for visual target navigation,

B. Yu, H. Kasaei, and M. Cao, “L3MVN: Leveraging large language models for visual target navigation,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 3554– 3560

work page 2023

[3] [3]

VLFM: Vision- language frontier maps for zero-shot semantic navigation,

N. Yokoyama, S. Ha, D. Batra, J. Wang, and B. Bucher, “VLFM: Vision- language frontier maps for zero-shot semantic navigation,” in2024 IEEE International Conference on Robotics and Automation (ICRA), 2024, pp. 42–48

work page 2024

[4] [4]

SG-Nav: Online 3d scene graph prompting for llm-based zero-shot object navigation,

H. Yin, X. Xu, Z. Wu, J. Zhou, and J. Lu, “SG-Nav: Online 3d scene graph prompting for llm-based zero-shot object navigation,” inAdvances in Neural Information Processing Systems, A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang, Eds., vol. 37, 2024

work page 2024

[5] [5]

ApexNav: An Adaptive Exploration Strategy for Zero-Shot Object Navigation with Target-Centric Semantic Fusion

M. Zhang, Y. Du, C. Wu, J. Zhou, Z. Qi, J. Ma, and B. Zhou, “ApexNav: An adaptive exploration strategy for zero-shot object navigation with target- centric semantic fusion,”CoRR, vol. abs/2504.14478, 2025

work page arXiv 2025

[6] [6]

STRIVE: Structured representation integrating VLM reasoning for efficient object navigation,

H. Zhu, Z. Li, Z. Liu, W. Wang, J. Zhang, J. Francis, and J. Oh, “STRIVE: Structured representation integrating VLM reasoning for efficient object navigation,”CoRR, vol. abs/2505.06729, 2025

work page arXiv 2025

[7] [7]

SysNav: Multi-level systematic cooperation enables real-world, cross-embodiment object navigation,

H. Zhu, Z. Li, Z. Liu, K. Guo, Z. Lin, Y. Cai, G. Chen, C. Lv, W. Wang, J. Oh, and J. Zhang, “SysNav: Multi-level systematic cooperation enables real-world, cross-embodiment object navigation,”CoRR, vol. abs/2603.06914, 2026

work page arXiv 2026

[8] [8]

USS-Nav: Unified spatio-semantic scene graph for lightweight UA V zero-shot object navigation,

W. Gai, Y. Gao, Y. Zhou, Y. Xie, Z. Liu, Y. Wu, X. Zhou, F. Gao, and Z. Meng, “USS-Nav: Unified spatio-semantic scene graph for lightweight UA V zero-shot object navigation,”CoRR, vol. abs/2602.00708, 2026

work page arXiv 2026

[9] [9]

Multi-floor zero-shot object navigation policy,

L. Zhang, H. Wang, E. Xiao, X. Zhang, Q. Zhang, Z. Jiang, and R. Xu, “Multi-floor zero-shot object navigation policy,”CoRR, vol. abs/2409.10906, 2024

work page arXiv 2024

[10] [10]

Stairway to success: An online floor-aware zero-shot object-goal navigation framework via LLM-driven coarse-to-fine exploration,

Z. Gong, R. Li, T. Hu, R. Qiu, L. Kong, L. Zhang, G. Zhao, Y. Ding, and J. Liang, “Stairway to success: An online floor-aware zero-shot object-goal navigation framework via LLM-driven coarse-to-fine exploration,”IEEE Robotics and Automation Letters, 2026

work page 2026

[11] [11]

Efficient global navigational planning in 3-D structures based on point cloud tomography,

B. Yang, J. Cheng, B. Xue, J. Jiao, and M. Liu, “Efficient global navigational planning in 3-D structures based on point cloud tomography,” IEEE/ASME Transactions on Mechatronics, vol. 30, no. 1, pp. 321–332, 2025

work page 2025

[12] [12]

A frontier-based approach for autonomous exploration,

B. Yamauchi, “A frontier-based approach for autonomous exploration,” inProceedings of the 1997 IEEE International Symposium on Com- putational Intelligence in Robotics and Automation (CIRA), 1997, pp. 146–151

work page 1997

[13] [13]

Efficient dense frontier detection for 2-D graph SLAM based on occupancy grid submaps,

J. Orsulic, D. Miklic, and Z. Kovacic, “Efficient dense frontier detection for 2-D graph SLAM based on occupancy grid submaps,”IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3569–3576, 2019

work page 2019

[14] [14]

FAEL: Fast autonomous exploration for large-scale environments with a mobile robot,

J. Huang, B. Zhou, Z. Fan, Y. Zhu, Y. Jie, L. Li, and H. Cheng, “FAEL: Fast autonomous exploration for large-scale environments with a mobile robot,”IEEE Robotics and Automation Letters, vol. 8, no. 3, pp. 1667– 1674, 2023

work page 2023

[15] [15]

Rapid autonomous exploration of large-scale environments for ground robots based on region partitioning,

Z. Wen, X. Liu, G. Lu, and J. Liu, “Rapid autonomous exploration of large-scale environments for ground robots based on region partitioning,” inProceedings of the IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 13 067–13 073

work page 2025

[16] [16]

FUEL: Fast UA V exploration using incremental frontier structure and hierarchical planning,

B. Zhou, Y. Zhang, X. Chen, and S. Shen, “FUEL: Fast UA V exploration using incremental frontier structure and hierarchical planning,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 779–786, 2021

work page 2021

[17] [17]

FALCON: Fast autonomous aerial exploration using coverage path guidance,

Y. Zhang, X. Chen, C. M. Feng, B. Zhou, and S. Shen, “FALCON: Fast autonomous aerial exploration using coverage path guidance,”IEEE Transactions on Robotics, vol. 41, pp. 1365–1385, 2025

work page 2025

[18] [18]

3D active metric-semantic SLAM,

Y. Tao, X. Liu, I. Spasojevic, S. Agarwal, and V. Kumar, “3D active metric-semantic SLAM,”IEEE Robotics and Automation Letters, vol. 9, no. 3, pp. 2989–2996, 2024

work page 2024

[19] [19]

EPIC: A lightweight LiDAR- based UA V exploration framework for large-scale scenarios,

S. Geng, Z. Ning, F. Zhang, and B. Zhou, “EPIC: A lightweight LiDAR- based UA V exploration framework for large-scale scenarios,”IEEE Robotics and Automation Letters, vol. 10, no. 5, pp. 5090–5097, 2025

work page 2025

[20] [20]

FLARE: Fast large-scale autonomous exploration guided by unknown regions,

X. Liu, M. Lin, S. Li, G. Xu, Z. Wang, H. Wu, and Y. Liu, “FLARE: Fast large-scale autonomous exploration guided by unknown regions,”IEEE Robotics and Automation Letters, vol. 10, no. 11, pp. 12 197–12 204, 2025

work page 2025

[21] [21]

SMUG planner: A safe multi-goal planner for mobile robots in challenging environments,

C. Chen, J. Frey, P. Arm, and M. Hutter, “SMUG planner: A safe multi-goal planner for mobile robots in challenging environments,”IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7170–7177, 2023

work page 2023

[22] [22]

LRAE: Large- region-aware safe and fast autonomous exploration of ground robots for uneven terrains,

Q. Bi, X. Zhang, S. Zhang, R. Wang, L. Li, and J. Yuan, “LRAE: Large- region-aware safe and fast autonomous exploration of ground robots for uneven terrains,”IEEE Robotics and Automation Letters, vol. 9, no. 12, pp. 11 186–11 193, 2024

work page 2024

[23] [23]

Portable planner for enhancing ground robots exploration performance in unstructured environments,

Y. Jia, W. Tang, H. Sun, J. Yang, B. Liu, and C. Wang, “Portable planner for enhancing ground robots exploration performance in unstructured environments,”IEEE Robotics and Automation Letters, vol. 9, no. 11, pp. 9295–9302, 2024

work page 2024

[24] [24]

AAGE: Air- assisted ground robotic autonomous exploration in large-scale unknown environments,

L. Zheng, M. Wei, R. Mei, K. Xu, J. Huang, and H. Cheng, “AAGE: Air- assisted ground robotic autonomous exploration in large-scale unknown environments,”IEEE Transactions on Robotics, vol. 41, pp. 1918–1937, 2025

work page 1918

[25] [25]

Probabilistic terrain mapping for mobile robots with uncertain localization,

P. Fankhauser, M. Bloesch, and M. Hutter, “Probabilistic terrain mapping for mobile robots with uncertain localization,”IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3019–3026, 2018

work page 2018

[26] [26]

Elevation mapping for locomotion and navigation using GPU,

T. Miki, L. Wellhausen, R. Grandia, F. Jenelten, T. Homberger, and M. Hutter, “Elevation mapping for locomotion and navigation using GPU,” in2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022, pp. 2273–2280

work page 2022

[27] [27]

Perceptive locomotion through nonlinear model-predictive control,

R. Grandia, F. Jenelten, S. Yang, F. Farshidian, and M. Hutter, “Perceptive locomotion through nonlinear model-predictive control,”IEEE Transac- tions on Robotics, vol. 39, no. 5, pp. 3402–3421, 2023

work page 2023

[28] [28]

An efficient trajectory planner for car-like robots on uneven terrain,

L. Xu, K. Chai, Z. Han, H. Liu, C. Xu, Y. Cao, and F. Gao, “An efficient trajectory planner for car-like robots on uneven terrain,” in2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 2853–2860

work page 2023

[29] [29]

Towards efficient trajectory generation for ground robots beyond 2D environment,

J. Wang, L. Xu, H. Fu, Z. Meng, C. Xu, Y. Cao, X. Lyu, and F. Gao, “Towards efficient trajectory generation for ground robots beyond 2D environment,”CoRR, vol. abs/2302.03323, 2023

work page arXiv 2023

[30] [30]

Efficient trajectory generation based on traversable planes in 3D complex archi- tectural spaces,

M. Zhang, Z. Tian, Y. Xia, C. Xu, F. Gao, and Y. Cao, “Efficient trajectory generation based on traversable planes in 3D complex archi- tectural spaces,” in2025 IEEE International Conference on Robotics and Automation (ICRA), 2025, pp. 14 513–14 519

work page 2025

[31] [31]

Real- time multilevel terrain-aware path planning for ground mobile robots in large-scale rough terrains,

Y. Li, K. Chen, Y. Wang, W. Zhang, J. Wang, H. Chen, and Y. Liu, “Real- time multilevel terrain-aware path planning for ground mobile robots in large-scale rough terrains,”IEEE Transactions on Robotics, vol. 41, pp. 4159–4179, 2025

work page 2025

[32] [32]

TiFA: A terrain- informed navigation framework for articulated tracked robots in rescue missions,

Y. Wang, W. Zhang, Y. Li, J. Wang, and H. Chen, “TiFA: A terrain- informed navigation framework for articulated tracked robots in rescue missions,”The International Journal of Robotics Research, 2025, on- lineFirst

work page 2025

[33] [33]

Object goal navigation using goal-oriented semantic exploration,

D. S. Chaplot, D. P. Gandhi, A. Gupta, and R. Salakhutdinov, “Object goal navigation using goal-oriented semantic exploration,” inAdvances in Neural Information Processing Systems, vol. 33, 2020, pp. 4247–4258

work page 2020

[34] [34]

PONI: Potential functions for objectgoal navigation with interaction-free learning,

S. K. Ramakrishnan, D. S. Chaplot, Z. Al-Halah, J. Malik, and K. Grauman, “PONI: Potential functions for objectgoal navigation with interaction-free learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 18 868– 18 878

work page 2022

[35] [35]

ZSON: Zero-shot object-goal navigation using multimodal goal embed- dings,

A. Majumdar, G. Aggarwal, B. Devnani, J. Hoffman, and D. Batra, “ZSON: Zero-shot object-goal navigation using multimodal goal embed- dings,” inAdvances in Neural Information Processing Systems, vol. 35, 2022

work page 2022

[36] [36]

Yuncong Yang, Han Yang, Jiachen Zhou, Peihao Chen, Hongxin Zhang, Yilun Du, and Chuang Gan

T. Chabal, S. Chen, J. Ponce, and C. Schmid, “FOM-Nav: Frontier-object maps for object goal navigation,”CoRR, vol. abs/2512.01009, 2025

work page arXiv 2025

[37] [37]

OpenFrontier: General Navigation with Visual-Language Grounded Frontiers

E. Padilla-Cerdio, B. Sun, M. Pollefeys, and H. Blum, “OpenFrontier: General navigation with visual-language grounded frontiers,”CoRR, vol. abs/2603.05377, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[38] [38]

Objectnav revisited: On evaluation of embodied agents navigating to objects

D. Batra, A. Gokaslan, A. Kembhavi, O. Maksymets, R. Mottaghi, M. Savva, A. Toshev, and E. Wijmans, “ObjectNav revisited: On evaluation 15 of embodied agents navigating to objects,”CoRR, vol. abs/2006.13171, 2020

work page arXiv 2006

[39] [39]

BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models,

J. Li, D. Li, S. Savarese, and S. C. H. Hoi, “BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 19 730–19 742

work page 2023

[40] [40]

A two-stage optimized next-view planning framework for 3-D unknown environment exploration, and structural reconstruction,

Z. Meng, H. Qin, Z. Chen, X. Chen, H. Sun, F. Lin, and M. H. A. Jr., “A two-stage optimized next-view planning framework for 3-D unknown environment exploration, and structural reconstruction,”IEEE Robotics and Automation Letters, vol. 2, no. 3, pp. 1680–1687, 2017

work page 2017

[41] [41]

Ego-planner: An esdf-free gradient-based local planner for quadrotors,

X. Zhou, Z. Wang, H. Ye, C. Xu, and F. Gao, “Ego-planner: An esdf-free gradient-based local planner for quadrotors,”IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 478–485, 2021

work page 2021

[42] [42]

Habitat navigation challenge 2023,

Habitat Team, “Habitat navigation challenge 2023,” https://github.com/ facebookresearch/habitat-challenge, 2023, accessed: 2026-05-17

work page 2023

[43] [43]

Habitat-Matterport 3D Dataset (HM3D): 1000 large-scale 3d environments for embodied AI,

S. K. Ramakrishnan, A. Gokaslan, E. Wijmans, O. Maksymets, A. Clegg, J. M. Turner, E. Undersander, W. Galuba, A. Westbury, A. X. Chang, M. Savva, Y. Zhao, and D. Batra, “Habitat-Matterport 3D Dataset (HM3D): 1000 large-scale 3d environments for embodied AI,” inAdvances in Neural Information Processing Systems Datasets and Benchmarks Track, 2021

work page 2021

[44] [44]

Matterport3D: Learning from RGB-D data in indoor environments,

A. X. Chang, A. Dai, T. Funkhouser, M. Halber, M. Nießner, M. Savva, S. Song, A. Zeng, and Y. Zhang, “Matterport3D: Learning from RGB-D data in indoor environments,” in2017 International Conference on 3D Vision (3DV), 2017, pp. 667–676

work page 2017

[45] [45]

ESC: Exploration with soft commonsense constraints for zero-shot object navigation,

K. Zhou, K. Zheng, C. Pryor, Y. Shen, H. Jin, L. Getoor, and X. E. Wang, “ESC: Exploration with soft commonsense constraints for zero-shot object navigation,” inProceedings of the 40th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 202. PMLR, 2023, pp. 42 829–42 842

work page 2023

[46] [46]

OpenFMNav: Towards open-set zero-shot object navigation via vision-language foundation models,

Y. Kuang, H. Lin, and M. Jiang, “OpenFMNav: Towards open-set zero-shot object navigation via vision-language foundation models,” in Findings of the Association for Computational Linguistics: NAACL 2024. Association for Computational Linguistics, 2024, pp. 338–351

work page 2024

[47] [47]

TriHelper: Zero-shot object navigation with dynamic assistance,

L. Zhang, Q. Zhang, H. Wang, E. Xiao, Z. Jiang, H. Chen, and R. Xu, “TriHelper: Zero-shot object navigation with dynamic assistance,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

work page 2024

[48] [48]

SAM 3: Segment Anything with Concepts

N. Carion, L. Gustafson, Y.-T. Hu, S. Debnath, R. Hu, D. S. Coll-Vinent, C. Ryali, K. V. Alwala, H. Khedr, A. Huanget al., “SAM 3: Segment anything with concepts,”CoRR, vol. abs/2511.16719, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[49] [49]

An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems,

K. Helsgaun, “An extension of the lin-kernighan-helsgaun tsp solver for constrained traveling salesman and vehicle routing problems,” Roskilde University, Roskilde, Denmark, Technical Report, 2017. [Online]. Available: http://www.akira.ruc.dk/ ∼keld/ research/LKH-3/LKH-3 REPORT.pdf

work page 2017

[50] [50]

InstructNav: Zero-shot system for generic instruction navigation in unexplored environment,

Y. Long, W. Cai, H. Wang, G. Zhan, and H. Dong, “InstructNav: Zero-shot system for generic instruction navigation in unexplored environment,” in Proceedings of The 8th Conference on Robot Learning, ser. Proceedings of Machine Learning Research, vol. 270. PMLR, 2025, pp. 2049–2060

work page 2025

[51] [51]

Fast-lio2: Fast direct lidar- inertial odometry,

W. Xu, Y. Cai, D. He, J. Lin, and F. Zhang, “Fast-lio2: Fast direct lidar- inertial odometry,”IEEE Transactions on Robotics, vol. 38, no. 4, pp. 2053–2073, 2022

work page 2053

[52] [52]

Intel RealSense Depth Camera D435i,

Intel Corporation, “Intel RealSense Depth Camera D435i,” https://www.intel.com/content/www/us/en/products/sku/190004/ intel-realsense-depth-camera-d435i.html, 2026, accessed: 2026-05- 17

work page 2026