HOTFLoc++: End-to-End Hierarchical LiDAR Place Recognition, Re-Ranking, and 6-DoF Metric Localisation in Forests
Pith reviewed 2026-05-17 23:34 UTC · model grok-4.3
The pith
An octree transformer with joint optimisation of recognition, re-ranking and localisation improves forest LiDAR performance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Leveraging an octree-based transformer to extract features at multiple granularities, together with learnable multi-scale geometric verification and joint optimisation of place recognition with re-ranking and localisation, enforces multi-scale geometric consistency and thereby improves convergence and reduces re-ranking failures in forest environments with high clutter and self-similarity.
What carries the argument
Octree-based transformer that produces multi-granularity features, paired with learnable multi-scale geometric verification and a joint training protocol that ties place recognition, re-ranking and 6-DoF localisation together.
Load-bearing premise
That the octree hierarchy plus joint optimisation of place recognition, re-ranking and localisation will enforce multi-scale geometric consistency and thereby improve convergence and reduce re-ranking failures specifically in forest environments with high clutter and self-similarity.
What would settle it
If ablation tests on CS-Wild-Places show that removing the multi-scale re-ranking module fails to cut average localisation error by roughly half or that Recall@1 does not rise by about 30 points, the benefit claimed for the joint hierarchical optimisation would be refuted.
read the original abstract
This article presents HOTFLoc++, an end-to-end hierarchical framework for LiDAR place recognition, re-ranking, and 6-DoF metric localisation in forests. Leveraging an octree-based transformer, our approach extracts features at multiple granularities to increase robustness to clutter, self-similarity, and viewpoint changes in challenging scenarios, including ground-to-ground and ground-to-aerial in forest and urban environments. We propose learnable multi-scale geometric verification to reduce re-ranking failures due to degraded single-scale correspondences. Our joint training protocol enforces multi-scale geometric consistency of the octree hierarchy via joint optimisation of place recognition with re-ranking and localisation, improving place recognition convergence. Our system achieves comparable or lower localisation errors to baselines, with runtime improvements of almost two orders of magnitude over RANSAC-based registration for dense point clouds. Experimental results on public datasets show the superiority of our approach compared to state-of-the-art methods, achieving an average Recall@1 of 90.7% on CS-Wild-Places: an improvement of 29.6 percentage points over baselines, while maintaining high performance on single-source benchmarks with an average Recall@1 of 91.7% and 97.9% on Wild-Places and MulRan, respectively. Our method achieves under 2m and 5$^{\circ}$ error for 97.2% of 6-DoF registration attempts, with our multi-scale re-ranking module reducing localisation errors by ~2x on average. The code is available at https://github.com/csiro-robotics/HOTFLoc.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents HOTFLoc++, an end-to-end hierarchical framework for LiDAR place recognition, re-ranking, and 6-DoF metric localisation in forests. It employs an octree-based transformer to extract multi-granularity features for robustness to clutter and self-similarity, introduces learnable multi-scale geometric verification to reduce re-ranking failures, and uses joint optimisation of place recognition with re-ranking and localisation to enforce multi-scale geometric consistency. Experiments on public datasets claim an average Recall@1 of 90.7% on CS-Wild-Places (29.6 pp improvement), 91.7% on Wild-Places, 97.9% on MulRan, under 2 m / 5° error for 97.2% of registrations, ~2x error reduction from the re-ranking module, and nearly two orders of magnitude runtime improvement over RANSAC.
Significance. If the empirical results hold under fair baselines and proper ablations, the work would represent a meaningful advance for LiDAR localisation in cluttered, self-similar forest environments where single-scale methods often fail. Code release supports reproducibility. The hierarchical octree design and joint training protocol address a practically relevant gap, though the magnitude of gains depends on isolating the contribution of the proposed components.
major comments (3)
- [Experimental Results / §5] The central attribution of the 29.6 pp Recall@1 lift on CS-Wild-Places and the ~2x localisation error reduction to the joint optimisation enforcing multi-scale geometric consistency is not supported by ablations that isolate this term from the backbone or dataset-specific tuning; no such forest-specific ablation isolating the joint loss is described.
- [Abstract and §4 (Joint Training Protocol)] The claim that joint training 'improves place recognition convergence' via multi-scale consistency lacks supporting evidence such as training curves, convergence metrics, or direct comparison of joint vs. staged optimisation on the forest datasets.
- [§5 (Experiments)] Without the full experimental section it is impossible to verify baseline fairness, data splits, statistical significance, or whether any post-hoc exclusions affect the reported 90.7% Recall@1, 97.2% success rate, and runtime claims.
minor comments (2)
- [§3.3] Clarify the exact learnable parameters in the multi-scale geometric verification module and whether they remain fixed or are re-optimised at inference.
- [§3.1] Add explicit discussion of how the octree hierarchy interacts with viewpoint changes in ground-to-aerial scenarios.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback, which highlights opportunities to strengthen the experimental validation of our contributions. We address each major comment below with clarifications and proposed revisions. Where the manuscript lacks explicit isolation of components, we commit to adding the necessary ablations and evidence in the revised version.
read point-by-point responses
-
Referee: [Experimental Results / §5] The central attribution of the 29.6 pp Recall@1 lift on CS-Wild-Places and the ~2x localisation error reduction to the joint optimisation enforcing multi-scale geometric consistency is not supported by ablations that isolate this term from the backbone or dataset-specific tuning; no such forest-specific ablation isolating the joint loss is described.
Authors: We acknowledge that the current ablations in §5.3 focus on the hierarchical octree backbone and multi-scale re-ranking modules but do not isolate the joint loss term specifically on forest data. To directly address this, we will add a new ablation table in the revised §5 comparing joint optimisation against staged training (place recognition first, then re-ranking/localisation) on CS-Wild-Places. This will quantify the incremental Recall@1 gain and error reduction attributable to the joint multi-scale consistency loss, separate from backbone or hyperparameter effects. revision: yes
-
Referee: [Abstract and §4 (Joint Training Protocol)] The claim that joint training 'improves place recognition convergence' via multi-scale consistency lacks supporting evidence such as training curves, convergence metrics, or direct comparison of joint vs. staged optimisation on the forest datasets.
Authors: The manuscript describes the joint training protocol in §4 but does not include training curves or quantitative convergence comparisons. We agree this evidence would better support the claim. In revision we will add training loss and Recall@1 curves (joint vs. staged) for the CS-Wild-Places and Wild-Places datasets in §5 or the supplementary material, showing faster convergence and improved final metrics under joint optimisation. revision: yes
-
Referee: [§5 (Experiments)] Without the full experimental section it is impossible to verify baseline fairness, data splits, statistical significance, or whether any post-hoc exclusions affect the reported 90.7% Recall@1, 97.2% success rate, and runtime claims.
Authors: Section 5 of the manuscript details the datasets, standard splits (following Wild-Places and MulRan protocols, with CS-Wild-Places using the provided cross-season partitions), baseline implementations (official code or re-implementations with matched hyperparameters), and evaluation metrics. All reported queries are included with no post-hoc exclusions. Statistical significance for retrieval is reported as mean over the full test set; for registration we average over 5 random seeds where stochasticity is present. To improve clarity we will insert a concise experimental setup summary table at the start of §5 in the revision. revision: partial
Circularity Check
No significant circularity; claims rest on empirical benchmarks
full rationale
The paper describes an octree-based transformer architecture, learnable multi-scale geometric verification, and a joint training protocol that optimizes place recognition together with re-ranking and localisation. These are presented as design choices whose benefits are measured via Recall@1 and 6-DoF error metrics on external datasets (CS-Wild-Places, Wild-Places, MulRan). No equations, fitted parameters, or self-citations are shown that reduce any central result to its own inputs by construction; the performance numbers are reported directly from experiments rather than derived tautologically from the method definition itself.
Axiom & Free-Parameter Ledger
free parameters (1)
- multi-scale geometric verification parameters
axioms (1)
- domain assumption Octree-based multi-granularity features increase robustness to clutter, self-similarity and viewpoint changes in forests
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Leveraging an octree-based transformer, our approach extracts features at multiple granularities... learnable multi-scale geometric verification
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
joint training protocol enforces multi-scale geometric consistency of the octree hierarchy via joint optimisation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Paired-CSLiDAR: Height-Stratified Registration for Cross-Source Aerial-Ground LiDAR Pose Refinement
Paired-CSLiDAR benchmark and Residual-Guided Stratified Registration achieve 86% success at 0.75 m RMSE on 9,012 cross-source pairs by height-stratified ICP and confidence-gated selection.
Reference graph
Works this paper leans on
-
[1]
Spectral Geometric Verification: Re-Ranking Point Cloud Retrieval for Metric Localization,
K. Vidanapathirana, P. Moghadam, S. Sridharan, and C. Fookes, “Spectral Geometric Verification: Re-Ranking Point Cloud Retrieval for Metric Localization,”IEEE Robot. Automat. Lett., vol. 8, no. 5, pp. 2494–2501, May 2023
work page 2023
-
[2]
CrossLoc3D: Aerial- Ground Cross-Source 3D Place Recognition,
T. Guan, A. Muthuselvam, M. Hoover, X. Wang, J. Liang, A. J. Sathyamoorthy, D. Conover, and D. Manocha, “CrossLoc3D: Aerial- Ground Cross-Source 3D Place Recognition,”Proc. IEEE/CVF Int. Conf. Comput. Vis., pp. 11 301–11 310, 2023
work page 2023
-
[3]
L. Carvalho de Lima, E. Griffiths, M. Haghighat, S. Denman, C. Fookes, P. Borges, M. Brunig, and M. Ramezani, “Online 6DoF Global Localisation in Forests using Semantically-Guided Re- Localisation and Cross-View Factor-Graph Optimisation,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2025
work page 2025
-
[4]
E. Griffiths, M. Haghighat, S. Denman, C. Fookes, and M. Ramezani, “HOTFormerLoc: Hierarchical Octree Transformer for Versatile Li- dar Place Recognition Across Ground and Aerial Views,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2025, pp. 6648– 6658
work page 2025
-
[5]
Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map,
G. Kim and A. Kim, “Scan Context: Egocentric Spatial Descriptor for Place Recognition Within 3D Point Cloud Map,” inProc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2018, pp. 4802–4809
work page 2018
-
[6]
RING++: Roto-Translation Invariant Gram for Global Localization on a Sparse Scan Map,
X. Xu, S. Lu, J. Wu, H. Lu, Q. Zhu, Y . Liao, R. Xiong, and Y . Wang, “RING++: Roto-Translation Invariant Gram for Global Localization on a Sparse Scan Map,”IEEE Trans. Robot., vol. 39, no. 6, pp. 4616– 4635, Dec. 2023
work page 2023
-
[7]
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition,
M. A. Uy and G. H. Lee, “PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., June 2018, pp. 4470–4479
work page 2018
-
[8]
DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization,
J. Du, R. Wang, and D. Cremers, “DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization,” inProc. Eur. Conf. Comput. Vis., 2020, pp. 744–762
work page 2020
-
[9]
MinkLoc3D: Point Cloud Based Large-Scale Place Recognition,
J. Komorowski, “MinkLoc3D: Point Cloud Based Large-Scale Place Recognition,” inProc. IEEE Winter Conf. Appl. Comput. Vis., Jan. 2021, pp. 1789–1798
work page 2021
-
[10]
LCDNet: Deep Loop Closure Detection and Point Cloud Registration for LiDAR SLAM,
D. Cattaneo, M. Vaghi, and A. Valada, “LCDNet: Deep Loop Closure Detection and Point Cloud Registration for LiDAR SLAM,”IEEE Trans. Robot., vol. 38, no. 4, pp. 2074–2093, Aug. 2022
work page 2074
-
[11]
LoGG3D-Net: Locally Guided Global Descriptor Learn- ing for 3D Place Recognition,
K. Vidanapathirana, M. Ramezani, P. Moghadam, S. Sridharan, and C. Fookes, “LoGG3D-Net: Locally Guided Global Descriptor Learn- ing for 3D Place Recognition,” inProc. IEEE Int. Conf. Robot. Automat., 2022, pp. 2215–2221
work page 2022
-
[12]
Improving Point Cloud Based Place Recognition with Ranking-based Loss and Large Batch Training,
J. Komorowski, “Improving Point Cloud Based Place Recognition with Ranking-based Loss and Large Batch Training,” in26th Int. Conf. Pattern Recognit.IEEE, 2022, pp. 3699–3705
work page 2022
-
[13]
EgoNN: Egocen- tric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale,
J. Komorowski, M. Wysoczanska, and T. Trzcinski, “EgoNN: Egocen- tric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale,”IEEE Robot. Automat. Lett., vol. 7, no. 2, pp. 722–729, Apr. 2022
work page 2022
-
[14]
Pyramid Point Cloud Transformer for Large-Scale Place Recognition,
L. Hui, H. Yang, M. Cheng, J. Xie, and J. Yang, “Pyramid Point Cloud Transformer for Large-Scale Place Recognition,” inProc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 6098–6107
work page 2021
-
[15]
TransLoc3D: Point cloud based large-scale place recognition using adaptive receptive fields,
T.-X. Xu, Y .-C. Guo, Z. Li, G. Yu, Y .-K. Lai, and S.-H. Zhang, “TransLoc3D: Point cloud based large-scale place recognition using adaptive receptive fields,”Commun. Inf. Syst., vol. 23, no. 1, pp. 57– 83, 2023
work page 2023
-
[16]
SALSA: Swift Adaptive Lightweight Self-Attention for Enhanced LiDAR Place Recognition,
R. G. Goswami, N. Patel, P. Krishnamurthy, and F. Khorrami, “SALSA: Swift Adaptive Lightweight Self-Attention for Enhanced LiDAR Place Recognition,”IEEE Robot. Autom. Lett., vol. 9, no. 10, pp. 8242–8249, Oct. 2024
work page 2024
-
[17]
Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environments,
J. Knights, K. Vidanapathirana, M. Ramezani, S. Sridharan, C. Fookes, and P. Moghadam, “Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environments,” inProc. IEEE Int. Conf. Robot. Automat., 2023, pp. 11 322–11 328
work page 2023
-
[18]
M. A. Fischler and R. C. Bolles, “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,”Commun. ACM, vol. 24, no. 6, pp. 381–395, June 1981
work page 1981
-
[19]
PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency,
X. Bai, Z. Luo, L. Zhou, H. Chen, L. Li, Z. Hu, H. Fu, and C.- L. Tai, “PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency,” inProc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., June 2021, pp. 15 854–15 864
work page 2021
-
[20]
CoFiNet: Reliable Coarse-to-fine Correspondences for Robust PointCloud Registration,
H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, “CoFiNet: Reliable Coarse-to-fine Correspondences for Robust PointCloud Registration,” inProc. Adv. Neural Inf. Process. Syst., vol. 34, 2021, pp. 23 872– 23 884
work page 2021
-
[21]
GeoTransformer: Fast and Robust Point Cloud Registration With Geometric Transformer,
Z. Qin, H. Yu, C. Wang, Y . Guo, Y . Peng, S. Ilic, D. Hu, and K. Xu, “GeoTransformer: Fast and Robust Point Cloud Registration With Geometric Transformer,”IEEE Trans. Pattern Anal. Machine Intell., vol. 45, no. 8, pp. 9806–9821, Aug. 2023
work page 2023
-
[22]
GeoAdapt: Self-Supervised Test-Time Adaptation in LiDAR Place Recognition Using Geometric Priors,
J. Knights, S. Hausler, S. Sridharan, C. Fookes, and P. Moghadam, “GeoAdapt: Self-Supervised Test-Time Adaptation in LiDAR Place Recognition Using Geometric Priors,”IEEE Robot. Automat. Lett., vol. 9, no. 1, pp. 915–922, Jan. 2024
work page 2024
-
[23]
A spectral technique for correspon- dence problems using pairwise constraints,
M. Leordeanu and M. Hebert, “A spectral technique for correspon- dence problems using pairwise constraints,” inProc. 10th IEEE Int. Conf. Comput. Vis., vol. 2, Oct. 2005, pp. 1482–1489
work page 2005
-
[24]
Super- Glue: Learning Feature Matching With Graph Neural Networks,
P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Super- Glue: Learning Feature Matching With Graph Neural Networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., June 2020, pp. 4937–4946
work page 2020
-
[25]
In Defense of the Triplet Loss for Person Re-Identification
A. Hermans, L. Beyer, and B. Leibe, “In Defense of the Triplet Loss for Person Re-Identification,” Nov. 2017, arXiv:1703.07737 [cs]
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[26]
MulRan: Multimodal Range Dataset for Urban Place Recognition,
G. Kim, Y . S. Park, Y . Cho, J. Jeong, and A. Kim, “MulRan: Multimodal Range Dataset for Urban Place Recognition,” inProc. IEEE Int. Conf. Robot. Automat., 2020, pp. 6246–6253
work page 2020
-
[27]
Sharpness-Aware Training for Free,
J. Du, D. Zhou, J. Feng, V . Tan, and J. T. Zhou, “Sharpness-Aware Training for Free,”Proc. Adv. Neural Inf. Process. Syst., Dec. 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.