pith. machine review for the scientific record. sign in

arxiv: 2604.11355 · v1 · submitted 2026-04-13 · 💻 cs.CV

Recognition: unknown

LEADER: Learning Reliable Local-to-Global Correspondences for LiDAR Relocalization

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:08 UTC · model grok-4.3

classification 💻 cs.CV
keywords LiDAR relocalization6-DoF pose estimationgeometric encoderreliability losspoint cloudoutlier robustnessdirect regression
0
0 comments X

The pith

A geometric encoder and truncated reliability loss improve LiDAR pose estimation by downweighting unreliable point predictions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces LEADER to address noise and outliers in learning-based LiDAR relocalization, where methods predict global 6-DoF poses directly without storing maps. It proposes a Robust Projection-based Geometric Encoder that extracts multi-scale geometric features from point clouds and pairs it with a Truncated Relative Reliability loss that reduces the weight of ambiguous or outlier predictions. This combination aims to produce more trustworthy local-to-global correspondences than treating every point equally. A reader would care because more robust pose estimates could support reliable navigation and mapping in complex real-world 3D environments such as urban driving.

Core claim

LEADER presents a Robust Projection-based Geometric Encoder that captures multi-scale geometric features to strengthen descriptiveness, together with a Truncated Relative Reliability loss that explicitly models point-wise ambiguity and downweights unreliable predictions, resulting in lower position and orientation errors than prior regression methods on the Oxford RobotCar and NCLT datasets.

What carries the argument

The Robust Projection-based Geometric Encoder, which projects LiDAR points to capture multi-scale geometric structure, and the Truncated Relative Reliability loss, which penalizes only sufficiently reliable point predictions while ignoring the rest.

If this is right

  • Direct regression methods can achieve lower pose error without explicit map storage when unreliable points are explicitly downweighted.
  • Multi-scale geometric features extracted via projection improve the descriptiveness of local point representations for global pose regression.
  • Truncating the loss at a reliability threshold mitigates the effect of outliers that would otherwise dominate the training signal.
  • The same architecture yields measurable gains on both the Oxford RobotCar and NCLT benchmarks under standard evaluation protocols.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The reliability-weighting idea could transfer to other regression tasks that predict continuous values from noisy sensor data.
  • Combining the encoder with explicit mapping modules might further reduce drift in long-term localization pipelines.
  • The projection-based design suggests a route to lighter models that still retain geometric detail without full 3-D convolutions.

Load-bearing premise

The reported error reductions are produced by the new encoder and truncated loss rather than by hidden differences in training protocols, data preprocessing, or hyperparameter choices versus the baselines.

What would settle it

Re-implement the baselines using the exact same training schedule, data splits, and preprocessing pipeline as LEADER and check whether the position-error reductions of 24.1 percent and 73.9 percent disappear.

Figures

Figures reproduced from arXiv: 2604.11355 by Chenglu Wen, Cheng Wang, Dunqiang Liu, Jianshi Wu, Minghang Zhu, Sheng Ao, Siqi Shen, Wen Li.

Figure 1
Figure 1. Figure 1: Mean position error comparisons on NCLT and Oxford [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The pipeline of the proposed LEADER. Raw point clouds undergo spatial transformation to establish yaw-invariant spatial [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of point cloud density distribution. [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization results of part of the methods on trajectory 17-13-26-39 in Quality-enhanced Oxford dataset. The black and red [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visualization results of part of the methods on trajectory 2012-05-26. The black and red points represent the true and predicted [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: All training trajectories and the 2012-05-26 test trajectory [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Cumulative distribution of position errors on NCLT [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Average Scene Point Error vs. Reliability Score Per [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
Figure 11
Figure 11. Figure 11: Overview of the proposed encoder architecture. [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Visualization of Reliability scores. Points are color [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Trajectory visualization of the NCLT dataset. [PITH_FULL_IMAGE:figures/full_fig_p012_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Ground truth trajectory visualization of Quality [PITH_FULL_IMAGE:figures/full_fig_p013_15.png] view at source ↗
read the original abstract

LiDAR relocalization has attracted increasing attention as it can deliver accurate 6-DoF pose estimation in complex 3D environments. Recent learning-based regression methods offer efficient solutions by directly predicting global poses without the need for explicit map storage. However, these methods often struggle in challenging scenes due to their equal treatment of all predicted points, which is vulnerable to noise and outliers. In this paper, we propose LEADER, a robust LiDAR-based relocalization framework enhanced by a simple, yet effective geometric encoder. Specifically, a Robust Projection-based Geometric Encoder architecture which captures multi-scale geometric features is first presented to enhance descriptiveness in geometric representation. A Truncated Relative Reliability loss is then formulated to model point-wise ambiguity and mitigate the influence of unreliable predictions. Extensive experiments on the Oxford RobotCar and NCLT datasets demonstrate that LEADER outperforms state-of-the-art methods, achieving 24.1% and 73.9% relative reductions in position error over existing techniques, respectively. The source code is released on https://github.com/JiansW/LEADER.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces LEADER, a learning-based framework for LiDAR relocalization that directly regresses 6-DoF poses. It proposes a Robust Projection-based Geometric Encoder to extract multi-scale geometric features from point clouds and a Truncated Relative Reliability loss to down-weight unreliable point predictions. Experiments on the Oxford RobotCar and NCLT datasets report that LEADER achieves 24.1% and 73.9% relative reductions in position error over prior state-of-the-art methods, with source code released.

Significance. If the reported gains are shown to arise specifically from the new encoder and loss under controlled conditions, the work would strengthen direct regression approaches for LiDAR relocalization by addressing noise and outlier sensitivity. The public code release is a clear positive for reproducibility and future comparisons.

major comments (2)
  1. [Experiments] Experiments section: the central performance claims rest on 24.1% and 73.9% relative position-error reductions, yet the manuscript provides no explicit statement that every baseline was re-implemented, trained from scratch, and evaluated under identical data splits, point-cloud preprocessing, augmentation, optimizer schedule, and hardware as LEADER. Without this protocol equivalence, the improvements cannot be confidently attributed to the Robust Projection-based Geometric Encoder or Truncated Relative Reliability loss.
  2. [Abstract and Experiments] Abstract and Experiments: no absolute error values, standard deviations, or error bars accompany the relative reductions, and no ablation tables isolate the contribution of the encoder versus the loss versus other design choices. These omissions make it difficult to assess whether the headline numbers are robust or sensitive to implementation details.
minor comments (2)
  1. [Abstract] The abstract refers to 'existing techniques' without naming the primary baselines; listing the main competing methods would improve immediate readability.
  2. [Method] Notation for the truncation threshold in the reliability loss is introduced without a clear symbol or equation reference in the early sections, which could be clarified for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will incorporate revisions to strengthen the experimental description and evaluation.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: the central performance claims rest on 24.1% and 73.9% relative position-error reductions, yet the manuscript provides no explicit statement that every baseline was re-implemented, trained from scratch, and evaluated under identical data splits, point-cloud preprocessing, augmentation, optimizer schedule, and hardware as LEADER. Without this protocol equivalence, the improvements cannot be confidently attributed to the Robust Projection-based Geometric Encoder or Truncated Relative Reliability loss.

    Authors: We agree that explicit protocol equivalence is necessary to attribute gains to the proposed components. In the revised manuscript, we will add a dedicated paragraph in the Experiments section stating that all baselines were re-implemented from their original publications, trained from scratch, and evaluated under identical conditions, including the same data splits, point-cloud preprocessing steps, augmentations, optimizer schedules, and hardware as LEADER. We will also reference the released source code to facilitate verification. revision: yes

  2. Referee: [Abstract and Experiments] Abstract and Experiments: no absolute error values, standard deviations, or error bars accompany the relative reductions, and no ablation tables isolate the contribution of the encoder versus the loss versus other design choices. These omissions make it difficult to assess whether the headline numbers are robust or sensitive to implementation details.

    Authors: We acknowledge that absolute metrics and ablations improve interpretability. We will revise the Abstract to report absolute position and rotation errors alongside the relative reductions. The Experiments section will be expanded with tables providing mean errors, standard deviations, and error bars on relevant figures. We will also include new ablation tables that isolate the contributions of the Robust Projection-based Geometric Encoder, the Truncated Relative Reliability loss, and other design choices. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical architecture and loss with independent experimental validation

full rationale

The paper introduces a new encoder architecture and a custom loss function, then reports empirical performance gains on public datasets. No derivation chain is presented that reduces a claimed result to its own inputs by construction, self-definition, or fitted-parameter renaming. The central claims rest on experimental comparisons rather than any mathematical identity or self-referential fitting. Self-citations, if present in the full text, are not load-bearing for the core method or the reported error reductions.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The central claim rests on the empirical effectiveness of two newly introduced components whose hyperparameters and architectural details are not visible in the abstract; no first-principles derivation is offered.

free parameters (1)
  • truncation threshold in reliability loss
    The point at which unreliable predictions are truncated is a design choice that must be set or tuned; its value is not given in the abstract.
axioms (1)
  • domain assumption LiDAR point clouds contain sufficient geometric structure for direct 6-DoF pose regression
    Implicit in the choice of a regression-based relocalization pipeline.
invented entities (2)
  • Robust Projection-based Geometric Encoder no independent evidence
    purpose: Capture multi-scale geometric features from LiDAR scans
    New module introduced by the paper; no independent evidence outside the reported experiments.
  • Truncated Relative Reliability loss no independent evidence
    purpose: Model point-wise ambiguity and reduce influence of unreliable predictions
    New loss function introduced by the paper; no independent evidence outside the reported experiments.

pith-pipeline@v0.9.0 · 5506 in / 1392 out tokens · 47632 ms · 2026-05-10T16:08:49.350411+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

72 extracted references

  1. [1]

    PyTorch 2: Faster Machine Learning Through Dynamic Python Bytecode Transformation and Graph Compilation

    Jason Ansel, Edward Yang, Horace He, Natalia Gimelshein, Animesh Jain, Michael V oznesensky, Bin Bao, Peter Bell, David Berard, Evgeni Burovski, Geeta Chauhan, Anjali Chourdia, Will Constable, Alban Desmaison, Zachary DeVito, Elias Ellison, Will Feng, Jiong Gong, Michael Gschwind, Brian Hirsh, Sherlock Huang, Kshiteej Kalambarkar, Laurent Kirsch, Michael ...

  2. [2]

    Spinnet: Learning a general surface descriptor for 3d point cloud registration

    Sheng Ao, Qingyong Hu, Bo Yang, Andrew Markham, and Yulan Guo. Spinnet: Learning a general surface descriptor for 3d point cloud registration. InCVPR, pages 11753–11762,

  3. [3]

    You only train once: Learning general and distinctive 3d local descriptors.IEEE TPAMI, 45 (3):3949–3967, 2022

    Sheng Ao, Yulan Guo, Qingyong Hu, Bo Yang, Andrew Markham, and Zengping Chen. You only train once: Learning general and distinctive 3d local descriptors.IEEE TPAMI, 45 (3):3949–3967, 2022

  4. [4]

    Buffer: Balancing accuracy, efficiency, and generaliz- ability in point cloud registration

    Sheng Ao, Qingyong Hu, Hanyun Wang, Kai Xu, and Yulan Guo. Buffer: Balancing accuracy, efficiency, and generaliz- ability in point cloud registration. InCVPR, pages 1255–1264,

  5. [5]

    Netvlad: Cnn architecture for weakly super- vised place recognition

    Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, and Josef Sivic. Netvlad: Cnn architecture for weakly super- vised place recognition. InCVPR, pages 5297–5307, 2016. 2

  6. [6]

    Learning less is more - 6d camera localization via 3d surface regression

    Eric Brachmann and Carsten Rother. Learning less is more - 6d camera localization via 3d surface regression. InCVPR, pages 4654–4662, 2018. 2

  7. [7]

    Visual camera re- localization from rgb and rgb-d images using dsac.IEEE TPAMI, 44(9):5847–5865, 2021

    Eric Brachmann and Carsten Rother. Visual camera re- localization from rgb and rgb-d images using dsac.IEEE TPAMI, 44(9):5847–5865, 2021. 1

  8. [8]

    Dsac-differentiable ransac for camera localization

    Eric Brachmann, Alexander Krull, Sebastian Nowozin, Jamie Shotton, Frank Michel, Stefan Gumhold, and Carsten Rother. Dsac-differentiable ransac for camera localization. InCVPR, pages 6684–6692, 2017

  9. [9]

    Accelerated coordinate encoding: Learning to relocalize in minutes using rgb and poses

    Eric Brachmann, Tommaso Cavallari, and Victor Adrian Prisacariu. Accelerated coordinate encoding: Learning to relocalize in minutes using rgb and poses. InCVPR, pages 5044–5053, 2023. 1

  10. [10]

    Ushani, and Ryan M

    Nicholas Carlevaris-Bianco, Arash K. Ushani, and Ryan M. Eustice. University of Michigan North Campus long-term vision and lidar dataset.IJRR, 35(9):1023–1035, 2015. 5, 2

  11. [11]

    Lcdnet: Deep loop closure detection and point cloud registration for lidar slam.IEEE TRO, 38(4):2074–2093, 2022

    Daniele Cattaneo, Matteo Vaghi, and Abhinav Valada. Lcdnet: Deep loop closure detection and point cloud registration for lidar slam.IEEE TRO, 38(4):2074–2093, 2022. 1

  12. [12]

    Dfnet: Enhance absolute pose regression with direct feature matching

    Shuai Chen, Xinghui Li, Zirui Wang, and Victor Prisacariu. Dfnet: Enhance absolute pose regression with direct feature matching. InECCV, 2022. 2

  13. [13]

    Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration

    Zhi Chen, Kun Sun, Fan Yang, and Wenbing Tao. Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration. InCVPR, pages 13221–13231, 2022. 5

  14. [14]

    Sc22-pcr++: Rethinking the generation and selection for ef- ficient and robust point cloud registration.IEEE TPAMI, 45 (10):12358–12376, 2023

    Zhi Chen, Kun Sun, Fan Yang, Lin Guo, and Wenbing Tao. Sc22-pcr++: Rethinking the generation and selection for ef- ficient and robust point cloud registration.IEEE TPAMI, 45 (10):12358–12376, 2023. 2

  15. [15]

    4d spatio-temporal convnets: Minkowski convolutional neural networks

    Christopher Choy, JunYoung Gwak, and Silvio Savarese. 4d spatio-temporal convnets: Minkowski convolutional neural networks. InCVPR, pages 3075–3084, 2019. 2, 5

  16. [16]

    Fully convolutional geometric features

    Christopher Choy, Jaesik Park, and Vladlen Koltun. Fully convolutional geometric features. InICCV, pages 8958–8966, 2019

  17. [17]

    High-dimensional convolutional networks for geometric pattern recognition

    Christopher Choy, Junha Lee, Rene Ranftl, Jaesik Park, and Vladlen Koltun. High-dimensional convolutional networks for geometric pattern recognition. InCVPR, 2020. 5

  18. [18]

    Fischler and Robert C

    Martin A. Fischler and Robert C. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography.CACM, 15: 381–395, 1981. 2

  19. [19]

    Are we ready for autonomous driving? the kitti vision benchmark suite

    Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. InCVPR, 2012. 5

  20. [20]

    Generative sparse detection networks for 3d single-shot object detection

    JunYoung Gwak, Christopher B Choy, and Silvio Savarese. Generative sparse detection networks for 3d single-shot object detection. InECCV, 2020. 5

  21. [21]

    Hitpr: Hierarchical transformer for place recognition in point cloud

    Zhixing Hou, Yan Yan, Chengzhong Xu, and Hui Kong. Hitpr: Hierarchical transformer for place recognition in point cloud. InICRA, page 2612–2618. IEEE Press, 2022. 2

  22. [22]

    Squeeze-and-excitation networks

    Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InCVPR, pages 7132–7141, 2018. 2

  23. [23]

    A consistency-aware spot-guided transformer for versatile and hierarchical point cloud registration

    Renlang Huang, Yufan Tang, Jiming Chen, and Liang Li. A consistency-aware spot-guided transformer for versatile and hierarchical point cloud registration. InNeurIPS, 2024. 1

  24. [24]

    Feature- metric registration: A fast semi-supervised approach for ro- bust point cloud registration without correspondences

    Xiaoshui Huang, Guofeng Mei, and Jian Zhang. Feature- metric registration: A fast semi-supervised approach for ro- bust point cloud registration without correspondences. In CVPR, pages 11366–11374, 2020. 1

  25. [25]

    Prior guided dropout for robust visual localization in dynamic environments

    Zhaoyang Huang, Yan Xu, Jianping Shi, Xiaowei Zhou, Hu- jun Bao, and Guofeng Zhang. Prior guided dropout for robust visual localization in dynamic environments. InICCV, pages 2791–2800, 2019. 2

  26. [26]

    Modelling uncertainty in deep learning for camera relocalization

    Alex Kendall and Roberto Cipolla. Modelling uncertainty in deep learning for camera relocalization. InICRA, pages 4762–4769. IEEE, 2016. 2

  27. [27]

    Posenet: A convolutional network for real-time 6-dof camera relocal- ization

    Alex Kendall, Matthew Grimes, and Roberto Cipolla. Posenet: A convolutional network for real-time 6-dof camera relocal- ization. InCVPR, pages 2938–2946, 2015. 1

  28. [28]

    Scan context: Egocentric spatial descriptor for place recognition within 3d point cloud map

    Giseop Kim and Ayoung Kim. Scan context: Egocentric spatial descriptor for place recognition within 3d point cloud map. InIROS, pages 4802–4809, 2018. 3

  29. [29]

    Scan con- text++: Structural place recognition robust to rotation and lateral variations in urban environments.IEEE TRO, 38(3): 1856–1874, 2021

    Giseop Kim, Sunwook Choi, and Ayoung Kim. Scan con- text++: Structural place recognition robust to rotation and lateral variations in urban environments.IEEE TRO, 38(3): 1856–1874, 2021. 1

  30. [30]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InICLR, 2015. 5

  31. [31]

    Minkloc3d: Point cloud based large-scale place recognition

    Jacek Komorowski. Minkloc3d: Point cloud based large-scale place recognition. InWACV, pages 1789–1798, 2021. 2

  32. [32]

    Spherical transformer for lidar-based 3d recognition

    Xin Lai, Yukang Chen, Fanbin Lu, Jianhui Liu, and Jiaya Jia. Spherical transformer for lidar-based 3d recognition. In CVPR, pages 17545–17555, 2023. 2

  33. [33]

    Patch- work++: Fast and robust ground segmentation solving partial under-segmentation using 3D point cloud

    Seungjae Lee, Hyungtae Lim, and Hyun Myung. Patch- work++: Fast and robust ground segmentation solving partial under-segmentation using 3D point cloud. InIROS, pages 13276–13283, 2022. 3, 1

  34. [34]

    pvlad: A discriminative image descriptor for image retrieval

    Jun Li, Changyin Sun, Junliang Xing, and Weiming Hu. pvlad: A discriminative image descriptor for image retrieval. In WCICA, pages 93–98, 2014. 2

  35. [35]

    Sgloc: Scene geometry encoding for outdoor lidar localization

    Wen Li, Shangshu Yu, Cheng Wang, Guosheng Hu, Siqi Shen, and Chenglu Wen. Sgloc: Scene geometry encoding for outdoor lidar localization. InCVPR, pages 9286–9295,

  36. [36]

    Diffloc: Diffusion model for outdoor lidar localization

    Wen Li, Yuyang Yang, Shangshu Yu, Guosheng Hu, Chenglu Wen, Ming Cheng, and Cheng Wang. Diffloc: Diffusion model for outdoor lidar localization. InCVPR, pages 15045– 15054, 2024. 1, 2, 5, 6

  37. [37]

    Lightloc: Learn- ing outdoor lidar localization at light speed

    Wen Li, Chen Liu, Shangshu Yu, Dunqiang Liu, Yin Zhou, Siqi Shen, Chenglu Wen, and Cheng Wang. Lightloc: Learn- ing outdoor lidar localization at light speed. InCVPR, pages 6680–6689, 2025. 1, 2, 5, 6

  38. [38]

    Quatro++: Robust global registration exploiting ground segmentation for loop closing in lidar slam.IJRR, 43(5):685–715, 2024

    Hyungtae Lim, Beomsoo Kim, Daebeom Kim, Eungchang Mason Lee, and Hyun Myung. Quatro++: Robust global registration exploiting ground segmentation for loop closing in lidar slam.IJRR, 43(5):685–715, 2024. 1

  39. [39]

    Sg-reg: Generalizable and efficient scene graph registration.IEEE TRO, 2025

    Chuhao Liu, Zhijian Qiao, Jieqi Shi, Ke Wang, Peize Liu, and Shaojie Shen. Sg-reg: Generalizable and efficient scene graph registration.IEEE TRO, 2025. 1

  40. [40]

    Densernet: Weakly supervised visual localization using multi-scale feature ag- gregation

    Dongfang Liu, Yiming Cui, Liqi Yan, Christos Mousas, Bai- jian Yang, and Yingjie Victor Chen. Densernet: Weakly supervised visual localization using multi-scale feature ag- gregation. InAAAI, pages 6101–6109. AAAI Press, 2021. 2

  41. [41]

    Difflow3d: Hierarchical diffusion models for uncertainty-aware 3d scene flow estimation.IEEE TPAMI,

    Jiuming Liu, Weicai Ye, Guangming Wang, Chaokang Jiang, Lei Pan, Jinru Han, Zhe Liu, Guofeng Zhang, and Hesh- eng Wang. Difflow3d: Hierarchical diffusion models for uncertainty-aware 3d scene flow estimation.IEEE TPAMI,

  42. [42]

    One ring to rule them all: Radon sinogram for place recognition, orientation and translation estimation

    Sha Lu, Xuecheng Xu, Huan Yin, Zexi Chen, Rong Xiong, and Yue Wang. One ring to rule them all: Radon sinogram for place recognition, orientation and translation estimation. InIROS, pages 2778–2785. IEEE, 2022. 5

  43. [43]

    L3-net: Towards learning based lidar localization for autonomous driving

    Weixin Lu, Yao Zhou, Guowei Wan, Shenhua Hou, and Shiyu Song. L3-net: Towards learning based lidar localization for autonomous driving. InCVPR, pages 6389–6398, 2019. 1

  44. [44]

    1 Year, 1000km: The Oxford RobotCar Dataset.IJRR, 36(1):3–15, 2017

    Will Maddern, Geoff Pascoe, Chris Linegar, and Paul New- man. 1 Year, 1000km: The Oxford RobotCar Dataset.IJRR, 36(1):3–15, 2017. 5

  45. [45]

    PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Repre- sentation for Achieving Global Map Consistency.IEEE TRO, 40:4045–4064, 2024

    Yue Pan, Xingguang Zhong, Louis Wiesmann, Th ¨orbjorn Posewsky, Jens Behley, and Cyrill Stachniss. PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Repre- sentation for Achieving Global Map Consistency.IEEE TRO, 40:4045–4064, 2024. 3

  46. [46]

    Qi, Li Yi, Hao Su, and Leonidas J

    Charles R. Qi, Li Yi, Hao Su, and Leonidas J. Guibas. Point- net++: deep hierarchical feature learning on point sets in a metric space. InNeurIPS, page 5105–5114, Red Hook, NY , USA, 2017. Curran Associates Inc. 3

  47. [47]

    Geometric transformer for fast and robust point cloud registration

    Zheng Qin, Hao Yu, Changjian Wang, Yulan Guo, Yuxing Peng, and Kai Xu. Geometric transformer for fast and robust point cloud registration. InCVPR, pages 11143–11152, 2022. 1

  48. [48]

    U-net: Convolutional networks for biomedical image segmentation

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMICCAI, pages 234–241, Cham, 2015. Springer Interna- tional Publishing. 3

  49. [49]

    Large-scale location recognition and the geometric burstiness problem

    Torsten Sattler, Michal Havlena, Konrad Schindler, and Marc Pollefeys. Large-scale location recognition and the geometric burstiness problem. InCVPR, pages 1582–1590, 2016. 2

  50. [50]

    Are large-scale 3d models really necessary for accurate visual localization? InCVPR, pages 6175–6184, 2017

    Torsten Sattler, Akihiko Torii, Josef Sivic, Marc Pollefeys, Hajime Taira, Masatoshi Okutomi, and Tomas Pajdla. Are large-scale 3d models really necessary for accurate visual localization? InCVPR, pages 6175–6184, 2017. 2

  51. [51]

    Generalized-icp

    Aleksandr Segal, Dirk H ¨ahnel, and Sebastian Thrun. Generalized-icp. InRSS. The MIT Press, 2009. 3

  52. [52]

    Fast and accurate deep loop closing and relocal- ization for reliable lidar slam.IEEE TRO, 2024

    Chenghao Shi, Xieyuanli Chen, Junhao Xiao, Bin Dai, and Huimin Lu. Fast and accurate deep loop closing and relocal- ization for reliable lidar slam.IEEE TRO, 2024. 1

  53. [53]

    Weiss, Niru Maheswaranathan, and Surya Ganguli

    Jascha Sohl-Dickstein, Eric A. Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. InICML, page 2256–2265. JMLR.org, 2015. 2

  54. [54]

    24/7 place recognition by view synthesis

    Akihiko Torii, Relja Arandjelovi ´c, Josef Sivic, Masatoshi Okutomi, and Tomas Pajdla. 24/7 place recognition by view synthesis. InCVPR, pages 1808–1817, 2015. 2

  55. [55]

    Pointnetvlad: Deep point cloud based retrieval for large-scale place recognition

    Mikaela Angelina Uy and Gim Hee Lee. Pointnetvlad: Deep point cloud based retrieval for large-scale place recognition. InCVPR, pages 4470–4479, 2018. 2

  56. [56]

    Gomez, Łukasz Kaiser, and Illia Polosukhin

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. InNeurIPS, page 6000–6010, Red Hook, NY , USA, 2017. Curran Associates Inc. 2

  57. [57]

    Hypliloc: Towards effective lidar pose regression with hyperbolic fusion

    Sijie Wang, Qiyu Kang, Rui She, Wei Wang, Kai Zhao, Yang Song, and Wee Peng Tay. Hypliloc: Towards effective lidar pose regression with hyperbolic fusion. InCVPR, pages 5176–5185, 2023. 2, 5, 6

  58. [58]

    Point- loc: Deep pose regressor for lidar point cloud localization

    Wei Wang, Bing Wang, Peijun Zhao, Changhao Chen, Ronald Clark, Bo Yang, Andrew Markham, and Niki Trigoni. Point- loc: Deep pose regressor for lidar point cloud localization. IEEE SJ, 22(1):959–968, 2021. 1, 2, 5, 6

  59. [59]

    Yue Wang and Justin M. Solomon. Deep closest point: Learn- ing representations for point cloud registration. InICCV,

  60. [60]

    Henriques, and Daniel Cremers

    Yan Xia, Mariia Gladkova, Rui Wang, Qianyun Li, Uwe Stilla, Jo˜ao F. Henriques, and Daniel Cremers. Casspr: Cross attention single scan place recognition. InICCV, pages 8427– 8438, 2023. 2

  61. [61]

    Ring++: Roto-translation- invariant gram for global localization on a sparse scan map

    Xuecheng Xu, Sha Lu, Jun Wu, Haojian Lu, Qiuguo Zhu, Yiyi Liao, Rong Xiong, and Yue Wang. Ring++: Roto-translation- invariant gram for global localization on a sparse scan map. IEEE TRO, 2023. 1, 5

  62. [62]

    Lisa: Lidar localization with semantic awareness

    Bochun Yang, Zijun Li, Wen Li, Zhipeng Cai, Chenglu Wen, Yu Zang, Matthias Muller, and Cheng Wang. Lisa: Lidar localization with semantic awareness. InCVPR, pages 15271– 15280, 2024. 2, 5, 6

  63. [63]

    One-inlier is first: Towards efficient position encoding for point cloud registration

    Fan Yang, Lin Guo, Zhi Chen, and Wenbing Tao. One-inlier is first: Towards efficient position encoding for point cloud registration. InNeurIPS, pages 6982–6995, 2022. 1

  64. [64]

    Raloc: Enhancing outdoor lidar localization via rotation awareness

    Yuyang Yang, Wen Li, Sheng Ao, Qingshan Xu, Shangshu Yu, Yu Guo, Yin Zhou, Siqi Shen, and Cheng Wang. Raloc: Enhancing outdoor lidar localization via rotation awareness. InICCV, 2025. 2, 5, 6

  65. [65]

    A survey on global lidar localization: Challenges, advances and open problems.IJCV, 132(8):3139–3171, 2024

    Huan Yin, Xuecheng Xu, Sha Lu, Xieyuanli Chen, Rong Xiong, Shaojie Shen, Cyrill Stachniss, and Yue Wang. A survey on global lidar localization: Challenges, advances and open problems.IJCV, 132(8):3139–3171, 2024. 1

  66. [66]

    Cofinet: Reliable coarse-to-fine correspondences for robust point cloud registration

    Hao Yu, Fu Li, Mahdi Saleh, Benjamin Busam, and Slobodan Ilic. Cofinet: Reliable coarse-to-fine correspondences for robust point cloud registration. InNeurIPS, 2021. 1

  67. [67]

    Lidar-based local- ization using universal encoding and memory-aware regres- sion.PR, 128:108685, 2022

    Shangshu Yu, Cheng Wang, Chenglu Wen, Ming Cheng, Minghao Liu, Zhihong Zhang, and Xin Li. Lidar-based local- ization using universal encoding and memory-aware regres- sion.PR, 128:108685, 2022. 2, 5, 6

  68. [68]

    Btc: A binary and triangle combined descriptor for 3d place recognition.IEEE TRO,

    Chongjian Yuan, Jiarong Lin, Zheng Liu, Hairuo Wei, Xi- aoping Hong, and Fu Zhang. Btc: A binary and triangle combined descriptor for 3d place recognition.IEEE TRO,

  69. [69]

    Where precision meets efficiency: Transformation diffusion model for point cloud registration

    Yongzhe Yuan, Yue Wu, Xiaolong Fan, Maoguo Gong, Qiguang Miao, and Wenping Ma. Where precision meets efficiency: Transformation diffusion model for point cloud registration. InAAAI, pages 9734–9742, 2025. 1

  70. [70]

    Category-level adversaries for outdoor lidar point clouds cross-domain se- mantic segmentation.IEEE TITS, 24(2):1982–1993, 2023

    Zhimin Yuan, Chenglu Wen, Ming Cheng, Yanfei Su, Wei- quan Liu, Shangshu Yu, and Cheng Wang. Category-level adversaries for outdoor lidar point clouds cross-domain se- mantic segmentation.IEEE TITS, 24(2):1982–1993, 2023. 2

  71. [71]

    Kfnet: Learning temporal camera relocalization using kalman filtering

    Lei Zhou, Zixin Luo, Tianwei Shen, Jiahui Zhang, Mingmin Zhen, Yao Yao, Tian Fang, and Long Quan. Kfnet: Learning temporal camera relocalization using kalman filtering. In CVPR, pages 4919–4928, 2020. 2

  72. [72]

    Open3D: A modern library for 3D data processing.arXiv, 2018

    Qian-Yi Zhou, Jaesik Park, and Vladlen Koltun. Open3D: A modern library for 3D data processing.arXiv, 2018. 1