LSGS-Loc: Towards Robust 3DGS-Based Visual Localization for Large-Scale UAV Scenarios

Fang Xu; Tengfei Wang; Xiang Zhang; Xin Wang; Zongqian Zhan

arxiv: 2604.05402 · v1 · submitted 2026-04-07 · 💻 cs.CV · cs.RO

LSGS-Loc: Towards Robust 3DGS-Based Visual Localization for Large-Scale UAV Scenarios

Xiang Zhang , Tengfei Wang , Fang Xu , Xin Wang , Zongqian Zhan This is my paper

Pith reviewed 2026-05-10 19:32 UTC · model grok-4.3

classification 💻 cs.CV cs.RO

keywords visual localization3D Gaussian SplattingUAVpose initializationphotometric refinementreliability maskinglarge-scale scenes

0 comments

The pith

LSGS-Loc achieves robust visual localization in large-scale UAV scenes by adding scale-aware pose initialization and Laplacian masking to 3D Gaussian Splatting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LSGS-Loc as a visual localization pipeline built for expansive UAV environments modeled with 3D Gaussian Splatting. It solves weak starting poses and sensitivity to rendering flaws by fusing scene-agnostic relative pose estimates with explicit scale limits drawn from the 3DGS representation. During refinement, a Laplacian-based mask steers optimization toward reliable image regions and away from blur or floaters. Experiments on UAV benchmarks show higher accuracy than prior 3DGS methods when queries arrive in random order, which supports practical autonomous drone operation.

Core claim

LSGS-Loc is a visual localization pipeline for large-scale 3DGS scenes that combines scene-agnostic relative pose estimation with explicit 3DGS scale constraints to produce geometrically grounded initial poses without scene-specific training. In the refinement stage, Laplacian-based reliability masking directs photometric optimization to high-quality regions and away from reconstruction artifacts such as blur and floaters. The resulting method reaches state-of-the-art accuracy and robustness on large-scale UAV benchmarks for unordered image queries.

What carries the argument

Scale-aware pose initialization that merges relative pose estimation with 3DGS scale constraints, paired with Laplacian reliability masking that filters unreliable regions during photometric refinement.

Load-bearing premise

The scale-aware initialization and Laplacian masking will continue to deliver reliable results across varied large-scale UAV environments without scene-specific retraining or benchmark-specific tuning.

What would settle it

Evaluation on a fresh large-scale UAV dataset containing new terrain, lighting, or reconstruction artifacts where the method no longer exceeds baseline 3DGS accuracy or shows clear degradation from unmasked regions.

Figures

Figures reproduced from arXiv: 2604.05402 by Fang Xu, Tengfei Wang, Xiang Zhang, Xin Wang, Zongqian Zhan.

**Figure 1.** Figure 1: Workflow of the proposed LSGS-Loc. (1) Scene representation via 3DGS followed by reference retrieval; (2) Intermediate pose alignment based [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗

**Figure 3.** Figure 3: Pass rate under strict thresholds (0.1 ◦, 0.2m) across 200 optimization iterations, averaged over all scenes. This indicates that the optimization stage successfully converges most residual errors from Phase 2, underscoring the robustness of our complete pipeline. The qualitative results in [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 2.** Figure 2: Visualization of the camera localization process. The Illustration [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 4.** Figure 4: Qualitative comparison of different optimization methods. The diagonal partitions the ground-truth query (lower-left) and the rendered image [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Visual localization in large-scale UAV scenarios is a critical capability for autonomous systems, yet it remains challenging due to geometric complexity and environmental variations. While 3D Gaussian Splatting (3DGS) has emerged as a promising scene representation, existing 3DGS-based visual localization methods struggle with robust pose initialization and sensitivity to rendering artifacts in large-scale settings. To address these limitations, we propose LSGS-Loc, a novel visual localization pipeline tailored for large-scale 3DGS scenes. Specifically, we introduce a scale-aware pose initialization strategy that combines scene-agnostic relative pose estimation with explicit 3DGS scale constraints, enabling geometrically grounded localization without scene-specific training. Furthermore, in the pose refinement, to mitigate the impact of reconstruction artifacts such as blur and floaters, we develop a Laplacian-based reliability masking mechanism that guides photometric refinement toward high-quality regions. Extensive experiments on large-scale UAV benchmarks demonstrate that our method achieves state-of-the-art accuracy and robustness for unordered image queries, significantly outperforming existing 3DGS-based approaches. Code is available at: https://github.com/xzhang-z/LSGS-Loc

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LSGS-Loc adds scale-aware initialization from relative poses plus Laplacian masking to 3DGS localization for UAVs, delivering practical gains on benchmarks but with generalization still unproven.

read the letter

LSGS-Loc takes existing 3D Gaussian Splatting and adds two targeted fixes for large-scale UAV visual localization. The first is scale-aware pose initialization that combines scene-agnostic relative pose estimation with explicit scale constraints pulled from the 3DGS model itself. The second is a Laplacian-based reliability mask during photometric refinement that steers the optimization toward cleaner regions and away from floaters and blur. Both moves are presented as new relative to prior 3DGS localization work, and they directly tackle the initialization failures and artifact sensitivity that show up in big outdoor scenes. The paper tests on unordered image queries, which is realistic for UAV data, and reports better accuracy than other 3DGS methods on the UAV benchmarks they used. The code is released, which makes the claims checkable. The evaluation stays within a small set of existing large-scale UAV datasets. Those sets likely share similar altitude ranges, lighting conditions, and reconstruction artifacts, so the reported improvements could be narrower than they appear. Without broader or out-of-distribution testing it is hard to tell whether the scale constraints and masking will transfer cleanly to new environments. The approach itself looks standard and free of circular reasoning. This is aimed at researchers and engineers working on drone navigation, aerial mapping, and 3D scene representations for robotics. A reader in that area would get concrete implementation ideas and a working baseline to try. The paper has enough focused engineering value and public code to deserve a serious referee rather than a desk reject, even if the review will probably press on the evaluation scope.

Referee Report

3 major / 2 minor

Summary. The paper introduces LSGS-Loc, a visual localization pipeline for large-scale UAV scenarios based on 3D Gaussian Splatting. It proposes a scale-aware pose initialization that combines scene-agnostic relative pose estimation with explicit 3DGS scale constraints, and a Laplacian-based reliability masking mechanism to guide photometric refinement away from reconstruction artifacts such as blur and floaters. The central claim is that extensive experiments on large-scale UAV benchmarks show state-of-the-art accuracy and robustness for unordered image queries, significantly outperforming prior 3DGS-based methods, without requiring scene-specific training.

Significance. If the empirical claims hold after addressing the noted issues, the work would provide a practical advance in 3DGS-based localization for UAVs by improving initialization robustness and artifact handling in large-scale settings. The public code release supports reproducibility and allows direct verification of the reported gains.

major comments (3)

[Experiments] Experiments section: The claim that the method generalizes robustly across diverse large-scale UAV environments rests on evaluation confined to a small number of existing UAV benchmarks. No cross-benchmark transfer tests or out-of-distribution evaluation (e.g., varying altitudes, lighting, or reconstruction quality) are reported, which is load-bearing for the central assertion of scene-agnostic robustness and SOTA performance on unordered queries.
[Method] Method section (scale-aware initialization): The explicit 3DGS scale constraints are described as enabling geometrically grounded localization, but the manuscript does not provide a quantitative analysis of how these constraints interact with the relative-pose estimator under varying scene scales or reconstruction errors; this detail is necessary to substantiate that the gains are method-intrinsic rather than benchmark-specific.
[Experiments] Experiments / ablation studies: The abstract asserts SOTA accuracy and robustness, yet the provided description lacks detailed error distributions, failure-case analysis, or full ablation tables isolating the contribution of Laplacian masking versus scale constraints; without these, it is impossible to confirm that the reported improvements are not influenced by post-hoc benchmark choices.

minor comments (2)

[Introduction] The phrase 'unordered image queries' is used repeatedly but never formally defined; a brief clarification in the introduction or problem statement would improve readability.
[Method] Figure captions for the pipeline diagram should explicitly label the scale-constraint and Laplacian-masking modules to match the textual description.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions where appropriate to strengthen the presentation of LSGS-Loc.

read point-by-point responses

Referee: The claim that the method generalizes robustly across diverse large-scale UAV environments rests on evaluation confined to a small number of existing UAV benchmarks. No cross-benchmark transfer tests or out-of-distribution evaluation (e.g., varying altitudes, lighting, or reconstruction quality) are reported, which is load-bearing for the central assertion of scene-agnostic robustness and SOTA performance on unordered queries.

Authors: We agree that broader generalization tests would further support the claims. Our evaluations use standard large-scale UAV benchmarks that encompass variations in scene scale, altitude, lighting conditions, and reconstruction quality, with consistent outperformance over prior 3DGS methods on unordered queries. These benchmarks are representative of the target scenarios and were chosen for their public availability and relevance. In the revision, we will expand the experiments section with a detailed characterization of benchmark diversity, additional per-scene breakdowns, and an explicit discussion of limitations regarding cross-benchmark transfer and OOD robustness. We will also explore any feasible supplementary analysis using existing data splits. revision: partial
Referee: The explicit 3DGS scale constraints are described as enabling geometrically grounded localization, but the manuscript does not provide a quantitative analysis of how these constraints interact with the relative-pose estimator under varying scene scales or reconstruction errors; this detail is necessary to substantiate that the gains are method-intrinsic rather than benchmark-specific.

Authors: The scale-aware initialization integrates scene-agnostic relative pose estimation with explicit 3DGS scale constraints derived from the Gaussian representation to enforce geometric consistency during initialization. Ablation results in the manuscript already isolate the contribution of this module to overall accuracy. To provide the requested quantitative analysis, we will add new experiments in the revised manuscript that measure the interaction effects, including sensitivity to varying scene scales and controlled levels of reconstruction error (e.g., by perturbing the 3DGS model), using the existing benchmark data to demonstrate that the gains are intrinsic to the proposed constraints. revision: yes
Referee: The abstract asserts SOTA accuracy and robustness, yet the provided description lacks detailed error distributions, failure-case analysis, or full ablation tables isolating the contribution of Laplacian masking versus scale constraints; without these, it is impossible to confirm that the reported improvements are not influenced by post-hoc benchmark choices.

Authors: We acknowledge that more granular analysis would improve transparency. The manuscript already includes ablation studies demonstrating the individual and combined effects of the scale-aware initialization and Laplacian-based reliability masking. In the revision, we will expand the experiments section to include full ablation tables with isolated component contributions, statistical error distributions (e.g., median, percentiles, and histograms of pose errors), and a dedicated failure-case analysis highlighting scenarios where artifacts or initialization challenges persist. These additions will use the same benchmark results to substantiate the reported improvements. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation; method builds on external benchmarks and standard components

full rationale

The paper presents LSGS-Loc as an engineering pipeline: scene-agnostic relative pose estimation augmented by explicit 3DGS scale constraints for initialization, followed by Laplacian reliability masking during photometric refinement. These are introduced as practical additions to address specific failure modes in large-scale 3DGS scenes, not as quantities derived from or fitted to the target localization accuracy. Performance is asserted via experiments on independent UAV benchmarks rather than any self-referential prediction or uniqueness theorem. No equations reduce the output metrics to the input definitions by construction, no load-bearing self-citations appear, and the central claims remain falsifiable against external data. The derivation chain is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on standard domain assumptions from 3D Gaussian Splatting and visual localization literature with no new free parameters, axioms, or invented entities introduced in the abstract description.

axioms (1)

domain assumption Core 3DGS rendering and photometric consistency assumptions hold for large-scale UAV scenes.
The pipeline builds directly on existing 3DGS without re-deriving or questioning its foundational rendering model.

pith-pipeline@v0.9.0 · 5511 in / 1227 out tokens · 61109 ms · 2026-05-10T19:32:41.085827+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

scale-aware pose initialization strategy that combines scene-agnostic relative pose estimation with explicit 3DGS scale constraints... Laplacian-based reliability masking mechanism
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Laplacian-driven reliability masking... guides photometric refinement toward high-quality regions

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages

[1]

From coarse to fine: Robust hierarchical localization at large scale,

P.-E. Sarlin, C. Cadena, R. Siegwart, and M. Dymczyk, “From coarse to fine: Robust hierarchical localization at large scale,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 716–12 725

work page 2019
[2]

Distinctive image features from scale-invariant key- points,

D. G. Lowe, “Distinctive image features from scale-invariant key- points,”International journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004

work page 2004
[3]

Inloc: Indoor visual localization with dense matching and view synthesis,

H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii, “Inloc: Indoor visual localization with dense matching and view synthesis,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7199–7209

work page 2018
[4]

Posenet: A convolutional network for real-time 6-dof camera relocalization,

A. Kendall, M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real-time 6-dof camera relocalization,” inProceedings of the IEEE international conference on computer vision, 2015, pp. 2938–2946

work page 2015
[5]

Map- relative pose regression for visual re-localization,

S. Chen, T. Cavallari, V . A. Prisacariu, and E. Brachmann, “Map- relative pose regression for visual re-localization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, 2024, pp. 20 665–20 674

work page 2024
[6]

Accelerated coordi- nate encoding: Learning to relocalize in minutes using rgb and poses,

E. Brachmann, T. Cavallari, and V . A. Prisacariu, “Accelerated coordi- nate encoding: Learning to relocalize in minutes using rgb and poses,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5044–5053

work page 2023
[7]

Glace: Global local accelerated coordinate encoding,

F. Wang, X. Jiang, S. Galliani, C. V ogel, and M. Pollefeys, “Glace: Global local accelerated coordinate encoding,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 21 562–21 571

work page 2024
[8]

R-score: Revisiting scene coordinate regression for robust large-scale visual localization,

X. Jiang, F. Wang, S. Galliani, C. V ogel, and M. Pollefeys, “R-score: Revisiting scene coordinate regression for robust large-scale visual localization,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 11 536–11 546

work page 2025
[9]

3d gaussian splatting for real-time radiance field rendering

B. Kerbl, G. Kopanas, T. Leimk ¨uhler, G. Drettakis,et al., “3d gaussian splatting for real-time radiance field rendering.”ACM Trans. Graph., vol. 42, no. 4, pp. 139–1, 2023

work page 2023
[10]

Gsloc: Visual localization with 3d gaussian splatting,

K. Botashev, V . Pyatov, G. Ferrer, and S. Lefkimmiatis, “Gsloc: Visual localization with 3d gaussian splatting,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 5664–5671

work page 2024
[11]

From sparse to dense: Camera relocalization with scene-specific detector from feature gaussian splatting,

Z. Huang, H. Yu, Y . Shentu, J. Yuan, and G. Zhang, “From sparse to dense: Camera relocalization with scene-specific detector from feature gaussian splatting,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 27 059–27 069

work page 2025
[12]

Gs-cpr: Efficient camera pose refinement via 3d gaussian splatting.arXiv preprint arXiv:2408.11085, 2024

C. Liu, S. Chen, Y . Bhalgat, S. Hu, M. Cheng, Z. Wang, V . A. Prisacariu, and T. Braud, “Gs-cpr: Efficient camera pose refinement via 3d gaussian splatting,”arXiv preprint arXiv:2408.11085, 2024

work page arXiv 2024
[13]

3dgs lsr: Large scale relocation for autonomous driving based on 3d gaussian splatting,

H. Lu, H. Chen, H. Liu, S. Zhang, B. Xu, and Z. Liu, “3dgs lsr: Large scale relocation for autonomous driving based on 3d gaussian splatting,”arXiv preprint arXiv:2507.05661, 2025

work page arXiv 2025
[14]

Gsplatloc: Grounding keypoint descriptors into 3d gaussian splatting for improved visual localization,

G. Sidorov, M. Mohrat, D. Gridusov, R. Rakhimov, and S. Kolyubin, “Gsplatloc: Grounding keypoint descriptors into 3d gaussian splatting for improved visual localization,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 12 601–12 607

work page 2025
[15]

Reloc3r: Large-scale training of relative camera pose regression for generalizable, fast, and accurate visual localization,

S. Dong, S. Wang, S. Liu, L. Cai, Q. Fan, J. Kannala, and Y . Yang, “Reloc3r: Large-scale training of relative camera pose regression for generalizable, fast, and accurate visual localization,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 16 739–16 752

work page 2025
[16]

Gs-reloc: A gaussian-splatting relocalization method for robust and accurate mono camera pose estimation,

K. Fodor and A. R ¨ovid, “Gs-reloc: A gaussian-splatting relocalization method for robust and accurate mono camera pose estimation,”IEEE Access, 2025

work page 2025
[17]

Six-dof pose estimation with efficient 3-d gaussian splatting representation for visual relocalization,

Z. Zhou, F. Hui, Y . Wu, and Y . Liu, “Six-dof pose estimation with efficient 3-d gaussian splatting representation for visual relocalization,” IEEE/ASME Transactions on Mechatronics, 2024

work page 2024
[18]

Gauloc: 3d gaussian splatting-based camera relocalization,

Z. Xin, C. Dai, Y . Li, and C. Wu, “Gauloc: 3d gaussian splatting-based camera relocalization,” inComputer Graphics Forum, vol. 43, no. 7. Wiley Online Library, 2024, p. e15256

work page 2024
[19]

Logs: Visual localiza- tion via gaussian splatting with fewer training images,

Y . Cheng, J. Jiao, Y . Wang, and D. Kanoulas, “Logs: Visual localiza- tion via gaussian splatting with fewer training images,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 15 029–15 036

work page 2025
[20]

Nerf: Representing scenes as neural radiance fields for view synthesis,

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoor- thi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,”Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021

work page 2021
[21]

Dfnet: Enhance absolute pose regression with direct feature matching,

S. Chen, X. Li, Z. Wang, and V . A. Prisacariu, “Dfnet: Enhance absolute pose regression with direct feature matching,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 1–17

work page 2022
[22]

inerf: Inverting neural radiance fields for pose estimation,

L. Yen-Chen, P. Florence, J. T. Barron, A. Rodriguez, P. Isola, and T.-Y . Lin, “inerf: Inverting neural radiance fields for pose estimation,” in2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 1323–1330

work page 2021
[23]

Pnerfloc: Visual localization with point-based neural radiance fields,

B. Zhao, L. Yang, M. Mao, H. Bao, and Z. Cui, “Pnerfloc: Visual localization with point-based neural radiance fields,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 7, 2024, pp. 7450–7459

work page 2024
[24]

The nerfect match: Exploring nerf features for visual localization,

Q. Zhou, M. Maximov, O. Litany, and L. Leal-Taix ´e, “The nerfect match: Exploring nerf features for visual localization,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 108–127

work page 2024
[25]

Nerf-loc: Visual localization with conditional neural radiance field,

J. Liu, Q. Nie, Y . Liu, and C. Wang, “Nerf-loc: Visual localization with conditional neural radiance field,”arXiv preprint arXiv:2304.07979, 2023

work page arXiv 2023
[26]

Crossfire: Camera relocalization on self- supervised features from an implicit representation,

A. Moreau, N. Piasco, M. Bennehar, D. Tsishkou, B. Stanciulescu, and A. de La Fortelle, “Crossfire: Camera relocalization on self- supervised features from an implicit representation,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 252–262

work page 2023
[27]

Feature query networks: Neural surface description for camera pose refinement,

H. Germain, D. DeTone, G. Pascoe, T. Schmidt, D. Novotny, R. New- combe, C. Sweeney, R. Szeliski, and V . Balntas, “Feature query networks: Neural surface description for camera pose refinement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5071–5081

work page 2022
[28]

Hgsloc: 3dgs- based heuristic camera pose refinement,

Z. Niu, Z. Tan, J. Zhang, X. Yang, and D. Hu, “Hgsloc: 3dgs- based heuristic camera pose refinement,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 1–7

work page 2025
[29]

6dgs: 6d pose estimation from a single image and a 3d gaussian splatting model,

B. Matteo, T. Tsesmelis, S. James, F. Poiesi, and A. Del Bue, “6dgs: 6d pose estimation from a single image and a 3d gaussian splatting model,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 420–436

work page 2024
[30]

Gsvisloc: Generalizable visual localization for gaussian splatting scene representations,

F. Khatib, D. Moran, G. Trostianetsky, Y . Kasten, M. Galun, and R. Basri, “Gsvisloc: Generalizable visual localization for gaussian splatting scene representations,”arXiv preprint arXiv:2508.18242, 2025

work page arXiv 2025
[31]

Anyloc: Towards universal visual place recognition,

N. Keetha, A. Mishra, J. Karhade, K. M. Jatavallabhula, S. Scherer, M. Krishna, and S. Garg, “Anyloc: Towards universal visual place recognition,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1286–1293, 2023

work page 2023
[32]

Gauu-scene v2: Assessing the reliability of image-based metrics with expansive lidar image dataset using 3dgs and nerf.arXiv preprint arXiv:2404.04880, 2024

B. Xiong, N. Zheng, J. Liu, and Z. Li, “Gauu-scene v2: Assessing the reliability of image-based metrics with expansive lidar image dataset using 3dgs and nerf,”arXiv preprint arXiv:2404.04880, 2024

work page arXiv 2024
[33]

gsplat: An open-source library for gaussian splatting,

V . Ye, R. Li, J. Kerr, M. Turkulainen, B. Yi, Z. Pan, O. Seiskari, J. Ye, J. Hu, M. Tancik, and A. Kanazawa, “gsplat: An open-source library for gaussian splatting,”Journal of Machine Learning Research, vol. 26, no. 34, pp. 1–17, 2025

work page 2025
[34]

Superpoint: Self- supervised interest point detection and description,

D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self- supervised interest point detection and description,” inProceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224–236

work page 2018
[35]

Su- perglue: Learning feature matching with graph neural networks,

P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Su- perglue: Learning feature matching with graph neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 4938–4947

work page 2020

[1] [1]

From coarse to fine: Robust hierarchical localization at large scale,

P.-E. Sarlin, C. Cadena, R. Siegwart, and M. Dymczyk, “From coarse to fine: Robust hierarchical localization at large scale,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12 716–12 725

work page 2019

[2] [2]

Distinctive image features from scale-invariant key- points,

D. G. Lowe, “Distinctive image features from scale-invariant key- points,”International journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004

work page 2004

[3] [3]

Inloc: Indoor visual localization with dense matching and view synthesis,

H. Taira, M. Okutomi, T. Sattler, M. Cimpoi, M. Pollefeys, J. Sivic, T. Pajdla, and A. Torii, “Inloc: Indoor visual localization with dense matching and view synthesis,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7199–7209

work page 2018

[4] [4]

Posenet: A convolutional network for real-time 6-dof camera relocalization,

A. Kendall, M. Grimes, and R. Cipolla, “Posenet: A convolutional network for real-time 6-dof camera relocalization,” inProceedings of the IEEE international conference on computer vision, 2015, pp. 2938–2946

work page 2015

[5] [5]

Map- relative pose regression for visual re-localization,

S. Chen, T. Cavallari, V . A. Prisacariu, and E. Brachmann, “Map- relative pose regression for visual re-localization,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion, 2024, pp. 20 665–20 674

work page 2024

[6] [6]

Accelerated coordi- nate encoding: Learning to relocalize in minutes using rgb and poses,

E. Brachmann, T. Cavallari, and V . A. Prisacariu, “Accelerated coordi- nate encoding: Learning to relocalize in minutes using rgb and poses,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5044–5053

work page 2023

[7] [7]

Glace: Global local accelerated coordinate encoding,

F. Wang, X. Jiang, S. Galliani, C. V ogel, and M. Pollefeys, “Glace: Global local accelerated coordinate encoding,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 21 562–21 571

work page 2024

[8] [8]

R-score: Revisiting scene coordinate regression for robust large-scale visual localization,

X. Jiang, F. Wang, S. Galliani, C. V ogel, and M. Pollefeys, “R-score: Revisiting scene coordinate regression for robust large-scale visual localization,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 11 536–11 546

work page 2025

[9] [9]

3d gaussian splatting for real-time radiance field rendering

B. Kerbl, G. Kopanas, T. Leimk ¨uhler, G. Drettakis,et al., “3d gaussian splatting for real-time radiance field rendering.”ACM Trans. Graph., vol. 42, no. 4, pp. 139–1, 2023

work page 2023

[10] [10]

Gsloc: Visual localization with 3d gaussian splatting,

K. Botashev, V . Pyatov, G. Ferrer, and S. Lefkimmiatis, “Gsloc: Visual localization with 3d gaussian splatting,” in2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2024, pp. 5664–5671

work page 2024

[11] [11]

From sparse to dense: Camera relocalization with scene-specific detector from feature gaussian splatting,

Z. Huang, H. Yu, Y . Shentu, J. Yuan, and G. Zhang, “From sparse to dense: Camera relocalization with scene-specific detector from feature gaussian splatting,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 27 059–27 069

work page 2025

[12] [12]

Gs-cpr: Efficient camera pose refinement via 3d gaussian splatting.arXiv preprint arXiv:2408.11085, 2024

C. Liu, S. Chen, Y . Bhalgat, S. Hu, M. Cheng, Z. Wang, V . A. Prisacariu, and T. Braud, “Gs-cpr: Efficient camera pose refinement via 3d gaussian splatting,”arXiv preprint arXiv:2408.11085, 2024

work page arXiv 2024

[13] [13]

3dgs lsr: Large scale relocation for autonomous driving based on 3d gaussian splatting,

H. Lu, H. Chen, H. Liu, S. Zhang, B. Xu, and Z. Liu, “3dgs lsr: Large scale relocation for autonomous driving based on 3d gaussian splatting,”arXiv preprint arXiv:2507.05661, 2025

work page arXiv 2025

[14] [14]

Gsplatloc: Grounding keypoint descriptors into 3d gaussian splatting for improved visual localization,

G. Sidorov, M. Mohrat, D. Gridusov, R. Rakhimov, and S. Kolyubin, “Gsplatloc: Grounding keypoint descriptors into 3d gaussian splatting for improved visual localization,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2025, pp. 12 601–12 607

work page 2025

[15] [15]

Reloc3r: Large-scale training of relative camera pose regression for generalizable, fast, and accurate visual localization,

S. Dong, S. Wang, S. Liu, L. Cai, Q. Fan, J. Kannala, and Y . Yang, “Reloc3r: Large-scale training of relative camera pose regression for generalizable, fast, and accurate visual localization,” inProceedings of the Computer Vision and Pattern Recognition Conference, 2025, pp. 16 739–16 752

work page 2025

[16] [16]

Gs-reloc: A gaussian-splatting relocalization method for robust and accurate mono camera pose estimation,

K. Fodor and A. R ¨ovid, “Gs-reloc: A gaussian-splatting relocalization method for robust and accurate mono camera pose estimation,”IEEE Access, 2025

work page 2025

[17] [17]

Six-dof pose estimation with efficient 3-d gaussian splatting representation for visual relocalization,

Z. Zhou, F. Hui, Y . Wu, and Y . Liu, “Six-dof pose estimation with efficient 3-d gaussian splatting representation for visual relocalization,” IEEE/ASME Transactions on Mechatronics, 2024

work page 2024

[18] [18]

Gauloc: 3d gaussian splatting-based camera relocalization,

Z. Xin, C. Dai, Y . Li, and C. Wu, “Gauloc: 3d gaussian splatting-based camera relocalization,” inComputer Graphics Forum, vol. 43, no. 7. Wiley Online Library, 2024, p. e15256

work page 2024

[19] [19]

Logs: Visual localiza- tion via gaussian splatting with fewer training images,

Y . Cheng, J. Jiao, Y . Wang, and D. Kanoulas, “Logs: Visual localiza- tion via gaussian splatting with fewer training images,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 15 029–15 036

work page 2025

[20] [20]

Nerf: Representing scenes as neural radiance fields for view synthesis,

B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoor- thi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,”Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021

work page 2021

[21] [21]

Dfnet: Enhance absolute pose regression with direct feature matching,

S. Chen, X. Li, Z. Wang, and V . A. Prisacariu, “Dfnet: Enhance absolute pose regression with direct feature matching,” inEuropean Conference on Computer Vision. Springer, 2022, pp. 1–17

work page 2022

[22] [22]

inerf: Inverting neural radiance fields for pose estimation,

L. Yen-Chen, P. Florence, J. T. Barron, A. Rodriguez, P. Isola, and T.-Y . Lin, “inerf: Inverting neural radiance fields for pose estimation,” in2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 1323–1330

work page 2021

[23] [23]

Pnerfloc: Visual localization with point-based neural radiance fields,

B. Zhao, L. Yang, M. Mao, H. Bao, and Z. Cui, “Pnerfloc: Visual localization with point-based neural radiance fields,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 7, 2024, pp. 7450–7459

work page 2024

[24] [24]

The nerfect match: Exploring nerf features for visual localization,

Q. Zhou, M. Maximov, O. Litany, and L. Leal-Taix ´e, “The nerfect match: Exploring nerf features for visual localization,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 108–127

work page 2024

[25] [25]

Nerf-loc: Visual localization with conditional neural radiance field,

J. Liu, Q. Nie, Y . Liu, and C. Wang, “Nerf-loc: Visual localization with conditional neural radiance field,”arXiv preprint arXiv:2304.07979, 2023

work page arXiv 2023

[26] [26]

Crossfire: Camera relocalization on self- supervised features from an implicit representation,

A. Moreau, N. Piasco, M. Bennehar, D. Tsishkou, B. Stanciulescu, and A. de La Fortelle, “Crossfire: Camera relocalization on self- supervised features from an implicit representation,” inProceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 252–262

work page 2023

[27] [27]

Feature query networks: Neural surface description for camera pose refinement,

H. Germain, D. DeTone, G. Pascoe, T. Schmidt, D. Novotny, R. New- combe, C. Sweeney, R. Szeliski, and V . Balntas, “Feature query networks: Neural surface description for camera pose refinement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5071–5081

work page 2022

[28] [28]

Hgsloc: 3dgs- based heuristic camera pose refinement,

Z. Niu, Z. Tan, J. Zhang, X. Yang, and D. Hu, “Hgsloc: 3dgs- based heuristic camera pose refinement,” in2025 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2025, pp. 1–7

work page 2025

[29] [29]

6dgs: 6d pose estimation from a single image and a 3d gaussian splatting model,

B. Matteo, T. Tsesmelis, S. James, F. Poiesi, and A. Del Bue, “6dgs: 6d pose estimation from a single image and a 3d gaussian splatting model,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 420–436

work page 2024

[30] [30]

Gsvisloc: Generalizable visual localization for gaussian splatting scene representations,

F. Khatib, D. Moran, G. Trostianetsky, Y . Kasten, M. Galun, and R. Basri, “Gsvisloc: Generalizable visual localization for gaussian splatting scene representations,”arXiv preprint arXiv:2508.18242, 2025

work page arXiv 2025

[31] [31]

Anyloc: Towards universal visual place recognition,

N. Keetha, A. Mishra, J. Karhade, K. M. Jatavallabhula, S. Scherer, M. Krishna, and S. Garg, “Anyloc: Towards universal visual place recognition,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1286–1293, 2023

work page 2023

[32] [32]

Gauu-scene v2: Assessing the reliability of image-based metrics with expansive lidar image dataset using 3dgs and nerf.arXiv preprint arXiv:2404.04880, 2024

B. Xiong, N. Zheng, J. Liu, and Z. Li, “Gauu-scene v2: Assessing the reliability of image-based metrics with expansive lidar image dataset using 3dgs and nerf,”arXiv preprint arXiv:2404.04880, 2024

work page arXiv 2024

[33] [33]

gsplat: An open-source library for gaussian splatting,

V . Ye, R. Li, J. Kerr, M. Turkulainen, B. Yi, Z. Pan, O. Seiskari, J. Ye, J. Hu, M. Tancik, and A. Kanazawa, “gsplat: An open-source library for gaussian splatting,”Journal of Machine Learning Research, vol. 26, no. 34, pp. 1–17, 2025

work page 2025

[34] [34]

Superpoint: Self- supervised interest point detection and description,

D. DeTone, T. Malisiewicz, and A. Rabinovich, “Superpoint: Self- supervised interest point detection and description,” inProceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 224–236

work page 2018

[35] [35]

Su- perglue: Learning feature matching with graph neural networks,

P.-E. Sarlin, D. DeTone, T. Malisiewicz, and A. Rabinovich, “Su- perglue: Learning feature matching with graph neural networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 4938–4947

work page 2020