From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Bing Wang; Fangzhen Li; Guang Chen; Hangjun Ye; Hao Li; kuang Gao; Liwen Yang; Xudi Ge; Yepeng Liu; Yongchao Xu

arxiv: 2602.20630 · v4 · pith:NBSVUVNDnew · submitted 2026-02-24 · 💻 cs.CV

From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Yepeng Liu , Hao Li , Liwen Yang , Fangzhen Li , Xudi Ge , Yuliang Gu , kuang Gao , Bing Wang

show 3 more authors

Guang Chen Hangjun Ye Yongchao Xu

This is my paper

Pith reviewed 2026-05-21 12:05 UTC · model grok-4.3

classification 💻 cs.CV

keywords keypoint detectionreinforcement learningpolicy gradientstrack qualitystructure from motionsparse matchingimage sequencescomputer vision

0 comments

The pith

Keypoint detection trained via reinforcement learning on full image sequences produces more consistent long-term tracks than methods optimized only on image pairs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reframes keypoint detection as a sequential decision process rather than independent per-image choices. It trains a policy with gradients that receive rewards based on how well selected points maintain identity and distinctiveness across an entire sequence of views. This matters for 3D vision pipelines such as SfM and SLAM because those systems ultimately rely on keypoints that survive many viewpoint and lighting changes without breaking tracks. A reader would care if the resulting detectors deliver higher accuracy on downstream tasks like relative pose estimation and reconstruction without extra post-processing steps.

Core claim

The authors claim that an end-to-end RL agent called TraqPoint, guided by a track-aware reward that scores both consistency and distinctiveness of keypoints over multiple views, learns detectors whose output keypoints form higher-quality tracks when evaluated on sparse matching benchmarks.

What carries the argument

The track-aware reward inside the TraqPoint policy-gradient framework, which scores keypoints jointly across sequence views for consistency and distinctiveness.

If this is right

Relative pose estimation accuracy increases on standard sparse matching test sets.
3D reconstruction completeness and accuracy improve on the same benchmarks.
The learned detectors require no separate tuning or descriptor retraining to achieve the reported gains.
Keypoints selected by the policy remain effective under large viewpoint and illumination shifts typical of real sequences.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same sequential reward idea could be applied to other vision tasks that depend on persistent features over time, such as long-term object tracking.
Training directly on sequences may reduce reliance on hand-designed post-processing heuristics that current pair-based methods use to enforce consistency.
If the reward can be computed from cheap geometric proxies, the approach might scale to very long video streams without dense ground-truth tracks.

Load-bearing premise

The reward signal that rewards consistency and distinctiveness across multiple views correctly measures long-term track quality and produces improvements that transfer without dataset-specific adjustments.

What would settle it

Run the same evaluation benchmarks on a detector trained only on pairs but given an auxiliary loss that explicitly penalizes track breaks; if performance remains equal or better than TraqPoint, the necessity of the full sequence RL formulation is challenged.

Figures

Figures reproduced from arXiv: 2602.20630 by Bing Wang, Fangzhen Li, Guang Chen, Hangjun Ye, Hao Li, kuang Gao, Liwen Yang, Xudi Ge, Yepeng Liu, Yongchao Xu, Yuliang Gu.

**Figure 2.** Figure 2: Following the architectural design of RDD [ [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of our proposed Sequence-Aware Keypoint Policy Learning framework: First, we select a reference frame from the [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Qualitative results on the MegaDepth dataset [ [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Ablation study on sequence length and the number of [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Keypoint-based matching is a fundamental component of modern 3D vision systems, such as Structure-from-Motion (SfM) and SLAM. Most existing learning-based methods are trained on image pairs, a paradigm that fails to explicitly optimize for the long-term trackability of keypoints across sequences under challenging viewpoint and illumination changes. In this paper, we reframe keypoint detection as a sequential decision-making problem. We introduce TraqPoint, a novel, end-to-end Reinforcement Learning (RL) framework designed to optimize the \textbf{Tra}ck-\textbf{q}uality (Traq) of keypoints directly on image sequences. Our core innovation is a track-aware reward mechanism that jointly encourages the consistency and distinctiveness of keypoints across multiple views, guided by a policy gradient method. Extensive evaluations on sparse matching benchmarks, including relative pose estimation and 3D reconstruction, demonstrate that TraqPoint significantly outperforms some state-of-the-art (SOTA) keypoint detection and description methods.The code will be available at https://github.com/xiaomi-research/traqpoint.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper reframes keypoint detection as a sequential RL problem and introduces TraqPoint, an end-to-end framework that optimizes a track-aware reward (joint consistency and distinctiveness across multiple views) via policy gradients on image sequences. It claims this yields keypoints with superior long-term trackability, leading to significant outperformance over SOTA methods on sparse matching benchmarks including relative pose estimation and 3D reconstruction.

Significance. If the central claim holds, the shift from pair-wise to sequence-aware training via RL could improve robustness of learned keypoints under viewpoint and illumination changes, with potential impact on SfM and SLAM pipelines. The explicit optimization of multi-view track quality is a promising direction, though its advantage over pair-wise baselines requires clear isolation from architecture or data effects.

major comments (3)

[§3.3] §3.3, Reward Definition: The track-aware reward combines consistency and distinctiveness over fixed-length sequences; without explicit analysis of sequence length K or how the reward correlates with true long-horizon track quality (vs. short-term signals), it is unclear whether the sequential formulation delivers the claimed generalizable advantage or reduces to heuristics achievable by pair-wise training.
[§5.2] §5.2, Ablation Studies: The experiments report outperformance on pose estimation and reconstruction but lack controls isolating the contribution of the track-aware sequential reward from the underlying detector architecture or training data statistics. This is load-bearing for attributing gains to the policy-gradient formulation rather than other factors.
[§4.1] §4.1, Hyperparameter Handling: The weighting between consistency and distinctiveness is a free parameter; the manuscript provides no sensitivity analysis or dataset-independent selection procedure, raising the risk that reported improvements incorporate dataset-specific tuning and undermine the generalizability claim.

minor comments (2)

[Abstract] Abstract: The phrasing 'significantly outperforms some state-of-the-art' is imprecise; include the specific competing methods and key quantitative margins (e.g., AUC or reconstruction error deltas) to strengthen the summary.
[§2] §2, Related Work: Several recent sequence-modeling or RL-for-vision papers are not cited; adding them would better contextualize the novelty of the track-aware policy gradient approach.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and have revised the manuscript to incorporate additional analysis and experiments where needed.

read point-by-point responses

Referee: [§3.3] §3.3, Reward Definition: The track-aware reward combines consistency and distinctiveness over fixed-length sequences; without explicit analysis of sequence length K or how the reward correlates with true long-horizon track quality (vs. short-term signals), it is unclear whether the sequential formulation delivers the claimed generalizable advantage or reduces to heuristics achievable by pair-wise training.

Authors: We agree that further analysis of sequence length K and its relation to long-horizon tracking is important. In the revised manuscript we have expanded §3.3 with experiments varying K from 3 to 10 and a correlation analysis between the reward signal and long-term metrics such as average track length and repeatability over extended sequences. These results indicate that moderate sequence lengths improve sustained trackability beyond what pair-wise training achieves. revision: yes
Referee: [§5.2] §5.2, Ablation Studies: The experiments report outperformance on pose estimation and reconstruction but lack controls isolating the contribution of the track-aware sequential reward from the underlying detector architecture or training data statistics. This is load-bearing for attributing gains to the policy-gradient formulation rather than other factors.

Authors: We acknowledge the need for stronger isolation of the reward contribution. The revised §5.2 now includes additional ablations that (i) compare the full sequence-aware model against an identical architecture trained with pair-wise rewards on the same data, (ii) disable the track-aware terms while retaining the RL framework, and (iii) vary training data statistics. The new results attribute a substantial portion of the gains specifically to the track-aware policy-gradient formulation. revision: yes
Referee: [§4.1] §4.1, Hyperparameter Handling: The weighting between consistency and distinctiveness is a free parameter; the manuscript provides no sensitivity analysis or dataset-independent selection procedure, raising the risk that reported improvements incorporate dataset-specific tuning and undermine the generalizability claim.

Authors: We agree that sensitivity analysis was missing. In the revised manuscript we have added a sensitivity study in §4.1 evaluating the consistency-distinctiveness weight over a wide range on multiple datasets. Performance remains stable within a practical interval, and we now describe a validation-set procedure for selecting the weight that does not rely on test-set information. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper reframes keypoint detection as a sequential RL task and defines a track-aware reward explicitly in terms of external multi-view consistency and distinctiveness measures. No equations reduce by construction to fitted inputs, no self-citation chains bear the central claim, and the reward is not a renaming or ansatz smuggled from prior self-work. The policy-gradient updates optimize an independently specified objective, making the reported gains on pose and reconstruction benchmarks attributable to the sequential formulation rather than tautological reuse of training signals.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

Abstract-only review limits visibility into concrete parameters or assumptions; the RL formulation implicitly relies on standard MDP and policy-gradient machinery plus the new reward definition.

free parameters (1)

Reward weighting between consistency and distinctiveness
Balance term required to combine the two components of the track-quality reward; value not stated.

axioms (1)

domain assumption Keypoint selection can be cast as a sequential decision process whose quality is measurable by cross-view consistency and distinctiveness.
Foundational premise that enables the RL reframing.

invented entities (1)

TraqPoint framework no independent evidence
purpose: End-to-end RL system for optimizing track-quality of keypoints on sequences
Newly proposed method and associated reward mechanism.

pith-pipeline@v0.9.0 · 5750 in / 1290 out tokens · 50519 ms · 2026-05-21T12:05:35.461799+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages

[1]

Three things ev- eryone should know to improve object retrieval

Relja Arandjelovi ´c and Andrew Zisserman. Three things ev- eryone should know to improve object retrieval. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2911– 2918, 2012. 7

work page 2012
[2]

Scale-free image keypoints using differentiable persistent homology

Giovanni Barbarani, Francesco Vaccarino, Gabriele Trivi- gno, Marco Guerra, Gabriele Berton, and Carlo Masone. Scale-free image keypoints using differentiable persistent homology. InProc. Int. Conf. Mach. Learn., pages 2990– 3002, 2024. 2

work page 2024
[3]

Reinforced feature points: Optimizing feature detection and description for a high-level task

Aritra Bhowmik, Stefan Gumhold, Carsten Rother, and Eric Brachmann. Reinforced feature points: Optimizing feature detection and description for a high-level task. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4948– 4957, 2020. 1, 2

work page 2020
[4]

The OpenCV library.Dr

Gary Bradski. The OpenCV library.Dr. Dobb’s Journal: Software Tools for the Professional Programmer, 25(11): 120–123, 2000. 5

work page 2000
[5]

RDD: Robust feature de- tector and descriptor using deformable transformer

Gonglin Chen, Tianwen Fu, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, and Yajie Zhao. RDD: Robust feature de- tector and descriptor using deformable transformer. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 6394– 6403, 2025. 1, 2, 3, 5, 6, 7, 8

work page 2025
[6]

Orbeez-SLAM: A real-time monocular visual slam with orb features and nerf- realized mapping

Chi-Ming Chung, Yang-Che Tseng, Ya-Ching Hsu, Xiang- Qian Shi, Yun-Hung Hua, Jia-Fong Yeh, Wen-Chin Chen, Yi-Ting Chen, and Winston H Hsu. Orbeez-SLAM: A real-time monocular visual slam with orb features and nerf- realized mapping. InProc. IEEE Int. Conf. Robot. Autom., pages 9400–9406, 2023. 1

work page 2023
[7]

SIPs: Succinct interest points from unsuper- vised inlierness probability learning

Titus Cieslewski, Konstantinos G Derpanis, and Davide Scaramuzza. SIPs: Succinct interest points from unsuper- vised inlierness probability learning. InInternational Con- ference on 3D Vision, pages 604–613, 2019. 2

work page 2019
[8]

ScanNet: Richly-annotated 3D reconstructions of indoor scenes

Angela Dai, Angel X Chang, Manolis Savva, Maciej Hal- ber, Thomas Funkhouser, and Matthias Nießner. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 5828–5839, 2017. 5, 6

work page 2017
[9]

SuperPoint: Self-supervised interest point detection and description

Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabi- novich. SuperPoint: Self-supervised interest point detection and description. InProc. of IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, pages 224–236, 2018. 1, 2, 5, 6, 7

work page 2018
[10]

D2- Net: A trainable cnn for joint description and detection of local features

Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Polle- feys, Josef Sivic, Akihiko Torii, and Torsten Sattler. D2- Net: A trainable cnn for joint description and detection of local features. InProc. IEEE/CVF Conf. Comput. Vis. Pat- tern Recog., pages 8092–8101, 2019. 2, 7

work page 2019
[11]

DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detec- tor

Johan Edstedt, Georg B ¨okman, and Zhenjun Zhao. DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detec- tor. InProc. of IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2024. 1, 2, 3, 5, 6, 8

work page 2024
[12]

RoMa: Robust dense feature matching

Johan Edstedt, Qiyu Sun, Georg B ¨okman, M ˚arten Wadenb¨ack, and Michael Felsberg. RoMa: Robust dense feature matching. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 19790–19800, 2024. 2

work page 2024
[13]

DaD: Distilled reinforcement learn- ing for diverse keypoint detection.arXiv preprint arXiv:2503.07347, 2025

Johan Edstedt, Georg B ¨okman, M ˚arten Wadenb ¨ack, and Michael Felsberg. DaD: Distilled reinforcement learn- ing for diverse keypoint detection.arXiv preprint arXiv:2503.07347, 2025. 2

work page arXiv 2025
[14]

Are we ready for autonomous driving? the KITTI vision benchmark suite

Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 3354–3361, 2012. 7, 8

work page 2012
[15]

SiLK: Simple learned keypoints

Pierre Gleize, Weiyao Wang, and Matt Feiszli. SiLK: Simple learned keypoints. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 22499–22508, 2023. 1, 2

work page 2023
[16]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 770– 778, 2016. 8

work page 2016
[17]

OmniGlue: Generalizable feature match- ing with foundation model guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, and Andre Araujo. OmniGlue: Generalizable feature match- ing with foundation model guidance. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., 2024. 8

work page 2024
[18]

Learn- ing to make keypoints sub-pixel accurate

Shinjeong Kim, Marc Pollefeys, and Daniel Barath. Learn- ing to make keypoints sub-pixel accurate. InProc. Eur. Conf. Comput. Vis., pages 413–431, 2024. 2

work page 2024
[19]

RIPE: Reinforcement learning on unlabeled image pairs for robust keypoint extraction

Johannes K ¨unzel, Anna Hilsmann, and Peter Eisert. RIPE: Reinforcement learning on unlabeled image pairs for robust keypoint extraction. InProc. IEEE/CVF Int. Conf. Comput. Vis., 2025. 1, 2, 5, 6, 7, 8

work page 2025
[20]

Self- supervised equivariant learning for oriented keypoint detec- tion

Jongmin Lee, Byungjin Kim, and Minsu Cho. Self- supervised equivariant learning for oriented keypoint detec- tion. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4847–4857, 2022. 2

work page 2022
[21]

Ground- ing image matching in 3d with MASt3R

Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Ground- ing image matching in 3d with MASt3R. InProc. Eur. Conf. Comput. Vis., pages 71–91, 2024. 2

work page 2024
[22]

Decoupling makes weakly supervised local feature better

Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, and Yulan Guo. Decoupling makes weakly supervised local feature better. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 15838–15848, 2022. 7

work page 2022
[23]

MegaDepth: Learning single-view depth prediction from internet photos

Zhengqi Li and Noah Snavely. MegaDepth: Learning single-view depth prediction from internet photos. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2041– 2050, 2018. 1, 2, 3, 5, 6, 8

work page 2041
[24]

LightGlue: Local feature matching at light speed

Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Polle- feys. LightGlue: Local feature matching at light speed. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 17627– 17638, 2023. 5, 6

work page 2023
[25]

LiftFeat: 3D geometry-aware local feature matching

Yepeng Liu, Wenpeng Lai, Zhou Zhao, Yuxuan Xiong, Jinchi Zhu, Jun Cheng, and Yongchao Xu. LiftFeat: 3D geometry-aware local feature matching. InProc. IEEE Int. Conf. Robot. Autom., pages 11714–11720, 2025. 1

work page 2025
[26]

Distinctive image features from scale- invariant keypoints.Int

David G Lowe. Distinctive image features from scale- invariant keypoints.Int. J. Comput. Vis., 60:91–110, 2004. 2, 5

work page 2004
[27]

ASLFeat: Learning local features of accurate shape and lo- calization

Zixin Luo, Lei Zhou, Xuyang Bai, Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, and Long Quan. ASLFeat: Learning local features of accurate shape and lo- calization. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 6589–6598, 2020. 7

work page 2020
[28]

Robust wide-baseline stereo from maximally stable ex- tremal regions.Image and vision computing, 22(10):761– 767, 2004

Jiri Matas, Ondrej Chum, Martin Urban, and Tom ´as Pa- jdla. Robust wide-baseline stereo from maximally stable ex- tremal regions.Image and vision computing, 22(10):761– 767, 2004. 2

work page 2004
[29]

ORB-SLAM2: An open- source slam system for monocular, stereo, and RGB-D cam- eras.IEEE Trans

Raul Mur-Artal and Juan D Tard´os. ORB-SLAM2: An open- source slam system for monocular, stereo, and RGB-D cam- eras.IEEE Trans. Robot., 33(5):1255–1262, 2017. 1

work page 2017
[30]

DINOv2: Learning robust visual features without supervision.Transactions on Machine Learning Research,

Maxime Oquab, Timoth ´ee Darcet, Th´eo Moutakanni, Huy V V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel HAZIZA, Francisco Massa, Alaaeldin El-Nouby, et al. DINOv2: Learning robust visual features without supervision.Transactions on Machine Learning Research,

work page
[31]

Ness-st: Detecting good and stable keypoints with a neural stability score and the Shi-Tomasi detector

Konstantin Pakulev, Alexander Vakhitov, and Gonzalo Fer- rer. Ness-st: Detecting good and stable keypoints with a neural stability score and the Shi-Tomasi detector. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 9578–9588, 2023. 2

work page 2023
[32]

Enhancing deformable lo- cal features by jointly learning to detect and describe key- points

Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Mar- tins, and Erickson R Nascimento. Enhancing deformable lo- cal features by jointly learning to detect and describe key- points. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 1306–1315, 2023. 2

work page 2023
[33]

XFeat: Accelerated fea- tures for lightweight image matching

Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Mar- tins, and Erickson R Nascimento. XFeat: Accelerated fea- tures for lightweight image matching. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2682–2691, 2024. 2, 3, 5, 6, 7

work page 2024
[34]

R2D2: Reliable and repeatable detec- tor and descriptor.Adv

Jerome Revaud, Cesar De Souza, Martin Humenberger, and Philippe Weinzaepfel. R2D2: Reliable and repeatable detec- tor and descriptor.Adv. Neural Inf. Process. Syst., 32, 2019. 2

work page 2019
[35]

Faster and better: A machine learning approach to corner detection

Edward Rosten, Reid Porter, and Tom Drummond. Faster and better: A machine learning approach to corner detection. IEEE Trans. on Pattern Anal. and Mach. Intell., 32(1):105– 119, 2008. 2, 5

work page 2008
[36]

ORB: An efficient alternative to SIFT or SURF

Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. ORB: An efficient alternative to SIFT or SURF. In Proc. IEEE/CVF Int. Conf. Comput. Vis., pages 2564–2571,

work page
[37]

S-TREK: Sequential translation and rotation equivariant keypoints for local fea- ture extraction

Emanuele Santellani, Christian Sormann, Mattia Rossi, An- dreas Kuhn, and Friedrich Fraundorfer. S-TREK: Sequential translation and rotation equivariant keypoints for local fea- ture extraction. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 9728–9737, 2023. 2

work page 2023
[38]

Gmm- ikrs: Gaussian mixture models for interpretable keypoint re- finement and scoring

Emanuele Santellani, Martin Zach, Christian Sormann, Mat- tia Rossi, Andreas Kuhn, and Friedrich Fraundorfer. Gmm- ikrs: Gaussian mixture models for interpretable keypoint re- finement and scoring. InProc. Eur. Conf. Comput. Vis., pages 77–93, 2024. 2

work page 2024
[39]

From coarse to fine: Robust hierarchical localization at large scale

Paul-Edouard Sarlin, Cesar Cadena, Roland Siegwart, and Marcin Dymczyk. From coarse to fine: Robust hierarchical localization at large scale. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 12716–12725, 2019. 1, 6

work page 2019
[40]

SuperGlue: Learning feature matching with graph neural networks

Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. SuperGlue: Learning feature matching with graph neural networks. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4938–4947, 2020. 5, 6

work page 2020
[41]

Benchmarking 6DOF outdoor visual localization in changing conditions

Torsten Sattler, Will Maddern, Carl Toft, Akihiko Torii, Lars Hammarstrand, Erik Stenborg, Daniel Safari, Masatoshi Okutomi, Marc Pollefeys, Josef Sivic, et al. Benchmarking 6DOF outdoor visual localization in changing conditions. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 8601–8610, 2018. 6, 7

work page 2018
[42]

Quad-networks: unsupervised learning to rank for interest point detection

Nikolay Savinov, Akihito Seki, Lubor Ladicky, Torsten Sat- tler, and Marc Pollefeys. Quad-networks: unsupervised learning to rank for interest point detection. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 1822– 1830, 2017. 2

work page 2017
[43]

Structure- from-motion revisited

Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4104–4113, 2016. 1, 2, 7

work page 2016
[44]

Comparative evaluation of hand-crafted and learned local features

Johannes L Schonberger, Hans Hardmeier, Torsten Sattler, and Marc Pollefeys. Comparative evaluation of hand-crafted and learned local features. InProc. IEEE/CVF Conf. Com- put. Vis. Pattern Recog., pages 1482–1491, 2017. 6, 7

work page 2017
[45]

Oriane Sim ´eoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timoth´ee Darcet, Th´eo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie,...

work page 2025
[46]

LoFTR: Detector-free local feature matching with transformers

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. LoFTR: Detector-free local feature matching with transformers. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 8922–8931, 2021. 2

work page 2021
[47]

GLAM- points: Greedily learned accurate match points

Prune Truong, Stefanos Apostolopoulos, Agata Mosinska, Samuel Stucky, Carlos Ciller, and Sandro De Zanet. GLAM- points: Greedily learned accurate match points. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 10732–10741,

work page
[48]

DISK: Learning local features with policy gradient.Adv

Michał Tyszkiewicz, Pascal Fua, and Eduard Trulls. DISK: Learning local features with policy gradient.Adv. Neural Inf. Process. Syst., 33:14254–14265, 2020. 1, 2, 5, 6, 8

work page 2020
[49]

DUSt3R: Geometric 3D vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. DUSt3R: Geometric 3D vision made easy. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 20697–20709, 2024. 2

work page 2024
[50]

FeatureBooster: Boosting feature descriptors with a lightweight neural network

Xinjiang Wang, Zeyu Liu, yu Hu, Wei Xi, Wenxian Yu, and Danping Zou. FeatureBooster: Boosting feature descriptors with a lightweight neural network. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., 2023. 1

work page 2023
[51]

Efficient LoFTR: Semi-dense local feature matching with sparse-like speed

Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, and Xiaowei Zhou. Efficient LoFTR: Semi-dense local feature matching with sparse-like speed. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 21666–21675, 2024. 2

work page 2024
[52]

Tree-based morse regions: A topological approach to local feature detection.IEEE Trans

Yongchao Xu, Pascal Monasse, Thierry G´eraud, and Laurent Najman. Tree-based morse regions: A topological approach to local feature detection.IEEE Trans. Image Process., 23 (12):5612–5625, 2014. 2

work page 2014
[53]

iSimLoc: Visual global local- ization for previously unseen environments with simulated images.IEEE Trans

Peng Yin, Ivan Cisneros, Shiqi Zhao, Ji Zhang, Howie Choset, and Sebastian Scherer. iSimLoc: Visual global local- ization for previously unseen environments with simulated images.IEEE Trans. Robot., 39(3):1893–1909, 2023. 1

work page 1909
[54]

ALIKE: Accurate and lightweight keypoint detection and descriptor extraction

Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter CY Chen, and Zhengguo Li. ALIKE: Accurate and lightweight keypoint detection and descriptor extraction. IEEE Trans. on Multimedia, 25:3101–3112, 2022. 6

work page 2022
[55]

ALIKED: A lighter keypoint and descriptor extraction network via deformable transformation.IEEE Trans

Xiaoming Zhao, Xingming Wu, Weihai Chen, Peter CY Chen, Qingsong Xu, and Zhengguo Li. ALIKED: A lighter keypoint and descriptor extraction network via deformable transformation.IEEE Trans. Instrum. Meas., 72:1–16, 2023. 1, 2, 3, 5, 6

work page 2023

[1] [1]

Three things ev- eryone should know to improve object retrieval

Relja Arandjelovi ´c and Andrew Zisserman. Three things ev- eryone should know to improve object retrieval. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2911– 2918, 2012. 7

work page 2012

[2] [2]

Scale-free image keypoints using differentiable persistent homology

Giovanni Barbarani, Francesco Vaccarino, Gabriele Trivi- gno, Marco Guerra, Gabriele Berton, and Carlo Masone. Scale-free image keypoints using differentiable persistent homology. InProc. Int. Conf. Mach. Learn., pages 2990– 3002, 2024. 2

work page 2024

[3] [3]

Reinforced feature points: Optimizing feature detection and description for a high-level task

Aritra Bhowmik, Stefan Gumhold, Carsten Rother, and Eric Brachmann. Reinforced feature points: Optimizing feature detection and description for a high-level task. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4948– 4957, 2020. 1, 2

work page 2020

[4] [4]

The OpenCV library.Dr

Gary Bradski. The OpenCV library.Dr. Dobb’s Journal: Software Tools for the Professional Programmer, 25(11): 120–123, 2000. 5

work page 2000

[5] [5]

RDD: Robust feature de- tector and descriptor using deformable transformer

Gonglin Chen, Tianwen Fu, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, and Yajie Zhao. RDD: Robust feature de- tector and descriptor using deformable transformer. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 6394– 6403, 2025. 1, 2, 3, 5, 6, 7, 8

work page 2025

[6] [6]

Orbeez-SLAM: A real-time monocular visual slam with orb features and nerf- realized mapping

Chi-Ming Chung, Yang-Che Tseng, Ya-Ching Hsu, Xiang- Qian Shi, Yun-Hung Hua, Jia-Fong Yeh, Wen-Chin Chen, Yi-Ting Chen, and Winston H Hsu. Orbeez-SLAM: A real-time monocular visual slam with orb features and nerf- realized mapping. InProc. IEEE Int. Conf. Robot. Autom., pages 9400–9406, 2023. 1

work page 2023

[7] [7]

SIPs: Succinct interest points from unsuper- vised inlierness probability learning

Titus Cieslewski, Konstantinos G Derpanis, and Davide Scaramuzza. SIPs: Succinct interest points from unsuper- vised inlierness probability learning. InInternational Con- ference on 3D Vision, pages 604–613, 2019. 2

work page 2019

[8] [8]

ScanNet: Richly-annotated 3D reconstructions of indoor scenes

Angela Dai, Angel X Chang, Manolis Savva, Maciej Hal- ber, Thomas Funkhouser, and Matthias Nießner. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 5828–5839, 2017. 5, 6

work page 2017

[9] [9]

SuperPoint: Self-supervised interest point detection and description

Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabi- novich. SuperPoint: Self-supervised interest point detection and description. InProc. of IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, pages 224–236, 2018. 1, 2, 5, 6, 7

work page 2018

[10] [10]

D2- Net: A trainable cnn for joint description and detection of local features

Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Polle- feys, Josef Sivic, Akihiko Torii, and Torsten Sattler. D2- Net: A trainable cnn for joint description and detection of local features. InProc. IEEE/CVF Conf. Comput. Vis. Pat- tern Recog., pages 8092–8101, 2019. 2, 7

work page 2019

[11] [11]

DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detec- tor

Johan Edstedt, Georg B ¨okman, and Zhenjun Zhao. DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detec- tor. InProc. of IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2024. 1, 2, 3, 5, 6, 8

work page 2024

[12] [12]

RoMa: Robust dense feature matching

Johan Edstedt, Qiyu Sun, Georg B ¨okman, M ˚arten Wadenb¨ack, and Michael Felsberg. RoMa: Robust dense feature matching. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 19790–19800, 2024. 2

work page 2024

[13] [13]

DaD: Distilled reinforcement learn- ing for diverse keypoint detection.arXiv preprint arXiv:2503.07347, 2025

Johan Edstedt, Georg B ¨okman, M ˚arten Wadenb ¨ack, and Michael Felsberg. DaD: Distilled reinforcement learn- ing for diverse keypoint detection.arXiv preprint arXiv:2503.07347, 2025. 2

work page arXiv 2025

[14] [14]

Are we ready for autonomous driving? the KITTI vision benchmark suite

Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 3354–3361, 2012. 7, 8

work page 2012

[15] [15]

SiLK: Simple learned keypoints

Pierre Gleize, Weiyao Wang, and Matt Feiszli. SiLK: Simple learned keypoints. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 22499–22508, 2023. 1, 2

work page 2023

[16] [16]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 770– 778, 2016. 8

work page 2016

[17] [17]

OmniGlue: Generalizable feature match- ing with foundation model guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, and Andre Araujo. OmniGlue: Generalizable feature match- ing with foundation model guidance. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., 2024. 8

work page 2024

[18] [18]

Learn- ing to make keypoints sub-pixel accurate

Shinjeong Kim, Marc Pollefeys, and Daniel Barath. Learn- ing to make keypoints sub-pixel accurate. InProc. Eur. Conf. Comput. Vis., pages 413–431, 2024. 2

work page 2024

[19] [19]

RIPE: Reinforcement learning on unlabeled image pairs for robust keypoint extraction

Johannes K ¨unzel, Anna Hilsmann, and Peter Eisert. RIPE: Reinforcement learning on unlabeled image pairs for robust keypoint extraction. InProc. IEEE/CVF Int. Conf. Comput. Vis., 2025. 1, 2, 5, 6, 7, 8

work page 2025

[20] [20]

Self- supervised equivariant learning for oriented keypoint detec- tion

Jongmin Lee, Byungjin Kim, and Minsu Cho. Self- supervised equivariant learning for oriented keypoint detec- tion. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4847–4857, 2022. 2

work page 2022

[21] [21]

Ground- ing image matching in 3d with MASt3R

Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Ground- ing image matching in 3d with MASt3R. InProc. Eur. Conf. Comput. Vis., pages 71–91, 2024. 2

work page 2024

[22] [22]

Decoupling makes weakly supervised local feature better

Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, and Yulan Guo. Decoupling makes weakly supervised local feature better. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 15838–15848, 2022. 7

work page 2022

[23] [23]

MegaDepth: Learning single-view depth prediction from internet photos

Zhengqi Li and Noah Snavely. MegaDepth: Learning single-view depth prediction from internet photos. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2041– 2050, 2018. 1, 2, 3, 5, 6, 8

work page 2041

[24] [24]

LightGlue: Local feature matching at light speed

Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Polle- feys. LightGlue: Local feature matching at light speed. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 17627– 17638, 2023. 5, 6

work page 2023

[25] [25]

LiftFeat: 3D geometry-aware local feature matching

Yepeng Liu, Wenpeng Lai, Zhou Zhao, Yuxuan Xiong, Jinchi Zhu, Jun Cheng, and Yongchao Xu. LiftFeat: 3D geometry-aware local feature matching. InProc. IEEE Int. Conf. Robot. Autom., pages 11714–11720, 2025. 1

work page 2025

[26] [26]

Distinctive image features from scale- invariant keypoints.Int

David G Lowe. Distinctive image features from scale- invariant keypoints.Int. J. Comput. Vis., 60:91–110, 2004. 2, 5

work page 2004

[27] [27]

ASLFeat: Learning local features of accurate shape and lo- calization

Zixin Luo, Lei Zhou, Xuyang Bai, Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, and Long Quan. ASLFeat: Learning local features of accurate shape and lo- calization. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 6589–6598, 2020. 7

work page 2020

[28] [28]

Robust wide-baseline stereo from maximally stable ex- tremal regions.Image and vision computing, 22(10):761– 767, 2004

Jiri Matas, Ondrej Chum, Martin Urban, and Tom ´as Pa- jdla. Robust wide-baseline stereo from maximally stable ex- tremal regions.Image and vision computing, 22(10):761– 767, 2004. 2

work page 2004

[29] [29]

ORB-SLAM2: An open- source slam system for monocular, stereo, and RGB-D cam- eras.IEEE Trans

Raul Mur-Artal and Juan D Tard´os. ORB-SLAM2: An open- source slam system for monocular, stereo, and RGB-D cam- eras.IEEE Trans. Robot., 33(5):1255–1262, 2017. 1

work page 2017

[30] [30]

DINOv2: Learning robust visual features without supervision.Transactions on Machine Learning Research,

Maxime Oquab, Timoth ´ee Darcet, Th´eo Moutakanni, Huy V V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel HAZIZA, Francisco Massa, Alaaeldin El-Nouby, et al. DINOv2: Learning robust visual features without supervision.Transactions on Machine Learning Research,

work page

[31] [31]

Ness-st: Detecting good and stable keypoints with a neural stability score and the Shi-Tomasi detector

Konstantin Pakulev, Alexander Vakhitov, and Gonzalo Fer- rer. Ness-st: Detecting good and stable keypoints with a neural stability score and the Shi-Tomasi detector. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 9578–9588, 2023. 2

work page 2023

[32] [32]

Enhancing deformable lo- cal features by jointly learning to detect and describe key- points

Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Mar- tins, and Erickson R Nascimento. Enhancing deformable lo- cal features by jointly learning to detect and describe key- points. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 1306–1315, 2023. 2

work page 2023

[33] [33]

XFeat: Accelerated fea- tures for lightweight image matching

Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Mar- tins, and Erickson R Nascimento. XFeat: Accelerated fea- tures for lightweight image matching. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2682–2691, 2024. 2, 3, 5, 6, 7

work page 2024

[34] [34]

R2D2: Reliable and repeatable detec- tor and descriptor.Adv

Jerome Revaud, Cesar De Souza, Martin Humenberger, and Philippe Weinzaepfel. R2D2: Reliable and repeatable detec- tor and descriptor.Adv. Neural Inf. Process. Syst., 32, 2019. 2

work page 2019

[35] [35]

Faster and better: A machine learning approach to corner detection

Edward Rosten, Reid Porter, and Tom Drummond. Faster and better: A machine learning approach to corner detection. IEEE Trans. on Pattern Anal. and Mach. Intell., 32(1):105– 119, 2008. 2, 5

work page 2008

[36] [36]

ORB: An efficient alternative to SIFT or SURF

Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. ORB: An efficient alternative to SIFT or SURF. In Proc. IEEE/CVF Int. Conf. Comput. Vis., pages 2564–2571,

work page

[37] [37]

S-TREK: Sequential translation and rotation equivariant keypoints for local fea- ture extraction

Emanuele Santellani, Christian Sormann, Mattia Rossi, An- dreas Kuhn, and Friedrich Fraundorfer. S-TREK: Sequential translation and rotation equivariant keypoints for local fea- ture extraction. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 9728–9737, 2023. 2

work page 2023

[38] [38]

Gmm- ikrs: Gaussian mixture models for interpretable keypoint re- finement and scoring

Emanuele Santellani, Martin Zach, Christian Sormann, Mat- tia Rossi, Andreas Kuhn, and Friedrich Fraundorfer. Gmm- ikrs: Gaussian mixture models for interpretable keypoint re- finement and scoring. InProc. Eur. Conf. Comput. Vis., pages 77–93, 2024. 2

work page 2024

[39] [39]

From coarse to fine: Robust hierarchical localization at large scale

Paul-Edouard Sarlin, Cesar Cadena, Roland Siegwart, and Marcin Dymczyk. From coarse to fine: Robust hierarchical localization at large scale. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 12716–12725, 2019. 1, 6

work page 2019

[40] [40]

SuperGlue: Learning feature matching with graph neural networks

Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. SuperGlue: Learning feature matching with graph neural networks. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4938–4947, 2020. 5, 6

work page 2020

[41] [41]

Benchmarking 6DOF outdoor visual localization in changing conditions

Torsten Sattler, Will Maddern, Carl Toft, Akihiko Torii, Lars Hammarstrand, Erik Stenborg, Daniel Safari, Masatoshi Okutomi, Marc Pollefeys, Josef Sivic, et al. Benchmarking 6DOF outdoor visual localization in changing conditions. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 8601–8610, 2018. 6, 7

work page 2018

[42] [42]

Quad-networks: unsupervised learning to rank for interest point detection

Nikolay Savinov, Akihito Seki, Lubor Ladicky, Torsten Sat- tler, and Marc Pollefeys. Quad-networks: unsupervised learning to rank for interest point detection. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 1822– 1830, 2017. 2

work page 2017

[43] [43]

Structure- from-motion revisited

Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4104–4113, 2016. 1, 2, 7

work page 2016

[44] [44]

Comparative evaluation of hand-crafted and learned local features

Johannes L Schonberger, Hans Hardmeier, Torsten Sattler, and Marc Pollefeys. Comparative evaluation of hand-crafted and learned local features. InProc. IEEE/CVF Conf. Com- put. Vis. Pattern Recog., pages 1482–1491, 2017. 6, 7

work page 2017

[45] [45]

Oriane Sim ´eoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timoth´ee Darcet, Th´eo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie,...

work page 2025

[46] [46]

LoFTR: Detector-free local feature matching with transformers

Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. LoFTR: Detector-free local feature matching with transformers. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 8922–8931, 2021. 2

work page 2021

[47] [47]

GLAM- points: Greedily learned accurate match points

Prune Truong, Stefanos Apostolopoulos, Agata Mosinska, Samuel Stucky, Carlos Ciller, and Sandro De Zanet. GLAM- points: Greedily learned accurate match points. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 10732–10741,

work page

[48] [48]

DISK: Learning local features with policy gradient.Adv

Michał Tyszkiewicz, Pascal Fua, and Eduard Trulls. DISK: Learning local features with policy gradient.Adv. Neural Inf. Process. Syst., 33:14254–14265, 2020. 1, 2, 5, 6, 8

work page 2020

[49] [49]

DUSt3R: Geometric 3D vision made easy

Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. DUSt3R: Geometric 3D vision made easy. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 20697–20709, 2024. 2

work page 2024

[50] [50]

FeatureBooster: Boosting feature descriptors with a lightweight neural network

Xinjiang Wang, Zeyu Liu, yu Hu, Wei Xi, Wenxian Yu, and Danping Zou. FeatureBooster: Boosting feature descriptors with a lightweight neural network. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., 2023. 1

work page 2023

[51] [51]

Efficient LoFTR: Semi-dense local feature matching with sparse-like speed

Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, and Xiaowei Zhou. Efficient LoFTR: Semi-dense local feature matching with sparse-like speed. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 21666–21675, 2024. 2

work page 2024

[52] [52]

Tree-based morse regions: A topological approach to local feature detection.IEEE Trans

Yongchao Xu, Pascal Monasse, Thierry G´eraud, and Laurent Najman. Tree-based morse regions: A topological approach to local feature detection.IEEE Trans. Image Process., 23 (12):5612–5625, 2014. 2

work page 2014

[53] [53]

iSimLoc: Visual global local- ization for previously unseen environments with simulated images.IEEE Trans

Peng Yin, Ivan Cisneros, Shiqi Zhao, Ji Zhang, Howie Choset, and Sebastian Scherer. iSimLoc: Visual global local- ization for previously unseen environments with simulated images.IEEE Trans. Robot., 39(3):1893–1909, 2023. 1

work page 1909

[54] [54]

ALIKE: Accurate and lightweight keypoint detection and descriptor extraction

Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter CY Chen, and Zhengguo Li. ALIKE: Accurate and lightweight keypoint detection and descriptor extraction. IEEE Trans. on Multimedia, 25:3101–3112, 2022. 6

work page 2022

[55] [55]

ALIKED: A lighter keypoint and descriptor extraction network via deformable transformation.IEEE Trans

Xiaoming Zhao, Xingming Wu, Weihai Chen, Peter CY Chen, Qingsong Xu, and Zhengguo Li. ALIKED: A lighter keypoint and descriptor extraction network via deformable transformation.IEEE Trans. Instrum. Meas., 72:1–16, 2023. 1, 2, 3, 5, 6

work page 2023