From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection
Pith reviewed 2026-05-21 12:05 UTC · model grok-4.3
The pith
Keypoint detection trained via reinforcement learning on full image sequences produces more consistent long-term tracks than methods optimized only on image pairs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that an end-to-end RL agent called TraqPoint, guided by a track-aware reward that scores both consistency and distinctiveness of keypoints over multiple views, learns detectors whose output keypoints form higher-quality tracks when evaluated on sparse matching benchmarks.
What carries the argument
The track-aware reward inside the TraqPoint policy-gradient framework, which scores keypoints jointly across sequence views for consistency and distinctiveness.
If this is right
- Relative pose estimation accuracy increases on standard sparse matching test sets.
- 3D reconstruction completeness and accuracy improve on the same benchmarks.
- The learned detectors require no separate tuning or descriptor retraining to achieve the reported gains.
- Keypoints selected by the policy remain effective under large viewpoint and illumination shifts typical of real sequences.
Where Pith is reading between the lines
- The same sequential reward idea could be applied to other vision tasks that depend on persistent features over time, such as long-term object tracking.
- Training directly on sequences may reduce reliance on hand-designed post-processing heuristics that current pair-based methods use to enforce consistency.
- If the reward can be computed from cheap geometric proxies, the approach might scale to very long video streams without dense ground-truth tracks.
Load-bearing premise
The reward signal that rewards consistency and distinctiveness across multiple views correctly measures long-term track quality and produces improvements that transfer without dataset-specific adjustments.
What would settle it
Run the same evaluation benchmarks on a detector trained only on pairs but given an auxiliary loss that explicitly penalizes track breaks; if performance remains equal or better than TraqPoint, the necessity of the full sequence RL formulation is challenged.
Figures
read the original abstract
Keypoint-based matching is a fundamental component of modern 3D vision systems, such as Structure-from-Motion (SfM) and SLAM. Most existing learning-based methods are trained on image pairs, a paradigm that fails to explicitly optimize for the long-term trackability of keypoints across sequences under challenging viewpoint and illumination changes. In this paper, we reframe keypoint detection as a sequential decision-making problem. We introduce TraqPoint, a novel, end-to-end Reinforcement Learning (RL) framework designed to optimize the \textbf{Tra}ck-\textbf{q}uality (Traq) of keypoints directly on image sequences. Our core innovation is a track-aware reward mechanism that jointly encourages the consistency and distinctiveness of keypoints across multiple views, guided by a policy gradient method. Extensive evaluations on sparse matching benchmarks, including relative pose estimation and 3D reconstruction, demonstrate that TraqPoint significantly outperforms some state-of-the-art (SOTA) keypoint detection and description methods.The code will be available at https://github.com/xiaomi-research/traqpoint.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reframes keypoint detection as a sequential RL problem and introduces TraqPoint, an end-to-end framework that optimizes a track-aware reward (joint consistency and distinctiveness across multiple views) via policy gradients on image sequences. It claims this yields keypoints with superior long-term trackability, leading to significant outperformance over SOTA methods on sparse matching benchmarks including relative pose estimation and 3D reconstruction.
Significance. If the central claim holds, the shift from pair-wise to sequence-aware training via RL could improve robustness of learned keypoints under viewpoint and illumination changes, with potential impact on SfM and SLAM pipelines. The explicit optimization of multi-view track quality is a promising direction, though its advantage over pair-wise baselines requires clear isolation from architecture or data effects.
major comments (3)
- [§3.3] §3.3, Reward Definition: The track-aware reward combines consistency and distinctiveness over fixed-length sequences; without explicit analysis of sequence length K or how the reward correlates with true long-horizon track quality (vs. short-term signals), it is unclear whether the sequential formulation delivers the claimed generalizable advantage or reduces to heuristics achievable by pair-wise training.
- [§5.2] §5.2, Ablation Studies: The experiments report outperformance on pose estimation and reconstruction but lack controls isolating the contribution of the track-aware sequential reward from the underlying detector architecture or training data statistics. This is load-bearing for attributing gains to the policy-gradient formulation rather than other factors.
- [§4.1] §4.1, Hyperparameter Handling: The weighting between consistency and distinctiveness is a free parameter; the manuscript provides no sensitivity analysis or dataset-independent selection procedure, raising the risk that reported improvements incorporate dataset-specific tuning and undermine the generalizability claim.
minor comments (2)
- [Abstract] Abstract: The phrasing 'significantly outperforms some state-of-the-art' is imprecise; include the specific competing methods and key quantitative margins (e.g., AUC or reconstruction error deltas) to strengthen the summary.
- [§2] §2, Related Work: Several recent sequence-modeling or RL-for-vision papers are not cited; adding them would better contextualize the novelty of the track-aware policy gradient approach.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major point below and have revised the manuscript to incorporate additional analysis and experiments where needed.
read point-by-point responses
-
Referee: [§3.3] §3.3, Reward Definition: The track-aware reward combines consistency and distinctiveness over fixed-length sequences; without explicit analysis of sequence length K or how the reward correlates with true long-horizon track quality (vs. short-term signals), it is unclear whether the sequential formulation delivers the claimed generalizable advantage or reduces to heuristics achievable by pair-wise training.
Authors: We agree that further analysis of sequence length K and its relation to long-horizon tracking is important. In the revised manuscript we have expanded §3.3 with experiments varying K from 3 to 10 and a correlation analysis between the reward signal and long-term metrics such as average track length and repeatability over extended sequences. These results indicate that moderate sequence lengths improve sustained trackability beyond what pair-wise training achieves. revision: yes
-
Referee: [§5.2] §5.2, Ablation Studies: The experiments report outperformance on pose estimation and reconstruction but lack controls isolating the contribution of the track-aware sequential reward from the underlying detector architecture or training data statistics. This is load-bearing for attributing gains to the policy-gradient formulation rather than other factors.
Authors: We acknowledge the need for stronger isolation of the reward contribution. The revised §5.2 now includes additional ablations that (i) compare the full sequence-aware model against an identical architecture trained with pair-wise rewards on the same data, (ii) disable the track-aware terms while retaining the RL framework, and (iii) vary training data statistics. The new results attribute a substantial portion of the gains specifically to the track-aware policy-gradient formulation. revision: yes
-
Referee: [§4.1] §4.1, Hyperparameter Handling: The weighting between consistency and distinctiveness is a free parameter; the manuscript provides no sensitivity analysis or dataset-independent selection procedure, raising the risk that reported improvements incorporate dataset-specific tuning and undermine the generalizability claim.
Authors: We agree that sensitivity analysis was missing. In the revised manuscript we have added a sensitivity study in §4.1 evaluating the consistency-distinctiveness weight over a wide range on multiple datasets. Performance remains stable within a practical interval, and we now describe a validation-set procedure for selecting the weight that does not rely on test-set information. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper reframes keypoint detection as a sequential RL task and defines a track-aware reward explicitly in terms of external multi-view consistency and distinctiveness measures. No equations reduce by construction to fitted inputs, no self-citation chains bear the central claim, and the reward is not a renaming or ansatz smuggled from prior self-work. The policy-gradient updates optimize an independently specified objective, making the reported gains on pose and reconstruction benchmarks attributable to the sequential formulation rather than tautological reuse of training signals.
Axiom & Free-Parameter Ledger
free parameters (1)
- Reward weighting between consistency and distinctiveness
axioms (1)
- domain assumption Keypoint selection can be cast as a sequential decision process whose quality is measurable by cross-view consistency and distinctiveness.
invented entities (1)
-
TraqPoint framework
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Three things ev- eryone should know to improve object retrieval
Relja Arandjelovi ´c and Andrew Zisserman. Three things ev- eryone should know to improve object retrieval. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2911– 2918, 2012. 7
work page 2012
-
[2]
Scale-free image keypoints using differentiable persistent homology
Giovanni Barbarani, Francesco Vaccarino, Gabriele Trivi- gno, Marco Guerra, Gabriele Berton, and Carlo Masone. Scale-free image keypoints using differentiable persistent homology. InProc. Int. Conf. Mach. Learn., pages 2990– 3002, 2024. 2
work page 2024
-
[3]
Reinforced feature points: Optimizing feature detection and description for a high-level task
Aritra Bhowmik, Stefan Gumhold, Carsten Rother, and Eric Brachmann. Reinforced feature points: Optimizing feature detection and description for a high-level task. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4948– 4957, 2020. 1, 2
work page 2020
-
[4]
Gary Bradski. The OpenCV library.Dr. Dobb’s Journal: Software Tools for the Professional Programmer, 25(11): 120–123, 2000. 5
work page 2000
-
[5]
RDD: Robust feature de- tector and descriptor using deformable transformer
Gonglin Chen, Tianwen Fu, Haiwei Chen, Wenbin Teng, Hanyuan Xiao, and Yajie Zhao. RDD: Robust feature de- tector and descriptor using deformable transformer. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 6394– 6403, 2025. 1, 2, 3, 5, 6, 7, 8
work page 2025
-
[6]
Orbeez-SLAM: A real-time monocular visual slam with orb features and nerf- realized mapping
Chi-Ming Chung, Yang-Che Tseng, Ya-Ching Hsu, Xiang- Qian Shi, Yun-Hung Hua, Jia-Fong Yeh, Wen-Chin Chen, Yi-Ting Chen, and Winston H Hsu. Orbeez-SLAM: A real-time monocular visual slam with orb features and nerf- realized mapping. InProc. IEEE Int. Conf. Robot. Autom., pages 9400–9406, 2023. 1
work page 2023
-
[7]
SIPs: Succinct interest points from unsuper- vised inlierness probability learning
Titus Cieslewski, Konstantinos G Derpanis, and Davide Scaramuzza. SIPs: Succinct interest points from unsuper- vised inlierness probability learning. InInternational Con- ference on 3D Vision, pages 604–613, 2019. 2
work page 2019
-
[8]
ScanNet: Richly-annotated 3D reconstructions of indoor scenes
Angela Dai, Angel X Chang, Manolis Savva, Maciej Hal- ber, Thomas Funkhouser, and Matthias Nießner. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 5828–5839, 2017. 5, 6
work page 2017
-
[9]
SuperPoint: Self-supervised interest point detection and description
Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabi- novich. SuperPoint: Self-supervised interest point detection and description. InProc. of IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, pages 224–236, 2018. 1, 2, 5, 6, 7
work page 2018
-
[10]
D2- Net: A trainable cnn for joint description and detection of local features
Mihai Dusmanu, Ignacio Rocco, Tomas Pajdla, Marc Polle- feys, Josef Sivic, Akihiko Torii, and Torsten Sattler. D2- Net: A trainable cnn for joint description and detection of local features. InProc. IEEE/CVF Conf. Comput. Vis. Pat- tern Recog., pages 8092–8101, 2019. 2, 7
work page 2019
-
[11]
DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detec- tor
Johan Edstedt, Georg B ¨okman, and Zhenjun Zhao. DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detec- tor. InProc. of IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2024. 1, 2, 3, 5, 6, 8
work page 2024
-
[12]
RoMa: Robust dense feature matching
Johan Edstedt, Qiyu Sun, Georg B ¨okman, M ˚arten Wadenb¨ack, and Michael Felsberg. RoMa: Robust dense feature matching. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 19790–19800, 2024. 2
work page 2024
-
[13]
Johan Edstedt, Georg B ¨okman, M ˚arten Wadenb ¨ack, and Michael Felsberg. DaD: Distilled reinforcement learn- ing for diverse keypoint detection.arXiv preprint arXiv:2503.07347, 2025. 2
-
[14]
Are we ready for autonomous driving? the KITTI vision benchmark suite
Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the KITTI vision benchmark suite. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 3354–3361, 2012. 7, 8
work page 2012
-
[15]
SiLK: Simple learned keypoints
Pierre Gleize, Weiyao Wang, and Matt Feiszli. SiLK: Simple learned keypoints. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 22499–22508, 2023. 1, 2
work page 2023
-
[16]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 770– 778, 2016. 8
work page 2016
-
[17]
OmniGlue: Generalizable feature match- ing with foundation model guidance
Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, and Andre Araujo. OmniGlue: Generalizable feature match- ing with foundation model guidance. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., 2024. 8
work page 2024
-
[18]
Learn- ing to make keypoints sub-pixel accurate
Shinjeong Kim, Marc Pollefeys, and Daniel Barath. Learn- ing to make keypoints sub-pixel accurate. InProc. Eur. Conf. Comput. Vis., pages 413–431, 2024. 2
work page 2024
-
[19]
RIPE: Reinforcement learning on unlabeled image pairs for robust keypoint extraction
Johannes K ¨unzel, Anna Hilsmann, and Peter Eisert. RIPE: Reinforcement learning on unlabeled image pairs for robust keypoint extraction. InProc. IEEE/CVF Int. Conf. Comput. Vis., 2025. 1, 2, 5, 6, 7, 8
work page 2025
-
[20]
Self- supervised equivariant learning for oriented keypoint detec- tion
Jongmin Lee, Byungjin Kim, and Minsu Cho. Self- supervised equivariant learning for oriented keypoint detec- tion. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4847–4857, 2022. 2
work page 2022
-
[21]
Ground- ing image matching in 3d with MASt3R
Vincent Leroy, Yohann Cabon, and J´erˆome Revaud. Ground- ing image matching in 3d with MASt3R. InProc. Eur. Conf. Comput. Vis., pages 71–91, 2024. 2
work page 2024
-
[22]
Decoupling makes weakly supervised local feature better
Kunhong Li, Longguang Wang, Li Liu, Qing Ran, Kai Xu, and Yulan Guo. Decoupling makes weakly supervised local feature better. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 15838–15848, 2022. 7
work page 2022
-
[23]
MegaDepth: Learning single-view depth prediction from internet photos
Zhengqi Li and Noah Snavely. MegaDepth: Learning single-view depth prediction from internet photos. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2041– 2050, 2018. 1, 2, 3, 5, 6, 8
work page 2041
-
[24]
LightGlue: Local feature matching at light speed
Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Polle- feys. LightGlue: Local feature matching at light speed. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 17627– 17638, 2023. 5, 6
work page 2023
-
[25]
LiftFeat: 3D geometry-aware local feature matching
Yepeng Liu, Wenpeng Lai, Zhou Zhao, Yuxuan Xiong, Jinchi Zhu, Jun Cheng, and Yongchao Xu. LiftFeat: 3D geometry-aware local feature matching. InProc. IEEE Int. Conf. Robot. Autom., pages 11714–11720, 2025. 1
work page 2025
-
[26]
Distinctive image features from scale- invariant keypoints.Int
David G Lowe. Distinctive image features from scale- invariant keypoints.Int. J. Comput. Vis., 60:91–110, 2004. 2, 5
work page 2004
-
[27]
ASLFeat: Learning local features of accurate shape and lo- calization
Zixin Luo, Lei Zhou, Xuyang Bai, Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, and Long Quan. ASLFeat: Learning local features of accurate shape and lo- calization. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 6589–6598, 2020. 7
work page 2020
-
[28]
Jiri Matas, Ondrej Chum, Martin Urban, and Tom ´as Pa- jdla. Robust wide-baseline stereo from maximally stable ex- tremal regions.Image and vision computing, 22(10):761– 767, 2004. 2
work page 2004
-
[29]
ORB-SLAM2: An open- source slam system for monocular, stereo, and RGB-D cam- eras.IEEE Trans
Raul Mur-Artal and Juan D Tard´os. ORB-SLAM2: An open- source slam system for monocular, stereo, and RGB-D cam- eras.IEEE Trans. Robot., 33(5):1255–1262, 2017. 1
work page 2017
-
[30]
Maxime Oquab, Timoth ´ee Darcet, Th´eo Moutakanni, Huy V V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel HAZIZA, Francisco Massa, Alaaeldin El-Nouby, et al. DINOv2: Learning robust visual features without supervision.Transactions on Machine Learning Research,
-
[31]
Konstantin Pakulev, Alexander Vakhitov, and Gonzalo Fer- rer. Ness-st: Detecting good and stable keypoints with a neural stability score and the Shi-Tomasi detector. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 9578–9588, 2023. 2
work page 2023
-
[32]
Enhancing deformable lo- cal features by jointly learning to detect and describe key- points
Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Mar- tins, and Erickson R Nascimento. Enhancing deformable lo- cal features by jointly learning to detect and describe key- points. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 1306–1315, 2023. 2
work page 2023
-
[33]
XFeat: Accelerated fea- tures for lightweight image matching
Guilherme Potje, Felipe Cadar, Andr ´e Araujo, Renato Mar- tins, and Erickson R Nascimento. XFeat: Accelerated fea- tures for lightweight image matching. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 2682–2691, 2024. 2, 3, 5, 6, 7
work page 2024
-
[34]
R2D2: Reliable and repeatable detec- tor and descriptor.Adv
Jerome Revaud, Cesar De Souza, Martin Humenberger, and Philippe Weinzaepfel. R2D2: Reliable and repeatable detec- tor and descriptor.Adv. Neural Inf. Process. Syst., 32, 2019. 2
work page 2019
-
[35]
Faster and better: A machine learning approach to corner detection
Edward Rosten, Reid Porter, and Tom Drummond. Faster and better: A machine learning approach to corner detection. IEEE Trans. on Pattern Anal. and Mach. Intell., 32(1):105– 119, 2008. 2, 5
work page 2008
-
[36]
ORB: An efficient alternative to SIFT or SURF
Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. ORB: An efficient alternative to SIFT or SURF. In Proc. IEEE/CVF Int. Conf. Comput. Vis., pages 2564–2571,
-
[37]
S-TREK: Sequential translation and rotation equivariant keypoints for local fea- ture extraction
Emanuele Santellani, Christian Sormann, Mattia Rossi, An- dreas Kuhn, and Friedrich Fraundorfer. S-TREK: Sequential translation and rotation equivariant keypoints for local fea- ture extraction. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 9728–9737, 2023. 2
work page 2023
-
[38]
Gmm- ikrs: Gaussian mixture models for interpretable keypoint re- finement and scoring
Emanuele Santellani, Martin Zach, Christian Sormann, Mat- tia Rossi, Andreas Kuhn, and Friedrich Fraundorfer. Gmm- ikrs: Gaussian mixture models for interpretable keypoint re- finement and scoring. InProc. Eur. Conf. Comput. Vis., pages 77–93, 2024. 2
work page 2024
-
[39]
From coarse to fine: Robust hierarchical localization at large scale
Paul-Edouard Sarlin, Cesar Cadena, Roland Siegwart, and Marcin Dymczyk. From coarse to fine: Robust hierarchical localization at large scale. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 12716–12725, 2019. 1, 6
work page 2019
-
[40]
SuperGlue: Learning feature matching with graph neural networks
Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. SuperGlue: Learning feature matching with graph neural networks. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4938–4947, 2020. 5, 6
work page 2020
-
[41]
Benchmarking 6DOF outdoor visual localization in changing conditions
Torsten Sattler, Will Maddern, Carl Toft, Akihiko Torii, Lars Hammarstrand, Erik Stenborg, Daniel Safari, Masatoshi Okutomi, Marc Pollefeys, Josef Sivic, et al. Benchmarking 6DOF outdoor visual localization in changing conditions. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 8601–8610, 2018. 6, 7
work page 2018
-
[42]
Quad-networks: unsupervised learning to rank for interest point detection
Nikolay Savinov, Akihito Seki, Lubor Ladicky, Torsten Sat- tler, and Marc Pollefeys. Quad-networks: unsupervised learning to rank for interest point detection. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 1822– 1830, 2017. 2
work page 2017
-
[43]
Structure- from-motion revisited
Johannes L Schonberger and Jan-Michael Frahm. Structure- from-motion revisited. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 4104–4113, 2016. 1, 2, 7
work page 2016
-
[44]
Comparative evaluation of hand-crafted and learned local features
Johannes L Schonberger, Hans Hardmeier, Torsten Sattler, and Marc Pollefeys. Comparative evaluation of hand-crafted and learned local features. InProc. IEEE/CVF Conf. Com- put. Vis. Pattern Recog., pages 1482–1491, 2017. 6, 7
work page 2017
-
[45]
Oriane Sim ´eoni, Huy V . V o, Maximilian Seitzer, Federico Baldassarre, Maxime Oquab, Cijo Jose, Vasil Khalidov, Marc Szafraniec, Seungeun Yi, Micha ¨el Ramamonjisoa, Francisco Massa, Daniel Haziza, Luca Wehrstedt, Jianyuan Wang, Timoth´ee Darcet, Th´eo Moutakanni, Leonel Sentana, Claire Roberts, Andrea Vedaldi, Jamie Tolan, John Brandt, Camille Couprie,...
work page 2025
-
[46]
LoFTR: Detector-free local feature matching with transformers
Jiaming Sun, Zehong Shen, Yuang Wang, Hujun Bao, and Xiaowei Zhou. LoFTR: Detector-free local feature matching with transformers. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 8922–8931, 2021. 2
work page 2021
-
[47]
GLAM- points: Greedily learned accurate match points
Prune Truong, Stefanos Apostolopoulos, Agata Mosinska, Samuel Stucky, Carlos Ciller, and Sandro De Zanet. GLAM- points: Greedily learned accurate match points. InProc. IEEE/CVF Int. Conf. Comput. Vis., pages 10732–10741,
-
[48]
DISK: Learning local features with policy gradient.Adv
Michał Tyszkiewicz, Pascal Fua, and Eduard Trulls. DISK: Learning local features with policy gradient.Adv. Neural Inf. Process. Syst., 33:14254–14265, 2020. 1, 2, 5, 6, 8
work page 2020
-
[49]
DUSt3R: Geometric 3D vision made easy
Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, and Jerome Revaud. DUSt3R: Geometric 3D vision made easy. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 20697–20709, 2024. 2
work page 2024
-
[50]
FeatureBooster: Boosting feature descriptors with a lightweight neural network
Xinjiang Wang, Zeyu Liu, yu Hu, Wei Xi, Wenxian Yu, and Danping Zou. FeatureBooster: Boosting feature descriptors with a lightweight neural network. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., 2023. 1
work page 2023
-
[51]
Efficient LoFTR: Semi-dense local feature matching with sparse-like speed
Yifan Wang, Xingyi He, Sida Peng, Dongli Tan, and Xiaowei Zhou. Efficient LoFTR: Semi-dense local feature matching with sparse-like speed. InProc. IEEE/CVF Conf. Comput. Vis. Pattern Recog., pages 21666–21675, 2024. 2
work page 2024
-
[52]
Tree-based morse regions: A topological approach to local feature detection.IEEE Trans
Yongchao Xu, Pascal Monasse, Thierry G´eraud, and Laurent Najman. Tree-based morse regions: A topological approach to local feature detection.IEEE Trans. Image Process., 23 (12):5612–5625, 2014. 2
work page 2014
-
[53]
Peng Yin, Ivan Cisneros, Shiqi Zhao, Ji Zhang, Howie Choset, and Sebastian Scherer. iSimLoc: Visual global local- ization for previously unseen environments with simulated images.IEEE Trans. Robot., 39(3):1893–1909, 2023. 1
work page 1909
-
[54]
ALIKE: Accurate and lightweight keypoint detection and descriptor extraction
Xiaoming Zhao, Xingming Wu, Jinyu Miao, Weihai Chen, Peter CY Chen, and Zhengguo Li. ALIKE: Accurate and lightweight keypoint detection and descriptor extraction. IEEE Trans. on Multimedia, 25:3101–3112, 2022. 6
work page 2022
-
[55]
Xiaoming Zhao, Xingming Wu, Weihai Chen, Peter CY Chen, Qingsong Xu, and Zhengguo Li. ALIKED: A lighter keypoint and descriptor extraction network via deformable transformation.IEEE Trans. Instrum. Meas., 72:1–16, 2023. 1, 2, 3, 5, 6
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.