RBE-Flow: Recurrent Bayesian Estimation on Feature Manifolds for Cross-Modal Registration

Hongwei Ding; Mengzhu Ding; Xiaoke Ding; Xin Song; Xuecong Liu

arxiv: 2606.30492 · v1 · pith:YNSAKJ75new · submitted 2026-06-29 · 💻 cs.CV

RBE-Flow: Recurrent Bayesian Estimation on Feature Manifolds for Cross-Modal Registration

Mengzhu Ding , Xin Song , Xiaoke Ding , Hongwei Ding , Xuecong Liu This is my paper

Pith reviewed 2026-06-30 06:37 UTC · model grok-4.3

classification 💻 cs.CV

keywords cross-modal image registrationbayesian estimationfeature manifoldsoptical flowuncertainty quantificationrecurrent optimizationmulti-sensor perception

0 comments

The pith

RBE-Flow recasts cross-modal flow estimation as closed-loop recurrent Bayesian estimation on feature manifolds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to replace one-shot deterministic regression with a recurrent Bayesian process that alternates manifold optimization and probabilistic state updates. By generating flow observations and calibrated uncertainties in each cycle, then assimilating them through sigma-point projection while feeding posterior covariance back to damp subsequent steps, the loop is intended to navigate non-convex landscapes caused by radiometric and geometric mismatches. A geometry-aware rectified negative log-likelihood loss is introduced to keep variance estimates stable during training. The resulting system is shown to deliver higher accuracy than prior methods on three standard cross-modal benchmarks, especially when sub-pixel precision is required.

Core claim

The central claim is that dense cross-modal flow estimation can be solved as a self-correcting recurrent Bayesian estimation problem on learned feature manifolds, where a Recurrent Manifold Optimization block produces observations and uncertainties that an Uncertainty-Adaptive Probabilistic Update assimilates via deterministic sigma-point projection, and the resulting posterior covariance adaptively regularizes the next optimization damping.

What carries the argument

The Recurrent Manifold Optimization (RMO) block iteratively produces flow observations with associated uncertainties, which are assimilated by the Uncertainty-Adaptive Probabilistic Update (UAPU) using deterministic sigma-point projection; the calibrated posterior covariance is fed back to regularize subsequent optimization steps.

If this is right

The closed loop allows the optimizer to increase damping when predictive confidence is low and decrease it when high.
The method produces not only a flow field but also a spatially varying uncertainty map that reflects the posterior covariance after each update.
Training remains stable because the rectified NLL term prevents variance collapse while preserving geometric consistency.
Performance gains are largest under strict sub-pixel evaluation criteria on the tested remote-sensing and scene datasets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same recurrent Bayesian structure could be applied to other vision tasks that currently rely on feed-forward regression over non-convex objectives.
If the sigma-point projection preserves calibration across modalities, the framework might supply reliable uncertainty estimates for downstream sensor-fusion pipelines.
Replacing the learned feature manifold with an explicit geometric manifold would test whether the Bayesian update itself, rather than the manifold learning, drives the improvement.

Load-bearing premise

The recurrent optimization steps produce uncertainties that remain meaningfully calibrated and can be fused without bias that the update rule cannot remove.

What would settle it

Disabling the covariance feedback to the optimization damping and observing no drop in sub-pixel accuracy on the same benchmarks would falsify the claim that the Bayesian loop is responsible for the reported gains.

Figures

Figures reproduced from arXiv: 2606.30492 by Hongwei Ding, Mengzhu Ding, Xiaoke Ding, Xin Song, Xuecong Liu.

**Figure 2.** Figure 2: Inlier correspondence visualization with state-of-the-art methods. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: CMR curves under varying thresholds on (a) OSdataset, (b) WHU [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

read the original abstract

Cross-modal image registration is essential for multi-sensor perception but remains fundamentally challenging due to severe non-linear radiometric discrepancies and geometric distortions. Existing deterministic matching methods lack uncertainty awareness, struggling to navigate the resulting highly non-convex optimization landscape and frequently accumulating errors in ambiguous regions. In this paper, we propose RBE-Flow, a novel framework that reformulates dense cross-modal flow estimation as a closed-loop recurrent Bayesian estimation problem on learned feature manifolds. Diverging from standard feed-forward regression, RBE-Flow establishes a robust self-correcting mechanism by deeply coupling feature-metric non-linear optimization with probabilistic state updates. Specifically, a Recurrent Manifold Optimization (RMO) block iteratively generates flow observations and their associated uncertainties, which are then optimally assimilated into the prior state via an Uncertainty-Adaptive Probabilistic Update (UAPU) using deterministic sigma-point projection. Crucially, the resulting calibrated posterior covariance is fed back to adaptively regularize the damping of subsequent optimization steps, allowing the system to modulate its convergence based on predictive confidence. To ensure stable probabilistic training, we introduce a hybrid supervision scheme featuring a geometry-aware rectified NLL loss that structurally prevents variance collapse. Extensive experiments on challenging OSdataset, WHU-OPT-SAR, and RoadScene benchmarks demonstrate that RBE-Flow consistently achieves state-of-the-art performance, outperforming existing methods by a significant margin, particularly under strict sub-pixel criteria. Project page: https://github.com/NEU-Liuxuecong/RBE-Flow

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RBE-Flow adds a recurrent Bayesian loop to cross-modal registration but the key calibration and bias claims rest on unverified assumptions.

read the letter

The main thing here is a new named pipeline that turns dense cross-modal flow into a closed recurrent Bayesian estimation problem on feature manifolds. It couples a Recurrent Manifold Optimization block that produces flow observations plus uncertainties with an Uncertainty-Adaptive Probabilistic Update that assimilates them via sigma-point projection, then feeds the posterior covariance back to damp the next optimization step. A geometry-aware rectified NLL is added to keep training stable.

What the paper actually does is spell out this coupling and run it on three standard benchmarks (OSdataset, WHU-OPT-SAR, RoadScene). The abstract reports consistent gains over prior methods, especially at strict sub-pixel thresholds. That is concrete enough to be worth looking at if you work on multi-sensor registration.

The soft spot is exactly the one the stress-test flags. The performance margin is attributed to the self-correcting Bayesian loop, yet the description gives no reliability diagrams, ECE scores, or direct checks that the RMO uncertainties are calibrated or that the sigma-point assimilation stays bias-free. Without those, it is hard to know whether the gains come from the claimed mechanism or from the usual feature-matching and loss-engineering tricks. The hybrid supervision is described at a high level but not shown to be independent of the reported improvements.

This is the kind of paper that belongs in a reading group if your group cares about uncertainty-aware registration; the architecture is explicit and the benchmarks are public. It does not yet look like a finished result because the central probabilistic claims lack the supporting diagnostics. A serious editor should send it to review so the authors can supply the missing calibration evidence and any ablation that isolates the recurrent update from the rest of the pipeline.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes RBE-Flow, which reformulates dense cross-modal flow estimation as a closed-loop recurrent Bayesian estimation problem on learned feature manifolds. It introduces a Recurrent Manifold Optimization (RMO) block that iteratively generates flow observations and uncertainties, an Uncertainty-Adaptive Probabilistic Update (UAPU) that assimilates them via deterministic sigma-point projection, and covariance feedback to adaptively damp subsequent optimization steps. A hybrid supervision scheme with a geometry-aware rectified NLL loss is used for stable probabilistic training. Experiments on the OSdataset, WHU-OPT-SAR, and RoadScene benchmarks are reported to yield state-of-the-art performance, with particular gains under strict sub-pixel criteria.

Significance. If the RMO uncertainties prove calibrated and the sigma-point assimilation bias-free, the recurrent Bayesian loop with covariance feedback could provide a principled self-correcting mechanism for navigating non-convex landscapes induced by radiometric and geometric discrepancies in cross-modal registration. This would be a meaningful advance for uncertainty-aware multi-sensor perception, but the significance is currently conditional on verification that the performance margins arise from the claimed probabilistic components rather than other factors.

major comments (2)

[Abstract and method description] Abstract and method description: the central SOTA claim rests on RMO producing meaningfully calibrated flow observations+uncertainties and UAPU assimilating them via deterministic sigma-point projection without introducing uncorrectable bias. No reliability diagrams, ECE scores, or bias quantification are provided to confirm these conditions hold on the reported benchmarks; if they fail, the performance margin cannot be attributed to the self-correcting mechanism.
[§4 (Experiments)] §4 (Experiments): without empirical checks on uncertainty calibration or assimilation bias, it remains unclear whether the reported sub-pixel gains on OSdataset, WHU-OPT-SAR, and RoadScene are due to the recurrent Bayesian loop or to standard feature matching and training choices.

minor comments (2)

Notation for the rectified NLL and sigma-point projection could be clarified with explicit equations in the main text rather than relying solely on the method description.
[Abstract] The project page URL is referenced but not provided in the manuscript text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We agree that additional empirical evidence is needed to substantiate the claims regarding uncertainty calibration and the bias-free assimilation in our proposed method. We will revise the manuscript accordingly to include these verifications.

read point-by-point responses

Referee: [Abstract and method description] Abstract and method description: the central SOTA claim rests on RMO producing meaningfully calibrated flow observations+uncertainties and UAPU assimilating them via deterministic sigma-point projection without introducing uncorrectable bias. No reliability diagrams, ECE scores, or bias quantification are provided to confirm these conditions hold on the reported benchmarks; if they fail, the performance margin cannot be attributed to the self-correcting mechanism.

Authors: We acknowledge that the original manuscript does not include reliability diagrams, ECE scores, or explicit bias quantification for the RMO uncertainties and UAPU assimilation. The performance improvements are presented through comparative results on the benchmarks, but we concur that direct validation of the probabilistic components is essential for attributing the gains to the recurrent Bayesian mechanism. In the revised manuscript, we will add reliability diagrams and ECE scores for the flow uncertainties across the OSdataset, WHU-OPT-SAR, and RoadScene benchmarks. Additionally, we will provide a bias analysis by comparing the sigma-point projected updates against Monte Carlo approximations where feasible. This will confirm whether the conditions for the self-correcting mechanism hold. revision: yes
Referee: [§4 (Experiments)] §4 (Experiments): without empirical checks on uncertainty calibration or assimilation bias, it remains unclear whether the reported sub-pixel gains on OSdataset, WHU-OPT-SAR, and RoadScene are due to the recurrent Bayesian loop or to standard feature matching and training choices.

Authors: We agree that the experimental section would be strengthened by explicit checks isolating the contribution of the Bayesian components. The current results demonstrate SOTA performance, but without ablations on the UAPU or covariance feedback, the source of the sub-pixel gains is not fully isolated. We will include in the revision: calibration metrics as noted, and ablation studies comparing the full RBE-Flow against variants with deterministic updates or without covariance feedback. These additions will help clarify whether the gains stem from the recurrent Bayesian loop. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation chain self-contained against external benchmarks

full rationale

The abstract and method description introduce RMO for generating observations+uncertainties, UAPU assimilation via sigma-point projection, covariance feedback, and a rectified NLL loss to prevent collapse. None of these steps are shown (via equations or self-citation) to reduce by construction to fitted inputs or prior results; the hybrid loss is an explicit design choice for training stability, and SOTA claims rest on benchmark experiments rather than tautological re-labeling of fits. No load-bearing self-citation or uniqueness theorem is invoked. This is the normal case of an independent architectural proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract only; no explicit free parameters, axioms, or invented entities are stated. The framework introduces named components (RMO, UAPU) whose internal assumptions cannot be audited from the given text.

pith-pipeline@v0.9.1-grok · 5813 in / 1085 out tokens · 24643 ms · 2026-06-30T06:37:15.509335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

64 extracted references · 8 canonical work pages

[1]

In: International conference on machine learning

Amos, B., Kolter, J.Z.: Optnet: Differentiable optimization as a layer in neural networks. In: International conference on machine learning. pp. 136–145. PMLR (2017)

2017
[2]

In: European conference on computer vision

Chen, H., Luo, Z., Zhou, L., Tian, Y., Zhen, M., Fang, T., Mckinnon, D., Tsin, Y., Quan, L.: Aspanformer: Detector-free image matching with adaptive span trans- former. In: European conference on computer vision. pp. 20–36. Springer (2022)

2022
[3]

Medical Image Analysis100, 103385 (2025)

Chen, J., Liu, Y., Wei, S., Bian, Z., Subramanian, S., Carass, A., Prince, J.L., Du, Y.: A survey on deep learning in medical image registration: New technologies, uncertainty, evaluation metrics, and beyond. Medical Image Analysis100, 103385 (2025)

2025
[4]

IEEE Transactions on Instrumentation and Measurement 73, 1–16 (2024)

Dai, K., Wang, K., Xie, T., Sun, T., Zhang, J., Kong, Q., Jiang, Z., Li, R., Zhao, L., Omar, M.: Dsap: Dynamic sparse attention perception matcher for accurate local feature matching. IEEE Transactions on Instrumentation and Measurement 73, 1–16 (2024)

2024
[5]

In: Proc

DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proc. IEEE/CVF Conf. Comput. Vis. Pat- tern Recognit. Workshops. pp. 224–236 (2018).https://doi.org/10.1109/CVPRW. 2018.00060

work page doi:10.1109/cvprw 2018
[6]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Edstedt,J.,Mateus,A.,Jaenal,A.:Colabsfm:Collaborativestructure-from-motion by point cloud registration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 6573–6583 (2025)

2025
[7]

In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition

Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: Roma: Robust dense feature matching. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. pp. 19790–19800 (2024)

2024
[8]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Gast, J., Roth, S.: Lightweight probabilistic deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3369–3378 (2018)

2018
[9]

Nature629(8014), 1034–1040 (2024)

Gehrig, D., Scaramuzza, D.: Low-latency automotive vision with event cameras. Nature629(8014), 1034–1040 (2024)

2024
[10]

In: Proceedings of the AAAI conference on artificial intelligence

Giang, K.T., Song, S., Jo, S.: Topicfm: Robust and interpretable topic-assisted feature matching. In: Proceedings of the AAAI conference on artificial intelligence. vol. 37, pp. 2447–2455 (2023)

2023
[11]

arXiv preprint arXiv:2501.07556 (2025)

He, X., Yu, H., Peng, S., Tan, D., Shen, Z., Bao, H., Zhou, X.: Matchanything: Uni- versal cross-modality image matching with large-scale pre-training. arXiv preprint arXiv:2501.07556 (2025)

work page arXiv 2025
[12]

Information Fusion102, 102027 (2024)

Hou, Z., Liu, Y., Zhang, L.: Pos-gift: A geometric and intensity-invariant feature transformation for multimodal images. Information Fusion102, 102027 (2024)

2024
[13]

In: European Conference on Computer Vision

Hu, D., Peng, L., Chu, T., Zhang, X., Mao, Y., Bondell, H., Gong, M.: Uncertainty quantification in depth estimation via constrained ordinal regression. In: European Conference on Computer Vision. pp. 237–256. Springer (2022)

2022
[14]

In: European conference on computer vision

Huang, Z., Shi, X., Zhang, C., Wang, Q., Cheung, K.C., Qin, H., Dai, J., Li, H.: Flowformer: A transformer architecture for optical flow. In: European conference on computer vision. pp. 668–685. Springer (2022)

2022
[15]

IEEE Transactions on geoscience and remote sensing57(1), 574–586 (2018)

Ji, S., Wei, S., Lu, M.: Fully convolutional networks for multisource building ex- traction from an open aerial and satellite imagery data set. IEEE Transactions on geoscience and remote sensing57(1), 574–586 (2018)

2018
[16]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Jiang,H.,Karpur,A.,Cao,B.,Huang,Q.,Araujo,A.:Omniglue:Generalizablefea- ture matching with foundation model guidance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 19865–19875 (2024) RBE-Flow for Cross-Modal Registration 17

2024
[17]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Jiang, H., Dang, Z., Wei, Z., Xie, J., Yang, J., Salzmann, M.: Robust outlier rejec- tion for 3d registration with variational bayes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1148–1157 (June 2023)

2023
[18]

In: Proceedings of the IEEE/CVF international conference on computer vision

Jiang, S., Campbell, D., Lu, Y., Li, H., Hartley, R.: Learning to estimate hid- den motions with global motion aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 9772–9781 (2021)

2021
[19]

Information Fusion73, 22–71 (2021)

Jiang, X., Ma, J., Xiao, G., Shao, Z., Guo, X.: A review of multimodal image matching: Methods and applications. Information Fusion73, 22–71 (2021)

2021
[20]

In: European Conference on Computer Vision

Leroy, V., Cabon, Y., Revaud, J.: Grounding image matching in 3d with mast3r. In: European Conference on Computer Vision. pp. 71–91. Springer (2024)

2024
[21]

IEEE Transactions on Image Processing 29, 3296–3310 (2019)

Li, J., Hu, Q., Ai, M.: Rift: Multi-modal image matching based on radiation- variation insensitive feature transform. IEEE Transactions on Image Processing 29, 3296–3310 (2019)

2019
[22]

IEEE Transactions on Geoscience and Remote Sensing 60, 1–14 (2022)

Li, J., Hu, Q., Ai, M.: LNIFT: Locally normalized image for rotation invariant mul- timodal feature matching. IEEE Transactions on Geoscience and Remote Sensing 60, 1–14 (2022)

2022
[23]

arXiv (2023)

Li, J., Shi, P., Hu, Q., Zhang, Y.: Rift2: Speeding-up rift with a new rotation- invariance technique. arXiv (2023)

2023
[24]

Pattern Recognition158, 110972 (2025)

Li, W., Chen, Q., Gu, G., Sui, X.: Object matching of visible–infrared image based on attention mechanism and feature fusion. Pattern Recognition158, 110972 (2025)

2025
[25]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Li, X., Yang, W., Deng, J., Cheng, Z., Zhou, X., Zhang, T.: Implicit correspondence learning for image-to-point cloud registration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16922–16931 (2025)

2025
[26]

Advances in Neural Information Processing Systems37, 128430–128461 (2024)

Li, Y., Li, X., Li, W., Hou, Q., Liu, L., Cheng, M.M., Yang, J.: Sardet-100k: Towards open-source benchmark and toolkit for large-scale sar object detection. Advances in Neural Information Processing Systems37, 128430–128461 (2024)

2024
[27]

In: Proceedings of the IEEE/CVF international conference on com- puter vision

Lindenberger, P., Sarlin, P.E., Pollefeys, M.: Lightglue: Local feature matching at light speed. In: Proceedings of the IEEE/CVF international conference on com- puter vision. pp. 17627–17638 (2023)

2023
[28]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, K., Yu, Y.: Revisiting the domain gap issue in non-cooperative spacecraft pose tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6864–6873 (2024)

2024
[29]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, X., Ding, M., Sun, Z., Li, Z., Teng, X.: Crft: Consistent-recurrent feature flow transformer for cross-modal image registration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 34784–34794 (2026)

2026
[30]

Image and Vision Computing p

Liu, X., Kou, Y., Chen, L., Teng, X., Li, Z., Luo, J.: Multi-orientation multi-scale aggregated registration with cyclic back-projection for multimodal images. Image and Vision Computing p. 106089 (2026)

2026
[31]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2026)

Liu, X., Sun, Z., Ding, H., Song, X., Zhang, S., Sun, Y.: Gaff: Global attention feature flow network for optical and sar image registration under geometric trans- formations. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2026)

2026
[32]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing17, 18139– 18155 (2024)

Liu, X., Teng, X., Bian, Y., Li, Z., Yu, Q.: Shape-adaptive modality independent region descriptor for multimodal remote sensing image matching. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing17, 18139– 18155 (2024)

2024
[33]

IEEE Transactions on Instrumentation and Measurement71, 1–12 (2022) 18 M

Liu, X., Teng, X., Li, Z., Yu, Q., Bian, Y.: A fast algorithm for high accuracy airborne sar geolocation based on local linear approximation. IEEE Transactions on Instrumentation and Measurement71, 1–12 (2022) 18 M. Ding et al

2022
[34]

Chinese Journal of Aeronautics37(1), 271–286 (2024)

Liu, X., Teng, X., Luo, J., Li, Z., Yu, Q., Bian, Y.: Robust multi-sensor image matching based on normalized self-similarity region descriptor. Chinese Journal of Aeronautics37(1), 271–286 (2024)

2024
[35]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Mao, S., Lu, S., Du, Z., Jiao, L., Gou, S., Mou, L., Lu, X., Xiong, L., Zhang, Y.: Cross-rejective open-set sar image registration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 23027–23036 (2025)

2025
[36]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Morimitsu, H., Zhu, X., Cesar, R.M., Ji, X., Yin, X.C.: Dpflow: Adaptive optical flow estimation with a dual-pyramid framework. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 17810–17820 (June 2025)

2025
[37]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Potje, G., Cadar, F., Araujo, A., Martins, R., Nascimento, E.R.: Xfeat: Acceler- ated features for lightweight image matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2682–2691 (2024)

2024
[38]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Qin, H., Xu, T., Li, T., Chen, Z., Feng, T., Li, J.: Must: The first dataset and unified framework for multispectral uav single object tracking. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16882–16891 (2025)

2025
[39]

Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations,

Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: Learning feature matching with graph neural networks. In: Proc. IEEE/CVF Conf. Com- put. Vis. Pattern Recognit. pp. 4938–4947 (2020).https://doi.org/10.1109/ CVPR42600.2020.00499

work page arXiv 2020
[40]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Schusterbauer, J., Gui, M., Fundel, F., Ommer, B.: Diff2flow: Training flow match- ing models via diffusion model alignment. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 28347–28357 (2025)

2025
[41]

In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021

Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: Loftr: Detector-free local feature matching with transformers. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. pp. 8922–8931 (2021).https://doi.org/10.1109/CVPR46437.2021. 00881

work page doi:10.1109/cvpr46437.2021 2021
[42]

arXiv preprint arXiv:2511.00598 (2025)

Sun, Z., Zhi, S., Li, R., Xia, J., Liu, Y., Jiang, W.: Gdros: A geometry-guided dense registration framework for optical-sar images under large geometric trans- formations. arXiv preprint arXiv:2511.00598 (2025)

work page arXiv 2025
[43]

Reducibility among combinatorial problems

Teed, Z., Deng, J.: Raft: Recurrent all-pairs field transforms for optical flow. In: Proc. Eur. Conf. Comput. Vis. (2020).https://doi.org/10.1007/978-3-030- 58536-5_24

work page doi:10.1007/978-3-030- 2020
[44]

Advances in neural information processing systems34, 16558–16569 (2021)

Teed, Z., Deng, J.: Droid-slam: Deep visual slam for monocular, stereo, and rgb- d cameras. Advances in neural information processing systems34, 16558–16569 (2021)

2021
[45]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Truong, P., Danelljan, M., Timofte, R.: Glu-net: Global-local universal network for dense flow and correspondences. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6258–6268 (2020)

2020
[46]

In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition

Truong, P., Danelljan, M., Van Gool, L., Timofte, R.: Learning accurate dense correspondences and when to trust them. In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition. pp. 5714–5724 (2021)

2021
[47]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Tuzcuoğlu, Ö., Köksal, A., Sofu, B., Kalkan, S., Alatan, A.A.: Xoftr: Cross-modal feature matching transformer. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. pp. 4275–4286 (2024).https://doi.org/10.1109/CVPRW63382.2024. 00431

work page doi:10.1109/cvprw63382.2024 2024
[48]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Wu, S., Wang, Y., Liu, X., Yang, Y., Wang, R., Guo, G., Doermann, D., Zhang, B.: Dfm: Differentiable feature matching for anomaly detection. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 15224–15233 (2025)

2025
[49]

In: Proceed- RBE-Flow for Cross-Modal Registration 19 ings of the IEEE/CVF conference on computer vision and pattern recognition

Wu, Z., Zheng, J., Ren, X., Vasluianu, F.A., Ma, C., Paudel, D.P., Van Gool, L., Timofte, R.: Single-model and any-modality for video object tracking. In: Proceed- RBE-Flow for Cross-Modal Registration 19 ings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 19156–19166 (2024)

2024
[50]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing13, 5847–5861 (2020)

Xiang, Y., Tao, R., Wang, F., You, H., Han, B.: Automatic registration of optical and sar images via improved phase congruency model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing13, 5847–5861 (2020)

2020
[51]

IEEE Trans

Xiang, Y., Wang, F., You, H.: Os-sift: A robust sift-like algorithm for high- resolution optical-to-sar image registration in suburban areas. IEEE Trans. Geosci. Remote Sens.56(6), 3078–3090 (2018).https://doi.org/10.1109/tgrs.2018. 2790483

work page doi:10.1109/tgrs.2018 2018
[52]

IEEE Transactions on Geoscience and Remote Sensing62, 1–13 (2024)

Xiao, Y., Zhang, C., Chen, Y., Jiang, B., Tang, J.: Adrnet: Affine and deformable registration networks for multimodal remote sensing images. IEEE Transactions on Geoscience and Remote Sensing62, 1–13 (2024)

2024
[53]

In: proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)

Xu, H., Ma, J., Le, Z., Jiang, J., Guo, X.: Fusiondn: A unified densely connected network for image fusion. In: proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)

2020
[54]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Xu, H., Ma, J., Yuan, J., Le, Z., Liu, W.: Rfnet: Unsupervised network for mutu- ally reinforcing multi-modal image registration and fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 19679–19688 (2022)

2022
[55]

IEEE Transactions on Pattern Analysis and Machine Intelligence 45(10), 12148–12166 (2023)

Xu, H., Yuan, J., Ma, J.: MURF: Mutually reinforcing multi-modal image registra- tion and fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 45(10), 12148–12166 (2023)

2023
[56]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Xu, H., Zhang, J., Cai, J., Rezatofighi, H., Tao, D.: Gmflow: Learning optical flow via global matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8121–8130 (2022)

2022
[57]

IEEE Transactions on Pattern Analysis and Machine Intelligence 45(11), 13961–13975 (2023)

Xu, H., Zhang, J., Cai, J., Rezatofighi, H., Tao, D.: Unifying flow, stereo and depth estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 45(11), 13961–13975 (2023)

2023
[58]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Yang, B., Chen, J., Ye, M.: Towards grand unified representation learning for unsu- pervised visible-infrared person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 11069–11079 (2023)

2023
[59]

IEEE Transactions on Geoscience and Remote Sensing57(11), 9059–9070 (2019)

Ye, Y., Bruzzone, L., Shan, J., Bovolo, F., Zhu, Q.: Fast and robust matching for multimodal remote sensing image registration. IEEE Transactions on Geoscience and Remote Sensing57(11), 9059–9070 (2019)

2019
[60]

In: WACV

Zeng, J., Gu, Z., Liu, W., Cai, L., Cheng, J.: Uncertainty aware interest point detection and description. In: WACV. pp. 2144–2153 (2025)

2025
[61]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Zhang, J., Xia, Z., Dong, M., Shen, S., Yue, L., Zheng, X.: Comatcher: Multi- view collaborative feature matching. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 21970–21980 (2025)

2025
[62]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Zhang, K., Deng, Y., Ma, J., Favaro, P.: Adapting dense matching for homography estimation with grid-based acceleration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 6294–6303 (2025)

2025
[63]

ISPRS Journal of Photogrammetry and Remote Sensing196, 1–15 (2023)

Zhang, Y., Yao, Y., Zhang, W., Zhang, Y.: Histogram of the orientation of the weighted phase descriptor for multi-mo dal remote sensing image matching. ISPRS Journal of Photogrammetry and Remote Sensing196, 1–15 (2023)

2023
[64]

IEEE Transactions on Geoscience and Remote Sensing (2025)

Zheng,C., Li, S.,Wang,C., Zhang, B.:Msg: Robust multimodalremote sensingim- age matching using side window gaussian space. IEEE Transactions on Geoscience and Remote Sensing (2025)

2025

[1] [1]

In: International conference on machine learning

Amos, B., Kolter, J.Z.: Optnet: Differentiable optimization as a layer in neural networks. In: International conference on machine learning. pp. 136–145. PMLR (2017)

2017

[2] [2]

In: European conference on computer vision

Chen, H., Luo, Z., Zhou, L., Tian, Y., Zhen, M., Fang, T., Mckinnon, D., Tsin, Y., Quan, L.: Aspanformer: Detector-free image matching with adaptive span trans- former. In: European conference on computer vision. pp. 20–36. Springer (2022)

2022

[3] [3]

Medical Image Analysis100, 103385 (2025)

Chen, J., Liu, Y., Wei, S., Bian, Z., Subramanian, S., Carass, A., Prince, J.L., Du, Y.: A survey on deep learning in medical image registration: New technologies, uncertainty, evaluation metrics, and beyond. Medical Image Analysis100, 103385 (2025)

2025

[4] [4]

IEEE Transactions on Instrumentation and Measurement 73, 1–16 (2024)

Dai, K., Wang, K., Xie, T., Sun, T., Zhang, J., Kong, Q., Jiang, Z., Li, R., Zhao, L., Omar, M.: Dsap: Dynamic sparse attention perception matcher for accurate local feature matching. IEEE Transactions on Instrumentation and Measurement 73, 1–16 (2024)

2024

[5] [5]

In: Proc

DeTone, D., Malisiewicz, T., Rabinovich, A.: Superpoint: Self-supervised interest point detection and description. In: Proc. IEEE/CVF Conf. Comput. Vis. Pat- tern Recognit. Workshops. pp. 224–236 (2018).https://doi.org/10.1109/CVPRW. 2018.00060

work page doi:10.1109/cvprw 2018

[6] [6]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Edstedt,J.,Mateus,A.,Jaenal,A.:Colabsfm:Collaborativestructure-from-motion by point cloud registration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 6573–6583 (2025)

2025

[7] [7]

In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition

Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: Roma: Robust dense feature matching. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. pp. 19790–19800 (2024)

2024

[8] [8]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Gast, J., Roth, S.: Lightweight probabilistic deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3369–3378 (2018)

2018

[9] [9]

Nature629(8014), 1034–1040 (2024)

Gehrig, D., Scaramuzza, D.: Low-latency automotive vision with event cameras. Nature629(8014), 1034–1040 (2024)

2024

[10] [10]

In: Proceedings of the AAAI conference on artificial intelligence

Giang, K.T., Song, S., Jo, S.: Topicfm: Robust and interpretable topic-assisted feature matching. In: Proceedings of the AAAI conference on artificial intelligence. vol. 37, pp. 2447–2455 (2023)

2023

[11] [11]

arXiv preprint arXiv:2501.07556 (2025)

He, X., Yu, H., Peng, S., Tan, D., Shen, Z., Bao, H., Zhou, X.: Matchanything: Uni- versal cross-modality image matching with large-scale pre-training. arXiv preprint arXiv:2501.07556 (2025)

work page arXiv 2025

[12] [12]

Information Fusion102, 102027 (2024)

Hou, Z., Liu, Y., Zhang, L.: Pos-gift: A geometric and intensity-invariant feature transformation for multimodal images. Information Fusion102, 102027 (2024)

2024

[13] [13]

In: European Conference on Computer Vision

Hu, D., Peng, L., Chu, T., Zhang, X., Mao, Y., Bondell, H., Gong, M.: Uncertainty quantification in depth estimation via constrained ordinal regression. In: European Conference on Computer Vision. pp. 237–256. Springer (2022)

2022

[14] [14]

In: European conference on computer vision

Huang, Z., Shi, X., Zhang, C., Wang, Q., Cheung, K.C., Qin, H., Dai, J., Li, H.: Flowformer: A transformer architecture for optical flow. In: European conference on computer vision. pp. 668–685. Springer (2022)

2022

[15] [15]

IEEE Transactions on geoscience and remote sensing57(1), 574–586 (2018)

Ji, S., Wei, S., Lu, M.: Fully convolutional networks for multisource building ex- traction from an open aerial and satellite imagery data set. IEEE Transactions on geoscience and remote sensing57(1), 574–586 (2018)

2018

[16] [16]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Jiang,H.,Karpur,A.,Cao,B.,Huang,Q.,Araujo,A.:Omniglue:Generalizablefea- ture matching with foundation model guidance. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 19865–19875 (2024) RBE-Flow for Cross-Modal Registration 17

2024

[17] [17]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Jiang, H., Dang, Z., Wei, Z., Xie, J., Yang, J., Salzmann, M.: Robust outlier rejec- tion for 3d registration with variational bayes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1148–1157 (June 2023)

2023

[18] [18]

In: Proceedings of the IEEE/CVF international conference on computer vision

Jiang, S., Campbell, D., Lu, Y., Li, H., Hartley, R.: Learning to estimate hid- den motions with global motion aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 9772–9781 (2021)

2021

[19] [19]

Information Fusion73, 22–71 (2021)

Jiang, X., Ma, J., Xiao, G., Shao, Z., Guo, X.: A review of multimodal image matching: Methods and applications. Information Fusion73, 22–71 (2021)

2021

[20] [20]

In: European Conference on Computer Vision

Leroy, V., Cabon, Y., Revaud, J.: Grounding image matching in 3d with mast3r. In: European Conference on Computer Vision. pp. 71–91. Springer (2024)

2024

[21] [21]

IEEE Transactions on Image Processing 29, 3296–3310 (2019)

Li, J., Hu, Q., Ai, M.: Rift: Multi-modal image matching based on radiation- variation insensitive feature transform. IEEE Transactions on Image Processing 29, 3296–3310 (2019)

2019

[22] [22]

IEEE Transactions on Geoscience and Remote Sensing 60, 1–14 (2022)

Li, J., Hu, Q., Ai, M.: LNIFT: Locally normalized image for rotation invariant mul- timodal feature matching. IEEE Transactions on Geoscience and Remote Sensing 60, 1–14 (2022)

2022

[23] [23]

arXiv (2023)

Li, J., Shi, P., Hu, Q., Zhang, Y.: Rift2: Speeding-up rift with a new rotation- invariance technique. arXiv (2023)

2023

[24] [24]

Pattern Recognition158, 110972 (2025)

Li, W., Chen, Q., Gu, G., Sui, X.: Object matching of visible–infrared image based on attention mechanism and feature fusion. Pattern Recognition158, 110972 (2025)

2025

[25] [25]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Li, X., Yang, W., Deng, J., Cheng, Z., Zhou, X., Zhang, T.: Implicit correspondence learning for image-to-point cloud registration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16922–16931 (2025)

2025

[26] [26]

Advances in Neural Information Processing Systems37, 128430–128461 (2024)

Li, Y., Li, X., Li, W., Hou, Q., Liu, L., Cheng, M.M., Yang, J.: Sardet-100k: Towards open-source benchmark and toolkit for large-scale sar object detection. Advances in Neural Information Processing Systems37, 128430–128461 (2024)

2024

[27] [27]

In: Proceedings of the IEEE/CVF international conference on com- puter vision

Lindenberger, P., Sarlin, P.E., Pollefeys, M.: Lightglue: Local feature matching at light speed. In: Proceedings of the IEEE/CVF international conference on com- puter vision. pp. 17627–17638 (2023)

2023

[28] [28]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, K., Yu, Y.: Revisiting the domain gap issue in non-cooperative spacecraft pose tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6864–6873 (2024)

2024

[29] [29]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Liu, X., Ding, M., Sun, Z., Li, Z., Teng, X.: Crft: Consistent-recurrent feature flow transformer for cross-modal image registration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 34784–34794 (2026)

2026

[30] [30]

Image and Vision Computing p

Liu, X., Kou, Y., Chen, L., Teng, X., Li, Z., Luo, J.: Multi-orientation multi-scale aggregated registration with cyclic back-projection for multimodal images. Image and Vision Computing p. 106089 (2026)

2026

[31] [31]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2026)

Liu, X., Sun, Z., Ding, H., Song, X., Zhang, S., Sun, Y.: Gaff: Global attention feature flow network for optical and sar image registration under geometric trans- formations. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2026)

2026

[32] [32]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing17, 18139– 18155 (2024)

Liu, X., Teng, X., Bian, Y., Li, Z., Yu, Q.: Shape-adaptive modality independent region descriptor for multimodal remote sensing image matching. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing17, 18139– 18155 (2024)

2024

[33] [33]

IEEE Transactions on Instrumentation and Measurement71, 1–12 (2022) 18 M

Liu, X., Teng, X., Li, Z., Yu, Q., Bian, Y.: A fast algorithm for high accuracy airborne sar geolocation based on local linear approximation. IEEE Transactions on Instrumentation and Measurement71, 1–12 (2022) 18 M. Ding et al

2022

[34] [34]

Chinese Journal of Aeronautics37(1), 271–286 (2024)

Liu, X., Teng, X., Luo, J., Li, Z., Yu, Q., Bian, Y.: Robust multi-sensor image matching based on normalized self-similarity region descriptor. Chinese Journal of Aeronautics37(1), 271–286 (2024)

2024

[35] [35]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Mao, S., Lu, S., Du, Z., Jiao, L., Gou, S., Mou, L., Lu, X., Xiong, L., Zhang, Y.: Cross-rejective open-set sar image registration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 23027–23036 (2025)

2025

[36] [36]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Morimitsu, H., Zhu, X., Cesar, R.M., Ji, X., Yin, X.C.: Dpflow: Adaptive optical flow estimation with a dual-pyramid framework. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 17810–17820 (June 2025)

2025

[37] [37]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Potje, G., Cadar, F., Araujo, A., Martins, R., Nascimento, E.R.: Xfeat: Acceler- ated features for lightweight image matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 2682–2691 (2024)

2024

[38] [38]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Qin, H., Xu, T., Li, T., Chen, Z., Feng, T., Li, J.: Must: The first dataset and unified framework for multispectral uav single object tracking. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 16882–16891 (2025)

2025

[39] [39]

Towards discriminability and diversity: Batch nuclear-norm maximization under label insufficient situations,

Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: Learning feature matching with graph neural networks. In: Proc. IEEE/CVF Conf. Com- put. Vis. Pattern Recognit. pp. 4938–4947 (2020).https://doi.org/10.1109/ CVPR42600.2020.00499

work page arXiv 2020

[40] [40]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Schusterbauer, J., Gui, M., Fundel, F., Ommer, B.: Diff2flow: Training flow match- ing models via diffusion model alignment. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 28347–28357 (2025)

2025

[41] [41]

In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021

Sun, J., Shen, Z., Wang, Y., Bao, H., Zhou, X.: Loftr: Detector-free local feature matching with transformers. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. pp. 8922–8931 (2021).https://doi.org/10.1109/CVPR46437.2021. 00881

work page doi:10.1109/cvpr46437.2021 2021

[42] [42]

arXiv preprint arXiv:2511.00598 (2025)

Sun, Z., Zhi, S., Li, R., Xia, J., Liu, Y., Jiang, W.: Gdros: A geometry-guided dense registration framework for optical-sar images under large geometric trans- formations. arXiv preprint arXiv:2511.00598 (2025)

work page arXiv 2025

[43] [43]

Reducibility among combinatorial problems

Teed, Z., Deng, J.: Raft: Recurrent all-pairs field transforms for optical flow. In: Proc. Eur. Conf. Comput. Vis. (2020).https://doi.org/10.1007/978-3-030- 58536-5_24

work page doi:10.1007/978-3-030- 2020

[44] [44]

Advances in neural information processing systems34, 16558–16569 (2021)

Teed, Z., Deng, J.: Droid-slam: Deep visual slam for monocular, stereo, and rgb- d cameras. Advances in neural information processing systems34, 16558–16569 (2021)

2021

[45] [45]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Truong, P., Danelljan, M., Timofte, R.: Glu-net: Global-local universal network for dense flow and correspondences. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 6258–6268 (2020)

2020

[46] [46]

In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition

Truong, P., Danelljan, M., Van Gool, L., Timofte, R.: Learning accurate dense correspondences and when to trust them. In: Proceedings of the IEEE/CVF con- ference on computer vision and pattern recognition. pp. 5714–5724 (2021)

2021

[47] [47]

In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Tuzcuoğlu, Ö., Köksal, A., Sofu, B., Kalkan, S., Alatan, A.A.: Xoftr: Cross-modal feature matching transformer. In: Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. pp. 4275–4286 (2024).https://doi.org/10.1109/CVPRW63382.2024. 00431

work page doi:10.1109/cvprw63382.2024 2024

[48] [48]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Wu, S., Wang, Y., Liu, X., Yang, Y., Wang, R., Guo, G., Doermann, D., Zhang, B.: Dfm: Differentiable feature matching for anomaly detection. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 15224–15233 (2025)

2025

[49] [49]

In: Proceed- RBE-Flow for Cross-Modal Registration 19 ings of the IEEE/CVF conference on computer vision and pattern recognition

Wu, Z., Zheng, J., Ren, X., Vasluianu, F.A., Ma, C., Paudel, D.P., Van Gool, L., Timofte, R.: Single-model and any-modality for video object tracking. In: Proceed- RBE-Flow for Cross-Modal Registration 19 ings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 19156–19166 (2024)

2024

[50] [50]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing13, 5847–5861 (2020)

Xiang, Y., Tao, R., Wang, F., You, H., Han, B.: Automatic registration of optical and sar images via improved phase congruency model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing13, 5847–5861 (2020)

2020

[51] [51]

IEEE Trans

Xiang, Y., Wang, F., You, H.: Os-sift: A robust sift-like algorithm for high- resolution optical-to-sar image registration in suburban areas. IEEE Trans. Geosci. Remote Sens.56(6), 3078–3090 (2018).https://doi.org/10.1109/tgrs.2018. 2790483

work page doi:10.1109/tgrs.2018 2018

[52] [52]

IEEE Transactions on Geoscience and Remote Sensing62, 1–13 (2024)

Xiao, Y., Zhang, C., Chen, Y., Jiang, B., Tang, J.: Adrnet: Affine and deformable registration networks for multimodal remote sensing images. IEEE Transactions on Geoscience and Remote Sensing62, 1–13 (2024)

2024

[53] [53]

In: proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)

Xu, H., Ma, J., Le, Z., Jiang, J., Guo, X.: Fusiondn: A unified densely connected network for image fusion. In: proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (2020)

2020

[54] [54]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Xu, H., Ma, J., Yuan, J., Le, Z., Liu, W.: Rfnet: Unsupervised network for mutu- ally reinforcing multi-modal image registration and fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 19679–19688 (2022)

2022

[55] [55]

IEEE Transactions on Pattern Analysis and Machine Intelligence 45(10), 12148–12166 (2023)

Xu, H., Yuan, J., Ma, J.: MURF: Mutually reinforcing multi-modal image registra- tion and fusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 45(10), 12148–12166 (2023)

2023

[56] [56]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Xu, H., Zhang, J., Cai, J., Rezatofighi, H., Tao, D.: Gmflow: Learning optical flow via global matching. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8121–8130 (2022)

2022

[57] [57]

IEEE Transactions on Pattern Analysis and Machine Intelligence 45(11), 13961–13975 (2023)

Xu, H., Zhang, J., Cai, J., Rezatofighi, H., Tao, D.: Unifying flow, stereo and depth estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 45(11), 13961–13975 (2023)

2023

[58] [58]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Yang, B., Chen, J., Ye, M.: Towards grand unified representation learning for unsu- pervised visible-infrared person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 11069–11079 (2023)

2023

[59] [59]

IEEE Transactions on Geoscience and Remote Sensing57(11), 9059–9070 (2019)

Ye, Y., Bruzzone, L., Shan, J., Bovolo, F., Zhu, Q.: Fast and robust matching for multimodal remote sensing image registration. IEEE Transactions on Geoscience and Remote Sensing57(11), 9059–9070 (2019)

2019

[60] [60]

In: WACV

Zeng, J., Gu, Z., Liu, W., Cai, L., Cheng, J.: Uncertainty aware interest point detection and description. In: WACV. pp. 2144–2153 (2025)

2025

[61] [61]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Zhang, J., Xia, Z., Dong, M., Shen, S., Yue, L., Zheng, X.: Comatcher: Multi- view collaborative feature matching. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 21970–21980 (2025)

2025

[62] [62]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Zhang, K., Deng, Y., Ma, J., Favaro, P.: Adapting dense matching for homography estimation with grid-based acceleration. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 6294–6303 (2025)

2025

[63] [63]

ISPRS Journal of Photogrammetry and Remote Sensing196, 1–15 (2023)

Zhang, Y., Yao, Y., Zhang, W., Zhang, Y.: Histogram of the orientation of the weighted phase descriptor for multi-mo dal remote sensing image matching. ISPRS Journal of Photogrammetry and Remote Sensing196, 1–15 (2023)

2023

[64] [64]

IEEE Transactions on Geoscience and Remote Sensing (2025)

Zheng,C., Li, S.,Wang,C., Zhang, B.:Msg: Robust multimodalremote sensingim- age matching using side window gaussian space. IEEE Transactions on Geoscience and Remote Sensing (2025)

2025