MatRes: Zero-Shot Test-Time Model Adaptation for Simultaneous Matching and Restoration

Kanggeon Lee; Kyoung Mu Lee; Soochahn Lee

arxiv: 2604.10081 · v1 · submitted 2026-04-11 · 💻 cs.CV · cs.AI

MatRes: Zero-Shot Test-Time Model Adaptation for Simultaneous Matching and Restoration

Kanggeon Lee , Soochahn Lee , Kyoung Mu Lee This is my paper

Pith reviewed 2026-05-10 16:44 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords image restorationgeometric matchingtest-time adaptationzero-shot learningcorrespondence estimationimage degradationcomputer vision

0 comments

The pith

Enforcing conditional similarity at matched points on one image pair lets a test-time method improve both restoration and geometric alignment without any training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MatRes as a way to handle real-world image pairs that are both degraded and taken from different viewpoints. These two problems interfere when solved separately, but the method claims they can help each other when a lightweight update is applied at test time. The update uses only the given pair, leaves all original models frozen, and requires no extra data or offline learning. A reader would care because many everyday photos come in mixed-quality sets of the same scene, and solving the tasks together could give better results than fixing one then the other.

Core claim

MatRes is a zero-shot test-time adaptation framework that jointly improves restoration quality and correspondence estimation using only a single low-quality and high-quality image pair. By enforcing conditional similarity at corresponding locations, MatRes updates only lightweight modules while keeping all pretrained components frozen, requiring no offline training or additional supervision. Extensive experiments across diverse combinations show that MatRes yields significant gains in both restoration and geometric alignment compared to using either restoration or matching models alone.

What carries the argument

Enforcement of conditional similarity at corresponding locations across the input pair, which updates only lightweight adaptation modules to make restoration and matching reinforce each other.

If this is right

Restoration and matching can be performed on the same pair without one task harming the other.
The approach works on any combination of existing pretrained restoration and matching models.
No new training data or retraining is needed to handle viewpoint changes plus degradation.
Users can capture multiple shots of a scene and obtain both a cleaned image and reliable point matches from them.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same idea of using correspondence to guide adaptation might apply to other coupled tasks such as denoising followed by object detection.
If the method scales, it could reduce reliance on large clean training sets for restoration models.
Practical pipelines for mobile photography or surveillance could incorporate this joint step instead of separate restoration and alignment stages.

Load-bearing premise

That the single-pair conditional similarity signal is enough to drive mutual gains in restoration and matching while all original models stay frozen and no labels are available.

What would settle it

Running MatRes on a degraded pair and finding that neither the restored image quality metrics nor the number of correct correspondences improves over applying the two models independently.

Figures

Figures reproduced from arXiv: 2604.10081 by Kanggeon Lee, Kyoung Mu Lee, Soochahn Lee.

**Figure 1.** Figure 1: Zero-shot Test-Time Adaptation. MATRES enhances degraded and viewpoint-shifted image pairs by leveraging the mutual guidance of a matching and a restoration network to produce a restored and aligned output image. Abstract Real-world image pairs often exhibit both severe degradations and large viewpoint changes, making image restoration and geometric matching mutually interfering tasks when treated independ… view at source ↗

**Figure 2.** Figure 2: Method Overview. Given an input pair (ILQ, IHQ), the pretrained generative prior model Mθ and the pretrained restoration network Rϕ remain frozen during adaptation (gradient flows through both models but their parameters are not updated). The zero-initialized adapter Hψ is the only trainable module, refined iteratively using losses computed on the IEQ and IHQ image pairs. At each adaptation step, Mθ extr… view at source ↗

**Figure 3.** Figure 3: Qualitative Restoration Results. Restoration performance is compared with and without the matching network Mθ across three tasks. For super-resolution (SR), we evaluate EDSR [33], ESRGAN [45], SwinIR [32], and HAT [46]; for denoising, we evaluate DnCNN [47], N2V [35], N2N [34], and Restormer [31]; and for deblurring, we evaluate DeblurGAN [36], SRN [37], Restormer [31], and NAFNet [48]. The green box indi… view at source ↗

**Figure 4.** Figure 4: Qualitative Matching Results. Geometric transform estimation results for five matching networks (LoftUp [26], ART [27], DIFT [28], RoMa [29], Mast3R [30]) across SR, denoising, and deblurring tasks, is evaluated with and without the restoration network Rϕ. The green box denotes the ground-truth transform, while the red box represents the estimated transform [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

Real-world image pairs often exhibit both severe degradations and large viewpoint changes, making image restoration and geometric matching mutually interfering tasks when treated independently. In this work, we propose MatRes, a zero-shot test-time adaptation framework that jointly improves restoration quality and correspondence estimation using only a single low-quality and high-quality image pair. By enforcing conditional similarity at corresponding locations, MatRes updates only lightweight modules while keeping all pretrained components frozen, requiring no offline training or additional supervision. Extensive experiments across diverse combinations show that MatRes yields significant gains in both restoration and geometric alignment compared to using either restoration or matching models alone. MatRes offers a practical and widely applicable solution for real-world scenarios where users commonly capture multiple images of a scene with varying viewpoints and quality, effectively addressing the often-overlooked mutual interference between matching and restoration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MatRes adds a lightweight test-time adaptation step that tries to make restoration and matching help each other on one degraded pair, but the whole thing rests on getting usable initial matches from the frozen model.

read the letter

MatRes is a zero-shot test-time method that updates only small adaptation modules on a single low-quality and high-quality image pair. It keeps the big pretrained restoration and matching networks frozen and uses a conditional similarity loss at corresponding locations to push both tasks to improve at once. No offline training or extra labels are needed. That framing directly targets the mutual interference that shows up when you have real-world photos with blur, noise, and viewpoint shifts at the same time. The setup is practical for users who just take several shots of a scene and want both a cleaner image and better correspondences without retraining anything. The abstract reports clear gains over running either task in isolation, which is the right comparison to make. The experiments apparently cover several degradation and viewpoint combinations, so the claim is not just on toy cases. The weakest link is the bootstrap. The frozen matcher still has to produce enough correct correspondences on the degraded input for the similarity constraint to have reliable locations to act on. If those initial matches are mostly noise, the adaptation modules get bad gradients and the mutual improvement may not happen. The description gives no explicit initialization trick, iterative loop, or robustness fix for this starting point, so the central assumption stays exposed. If the full paper shows ablations where performance holds even when the initial matcher is deliberately degraded, that would tighten the argument. Otherwise the gains could be limited to milder cases. This is the kind of paper that belongs in a reading group for people working on real-world geometric vision pipelines. It is worth sending to peer review because the problem is concrete, the method is simple to implement, and the experimental claims are falsifiable even if the bootstrap issue needs more scrutiny.

Referee Report

1 major / 2 minor

Summary. The paper proposes MatRes, a zero-shot test-time adaptation framework for jointly performing image restoration and geometric matching on a single degraded low-quality image paired with a high-quality reference. It updates only lightweight adaptation modules by enforcing conditional similarity at corresponding locations while freezing all pretrained restoration and matching models, requiring no offline training or extra supervision. Experiments across diverse image combinations reportedly demonstrate significant gains in both restoration quality and correspondence accuracy over using either task model in isolation.

Significance. If the central claims hold, the work is significant for addressing the mutual interference between restoration and matching in real-world scenarios with degradations and viewpoint changes. The zero-shot, single-pair, test-time nature without additional supervision or training offers a practical solution for common user-captured image sets, and the extensive experiments across combinations provide empirical support for the mutual improvement idea.

major comments (1)

[§3] §3 (Method): The central mechanism relies on initial correspondences produced by the frozen matching model on the degraded input to define locations for conditional similarity enforcement. The manuscript does not specify an explicit initialization strategy, robustness filter, or iterative refinement loop to ensure these initial matches are reliable enough under severe degradations and viewpoint changes; if the starting correspondences are too noisy, the lightweight modules receive unreliable gradients and the mutual improvement may not materialize.

minor comments (2)

[Abstract] Abstract and §4: The phrase 'significant gains' is used repeatedly; replace with concrete quantitative improvements (e.g., PSNR deltas, matching accuracy percentages) and include error bars or statistical tests to allow readers to assess effect sizes.
[§4] §4 (Experiments): Clarify the exact combinations of restoration and matching backbones tested and whether the reported gains are consistent across all pairs or driven by a subset of easier cases.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential significance of MatRes for real-world image pairs with degradations and viewpoint changes. We address the single major comment point-by-point below and have prepared revisions to improve clarity.

read point-by-point responses

Referee: [§3] §3 (Method): The central mechanism relies on initial correspondences produced by the frozen matching model on the degraded input to define locations for conditional similarity enforcement. The manuscript does not specify an explicit initialization strategy, robustness filter, or iterative refinement loop to ensure these initial matches are reliable enough under severe degradations and viewpoint changes; if the starting correspondences are too noisy, the lightweight modules receive unreliable gradients and the mutual improvement may not materialize.

Authors: We appreciate this observation on the initialization of correspondences. In the revised manuscript we have added an explicit statement in Section 3 clarifying that initial correspondences are obtained directly by running the frozen pretrained matching model on the degraded input paired with the high-quality reference; no additional preprocessing or selection is applied at this stage. We deliberately omit a separate robustness filter or iterative refinement loop at initialization to preserve the zero-shot, single-pair, no-training character of the framework. Instead, the conditional similarity loss applied during test-time adaptation of the lightweight modules serves as the mechanism for refinement: gradients update only the adaptation parameters while the core models remain frozen, allowing the system to improve effective alignment even when some initial matches are noisy. Our experiments across diverse degradation and viewpoint combinations already demonstrate that mutual gains occur in practice, including under severe conditions where standalone matching would fail. To further address the concern, the revision includes a short sensitivity discussion and an additional ablation that perturbs the initial matches to quantify robustness. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method is a new empirical framework

full rationale

The paper introduces MatRes as a zero-shot test-time adaptation framework that enforces conditional similarity at corresponding locations on a single degraded/high-quality pair to update only lightweight modules while freezing all pretrained models. No equations or steps reduce by construction to fitted inputs, self-definitions, or self-citation chains. Claims rest on the proposed adaptation procedure plus extensive experiments across combinations, which are independent of the target results. The skeptic concern about bootstrap stability from poor initial matches is a validity issue, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are detailed beyond reliance on pretrained models and the conditional similarity idea.

axioms (1)

domain assumption Pretrained restoration and matching models exist and can be kept frozen during lightweight adaptation.
The framework explicitly keeps all pretrained components frozen.

pith-pipeline@v0.9.0 · 5439 in / 1153 out tokens · 40008 ms · 2026-05-10T16:44:47.300127+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 74 canonical work pages

[1]

Deep learning in medical image registration: a survey.Machine Vision and Applications, 31(1):8, 2020

Grant Haskins, Uwe Kruger, and Pingkun Yan. Deep learning in medical image registration: a survey.Machine Vision and Applications, 31(1):8, 2020. 2

work page 2020
[2]

Local feature matching using deep learn- ing: A survey.Information Fusion, 107:102344, 2024

Shibiao Xu, Shunpeng Chen, Rongtao Xu, Changwei Wang, Peng Lu, and Li Guo. Local feature matching using deep learn- ing: A survey.Information Fusion, 107:102344, 2024. 2

work page 2024
[3]

Deep learning reforms image matching: A survey and outlook.arXiv preprint arXiv:2506.04619, 2025

Shihua Zhang, Zizhuo Li, Kaining Zhang, Yifan Lu, Yuxin Deng, Linfeng Tang, Xingyu Jiang, and Jiayi Ma. Deep learning reforms image matching: A survey and outlook.arXiv preprint arXiv:2506.04619, 2025. 2

work page arXiv 2025
[4]

Brief review of image denoising techniques.Visual computing for industry, biomedicine, and art, 2(1):7, 2019

Linwei Fan, Fan Zhang, Hui Fan, and Caiming Zhang. Brief review of image denoising techniques.Visual computing for industry, biomedicine, and art, 2(1):7, 2019. 2

work page 2019
[5]

Deep learning for image super-resolution: A survey.IEEE transactions on pattern analysis and machine intelligence, 43(10):3365–3387, 2020

Zhihao Wang, Jian Chen, and Steven CH Hoi. Deep learning for image super-resolution: A survey.IEEE transactions on pattern analysis and machine intelligence, 43(10):3365–3387, 2020. 2

work page 2020
[6]

Deep image de- blurring: A survey.International Journal of Computer Vision, 130(9):2103–2130, 2022

Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bj¨orn Stenger, Ming-Hsuan Yang, and Hongdong Li. Deep image de- blurring: A survey.International Journal of Computer Vision, 130(9):2103–2130, 2022. 2

work page 2022
[7]

A survey of deep learn- ing approaches to image restoration.Neurocomputing, 487:46– 65, 2022

Jingwen Su, Boyan Xu, and Hujun Yin. A survey of deep learn- ing approaches to image restoration.Neurocomputing, 487:46– 65, 2022. 2

work page 2022
[8]

Image de- noising: The deep learning revolution and beyond—a survey paper.SIAM Journal on Imaging Sciences, 16(3):1594–1654,

Michael Elad, Bahjat Kawar, and Gregory Vaksman. Image de- noising: The deep learning revolution and beyond—a survey paper.SIAM Journal on Imaging Sciences, 16(3):1594–1654,

work page
[9]

A comprehensive review of deep learning-based real-world image restoration.IEEE Access, 11:21049–21067, 2023

Lujun Zhai, Yonghui Wang, Suxia Cui, and Yu Zhou. A comprehensive review of deep learning-based real-world image restoration.IEEE Access, 11:21049–21067, 2023. 2

work page 2023
[10]

Eficient image denoising using deep learning: A brief survey.Information Fusion, page 103013, 2025

Bo Jiang, Jinxing Li, Yao Lu, Qing Cai, Huaibo Song, and Guangming Lu. Eficient image denoising using deep learning: A brief survey.Information Fusion, page 103013, 2025. 2

work page 2025
[11]

A survey on all-in-one image restoration: Taxon- omy, evaluation and future trends.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2025

Junjun Jiang, Zengyuan Zuo, Gang Wu, Kui Jiang, and Xian- ming Liu. A survey on all-in-one image restoration: Taxon- omy, evaluation and future trends.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2025. 2

work page 2025
[12]

Multi-shot imaging: joint alignment, deblurring and resolution-enhancement

Haichao Zhang and Lawrence Carin. Multi-shot imaging: joint alignment, deblurring and resolution-enhancement. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2925–2932, 2014. 2

work page 2014
[13]

Deep burst super-resolution

Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. Deep burst super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion, pages 9209–9218, 2021. 2

work page 2021
[14]

Lucas-kanade reloaded: End-to-end super-resolution from raw image bursts

Bruno Lecouat, Jean Ponce, and Julien Mairal. Lucas-kanade reloaded: End-to-end super-resolution from raw image bursts. InProceedings of the IEEE/CVF international conference on computer vision, pages 2370–2379, 2021. 2

work page 2021
[15]

A differentiable two-stage alignment scheme for burst image reconstruction with large shift

Shi Guo, Xi Yang, Jianqi Ma, Gaofeng Ren, and Lei Zhang. A differentiable two-stage alignment scheme for burst image reconstruction with large shift. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17472–17481, 2022. 2

work page 2022
[16]

Stereo video deblurring

Anita Sellent, Carsten Rother, and Stefan Roth. Stereo video deblurring. InEuropean conference on computer vision, pages 558–575. Springer, 2016. 2

work page 2016
[17]

Spatio-temporal transformer network for video restoration

Tae Hyun Kim, Mehdi SM Sajjadi, Michael Hirsch, and Bern- hard Scholkopf. Spatio-temporal transformer network for video restoration. InProceedings of the European conference on com- puter vision (ECCV), pages 106–122, 2018. 2

work page 2018
[18]

Joint stereo video deblurring, scene flow estimation and moving object segmentation.IEEE Transactions on Image Processing, 29:1748–1761, 2019

Liyuan Pan, Yuchao Dai, Miaomiao Liu, Fatih Porikli, and Quan Pan. Joint stereo video deblurring, scene flow estimation and moving object segmentation.IEEE Transactions on Image Processing, 29:1748–1761, 2019. 2

work page 2019
[19]

Tdan: Temporally-deformable alignment network for video super- resolution

Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu. Tdan: Temporally-deformable alignment network for video super- resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. 2

work page 2020
[20]

Hpatches: A benchmark and evaluation of hand- crafted and learned local descriptors, 2017

Vassileios Balntas, Karel Lenc, Andrea Vedaldi, and Krystian Mikolajczyk. Hpatches: A benchmark and evaluation of hand- crafted and learned local descriptors, 2017. 2

work page 2017
[21]

Lawrence Zitnick, and Piotr Doll ´ar

Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bour- dev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Doll ´ar. Microsoft coco: Com- mon objects in context, 2015. 2

work page 2015
[22]

Deep lucas- kanade homography for multimodal image alignment

Yiming Zhao, Xinming Huang, and Ziming Zhang. Deep lucas- kanade homography for multimodal image alignment. InCVPR,

work page
[23]

Megadepth: Learning single- view depth prediction from internet photos, 2018

Zhengqi Li and Noah Snavely. Megadepth: Learning single- view depth prediction from internet photos, 2018. 2

work page 2018
[24]

Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly- annotated 3d reconstructions of indoor scenes. InProc. Com- puter Vision and Pattern Recognition (CVPR), IEEE, 2017. 2

work page 2017
[25]

LoRA: Low-rank adaptation of large language models

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. InInternational Conference on Learning Representations, 2022. 2

work page 2022
[26]

Loftup: Learning a coordinate-based feature upsampler for vision foundation models, 2025

Haiwen Huang, Anpei Chen, V olodymyr Havrylov, Andreas Geiger, and Dan Zhang. Loftup: Learning a coordinate-based feature upsampler for vision foundation models, 2025. 3, 5, 6, 8

work page 2025
[27]

Auto- regressive transformation for image alignment

Kanggeon Lee, Soochahn Lee, and Kyoung Mu Lee. Auto- regressive transformation for image alignment. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 13569–13579, October 2025. 3, 5, 6, 8

work page 2025
[28]

arXiv preprint arXiv:2306.03881 (2023) 14

Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, and Bharath Hariharan. Emergent correspondence from image diffusion.arXiv preprint arXiv:2306.03881, 2023. 3, 4, 5, 6, 8

work page arXiv 2023
[29]

RoMa: Robust Dense Feature Matching

Johan Edstedt, Qiyu Sun, Georg B ¨okman, M˚arten Wadenb¨ack, and Michael Felsberg. RoMa: Robust Dense Feature Matching. IEEE Conference on Computer Vision and Pattern Recognition,

work page
[30]

Grounding image matching in 3d with mast3r, 2024

Vincent Leroy, Yohann Cabon, and Jerome Revaud. Grounding image matching in 3d with mast3r, 2024. 3, 5, 6, 8

work page 2024
[31]

Restormer: Efficient transformer for high-resolution image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InCVPR, 2022. 3, 5, 6, 7, 8 9

work page 2022
[32]

Swinir: Image restoration using swin transformer.arXiv preprint arXiv:2108.10257,

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration us- ing swin transformer.arXiv preprint arXiv:2108.10257, 2021. 3, 5, 6, 7

work page arXiv 2021
[33]

Enhanced deep residual networks for single im- age super-resolution

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Ky- oung Mu Lee. Enhanced deep residual networks for single im- age super-resolution. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017. 3, 5, 6, 7, 8

work page 2017
[34]

Noise2noise: Learning image restoration without clean data, 2018

Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. Noise2noise: Learning image restoration without clean data, 2018. 3, 5, 6, 7

work page 2018
[35]

Noise2void-learning denoising from single noisy images

Alexander Krull, Tim-Oliver Buchholz, and Florian Jug. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2129–2137, 2019. 3, 5, 6, 7

work page 2019
[36]

Deblurgan: Blind motion de- blurring using conditional adversarial networks.ArXiv e-prints,

Orest Kupyn, V olodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. Deblurgan: Blind motion de- blurring using conditional adversarial networks.ArXiv e-prints,

work page
[37]

Scale-recurrent network for deep image deblurring

Xin Tao, Hongyun Gao, Xiaoyong Shen, Jue Wang, and Jiaya Jia. Scale-recurrent network for deep image deblurring. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 3, 5, 6, 7

work page 2018
[38]

A tale of two features: Stable diffusion complements dino for zero-shot semantic correspondence

Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, and Ming-Hsuan Yang. A tale of two features: Stable diffusion complements dino for zero-shot semantic correspondence. InAdvances in Neural In- formation Processing Systems (NeurIPS) 2023, 2023. 3, 4

work page 2023
[39]

Zero-shot image feature consensus with deep functional maps.arXiv preprint arXiv:2403.12038, 2024

Xinle Cheng, Congyue Deng, Adam Harley, Yixin Zhu, and Leonidas Guibas. Zero-shot image feature consensus with deep functional maps.arXiv preprint arXiv:2403.12038, 2024. 3, 4, 5

work page arXiv 2024
[40]

Retinaregnet: A zero-shot approach for retinal image registration.Computers in Biology and Medicine, 186:109645,

Vishal Balaji Sivaraman, Muhammad Imran, Qingyue Wei, Preethika Muralidharan, Michelle R Tamplin, Isabella M Grum- bach, Randy H Kardon, Jui-Kai Wang, Yuyin Zhou, and Wei Shao. Retinaregnet: A zero-shot approach for retinal image registration.Computers in Biology and Medicine, 186:109645,

work page
[41]

Denoising diffusion implicit models.International Conference on Learn- ing Representations (ICLR), 2021

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.International Conference on Learn- ing Representations (ICLR), 2021. 4

work page 2021
[42]

High-resolution image synthesis with latent diffusion models.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image synthesis with latent diffusion models.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022. 4

work page 2022
[43]

Visual autoregressive modeling: Scalable image genera- tion via next-scale prediction, 2024

Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, and Liwei Wang. Visual autoregressive modeling: Scalable image genera- tion via next-scale prediction, 2024. 4

work page 2024
[44]

An image is worth 16x16 words: Transformers for image recognition at scale, 2021

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa De- hghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2021. 4

work page 2021
[45]

Esrgan: Enhanced super-resolution generative adversarial networks

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esrgan: Enhanced super-resolution generative adversarial networks. InThe Eu- ropean Conference on Computer Vision Workshops (ECCVW), September 2018. 5, 6, 7, 8

work page 2018
[46]

Activating more pixels in image super-resolution trans- former

Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, and Chao Dong. Activating more pixels in image super-resolution trans- former. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 22367– 22377, June 2023. 5, 6, 7

work page 2023
[47]

Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising.IEEE Transactions on Image Pro- cessing, 26(7):3142–3155, 2017

Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising.IEEE Transactions on Image Pro- cessing, 26(7):3142–3155, 2017. 5, 6, 7

work page 2017
[48]

Simple baselines for image restoration

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration.arXiv preprint arXiv:2204.04676, 2022. 5, 6, 7

work page arXiv 2022
[49]

Toward real-world single image super-resolution: A new benchmark and a new model

Jianrui Cai, Hui Zeng, Hongwei Yong, Zisheng Cao, and Lei Zhang. Toward real-world single image super-resolution: A new benchmark and a new model. InProceedings of the IEEE International Conference on Computer Vision, 2019. 5, 7, 8

work page 2019
[50]

Ntire 2017 challenge on single image super-resolution: Dataset and study

Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017. 5, 7, 8

work page 2017
[51]

Low-complexity single-image super- resolution based on nonnegative neighbor embedding

Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie-Line Alberi-Morel. Low-complexity single-image super- resolution based on nonnegative neighbor embedding. InPro- ceedings of the British Machine Vision Conference (BMVC),

work page
[52]

On single im- age scale-up using sparse representations

Roman Zeyde, Michael Elad, and Matan Protter. On single im- age scale-up using sparse representations. InProceedings of the International Conference on Curves and Surfaces, 2010. 5, 7, 8

work page 2010
[53]

Single image super-resolution from transformed self-exemplars

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 5, 7, 8

work page 2015
[54]

A database of human segmented natural images and its ap- plication to evaluating segmentation algorithms and measuring ecological statistics

David Martin, Charless Fowlkes, Doron Tal, and Jitendra Ma- lik. A database of human segmented natural images and its ap- plication to evaluating segmentation algorithms and measuring ecological statistics. InProceedings of the IEEE International Conference on Computer Vision (ICCV), 2001. 5, 7, 8

work page 2001
[55]

Deep multi-scale convolutional neural network for dynamic scene de- blurring

Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene de- blurring. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), July 2017. 5, 7, 8

work page 2017
[56]

Human-aware motion deblur- ring

Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao. Human-aware motion deblur- ring. InProceedings of the IEEE International Conference on Computer Vision (ICCV), 2019. 5, 7, 8

work page 2019
[57]

Real-world blur dataset for learning and benchmarking deblur- ring algorithms

Jaesung Rim, Haeyun Lee, Jucheol Won, and Sunghyun Cho. Real-world blur dataset for learning and benchmarking deblur- ring algorithms. InProceedings of the European Conference on Computer Vision (ECCV), 2020. 5, 7, 8

work page 2020
[58]

Deep learning face attributes in the wild

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. InProceedings of Interna- tional Conference on Computer Vision (ICCV), December 2015. 5, 7, 8 10

work page 2015
[59]

Autolut: Lut-based image super-resolution with au- tomatic sampling and adaptive residual learning, 2025

Yuheng Xu, Shijie Yang, Xin Liu, Jie Liu, Jie Tang, and Gang- shan Wu. Autolut: Lut-based image super-resolution with au- tomatic sampling and adaptive residual learning, 2025. 5, 7, 8

work page 2025
[60]

Catanet: Efficient content-aware token aggregation for lightweight image super- resolution.arXiv preprint arXiv:2503.06896, 2025

Xin Liu, Jie Liu, Jie Tang, and Gangshan Wu. Catanet: Efficient content-aware token aggregation for lightweight image super- resolution.arXiv preprint arXiv:2503.06896, 2025. 5, 7, 8

work page arXiv 2025
[61]

Im-lut: Interpolation mixing look-up tables for image super-resolution, 2025

Sejin Park, Sangmin Lee, Kyong Hwan Jin, and Seung-Won Jung. Im-lut: Interpolation mixing look-up tables for image super-resolution, 2025. 5, 7

work page 2025
[62]

Multi-stage progressive image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. InCVPR,

work page
[63]

Robust image denoising through adversarial fre- quency mixup

Donghun Ryou, Inju Ha, Hyewon Yoo, Dongwan Kim, and Bo- hyung Han. Robust image denoising through adversarial fre- quency mixup. In2024 IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 2723–2732, 2024. 5, 7, 8

work page 2024
[64]

Maxim: Multi-axis mlp for image processing.CVPR, 2022

Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Pey- man Milanfar, Alan Bovik, and Yinxiao Li. Maxim: Multi-axis mlp for image processing.CVPR, 2022. 5, 7

work page 2022
[65]

Blur2blur: Blur conversion for unsu- pervised image deblurring on unknown domains

Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, and Minh Hoai. Blur2blur: Blur conversion for unsu- pervised image deblurring on unknown domains. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), 2024. 5, 7, 8

work page 2024
[66]

Stripformer: Strip transformer for fast im- age deblurring, 2022

Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, and Chia-Wen Lin. Stripformer: Strip transformer for fast im- age deblurring, 2022. 5, 7, 8

work page 2022
[67]

Hierarchical integration diffusion model for realistic image deblurring

Zheng Chen, Yulun Zhang, Liu Ding, Xia Bin, Jinjin Gu, Linghe Kong, and Xin Yuan. Hierarchical integration diffusion model for realistic image deblurring. InNeurIPS, 2023. 5, 7

work page 2023
[68]

Minima: Modality invariant image matching

Jiangwei Ren, Xingyu Jiang, Zizhuo Li, Dingkang Liang, Xin Zhou, and Xiang Bai. Minima: Modality invariant image matching. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025. 5, 8

work page 2025
[69]

Learning affine correspondences by integrating geometric constraints

Pengju Sun, Banglei Guan, Zhenbao Yu, Yang Shang, Qifeng Yu, and Daniel Barath. Learning affine correspondences by integrating geometric constraints. InIEEE Conference on Computer Vision and Pattern Recognition, pages 27038–27048,

work page
[70]

Matchanything: Universal cross-modality image matching with large-scale pre-training

Xingyi He, Hao Yu, Sida Peng, Dongli Tan, Zehong Shen, Hujun Bao, and Xiaowei Zhou. Matchanything: Universal cross-modality image matching with large-scale pre-training. In Arxiv, 2025. 5, 8

work page 2025
[71]

Decoupled weight decay reg- ularization, 2019

Ilya Loshchilov and Frank Hutter. Decoupled weight decay reg- ularization, 2019. 5

work page 2019
[72]

The dual-bootstrap iterative closest point algorithm with application to retinal image registration.IEEE Transactions on Medical Imaging, 22, 2003

Charles Stewart, Chia-Ling Tsai, and Badrinath Roysam. The dual-bootstrap iterative closest point algorithm with application to retinal image registration.IEEE Transactions on Medical Imaging, 22, 2003. 7

work page 2003
[73]

Fire: Fundus image registration dataset.Modeling and Artificial Intelligence in Ophthalmology, 2017

Carlos Hernandez-Matas, Xenophon Zabulis, Areti Triantafyl- lou, Panagiota Anyfanti, Stella Douma, and Antonis A Argyros. Fire: Fundus image registration dataset.Modeling and Artificial Intelligence in Ophthalmology, 2017. 7

work page 2017
[74]

Image quality metrics: Psnr vs

Alain Hor ´e and Djemel Ziou. Image quality metrics: Psnr vs. ssim. In2010 20th International Conference on Pattern Recog- nition, pages 2366–2369, 2010. 7 11

work page 2010

[1] [1]

Deep learning in medical image registration: a survey.Machine Vision and Applications, 31(1):8, 2020

Grant Haskins, Uwe Kruger, and Pingkun Yan. Deep learning in medical image registration: a survey.Machine Vision and Applications, 31(1):8, 2020. 2

work page 2020

[2] [2]

Local feature matching using deep learn- ing: A survey.Information Fusion, 107:102344, 2024

Shibiao Xu, Shunpeng Chen, Rongtao Xu, Changwei Wang, Peng Lu, and Li Guo. Local feature matching using deep learn- ing: A survey.Information Fusion, 107:102344, 2024. 2

work page 2024

[3] [3]

Deep learning reforms image matching: A survey and outlook.arXiv preprint arXiv:2506.04619, 2025

Shihua Zhang, Zizhuo Li, Kaining Zhang, Yifan Lu, Yuxin Deng, Linfeng Tang, Xingyu Jiang, and Jiayi Ma. Deep learning reforms image matching: A survey and outlook.arXiv preprint arXiv:2506.04619, 2025. 2

work page arXiv 2025

[4] [4]

Brief review of image denoising techniques.Visual computing for industry, biomedicine, and art, 2(1):7, 2019

Linwei Fan, Fan Zhang, Hui Fan, and Caiming Zhang. Brief review of image denoising techniques.Visual computing for industry, biomedicine, and art, 2(1):7, 2019. 2

work page 2019

[5] [5]

Deep learning for image super-resolution: A survey.IEEE transactions on pattern analysis and machine intelligence, 43(10):3365–3387, 2020

Zhihao Wang, Jian Chen, and Steven CH Hoi. Deep learning for image super-resolution: A survey.IEEE transactions on pattern analysis and machine intelligence, 43(10):3365–3387, 2020. 2

work page 2020

[6] [6]

Deep image de- blurring: A survey.International Journal of Computer Vision, 130(9):2103–2130, 2022

Kaihao Zhang, Wenqi Ren, Wenhan Luo, Wei-Sheng Lai, Bj¨orn Stenger, Ming-Hsuan Yang, and Hongdong Li. Deep image de- blurring: A survey.International Journal of Computer Vision, 130(9):2103–2130, 2022. 2

work page 2022

[7] [7]

A survey of deep learn- ing approaches to image restoration.Neurocomputing, 487:46– 65, 2022

Jingwen Su, Boyan Xu, and Hujun Yin. A survey of deep learn- ing approaches to image restoration.Neurocomputing, 487:46– 65, 2022. 2

work page 2022

[8] [8]

Image de- noising: The deep learning revolution and beyond—a survey paper.SIAM Journal on Imaging Sciences, 16(3):1594–1654,

Michael Elad, Bahjat Kawar, and Gregory Vaksman. Image de- noising: The deep learning revolution and beyond—a survey paper.SIAM Journal on Imaging Sciences, 16(3):1594–1654,

work page

[9] [9]

A comprehensive review of deep learning-based real-world image restoration.IEEE Access, 11:21049–21067, 2023

Lujun Zhai, Yonghui Wang, Suxia Cui, and Yu Zhou. A comprehensive review of deep learning-based real-world image restoration.IEEE Access, 11:21049–21067, 2023. 2

work page 2023

[10] [10]

Eficient image denoising using deep learning: A brief survey.Information Fusion, page 103013, 2025

Bo Jiang, Jinxing Li, Yao Lu, Qing Cai, Huaibo Song, and Guangming Lu. Eficient image denoising using deep learning: A brief survey.Information Fusion, page 103013, 2025. 2

work page 2025

[11] [11]

A survey on all-in-one image restoration: Taxon- omy, evaluation and future trends.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2025

Junjun Jiang, Zengyuan Zuo, Gang Wu, Kui Jiang, and Xian- ming Liu. A survey on all-in-one image restoration: Taxon- omy, evaluation and future trends.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2025. 2

work page 2025

[12] [12]

Multi-shot imaging: joint alignment, deblurring and resolution-enhancement

Haichao Zhang and Lawrence Carin. Multi-shot imaging: joint alignment, deblurring and resolution-enhancement. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2925–2932, 2014. 2

work page 2014

[13] [13]

Deep burst super-resolution

Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. Deep burst super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recogni- tion, pages 9209–9218, 2021. 2

work page 2021

[14] [14]

Lucas-kanade reloaded: End-to-end super-resolution from raw image bursts

Bruno Lecouat, Jean Ponce, and Julien Mairal. Lucas-kanade reloaded: End-to-end super-resolution from raw image bursts. InProceedings of the IEEE/CVF international conference on computer vision, pages 2370–2379, 2021. 2

work page 2021

[15] [15]

A differentiable two-stage alignment scheme for burst image reconstruction with large shift

Shi Guo, Xi Yang, Jianqi Ma, Gaofeng Ren, and Lei Zhang. A differentiable two-stage alignment scheme for burst image reconstruction with large shift. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17472–17481, 2022. 2

work page 2022

[16] [16]

Stereo video deblurring

Anita Sellent, Carsten Rother, and Stefan Roth. Stereo video deblurring. InEuropean conference on computer vision, pages 558–575. Springer, 2016. 2

work page 2016

[17] [17]

Spatio-temporal transformer network for video restoration

Tae Hyun Kim, Mehdi SM Sajjadi, Michael Hirsch, and Bern- hard Scholkopf. Spatio-temporal transformer network for video restoration. InProceedings of the European conference on com- puter vision (ECCV), pages 106–122, 2018. 2

work page 2018

[18] [18]

Joint stereo video deblurring, scene flow estimation and moving object segmentation.IEEE Transactions on Image Processing, 29:1748–1761, 2019

Liyuan Pan, Yuchao Dai, Miaomiao Liu, Fatih Porikli, and Quan Pan. Joint stereo video deblurring, scene flow estimation and moving object segmentation.IEEE Transactions on Image Processing, 29:1748–1761, 2019. 2

work page 2019

[19] [19]

Tdan: Temporally-deformable alignment network for video super- resolution

Yapeng Tian, Yulun Zhang, Yun Fu, and Chenliang Xu. Tdan: Temporally-deformable alignment network for video super- resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. 2

work page 2020

[20] [20]

Hpatches: A benchmark and evaluation of hand- crafted and learned local descriptors, 2017

Vassileios Balntas, Karel Lenc, Andrea Vedaldi, and Krystian Mikolajczyk. Hpatches: A benchmark and evaluation of hand- crafted and learned local descriptors, 2017. 2

work page 2017

[21] [21]

Lawrence Zitnick, and Piotr Doll ´ar

Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bour- dev, Ross Girshick, James Hays, Pietro Perona, Deva Ramanan, C. Lawrence Zitnick, and Piotr Doll ´ar. Microsoft coco: Com- mon objects in context, 2015. 2

work page 2015

[22] [22]

Deep lucas- kanade homography for multimodal image alignment

Yiming Zhao, Xinming Huang, and Ziming Zhang. Deep lucas- kanade homography for multimodal image alignment. InCVPR,

work page

[23] [23]

Megadepth: Learning single- view depth prediction from internet photos, 2018

Zhengqi Li and Noah Snavely. Megadepth: Learning single- view depth prediction from internet photos, 2018. 2

work page 2018

[24] [24]

Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner

Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. Scannet: Richly- annotated 3d reconstructions of indoor scenes. InProc. Com- puter Vision and Pattern Recognition (CVPR), IEEE, 2017. 2

work page 2017

[25] [25]

LoRA: Low-rank adaptation of large language models

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. InInternational Conference on Learning Representations, 2022. 2

work page 2022

[26] [26]

Loftup: Learning a coordinate-based feature upsampler for vision foundation models, 2025

Haiwen Huang, Anpei Chen, V olodymyr Havrylov, Andreas Geiger, and Dan Zhang. Loftup: Learning a coordinate-based feature upsampler for vision foundation models, 2025. 3, 5, 6, 8

work page 2025

[27] [27]

Auto- regressive transformation for image alignment

Kanggeon Lee, Soochahn Lee, and Kyoung Mu Lee. Auto- regressive transformation for image alignment. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 13569–13579, October 2025. 3, 5, 6, 8

work page 2025

[28] [28]

arXiv preprint arXiv:2306.03881 (2023) 14

Luming Tang, Menglin Jia, Qianqian Wang, Cheng Perng Phoo, and Bharath Hariharan. Emergent correspondence from image diffusion.arXiv preprint arXiv:2306.03881, 2023. 3, 4, 5, 6, 8

work page arXiv 2023

[29] [29]

RoMa: Robust Dense Feature Matching

Johan Edstedt, Qiyu Sun, Georg B ¨okman, M˚arten Wadenb¨ack, and Michael Felsberg. RoMa: Robust Dense Feature Matching. IEEE Conference on Computer Vision and Pattern Recognition,

work page

[30] [30]

Grounding image matching in 3d with mast3r, 2024

Vincent Leroy, Yohann Cabon, and Jerome Revaud. Grounding image matching in 3d with mast3r, 2024. 3, 5, 6, 8

work page 2024

[31] [31]

Restormer: Efficient transformer for high-resolution image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InCVPR, 2022. 3, 5, 6, 7, 8 9

work page 2022

[32] [32]

Swinir: Image restoration using swin transformer.arXiv preprint arXiv:2108.10257,

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration us- ing swin transformer.arXiv preprint arXiv:2108.10257, 2021. 3, 5, 6, 7

work page arXiv 2021

[33] [33]

Enhanced deep residual networks for single im- age super-resolution

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Ky- oung Mu Lee. Enhanced deep residual networks for single im- age super-resolution. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017. 3, 5, 6, 7, 8

work page 2017

[34] [34]

Noise2noise: Learning image restoration without clean data, 2018

Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. Noise2noise: Learning image restoration without clean data, 2018. 3, 5, 6, 7

work page 2018

[35] [35]

Noise2void-learning denoising from single noisy images

Alexander Krull, Tim-Oliver Buchholz, and Florian Jug. Noise2void-learning denoising from single noisy images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2129–2137, 2019. 3, 5, 6, 7

work page 2019

[36] [36]

Deblurgan: Blind motion de- blurring using conditional adversarial networks.ArXiv e-prints,

Orest Kupyn, V olodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. Deblurgan: Blind motion de- blurring using conditional adversarial networks.ArXiv e-prints,

work page

[37] [37]

Scale-recurrent network for deep image deblurring

Xin Tao, Hongyun Gao, Xiaoyong Shen, Jue Wang, and Jiaya Jia. Scale-recurrent network for deep image deblurring. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. 3, 5, 6, 7

work page 2018

[38] [38]

A tale of two features: Stable diffusion complements dino for zero-shot semantic correspondence

Junyi Zhang, Charles Herrmann, Junhwa Hur, Luisa Polania Cabrera, Varun Jampani, Deqing Sun, and Ming-Hsuan Yang. A tale of two features: Stable diffusion complements dino for zero-shot semantic correspondence. InAdvances in Neural In- formation Processing Systems (NeurIPS) 2023, 2023. 3, 4

work page 2023

[39] [39]

Zero-shot image feature consensus with deep functional maps.arXiv preprint arXiv:2403.12038, 2024

Xinle Cheng, Congyue Deng, Adam Harley, Yixin Zhu, and Leonidas Guibas. Zero-shot image feature consensus with deep functional maps.arXiv preprint arXiv:2403.12038, 2024. 3, 4, 5

work page arXiv 2024

[40] [40]

Retinaregnet: A zero-shot approach for retinal image registration.Computers in Biology and Medicine, 186:109645,

Vishal Balaji Sivaraman, Muhammad Imran, Qingyue Wei, Preethika Muralidharan, Michelle R Tamplin, Isabella M Grum- bach, Randy H Kardon, Jui-Kai Wang, Yuyin Zhou, and Wei Shao. Retinaregnet: A zero-shot approach for retinal image registration.Computers in Biology and Medicine, 186:109645,

work page

[41] [41]

Denoising diffusion implicit models.International Conference on Learn- ing Representations (ICLR), 2021

Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.International Conference on Learn- ing Representations (ICLR), 2021. 4

work page 2021

[42] [42]

High-resolution image synthesis with latent diffusion models.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image synthesis with latent diffusion models.IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2022. 4

work page 2022

[43] [43]

Visual autoregressive modeling: Scalable image genera- tion via next-scale prediction, 2024

Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, and Liwei Wang. Visual autoregressive modeling: Scalable image genera- tion via next-scale prediction, 2024. 4

work page 2024

[44] [44]

An image is worth 16x16 words: Transformers for image recognition at scale, 2021

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa De- hghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2021. 4

work page 2021

[45] [45]

Esrgan: Enhanced super-resolution generative adversarial networks

Xintao Wang, Ke Yu, Shixiang Wu, Jinjin Gu, Yihao Liu, Chao Dong, Yu Qiao, and Chen Change Loy. Esrgan: Enhanced super-resolution generative adversarial networks. InThe Eu- ropean Conference on Computer Vision Workshops (ECCVW), September 2018. 5, 6, 7, 8

work page 2018

[46] [46]

Activating more pixels in image super-resolution trans- former

Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, and Chao Dong. Activating more pixels in image super-resolution trans- former. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 22367– 22377, June 2023. 5, 6, 7

work page 2023

[47] [47]

Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising.IEEE Transactions on Image Pro- cessing, 26(7):3142–3155, 2017

Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising.IEEE Transactions on Image Pro- cessing, 26(7):3142–3155, 2017. 5, 6, 7

work page 2017

[48] [48]

Simple baselines for image restoration

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration.arXiv preprint arXiv:2204.04676, 2022. 5, 6, 7

work page arXiv 2022

[49] [49]

Toward real-world single image super-resolution: A new benchmark and a new model

Jianrui Cai, Hui Zeng, Hongwei Yong, Zisheng Cao, and Lei Zhang. Toward real-world single image super-resolution: A new benchmark and a new model. InProceedings of the IEEE International Conference on Computer Vision, 2019. 5, 7, 8

work page 2019

[50] [50]

Ntire 2017 challenge on single image super-resolution: Dataset and study

Eirikur Agustsson and Radu Timofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. InThe IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017. 5, 7, 8

work page 2017

[51] [51]

Low-complexity single-image super- resolution based on nonnegative neighbor embedding

Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie-Line Alberi-Morel. Low-complexity single-image super- resolution based on nonnegative neighbor embedding. InPro- ceedings of the British Machine Vision Conference (BMVC),

work page

[52] [52]

On single im- age scale-up using sparse representations

Roman Zeyde, Michael Elad, and Matan Protter. On single im- age scale-up using sparse representations. InProceedings of the International Conference on Curves and Surfaces, 2010. 5, 7, 8

work page 2010

[53] [53]

Single image super-resolution from transformed self-exemplars

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. 5, 7, 8

work page 2015

[54] [54]

A database of human segmented natural images and its ap- plication to evaluating segmentation algorithms and measuring ecological statistics

David Martin, Charless Fowlkes, Doron Tal, and Jitendra Ma- lik. A database of human segmented natural images and its ap- plication to evaluating segmentation algorithms and measuring ecological statistics. InProceedings of the IEEE International Conference on Computer Vision (ICCV), 2001. 5, 7, 8

work page 2001

[55] [55]

Deep multi-scale convolutional neural network for dynamic scene de- blurring

Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene de- blurring. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), July 2017. 5, 7, 8

work page 2017

[56] [56]

Human-aware motion deblur- ring

Ziyi Shen, Wenguan Wang, Xiankai Lu, Jianbing Shen, Haibin Ling, Tingfa Xu, and Ling Shao. Human-aware motion deblur- ring. InProceedings of the IEEE International Conference on Computer Vision (ICCV), 2019. 5, 7, 8

work page 2019

[57] [57]

Real-world blur dataset for learning and benchmarking deblur- ring algorithms

Jaesung Rim, Haeyun Lee, Jucheol Won, and Sunghyun Cho. Real-world blur dataset for learning and benchmarking deblur- ring algorithms. InProceedings of the European Conference on Computer Vision (ECCV), 2020. 5, 7, 8

work page 2020

[58] [58]

Deep learning face attributes in the wild

Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. InProceedings of Interna- tional Conference on Computer Vision (ICCV), December 2015. 5, 7, 8 10

work page 2015

[59] [59]

Autolut: Lut-based image super-resolution with au- tomatic sampling and adaptive residual learning, 2025

Yuheng Xu, Shijie Yang, Xin Liu, Jie Liu, Jie Tang, and Gang- shan Wu. Autolut: Lut-based image super-resolution with au- tomatic sampling and adaptive residual learning, 2025. 5, 7, 8

work page 2025

[60] [60]

Catanet: Efficient content-aware token aggregation for lightweight image super- resolution.arXiv preprint arXiv:2503.06896, 2025

Xin Liu, Jie Liu, Jie Tang, and Gangshan Wu. Catanet: Efficient content-aware token aggregation for lightweight image super- resolution.arXiv preprint arXiv:2503.06896, 2025. 5, 7, 8

work page arXiv 2025

[61] [61]

Im-lut: Interpolation mixing look-up tables for image super-resolution, 2025

Sejin Park, Sangmin Lee, Kyong Hwan Jin, and Seung-Won Jung. Im-lut: Interpolation mixing look-up tables for image super-resolution, 2025. 5, 7

work page 2025

[62] [62]

Multi-stage progressive image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Multi-stage progressive image restoration. InCVPR,

work page

[63] [63]

Robust image denoising through adversarial fre- quency mixup

Donghun Ryou, Inju Ha, Hyewon Yoo, Dongwan Kim, and Bo- hyung Han. Robust image denoising through adversarial fre- quency mixup. In2024 IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR), pages 2723–2732, 2024. 5, 7, 8

work page 2024

[64] [64]

Maxim: Multi-axis mlp for image processing.CVPR, 2022

Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Pey- man Milanfar, Alan Bovik, and Yinxiao Li. Maxim: Multi-axis mlp for image processing.CVPR, 2022. 5, 7

work page 2022

[65] [65]

Blur2blur: Blur conversion for unsu- pervised image deblurring on unknown domains

Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, and Minh Hoai. Blur2blur: Blur conversion for unsu- pervised image deblurring on unknown domains. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR), 2024. 5, 7, 8

work page 2024

[66] [66]

Stripformer: Strip transformer for fast im- age deblurring, 2022

Fu-Jen Tsai, Yan-Tsung Peng, Yen-Yu Lin, Chung-Chi Tsai, and Chia-Wen Lin. Stripformer: Strip transformer for fast im- age deblurring, 2022. 5, 7, 8

work page 2022

[67] [67]

Hierarchical integration diffusion model for realistic image deblurring

Zheng Chen, Yulun Zhang, Liu Ding, Xia Bin, Jinjin Gu, Linghe Kong, and Xin Yuan. Hierarchical integration diffusion model for realistic image deblurring. InNeurIPS, 2023. 5, 7

work page 2023

[68] [68]

Minima: Modality invariant image matching

Jiangwei Ren, Xingyu Jiang, Zizhuo Li, Dingkang Liang, Xin Zhou, and Xiang Bai. Minima: Modality invariant image matching. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025. 5, 8

work page 2025

[69] [69]

Learning affine correspondences by integrating geometric constraints

Pengju Sun, Banglei Guan, Zhenbao Yu, Yang Shang, Qifeng Yu, and Daniel Barath. Learning affine correspondences by integrating geometric constraints. InIEEE Conference on Computer Vision and Pattern Recognition, pages 27038–27048,

work page

[70] [70]

Matchanything: Universal cross-modality image matching with large-scale pre-training

Xingyi He, Hao Yu, Sida Peng, Dongli Tan, Zehong Shen, Hujun Bao, and Xiaowei Zhou. Matchanything: Universal cross-modality image matching with large-scale pre-training. In Arxiv, 2025. 5, 8

work page 2025

[71] [71]

Decoupled weight decay reg- ularization, 2019

Ilya Loshchilov and Frank Hutter. Decoupled weight decay reg- ularization, 2019. 5

work page 2019

[72] [72]

The dual-bootstrap iterative closest point algorithm with application to retinal image registration.IEEE Transactions on Medical Imaging, 22, 2003

Charles Stewart, Chia-Ling Tsai, and Badrinath Roysam. The dual-bootstrap iterative closest point algorithm with application to retinal image registration.IEEE Transactions on Medical Imaging, 22, 2003. 7

work page 2003

[73] [73]

Fire: Fundus image registration dataset.Modeling and Artificial Intelligence in Ophthalmology, 2017

Carlos Hernandez-Matas, Xenophon Zabulis, Areti Triantafyl- lou, Panagiota Anyfanti, Stella Douma, and Antonis A Argyros. Fire: Fundus image registration dataset.Modeling and Artificial Intelligence in Ophthalmology, 2017. 7

work page 2017

[74] [74]

Image quality metrics: Psnr vs

Alain Hor ´e and Djemel Ziou. Image quality metrics: Psnr vs. ssim. In2010 20th International Conference on Pattern Recog- nition, pages 2366–2369, 2010. 7 11

work page 2010