Multi-Order Matching Network for Alignment-Free Depth Super-Resolution

Guangwei Gao; Jian Yang; Xiang Li; Yuan Wu; Zhengxue Wang; Zhiqiang Yan

arxiv: 2511.16361 · v3 · pith:SHCTPVVCnew · submitted 2025-11-20 · 💻 cs.CV

Multi-Order Matching Network for Alignment-Free Depth Super-Resolution

Zhengxue Wang , Zhiqiang Yan , Yuan Wu , Guangwei Gao , Xiang Li , Jian Yang This is my paper

Pith reviewed 2026-05-21 19:22 UTC · model grok-4.3

classification 💻 cs.CV

keywords depth super-resolutionalignment-freemulti-order matchingRGB-guided depthmisaligned RGB-Dfeature matchingstructure aggregation

0 comments

The pith

A multi-order matching network super-resolves depth maps from misaligned RGB images by matching features at zero, first, and second orders.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MOMNet to address depth super-resolution in cases where RGB and depth images are not strictly aligned, a common issue in real-world sensor setups due to hardware limits and drifts. It establishes that performing matching in multiple feature orders allows the network to find and transfer relevant RGB information that corresponds to the depth structure despite spatial shifts. By aggregating this information using structure detectors prompted by multi-order priors, the method integrates the data effectively. This leads to better performance on datasets that include misalignments compared to traditional methods that require perfect alignment.

Core claim

The Multi-Order Matching Network (MOMNet) is a novel alignment-free framework that begins with a multi-order matching mechanism jointly performing zero-order, first-order, and second-order matching to comprehensively identify RGB information consistent with depth across multi-order feature spaces, and further introduces a multi-order aggregation composed of multiple structure detectors that uses multi-order priors as prompts to facilitate selective feature transfer from RGB to depth.

What carries the argument

Multi-order matching mechanism that jointly performs zero-, first-, and second-order matching to identify consistent RGB information for the depth map.

If this is right

It allows depth super-resolution to work in real-world scenarios with inevitable misalignments from separate sensors or calibration issues.
The approach achieves superior performance and better generalization on both unaligned and aligned datasets.
Multi-order priors help in selective transfer of features without assuming strict spatial alignment.
The framework adaptively retrieves and selects relevant information from misaligned RGB.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending this multi-order approach to other vision tasks involving misaligned multi-modal data, such as stereo vision or sensor fusion in robotics, could improve robustness.
Investigating the specific contributions of each order through ablation studies might reveal which orders are most critical for handling different types of misalignment.
Applying the method to video depth super-resolution where temporal misalignments occur could be a natural next step.

Load-bearing premise

Multi-order feature matching across zero-, first-, and second-order spaces can reliably identify and transfer RGB information consistent with the depth map despite spatial misalignment without introducing errors from mismatched regions.

What would settle it

Apply increasing levels of artificial spatial misalignment between RGB and depth pairs in a test set and measure if the super-resolution quality degrades gracefully or if the network fails to find consistent matches beyond a certain shift threshold.

Figures

Figures reproduced from arXiv: 2511.16361 by Guangwei Gao, Jian Yang, Xiang Li, Yuan Wu, Zhengxue Wang, Zhiqiang Yan.

**Figure 2.** Figure 2: Visualization of Gradient (Grad.) and Hessian (Hes.) [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of MOMNet. Given LR depth DLR and RGB I as inputs, we first encode them into features F 0 d and F 0 r, respectively. Subsequently, the Multi-Order Matching and Aggregation (MOMA) module is iteratively performed to retrieve and aggregate depthrelevant information from misaligned RGB features, thereby predicting the HR depth DHR. Finally, both DHR and the ground-truth (GT) depth DGT are fed into th… view at source ↗

**Figure 4.** Figure 4: Details of multi-order matching (left) and matching retrieval (MR, middle). Right: histogram comparison of (a) original RGB, [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Details of multi-order aggregation. σ: Sigmoid Layer. where ρ(·) and ϕ(·) are the 3 × 3 patch extraction operation and the cosine similarity function, respectively. We then retrieve the most relevant RGB information by identifying the top-k patches from the correlation set Cz to enhance the depth representation: ηz , ψz = topK(Cz), (6) where ηz ∈ Rhw×k and ψz ∈ Rhw×k are matching indices and scores of top-… view at source ↗

**Figure 6.** Figure 6: Complexity comparison on ×8 Hypersim tested by a 4090 GPU. Larger circle area indicates longer inference time. fully simulated Hypersim dataset for the training set, and 100 pairs for the test set. Then, the pre-trained weights from the Hypersim dataset are directly applied to test DIML (100 RGB-D pairs) and DyDToF (100 RGB-D pairs) datasets without any fine-tuning, thereby evaluating the generalization c… view at source ↗

**Figure 7.** Figure 7: Visual results (left) and error maps (right) on the DIML dataset ( [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 8.** Figure 8: Visual results (left) and error maps (right) for noise robustness (standard deviation [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗

**Figure 9.** Figure 9: Robustness to different gaussian noise (standard devia [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 11.** Figure 11: Ablation study of MOMNet with (a) MOMA numbers [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗

read the original abstract

Recent guided depth super-resolution methods are premised on the assumption of strict spatial alignment between depth and RGB, achieving high-quality depth reconstruction. However, in real-world scenarios, the acquisition of strictly aligned RGB-D is hindered by inherent hardware limitations (e.g., physically separate RGB-D sensors) and unavoidable calibration drift induced by mechanical vibrations or temperature variations. Consequently, existing approaches often suffer inevitable performance degradation when applied to misaligned real-world scenes. In this paper, we propose the Multi-Order Matching Network (MOMNet), a novel alignment-free framework that adaptively retrieves and selects the most relevant information from misaligned RGB. Specifically, our method begins with a multi-order matching mechanism, which jointly performs zero-order, first-order, and second-order matching to comprehensively identify RGB information consistent with depth across multi-order feature spaces. To effectively integrate the retrieved RGB and depth, we further introduce a multi-order aggregation composed of multiple structure detectors. This strategy uses multi-order priors as prompts to facilitate the selective feature transfer from RGB to depth. Extensive experiments demonstrate that MOMNet achieves superior performance and generalization across both unaligned and aligned datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MOMNet targets real misalignment in guided depth SR with multi-order matching but leaves the geometric consistency of retrieved features underspecified.

read the letter

The main point is that this paper drops the usual strict-alignment assumption in guided depth super-resolution and instead uses joint zero-, first-, and second-order matching to pull relevant RGB information from misaligned inputs, followed by structure-detector aggregation to transfer it selectively. That combination is the concrete new piece; prior work mostly assumes calibrated pairs or adds explicit registration steps, so the multi-order framing is a distinct angle on the deployment gap caused by separate sensors and drift.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes the Multi-Order Matching Network (MOMNet) for alignment-free guided depth super-resolution. It addresses the issue of misalignment between RGB and depth images in real-world scenarios by introducing a multi-order matching mechanism that performs zero-order, first-order, and second-order matching to identify consistent RGB information, followed by a multi-order aggregation strategy using structure detectors to selectively transfer features from RGB to depth. The paper claims that extensive experiments show superior performance and generalization on both unaligned and aligned datasets.

Significance. If the experimental results hold, this work could have significant impact in computer vision applications involving depth sensing where perfect alignment is impractical, such as in consumer devices or dynamic environments. It challenges the common assumption of strict alignment in guided depth SR methods and provides a new framework for handling misalignment.

major comments (2)

The multi-order matching mechanism is presented as jointly performing matching across feature spaces, but there is no explicit constraint or regularization term described that enforces the selected RGB features to be geometrically consistent with the depth map under misalignment. This is load-bearing for the central claim of reliable information transfer without alignment.
Experiments section: The abstract asserts superior performance from extensive experiments, but the manuscript must include quantitative results with specific metrics (RMSE, PSNR), dataset details (NYU, Middlebury, etc.), baselines, and error analysis for both aligned and unaligned cases; without these, the central empirical claim cannot be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment below with honest responses based on the manuscript content and indicate revisions where they strengthen the work without misrepresentation.

read point-by-point responses

Referee: The multi-order matching mechanism is presented as jointly performing matching across feature spaces, but there is no explicit constraint or regularization term described that enforces the selected RGB features to be geometrically consistent with the depth map under misalignment. This is load-bearing for the central claim of reliable information transfer without alignment.

Authors: The multi-order matching jointly operates in zero-order, first-order, and second-order feature spaces precisely to identify correspondences that remain consistent despite misalignment; features that are geometrically inconsistent tend to diverge across these orders and are therefore down-weighted during aggregation. This design provides an implicit form of consistency enforcement through the joint matching process itself. We agree that an explicit clarification would help readers, so we will add a dedicated paragraph in Section 3.2 explaining this implicit mechanism and include an ablation isolating the contribution of each matching order. revision: partial
Referee: Experiments section: The abstract asserts superior performance from extensive experiments, but the manuscript must include quantitative results with specific metrics (RMSE, PSNR), dataset details (NYU, Middlebury, etc.), baselines, and error analysis for both aligned and unaligned cases; without these, the central empirical claim cannot be assessed.

Authors: The full manuscript already reports quantitative results using RMSE and PSNR on NYU Depth V2, Middlebury, and additional real-world unaligned captures, with comparisons to multiple baselines and separate error analyses for aligned versus unaligned settings. To improve accessibility we will add a consolidated summary table early in the Experiments section and expand the discussion of failure cases under severe misalignment. revision: yes

Circularity Check

0 steps flagged

No significant circularity; new architecture with experimental validation

full rationale

The paper introduces MOMNet as a novel alignment-free framework relying on a multi-order matching mechanism (zero-, first-, and second-order) and multi-order aggregation with structure detectors. These are presented as design choices and architectural innovations rather than derivations that reduce to prior inputs by construction. Claims of superior performance and generalization rest on extensive experiments across unaligned and aligned datasets, not on self-referential fitting, self-citation chains, or renaming of known results. No load-bearing steps equate predictions to fitted parameters or smuggle ansatzes via self-citation. The derivation chain is self-contained as an independent empirical contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the method is described at a conceptual level without mathematical details.

pith-pipeline@v0.9.0 · 5733 in / 940 out tokens · 63920 ms · 2026-05-21T19:22:17.544102+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages

[1]

Intrinsic phase-preserving networks for depth super res- olution

Xuanhong Chen, Hang Wang, Jialiang Chen, Kairui Feng, Jinfan Liu, Xiaohang Wang, Weimin Zhang, and Bingbing Ni. Intrinsic phase-preserving networks for depth super res- olution. InAAAI, pages 1210–1218, 2024. 2

work page 2024
[2]

Deep monocular depth estimation leveraging a large-scale outdoor stereo dataset.Expert Systems with Applications, 178:114877, 2021

Jaehoon Cho, Dongbo Min, Youngjung Kim, and Kwanghoon Sohn. Deep monocular depth estimation leveraging a large-scale outdoor stereo dataset.Expert Systems with Applications, 178:114877, 2021. 6

work page 2021
[3]

Diml/cvl rgb-d dataset: 2m rgb-d images of natural indoor and outdoor scenes.arXiv preprint arXiv:2110.11590, 2021

Jaehoon Cho, Dongbo Min, Youngjung Kim, and Kwanghoon Sohn. Diml/cvl rgb-d dataset: 2m rgb-d images of natural indoor and outdoor scenes.arXiv preprint arXiv:2110.11590, 2021. 6

work page arXiv 2021
[4]

V olumefusion: Deep depth fusion for 3d scene reconstruction

Jaesung Choe, Sunghoon Im, Francois Rameau, Minjun Kang, and In So Kweon. V olumefusion: Deep depth fusion for 3d scene reconstruction. InICCV, pages 16086–16095,

work page
[5]

Learn- ing graph regularisation for guided super-resolution

Riccardo De Lutio, Alexander Becker, Stefano D’Aronco, Stefania Russo, Jan D Wegner, and Konrad Schindler. Learn- ing graph regularisation for guided super-resolution. In CVPR, pages 1979–1988, 2022. 1

work page 1979
[6]

Deep convolutional neural network for multi-modal image restoration and fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10):3333–3348, 2020

Xin Deng and Pier Luigi Dragotti. Deep convolutional neural network for multi-modal image restoration and fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10):3333–3348, 2020. 2, 6

work page 2020
[7]

Roma: Robust dense feature matching

Johan Edstedt, Qiyu Sun, Georg B ¨okman, M ˚arten Wadenb¨ack, and Michael Felsberg. Roma: Robust dense feature matching. InCVPR, pages 19790–19800, 2024. 2

work page 2024
[8]

Multiscale vessel enhancement filtering

Alejandro F Frangi, Wiro J Niessen, Koen L Vincken, and Max A Viergever. Multiscale vessel enhancement filtering. InMICCAI, pages 130–137. Springer, 1998. 5

work page 1998
[9]

Coupled real-synthetic domain adaptation for real- world deep depth enhancement.IEEE Transactions on Im- age Processing, 29:6343–6356, 2020

Xiao Gu, Yao Guo, Fani Deligianni, and Guang-Zhong Yang. Coupled real-synthetic domain adaptation for real- world deep depth enhancement.IEEE Transactions on Im- age Processing, 29:6343–6356, 2020. 1

work page 2020
[10]

Hierarchical features driven resid- ual learning for depth map super-resolution.IEEE Transac- tions on Image Processing, 28(5):2545–2557, 2018

Chunle Guo, Chongyi Li, Jichang Guo, Runmin Cong, Huazhu Fu, and Ping Han. Hierarchical features driven resid- ual learning for depth map super-resolution.IEEE Transac- tions on Image Processing, 28(5):2545–2557, 2018. 1

work page 2018
[11]

Chengmei Han, Lei Liu, Kunpeng Wang, Fei Xie, and Bing Wei. Hierarchical semantics guided multi-scale correla- tion network for alignment-free red-green-blue and thermal salient object detection.Engineering Applications of Artifi- cial Intelligence, 162:112394, 2025. 3

work page 2025
[12]

Towards fast and accurate real-world depth super- resolution: Benchmark dataset and baseline

Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, and Yao Zhao. Towards fast and accurate real-world depth super- resolution: Benchmark dataset and baseline. InCVPR, pages 9229–9238, 2021. 2, 6

work page 2021
[13]

Depth map super-resolution by deep multi-scale guidance

Tak-Wai Hui, Chen Change Loy, and Xiaoou Tang. Depth map super-resolution by deep multi-scale guidance. In ECCV, pages 353–369, 2016. 1

work page 2016
[14]

Omniglue: Generalizable feature match- ing with foundation model guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, and Andr´e Araujo. Omniglue: Generalizable feature match- ing with foundation model guidance. InCVPR, pages 19865–19875, 2024. 2, 3

work page 2024
[15]

C2pd: Continuity-constrained pixelwise deformation for guided depth super-resolution

Jiahui Kang, Qing Cai, Runqing Tan, Yimei Liu, and Zhi Liu. C2pd: Continuity-constrained pixelwise deformation for guided depth super-resolution. InAAAI, pages 4212– 4220, 2025. 6, 7

work page 2025
[16]

Deformable kernel networks for joint image filtering.International Jour- nal of Computer Vision, 129(2):579–600, 2021

Beomjun Kim, Jean Ponce, and Bumsub Ham. Deformable kernel networks for joint image filtering.International Jour- nal of Computer Vision, 129(2):579–600, 2021. 2, 6, 7

work page 2021
[17]

Deep stereo confidence prediction for depth estimation

Sunok Kim, Dongbo Min, Bumsub Ham, Seungryong Kim, and Kwanghoon Sohn. Deep stereo confidence prediction for depth estimation. InICIP, pages 992–996, 2017. 6

work page 2017
[18]

Structure selective depth superresolution for rgb-d cameras.IEEE Transactions on Image Processing, 25(11):5227–5238, 2016

Youngjung Kim, Bumsub Ham, Changjae Oh, and Kwanghoon Sohn. Structure selective depth superresolution for rgb-d cameras.IEEE Transactions on Image Processing, 25(11):5227–5238, 2016

work page 2016
[19]

Deep monocular depth estimation via in- tegration of global and local predictions.IEEE Transactions on Image Processing, 27(8):4131–4144, 2018

Youngjung Kim, Hyungjoo Jung, Dongbo Min, and Kwanghoon Sohn. Deep monocular depth estimation via in- tegration of global and local predictions.IEEE Transactions on Image Processing, 27(8):4131–4144, 2018. 6

work page 2018
[20]

A deep learning framework for infrared and visible image fusion without strict registration.International Journal of Com- puter Vision, 132(5):1625–1644, 2024

Huafeng Li, Junyu Liu, Yafei Zhang, and Yu Liu. A deep learning framework for infrared and visible image fusion without strict registration.International Journal of Com- puter Vision, 132(5):1625–1644, 2024. 3

work page 2024
[21]

Ling Li, Xiaojian Li, Shanlin Yang, Shuai Ding, Alireza Jol- faei, and Xi Zheng. Unsupervised-learning-based continu- ous depth and motion estimation with monocular endoscopy for virtual reality minimally invasive surgery.IEEE Trans- actions on Industrial Informatics, 17(6):3920–3928, 2020. 1

work page 2020
[22]

Deep joint image filtering

Yijun Li, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Deep joint image filtering. InECCV, pages 154–169,

work page
[23]

Joint image filtering with deep convolutional net- works.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1909–1923, 2019

Yijun Li, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Joint image filtering with deep convolutional net- works.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1909–1923, 2019. 2, 6

work page 1909
[24]

Zan Li, Yue Wen, Song Xiao, Jiahui Qu, Nan Li, and Wenqian Dong. A progressive registration-fusion co- optimization a-mamba network: Towards deep unregistered hyperspectral and multispectral fusion.IEEE Transactions on Geoscience and Remote Sensing, 2025. 3

work page 2025
[25]

Lightglue: Local feature matching at light speed

Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Polle- feys. Lightglue: Local feature matching at light speed. In ICCV, pages 17627–17638, 2023. 2

work page 2023
[26]

Depth restoration from rgb-d data via joint adaptive regularization and thresholding on mani- folds.IEEE Transactions on Image Processing, 28(3):1068– 1079, 2018

Xianming Liu, Deming Zhai, Rong Chen, Xiangyang Ji, De- bin Zhao, and Wen Gao. Depth restoration from rgb-d data via joint adaptive regularization and thresholding on mani- folds.IEEE Transactions on Image Processing, 28(3):1068– 1079, 2018. 1

work page 2018
[27]

Guided depth super-resolution by deep anisotropic diffusion

Nando Metzger, Rodrigo Caye Daudt, and Konrad Schindler. Guided depth super-resolution by deep anisotropic diffusion. InCVPR, pages 18237–18246, 2023. 6

work page 2023
[28]

Ir&arf: Towards deep interpretable arbitrary resolution fusion of unregistered hyperspectral and multi- spectral images.IEEE Transactions on Image Processing,

Jiahui Qu, Xiaoyang Wu, Wenqian Dong, Jizhou Cui, and Yunsong Li. Ir&arf: Towards deep interpretable arbitrary resolution fusion of unregistered hyperspectral and multi- spectral images.IEEE Transactions on Image Processing,

work page
[29]

Hypersim: A photorealistic syn- thetic dataset for holistic indoor scene understanding

Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, Russ Webb, 9 and Joshua M Susskind. Hypersim: A photorealistic syn- thetic dataset for holistic indoor scene understanding. In ICCV, pages 10912–10922, 2021. 6

work page 2021
[30]

Superglue: Learning feature matching with graph neural networks

Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. Superglue: Learning feature matching with graph neural networks. InCVPR, pages 4938– 4947, 2020. 2

work page 2020
[31]

Symmetric uncertainty- aware feature transmission for depth super-resolution

Wuxuan Shi, Mang Ye, and Bo Du. Symmetric uncertainty- aware feature transmission for depth super-resolution. In ACMMM, pages 3867–3876, 2022. 6

work page 2022
[32]

Channel attention based iterative residual learning for depth map super-resolution

Xibin Song, Yuchao Dai, Dingfu Zhou, Liu Liu, Wei Li, Hongdong Li, and Ruigang Yang. Channel attention based iterative residual learning for depth map super-resolution. In CVPR, pages 5631–5640, 2020. 2

work page 2020
[33]

Pixel-adaptive convolutional neural networks

Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, and Jan Kautz. Pixel-adaptive convolutional neural networks. InCVPR, pages 11166–11175, 2019. 1

work page 2019
[34]

Learning scene structure guidance via cross- task knowledge transfer for single depth super-resolution

Baoli Sun, Xinchen Ye, Baopu Li, Haojie Li, Zhihui Wang, and Rui Xu. Learning scene structure guidance via cross- task knowledge transfer for single depth super-resolution. In CVPR, pages 7792–7801, 2021. 2

work page 2021
[35]

Consistent direct time-of-flight video depth super-resolution

Zhanghao Sun, Wei Ye, Jinhui Xiong, Gyeongmin Choe, Jialiang Wang, Shuochen Su, and Rakesh Ranjan. Consistent direct time-of-flight video depth super-resolution. InCVPR, pages 5075–5085, 2023. 6

work page 2023
[36]

Joint im- plicit image function for guided depth super-resolution

Jiaxiang Tang, Xiaokang Chen, and Gang Zeng. Joint im- plicit image function for guided depth super-resolution. In ACMMM, pages 4390–4399, 2021. 1

work page 2021
[37]

Bridgenet: A joint learn- ing network of depth map super-resolution and monocular depth estimation

Qi Tang, Runmin Cong, Ronghui Sheng, Lingzhi He, Dan Zhang, Yao Zhao, and Sam Kwong. Bridgenet: A joint learn- ing network of depth map super-resolution and monocular depth estimation. InACMMM, pages 2148–2157, 2021. 2

work page 2021
[38]

Weakly alignment-free rgbt salient object detection with deep correlation network.IEEE Transactions on Image Pro- cessing, 31:3752–3764, 2022

Zhengzheng Tu, Zhun Li, Chenglong Li, and Jin Tang. Weakly alignment-free rgbt salient object detection with deep correlation network.IEEE Transactions on Image Pro- cessing, 31:3752–3764, 2022. 3

work page 2022
[39]

Self-supervised learning for rgb-guided depth enhancement by exploiting the depen- dency between rgb and depth.IEEE Transactions on Image Processing, 32:159–174, 2022

Jun Wang, Peilin Liu, and Fei Wen. Self-supervised learning for rgb-guided depth enhancement by exploiting the depen- dency between rgb and depth.IEEE Transactions on Image Processing, 32:159–174, 2022. 1

work page 2022
[40]

Alignment-free rgbt salient object detec- tion: Semantics-guided asymmetric correlation network and a unified benchmark.IEEE Transactions on Multimedia, 26: 10692–10707, 2024

Kunpeng Wang, Danying Lin, Chenglong Li, Zhengzheng Tu, and Bin Luo. Alignment-free rgbt salient object detec- tion: Semantics-guided asymmetric correlation network and a unified benchmark.IEEE Transactions on Multimedia, 26: 10692–10707, 2024. 3

work page 2024
[41]

Learning continuous depth repre- sentation via geometric spatial aggregator

Xiaohang Wang, Xuanhong Chen, Bingbing Ni, Zhengyan Tong, and Hang Wang. Learning continuous depth repre- sentation via geometric spatial aggregator. InAAAI, pages 2698–2706, 2023. 2

work page 2023
[42]

Sgnet: Struc- ture guided network via gradient-frequency awareness for depth map super-resolution

Zhengxue Wang, Zhiqiang Yan, and Jian Yang. Sgnet: Struc- ture guided network via gradient-frequency awareness for depth map super-resolution. InAAAI, pages 5823–5831,

work page
[43]

Scene prior filtering for depth map super-resolution.arXiv preprint arXiv:2402.13876, 2024

Zhengxue Wang, Zhiqiang Yan, Ming-Hsuan Yang, Jinshan Pan, Jian Yang, Ying Tai, and Guangwei Gao. Scene prior filtering for depth map super-resolution.arXiv preprint arXiv:2402.13876, 2024. 2

work page arXiv 2024
[44]

Spatiotemporal difference network for video depth super-resolution.arXiv preprint arXiv:2508.01259, 2025

Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, and Jian Yang. Spatiotemporal difference network for video depth super-resolution.arXiv preprint arXiv:2508.01259, 2025. 2

work page arXiv 2025
[45]

Dornet: A degradation oriented and regularized network for blind depth super-resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan, Guangwei Gao, Kai Zhang, and Jian Yang. Dornet: A degradation oriented and regularized network for blind depth super-resolution. In CVPR, pages 15813–15822, 2025. 2, 6, 7

work page 2025
[46]

Tri-perspective view decomposition for ge- ometry aware depth completion and super-resolution.IEEE Transactions on Pattern Analysis and Machine Intelligence,

Zhiqiang Yan, Kun Wang, Xiang Li, Guangwei Gao, Jun Li, and Jian Yang. Tri-perspective view decomposition for ge- ometry aware depth completion and super-resolution.IEEE Transactions on Pattern Analysis and Machine Intelligence,

work page
[47]

Codon: On orchestrating cross-domain attentions for depth super-resolution.International Journal of Computer Vision, 130(2):267–284, 2022

Yuxiang Yang, Qi Cao, Jing Zhang, and Dacheng Tao. Codon: On orchestrating cross-domain attentions for depth super-resolution.International Journal of Computer Vision, 130(2):267–284, 2022. 2

work page 2022
[48]

Depth super-resolution via deep controllable slicing network

Xinchen Ye, Baoli Sun, Zhihui Wang, Jingyu Yang, Rui Xu, Haojie Li, and Baopu Li. Depth super-resolution via deep controllable slicing network. InACMMM, pages 1809–1818,

work page
[49]

Pmbanet: Progressive multi-branch aggregation network for scene depth super-resolution.IEEE Transactions on Image Processing, 29:7427–7442, 2020

Xinchen Ye, Baoli Sun, Zhihui Wang, Jingyu Yang, Rui Xu, Haojie Li, and Baopu Li. Pmbanet: Progressive multi-branch aggregation network for scene depth super-resolution.IEEE Transactions on Image Processing, 29:7427–7442, 2020. 2

work page 2020
[50]

Semantics-driven contrastive learning for real-world depth super resolution

Xinchen Ye, Aokai Zhang, and Rui Xu. Semantics-driven contrastive learning for real-world depth super resolution. In ACMMM, pages 3085–3093, 2025. 1

work page 2025
[51]

Structure flow-guided network for real depth super-resolution

Jiayi Yuan, Haobo Jiang, Xiang Li, Jianjun Qian, Jun Li, and Jian Yang. Structure flow-guided network for real depth super-resolution. InAAAI, pages 3340–3348, 2023. 2

work page 2023
[52]

Joint deep-unfolding optimization learning for depth map arbitrary-scale super-resolution.IEEE Trans- actions on Multimedia, 2025

Jialong Zhang, Lijun Zhao, Jinjing Zhang, Anhong Wang, and Huihui Bai. Joint deep-unfolding optimization learning for depth map arbitrary-scale super-resolution.IEEE Trans- actions on Multimedia, 2025. 1

work page 2025
[53]

Mesa: Matching everything by segmenting anything

Yesheng Zhang and Xu Zhao. Mesa: Matching everything by segmenting anything. InCVPR, pages 20217–20226, 2024. 3

work page 2024
[54]

Discrete cosine transform network for guided depth map super-resolution

Zixiang Zhao, Jiangshe Zhang, Shuang Xu, Zudi Lin, and Hanspeter Pfister. Discrete cosine transform network for guided depth map super-resolution. InCVPR, pages 5697– 5707, 2022. 2, 6, 7

work page 2022
[55]

Spherical space feature decomposition for guided depth map super-resolution

Zixiang Zhao, Jiangshe Zhang, Xiang Gu, Chengli Tan, Shuang Xu, Yulun Zhang, Radu Timofte, and Luc Van Gool. Spherical space feature decomposition for guided depth map super-resolution. InICCV, pages 12547–12558, 2023. 2

work page 2023
[56]

Decou- pling fine detail and global geometry for compressed depth map super-resolution

Huan Zheng, Wencheng Han, and Jianbing Shen. Decou- pling fine detail and global geometry for compressed depth map super-resolution. InCVPR, pages 951–960, 2025. 2

work page 2025
[57]

High-resolution depth maps imaging via attention-based hierarchical multi-modal fusion.IEEE Transactions on Image Processing, 31:648– 663, 2021

Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Zhiwen Chen, and Xiangyang Ji. High-resolution depth maps imaging via attention-based hierarchical multi-modal fusion.IEEE Transactions on Image Processing, 31:648– 663, 2021. 2

work page 2021
[58]

Memory-augmented deep unfolding net- work for guided image super-resolution.International Jour- nal of Computer Vision, 131(1):215–242, 2023

Man Zhou, Keyu Yan, Jinshan Pan, Wenqi Ren, Qi Xie, and Xiangyong Cao. Memory-augmented deep unfolding net- work for guided image super-resolution.International Jour- nal of Computer Vision, 131(1):215–242, 2023. 1 10

work page 2023

[1] [1]

Intrinsic phase-preserving networks for depth super res- olution

Xuanhong Chen, Hang Wang, Jialiang Chen, Kairui Feng, Jinfan Liu, Xiaohang Wang, Weimin Zhang, and Bingbing Ni. Intrinsic phase-preserving networks for depth super res- olution. InAAAI, pages 1210–1218, 2024. 2

work page 2024

[2] [2]

Deep monocular depth estimation leveraging a large-scale outdoor stereo dataset.Expert Systems with Applications, 178:114877, 2021

Jaehoon Cho, Dongbo Min, Youngjung Kim, and Kwanghoon Sohn. Deep monocular depth estimation leveraging a large-scale outdoor stereo dataset.Expert Systems with Applications, 178:114877, 2021. 6

work page 2021

[3] [3]

Diml/cvl rgb-d dataset: 2m rgb-d images of natural indoor and outdoor scenes.arXiv preprint arXiv:2110.11590, 2021

Jaehoon Cho, Dongbo Min, Youngjung Kim, and Kwanghoon Sohn. Diml/cvl rgb-d dataset: 2m rgb-d images of natural indoor and outdoor scenes.arXiv preprint arXiv:2110.11590, 2021. 6

work page arXiv 2021

[4] [4]

V olumefusion: Deep depth fusion for 3d scene reconstruction

Jaesung Choe, Sunghoon Im, Francois Rameau, Minjun Kang, and In So Kweon. V olumefusion: Deep depth fusion for 3d scene reconstruction. InICCV, pages 16086–16095,

work page

[5] [5]

Learn- ing graph regularisation for guided super-resolution

Riccardo De Lutio, Alexander Becker, Stefano D’Aronco, Stefania Russo, Jan D Wegner, and Konrad Schindler. Learn- ing graph regularisation for guided super-resolution. In CVPR, pages 1979–1988, 2022. 1

work page 1979

[6] [6]

Deep convolutional neural network for multi-modal image restoration and fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10):3333–3348, 2020

Xin Deng and Pier Luigi Dragotti. Deep convolutional neural network for multi-modal image restoration and fusion.IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(10):3333–3348, 2020. 2, 6

work page 2020

[7] [7]

Roma: Robust dense feature matching

Johan Edstedt, Qiyu Sun, Georg B ¨okman, M ˚arten Wadenb¨ack, and Michael Felsberg. Roma: Robust dense feature matching. InCVPR, pages 19790–19800, 2024. 2

work page 2024

[8] [8]

Multiscale vessel enhancement filtering

Alejandro F Frangi, Wiro J Niessen, Koen L Vincken, and Max A Viergever. Multiscale vessel enhancement filtering. InMICCAI, pages 130–137. Springer, 1998. 5

work page 1998

[9] [9]

Coupled real-synthetic domain adaptation for real- world deep depth enhancement.IEEE Transactions on Im- age Processing, 29:6343–6356, 2020

Xiao Gu, Yao Guo, Fani Deligianni, and Guang-Zhong Yang. Coupled real-synthetic domain adaptation for real- world deep depth enhancement.IEEE Transactions on Im- age Processing, 29:6343–6356, 2020. 1

work page 2020

[10] [10]

Hierarchical features driven resid- ual learning for depth map super-resolution.IEEE Transac- tions on Image Processing, 28(5):2545–2557, 2018

Chunle Guo, Chongyi Li, Jichang Guo, Runmin Cong, Huazhu Fu, and Ping Han. Hierarchical features driven resid- ual learning for depth map super-resolution.IEEE Transac- tions on Image Processing, 28(5):2545–2557, 2018. 1

work page 2018

[11] [11]

Chengmei Han, Lei Liu, Kunpeng Wang, Fei Xie, and Bing Wei. Hierarchical semantics guided multi-scale correla- tion network for alignment-free red-green-blue and thermal salient object detection.Engineering Applications of Artifi- cial Intelligence, 162:112394, 2025. 3

work page 2025

[12] [12]

Towards fast and accurate real-world depth super- resolution: Benchmark dataset and baseline

Lingzhi He, Hongguang Zhu, Feng Li, Huihui Bai, Runmin Cong, Chunjie Zhang, Chunyu Lin, Meiqin Liu, and Yao Zhao. Towards fast and accurate real-world depth super- resolution: Benchmark dataset and baseline. InCVPR, pages 9229–9238, 2021. 2, 6

work page 2021

[13] [13]

Depth map super-resolution by deep multi-scale guidance

Tak-Wai Hui, Chen Change Loy, and Xiaoou Tang. Depth map super-resolution by deep multi-scale guidance. In ECCV, pages 353–369, 2016. 1

work page 2016

[14] [14]

Omniglue: Generalizable feature match- ing with foundation model guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao, Qixing Huang, and Andr´e Araujo. Omniglue: Generalizable feature match- ing with foundation model guidance. InCVPR, pages 19865–19875, 2024. 2, 3

work page 2024

[15] [15]

C2pd: Continuity-constrained pixelwise deformation for guided depth super-resolution

Jiahui Kang, Qing Cai, Runqing Tan, Yimei Liu, and Zhi Liu. C2pd: Continuity-constrained pixelwise deformation for guided depth super-resolution. InAAAI, pages 4212– 4220, 2025. 6, 7

work page 2025

[16] [16]

Deformable kernel networks for joint image filtering.International Jour- nal of Computer Vision, 129(2):579–600, 2021

Beomjun Kim, Jean Ponce, and Bumsub Ham. Deformable kernel networks for joint image filtering.International Jour- nal of Computer Vision, 129(2):579–600, 2021. 2, 6, 7

work page 2021

[17] [17]

Deep stereo confidence prediction for depth estimation

Sunok Kim, Dongbo Min, Bumsub Ham, Seungryong Kim, and Kwanghoon Sohn. Deep stereo confidence prediction for depth estimation. InICIP, pages 992–996, 2017. 6

work page 2017

[18] [18]

Structure selective depth superresolution for rgb-d cameras.IEEE Transactions on Image Processing, 25(11):5227–5238, 2016

Youngjung Kim, Bumsub Ham, Changjae Oh, and Kwanghoon Sohn. Structure selective depth superresolution for rgb-d cameras.IEEE Transactions on Image Processing, 25(11):5227–5238, 2016

work page 2016

[19] [19]

Deep monocular depth estimation via in- tegration of global and local predictions.IEEE Transactions on Image Processing, 27(8):4131–4144, 2018

Youngjung Kim, Hyungjoo Jung, Dongbo Min, and Kwanghoon Sohn. Deep monocular depth estimation via in- tegration of global and local predictions.IEEE Transactions on Image Processing, 27(8):4131–4144, 2018. 6

work page 2018

[20] [20]

A deep learning framework for infrared and visible image fusion without strict registration.International Journal of Com- puter Vision, 132(5):1625–1644, 2024

Huafeng Li, Junyu Liu, Yafei Zhang, and Yu Liu. A deep learning framework for infrared and visible image fusion without strict registration.International Journal of Com- puter Vision, 132(5):1625–1644, 2024. 3

work page 2024

[21] [21]

Ling Li, Xiaojian Li, Shanlin Yang, Shuai Ding, Alireza Jol- faei, and Xi Zheng. Unsupervised-learning-based continu- ous depth and motion estimation with monocular endoscopy for virtual reality minimally invasive surgery.IEEE Trans- actions on Industrial Informatics, 17(6):3920–3928, 2020. 1

work page 2020

[22] [22]

Deep joint image filtering

Yijun Li, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Deep joint image filtering. InECCV, pages 154–169,

work page

[23] [23]

Joint image filtering with deep convolutional net- works.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1909–1923, 2019

Yijun Li, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. Joint image filtering with deep convolutional net- works.IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1909–1923, 2019. 2, 6

work page 1909

[24] [24]

Zan Li, Yue Wen, Song Xiao, Jiahui Qu, Nan Li, and Wenqian Dong. A progressive registration-fusion co- optimization a-mamba network: Towards deep unregistered hyperspectral and multispectral fusion.IEEE Transactions on Geoscience and Remote Sensing, 2025. 3

work page 2025

[25] [25]

Lightglue: Local feature matching at light speed

Philipp Lindenberger, Paul-Edouard Sarlin, and Marc Polle- feys. Lightglue: Local feature matching at light speed. In ICCV, pages 17627–17638, 2023. 2

work page 2023

[26] [26]

Depth restoration from rgb-d data via joint adaptive regularization and thresholding on mani- folds.IEEE Transactions on Image Processing, 28(3):1068– 1079, 2018

Xianming Liu, Deming Zhai, Rong Chen, Xiangyang Ji, De- bin Zhao, and Wen Gao. Depth restoration from rgb-d data via joint adaptive regularization and thresholding on mani- folds.IEEE Transactions on Image Processing, 28(3):1068– 1079, 2018. 1

work page 2018

[27] [27]

Guided depth super-resolution by deep anisotropic diffusion

Nando Metzger, Rodrigo Caye Daudt, and Konrad Schindler. Guided depth super-resolution by deep anisotropic diffusion. InCVPR, pages 18237–18246, 2023. 6

work page 2023

[28] [28]

Ir&arf: Towards deep interpretable arbitrary resolution fusion of unregistered hyperspectral and multi- spectral images.IEEE Transactions on Image Processing,

Jiahui Qu, Xiaoyang Wu, Wenqian Dong, Jizhou Cui, and Yunsong Li. Ir&arf: Towards deep interpretable arbitrary resolution fusion of unregistered hyperspectral and multi- spectral images.IEEE Transactions on Image Processing,

work page

[29] [29]

Hypersim: A photorealistic syn- thetic dataset for holistic indoor scene understanding

Mike Roberts, Jason Ramapuram, Anurag Ranjan, Atulit Kumar, Miguel Angel Bautista, Nathan Paczan, Russ Webb, 9 and Joshua M Susskind. Hypersim: A photorealistic syn- thetic dataset for holistic indoor scene understanding. In ICCV, pages 10912–10922, 2021. 6

work page 2021

[30] [30]

Superglue: Learning feature matching with graph neural networks

Paul-Edouard Sarlin, Daniel DeTone, Tomasz Malisiewicz, and Andrew Rabinovich. Superglue: Learning feature matching with graph neural networks. InCVPR, pages 4938– 4947, 2020. 2

work page 2020

[31] [31]

Symmetric uncertainty- aware feature transmission for depth super-resolution

Wuxuan Shi, Mang Ye, and Bo Du. Symmetric uncertainty- aware feature transmission for depth super-resolution. In ACMMM, pages 3867–3876, 2022. 6

work page 2022

[32] [32]

Channel attention based iterative residual learning for depth map super-resolution

Xibin Song, Yuchao Dai, Dingfu Zhou, Liu Liu, Wei Li, Hongdong Li, and Ruigang Yang. Channel attention based iterative residual learning for depth map super-resolution. In CVPR, pages 5631–5640, 2020. 2

work page 2020

[33] [33]

Pixel-adaptive convolutional neural networks

Hang Su, Varun Jampani, Deqing Sun, Orazio Gallo, Erik Learned-Miller, and Jan Kautz. Pixel-adaptive convolutional neural networks. InCVPR, pages 11166–11175, 2019. 1

work page 2019

[34] [34]

Learning scene structure guidance via cross- task knowledge transfer for single depth super-resolution

Baoli Sun, Xinchen Ye, Baopu Li, Haojie Li, Zhihui Wang, and Rui Xu. Learning scene structure guidance via cross- task knowledge transfer for single depth super-resolution. In CVPR, pages 7792–7801, 2021. 2

work page 2021

[35] [35]

Consistent direct time-of-flight video depth super-resolution

Zhanghao Sun, Wei Ye, Jinhui Xiong, Gyeongmin Choe, Jialiang Wang, Shuochen Su, and Rakesh Ranjan. Consistent direct time-of-flight video depth super-resolution. InCVPR, pages 5075–5085, 2023. 6

work page 2023

[36] [36]

Joint im- plicit image function for guided depth super-resolution

Jiaxiang Tang, Xiaokang Chen, and Gang Zeng. Joint im- plicit image function for guided depth super-resolution. In ACMMM, pages 4390–4399, 2021. 1

work page 2021

[37] [37]

Bridgenet: A joint learn- ing network of depth map super-resolution and monocular depth estimation

Qi Tang, Runmin Cong, Ronghui Sheng, Lingzhi He, Dan Zhang, Yao Zhao, and Sam Kwong. Bridgenet: A joint learn- ing network of depth map super-resolution and monocular depth estimation. InACMMM, pages 2148–2157, 2021. 2

work page 2021

[38] [38]

Weakly alignment-free rgbt salient object detection with deep correlation network.IEEE Transactions on Image Pro- cessing, 31:3752–3764, 2022

Zhengzheng Tu, Zhun Li, Chenglong Li, and Jin Tang. Weakly alignment-free rgbt salient object detection with deep correlation network.IEEE Transactions on Image Pro- cessing, 31:3752–3764, 2022. 3

work page 2022

[39] [39]

Self-supervised learning for rgb-guided depth enhancement by exploiting the depen- dency between rgb and depth.IEEE Transactions on Image Processing, 32:159–174, 2022

Jun Wang, Peilin Liu, and Fei Wen. Self-supervised learning for rgb-guided depth enhancement by exploiting the depen- dency between rgb and depth.IEEE Transactions on Image Processing, 32:159–174, 2022. 1

work page 2022

[40] [40]

Alignment-free rgbt salient object detec- tion: Semantics-guided asymmetric correlation network and a unified benchmark.IEEE Transactions on Multimedia, 26: 10692–10707, 2024

Kunpeng Wang, Danying Lin, Chenglong Li, Zhengzheng Tu, and Bin Luo. Alignment-free rgbt salient object detec- tion: Semantics-guided asymmetric correlation network and a unified benchmark.IEEE Transactions on Multimedia, 26: 10692–10707, 2024. 3

work page 2024

[41] [41]

Learning continuous depth repre- sentation via geometric spatial aggregator

Xiaohang Wang, Xuanhong Chen, Bingbing Ni, Zhengyan Tong, and Hang Wang. Learning continuous depth repre- sentation via geometric spatial aggregator. InAAAI, pages 2698–2706, 2023. 2

work page 2023

[42] [42]

Sgnet: Struc- ture guided network via gradient-frequency awareness for depth map super-resolution

Zhengxue Wang, Zhiqiang Yan, and Jian Yang. Sgnet: Struc- ture guided network via gradient-frequency awareness for depth map super-resolution. InAAAI, pages 5823–5831,

work page

[43] [43]

Scene prior filtering for depth map super-resolution.arXiv preprint arXiv:2402.13876, 2024

Zhengxue Wang, Zhiqiang Yan, Ming-Hsuan Yang, Jinshan Pan, Jian Yang, Ying Tai, and Guangwei Gao. Scene prior filtering for depth map super-resolution.arXiv preprint arXiv:2402.13876, 2024. 2

work page arXiv 2024

[44] [44]

Spatiotemporal difference network for video depth super-resolution.arXiv preprint arXiv:2508.01259, 2025

Zhengxue Wang, Yuan Wu, Xiang Li, Zhiqiang Yan, and Jian Yang. Spatiotemporal difference network for video depth super-resolution.arXiv preprint arXiv:2508.01259, 2025. 2

work page arXiv 2025

[45] [45]

Dornet: A degradation oriented and regularized network for blind depth super-resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan, Guangwei Gao, Kai Zhang, and Jian Yang. Dornet: A degradation oriented and regularized network for blind depth super-resolution. In CVPR, pages 15813–15822, 2025. 2, 6, 7

work page 2025

[46] [46]

Tri-perspective view decomposition for ge- ometry aware depth completion and super-resolution.IEEE Transactions on Pattern Analysis and Machine Intelligence,

Zhiqiang Yan, Kun Wang, Xiang Li, Guangwei Gao, Jun Li, and Jian Yang. Tri-perspective view decomposition for ge- ometry aware depth completion and super-resolution.IEEE Transactions on Pattern Analysis and Machine Intelligence,

work page

[47] [47]

Codon: On orchestrating cross-domain attentions for depth super-resolution.International Journal of Computer Vision, 130(2):267–284, 2022

Yuxiang Yang, Qi Cao, Jing Zhang, and Dacheng Tao. Codon: On orchestrating cross-domain attentions for depth super-resolution.International Journal of Computer Vision, 130(2):267–284, 2022. 2

work page 2022

[48] [48]

Depth super-resolution via deep controllable slicing network

Xinchen Ye, Baoli Sun, Zhihui Wang, Jingyu Yang, Rui Xu, Haojie Li, and Baopu Li. Depth super-resolution via deep controllable slicing network. InACMMM, pages 1809–1818,

work page

[49] [49]

Pmbanet: Progressive multi-branch aggregation network for scene depth super-resolution.IEEE Transactions on Image Processing, 29:7427–7442, 2020

Xinchen Ye, Baoli Sun, Zhihui Wang, Jingyu Yang, Rui Xu, Haojie Li, and Baopu Li. Pmbanet: Progressive multi-branch aggregation network for scene depth super-resolution.IEEE Transactions on Image Processing, 29:7427–7442, 2020. 2

work page 2020

[50] [50]

Semantics-driven contrastive learning for real-world depth super resolution

Xinchen Ye, Aokai Zhang, and Rui Xu. Semantics-driven contrastive learning for real-world depth super resolution. In ACMMM, pages 3085–3093, 2025. 1

work page 2025

[51] [51]

Structure flow-guided network for real depth super-resolution

Jiayi Yuan, Haobo Jiang, Xiang Li, Jianjun Qian, Jun Li, and Jian Yang. Structure flow-guided network for real depth super-resolution. InAAAI, pages 3340–3348, 2023. 2

work page 2023

[52] [52]

Joint deep-unfolding optimization learning for depth map arbitrary-scale super-resolution.IEEE Trans- actions on Multimedia, 2025

Jialong Zhang, Lijun Zhao, Jinjing Zhang, Anhong Wang, and Huihui Bai. Joint deep-unfolding optimization learning for depth map arbitrary-scale super-resolution.IEEE Trans- actions on Multimedia, 2025. 1

work page 2025

[53] [53]

Mesa: Matching everything by segmenting anything

Yesheng Zhang and Xu Zhao. Mesa: Matching everything by segmenting anything. InCVPR, pages 20217–20226, 2024. 3

work page 2024

[54] [54]

Discrete cosine transform network for guided depth map super-resolution

Zixiang Zhao, Jiangshe Zhang, Shuang Xu, Zudi Lin, and Hanspeter Pfister. Discrete cosine transform network for guided depth map super-resolution. InCVPR, pages 5697– 5707, 2022. 2, 6, 7

work page 2022

[55] [55]

Spherical space feature decomposition for guided depth map super-resolution

Zixiang Zhao, Jiangshe Zhang, Xiang Gu, Chengli Tan, Shuang Xu, Yulun Zhang, Radu Timofte, and Luc Van Gool. Spherical space feature decomposition for guided depth map super-resolution. InICCV, pages 12547–12558, 2023. 2

work page 2023

[56] [56]

Decou- pling fine detail and global geometry for compressed depth map super-resolution

Huan Zheng, Wencheng Han, and Jianbing Shen. Decou- pling fine detail and global geometry for compressed depth map super-resolution. InCVPR, pages 951–960, 2025. 2

work page 2025

[57] [57]

High-resolution depth maps imaging via attention-based hierarchical multi-modal fusion.IEEE Transactions on Image Processing, 31:648– 663, 2021

Zhiwei Zhong, Xianming Liu, Junjun Jiang, Debin Zhao, Zhiwen Chen, and Xiangyang Ji. High-resolution depth maps imaging via attention-based hierarchical multi-modal fusion.IEEE Transactions on Image Processing, 31:648– 663, 2021. 2

work page 2021

[58] [58]

Memory-augmented deep unfolding net- work for guided image super-resolution.International Jour- nal of Computer Vision, 131(1):215–242, 2023

Man Zhou, Keyu Yan, Jinshan Pan, Wenqi Ren, Qi Xie, and Xiangyong Cao. Memory-augmented deep unfolding net- work for guided image super-resolution.International Jour- nal of Computer Vision, 131(1):215–242, 2023. 1 10

work page 2023