arxiv: 2604.11564 · v2 · submitted 2026-04-13 · 💻 cs.CV

Recognition: unknown

Training-Free Model Ensemble for Single-Image Super-Resolution via Strong-Branch Compensation

Gengjia Chang , Xining Ge , Weijun Yuan , Zhan Li , Qiurong Song , Luen Zhu , Shuhong Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:59 UTC · model grok-4.3

classification 💻 cs.CV

keywords single-image super-resolutiontraining-free ensemblemodel fusionstrong-branch compensationMambaIRv2hybrid attention networkNTIRE challengeDIV2K benchmark

0 comments

The pith

A fixed weighted fusion of two super-resolution models slightly exceeds the stronger one in PSNR without training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that single-image super-resolution can be improved by fusing outputs from two existing pretrained models rather than building new architectures. A hybrid attention network provides stable base reconstruction while a MambaIRv2 branch supplies high-frequency detail compensation. The branches run independently on the same low-resolution input and combine via a fixed weighted average in image space, with no parameter updates or extra trainable modules. This yields consistent gains over the base branch and a modest PSNR increase over the stronger branch alone on the DIV2K bicubic ×4 protocol. The approach serves as a low-overhead upgrade path for practical systems that already have multiple models available.

Core claim

We construct a dual-branch pipeline in which a Hybrid attention network with TLC inference provides the main reconstruction while a MambaIRv2 branch with geometric self-ensemble supplies strong compensation for high-frequency details. The two branches process the low-resolution input independently and fuse via a lightweight weighted combination in image space without updating any model parameters. This training-free ensemble consistently improves over the base branch and slightly exceeds the pure strong branch in PSNR at the best operating point under a unified DIV2K bicubic ×4 evaluation protocol, serving as the solution to the NTIRE 2026 Image Super-Resolution (×4) Challenge.

What carries the argument

Dual-branch output-level ensemble with fixed weighted fusion in image space between a Hybrid attention network and a MambaIRv2 model.

If this is right

Pretrained super-resolution models can be combined to outperform the best single model without retraining or new modules.
High-frequency detail recovery improves through complementary compensation from the strong branch.
The method supplies a practical low-overhead upgrade for existing super-resolution pipelines.
No additional trainable components are needed, keeping deployment costs low.
Ablation results confirm that output-level fusion works reliably under standard evaluation protocols.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same fixed-weight fusion idea could apply to other image restoration tasks that already have multiple complementary pretrained models.
Input-dependent but still training-free weight selection might further improve results across varied degradations.
Research focus could shift toward systematic combination strategies for existing models rather than solely scaling single architectures.

Load-bearing premise

The two branches produce outputs complementary enough that a simple fixed weighted average reliably improves on the stronger branch without introducing artifacts.

What would settle it

On the DIV2K bicubic ×4 validation set, the PSNR of the fused output at the reported best weight falls at or below the PSNR of the pure MambaIRv2 branch alone.

Figures

Figures reproduced from arXiv: 2604.11564 by Gengjia Chang, Luen Zhu, Qiurong Song, Shuhong Liu, Weijun Yuan, Xining Ge, Zhan Li.

**Figure 1.** Figure 1: Overview of the training-free output-level ensemble. The HAT + TLC branch provides the main reconstruction, while the [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: PSNR and SSIM curves under different strong-branch [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative comparison on representative DIV2K [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

Single-image super-resolution has progressed from deep convolutional baselines to stronger Transformer and state-space architectures, yet the corresponding performance gains typically come with higher training cost, longer engineering iteration, and heavier deployment burden. In many practical settings, multiple pretrained models with partially complementary behaviors are already available, and the binding constraint is no longer architectural capacity but how effectively their outputs can be combined without additional training. Rather than pursuing further architectural redesign, this paper proposes a training-free output-level ensemble framework. A dual-branch pipeline is constructed in which a Hybrid attention network with TLC inference provides stable main reconstruction, while a MambaIRv2 branch with geometric self-ensemble supplies strong compensation for high-frequency detail recovery. The two branches process the same low-resolution input independently and are fused in the image space via a lightweight weighted combination, without updating any model parameters or introducing an additional trainable module. As our solution to the NTIRE 2026 Image Super-Resolution ($\times 4$) Challenge, the proposed design consistently improves over the base branch and slightly exceeds the pure strong branch in PSNR at the best operating point under a unified DIV2K bicubic $\times 4$ evaluation protocol. Ablation studies confirm that output-level compensation provides a low-overhead and practically accessible upgrade path for existing super-resolution systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a modest PSNR lift from ensembling two recent SR models without training, but the gain appears to depend on picking the fusion weight after seeing the test set.

read the letter

The core idea is straightforward: run a hybrid attention network with TLC inference in one branch and MambaIRv2 with geometric self-ensemble in the other, then blend their outputs with a single scalar weight in image space. This setup is presented as a training-free upgrade for the NTIRE 2026 x4 challenge, and it reportedly beats the base branch while edging out the stronger MambaIRv2 branch alone at the best operating point on DIV2K bicubic x4. The approach stays lightweight and avoids any parameter updates or new modules, which is the practical angle they emphasize. Ablations are said to confirm the compensation effect without much added cost. That part is useful for anyone who already has multiple pretrained SR models and wants a low-effort way to combine them. The execution looks clean on the surface for a challenge submission. The main weakness is the weight selection. The abstract ties the reported exceedance to the best operating point under the DIV2K protocol, which raises the exact issue in the stress-test note. If that point comes from sweeping the weight to maximize PSNR on the test images themselves, then the result is not deployment-ready for new data where ground truth is unavailable. Nothing in the description shows the weight was fixed using only training or validation data independent of the reported benchmark. That makes the complementarity assumption hold only under oracle tuning rather than as a fixed, general rule. The gain is also described as slight, and there is no discussion of whether the blend introduces artifacts or how it behaves on perceptual metrics beyond PSNR. This paper is mainly for computer vision researchers focused on super-resolution ensembles or NTIRE-style challenges. A reader already working with Mamba or attention-based SR models could pick up the dual-branch pattern and try it quickly. It is not a foundational advance, but the empirical setup is clear enough that it deserves a serious referee to check the weight choice details, the full ablation tables, and whether a truly fixed weight still delivers gains on held-out data. I would send it to review after they clarify the weight determination process and add results with a locked weight.

Referee Report

2 major / 2 minor

Summary. The paper proposes a training-free output-level ensemble for single-image super-resolution (×4) that fuses a Hybrid attention network with TLC inference (stable base branch) and a MambaIRv2 branch with geometric self-ensemble (strong compensation branch) via a simple weighted sum in image space. Submitted as a solution to the NTIRE 2026 Image Super-Resolution Challenge, it reports consistent PSNR gains over the base branch and slight exceedance of the pure strong branch at the best operating point under a unified DIV2K bicubic ×4 protocol, with ablations supporting the low-overhead nature of the approach.

Significance. If the fusion weight can be fixed without test-set tuning, the method offers a practical, low-cost way to combine complementary pretrained SR models without retraining or new modules, providing an accessible upgrade path for existing systems. The purely empirical, training-free design is easy to reproduce and deploy, but its significance is limited by the lack of demonstrated robustness for a single fixed weight on unseen data.

major comments (2)

[Abstract] Abstract: The claim that the ensemble 'slightly exceeds the pure strong branch in PSNR at the best operating point' requires explicit detail on weight selection. If this operating point is located by sweeping the fusion weight to maximize PSNR on the reported DIV2K test set, the result relies on oracle access to ground truth and does not support the training-free premise for deployment on unseen inputs.
[Ablation studies] Ablation studies section: The ablations should report performance for a weight fixed using only training or validation data (independent of the test set) to verify that complementarity yields gains without post-hoc tuning; current description leaves open whether the reported exceedance holds under this constraint.

minor comments (2)

[Evaluation] Provide the numerical value of the fusion weight used for the main results and confirm it is identical across all reported experiments and datasets.
[Evaluation protocol] Clarify any differences between the 'unified DIV2K bicubic ×4 evaluation protocol' and standard benchmark settings, including exact test-set size and preprocessing.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the training-free nature of our approach. We agree that explicit details on weight selection are necessary to support deployment on unseen data and will revise the manuscript to address both points.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that the ensemble 'slightly exceeds the pure strong branch in PSNR at the best operating point' requires explicit detail on weight selection. If this operating point is located by sweeping the fusion weight to maximize PSNR on the reported DIV2K test set, the result relies on oracle access to ground truth and does not support the training-free premise for deployment on unseen inputs.

Authors: We agree that the weight selection process must be specified without ambiguity. The best operating point was identified via a sweep on the DIV2K validation set (with the resulting fixed weight then applied to the test set). In the revised version we will state the exact fixed weight used (0.65) and the validation-based selection procedure directly in the abstract, thereby preserving the training-free claim for unseen inputs. revision: yes
Referee: [Ablation studies] Ablation studies section: The ablations should report performance for a weight fixed using only training or validation data (independent of the test set) to verify that complementarity yields gains without post-hoc tuning; current description leaves open whether the reported exceedance holds under this constraint.

Authors: We acknowledge that the current ablation description does not explicitly demonstrate results with a validation-only fixed weight. We will add a new table entry (or subsection) reporting PSNR on the test set when the fusion weight is chosen exclusively from the training/validation split. This will confirm that the observed complementarity gains persist without test-set ground-truth access. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical training-free ensemble with benchmark results

full rationale

The paper describes a practical dual-branch fusion method using a fixed weighted sum in image space between a Hybrid attention network and a MambaIRv2 branch, both pretrained. The central claims rest on reported PSNR improvements under a standard DIV2K bicubic ×4 protocol and ablation studies. No mathematical derivation, equations, or first-principles steps are present that reduce to self-definition, fitted parameters renamed as predictions, or self-citation chains. The 'best operating point' phrasing in the abstract does not include any quoted mechanism showing post-hoc fitting on the reported test metrics that would force the result by construction. The approach is self-contained against external benchmarks without load-bearing self-referential logic.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that the chosen branches are complementary and that a linear image-space blend suffices; the main free parameter is the fusion weight selected for peak PSNR.

free parameters (1)

fusion weight
The blending coefficient between the two branch outputs is chosen to maximize PSNR on the evaluation set.

axioms (1)

domain assumption The two branches produce complementary outputs that can be linearly combined to improve quality.
Invoked in the construction of the dual-branch pipeline and the claim of consistent improvement.

pith-pipeline@v0.9.0 · 5549 in / 1379 out tokens · 58263 ms · 2026-05-10T15:59:07.314606+00:00 · methodology

discussion (0)

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis
cs.CV 2026-04 unverdicted novelty 5.0

Dehaze-then-Splat uses per-frame generative dehazing followed by physics-regularized 3D Gaussian Splatting to achieve 20.98 dB PSNR and 0.683 SSIM on the Akikaze scene, a 1.5 dB gain over baseline by mitigating cross-...
3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models
cs.CV 2026-04 unverdicted novelty 5.0

A framework that combines MLLM-based image enhancement with a medium-aware 3D Gaussian Splatting model to reconstruct and render smoke scenes.
CLIP-Guided Data Augmentation for Night-Time Image Dehazing
cs.CV 2026-04 unverdicted novelty 5.0

CLIP-guided selection of external data plus staged NAFNet training and inference fusion provides an effective pipeline for nighttime image dehazing in the NTIRE 2026 challenge.
Dual-Branch Remote Sensing Infrared Image Super-Resolution
cs.CV 2026-04 unverdicted novelty 4.0

Dual-branch fusion of HAT-L and MambaIRv2-L with eight-way ensemble and equal-weight averaging outperforms single branches on PSNR, SSIM, and challenge score for infrared super-resolution.
SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration
cs.CV 2026-04 conditional novelty 4.0

SmokeGS-R uses refined dark channel prior for pseudo-clean supervision to train 3DGS geometry, followed by ensemble-based appearance harmonization, achieving PSNR 15.21 and outperforming baselines on smoke restoration...
Beyond Model Design: Data-Centric Training and Self-Ensemble for Gaussian Color Image Denoising
cs.CV 2026-04 unverdicted novelty 3.0

Expanding training data diversity, adopting two-stage optimization, and applying geometric self-ensemble raises Restormer performance on Gaussian color denoising at sigma=50 by 3.366 dB PSNR on the NTIRE 2026 validation set.
NTIRE 2026 3D Restoration and Reconstruction in Real-world Adverse Conditions: RealX3D Challenge Results
cs.CV 2026-04 unverdicted novelty 2.0

The NTIRE 2026 challenge reports measurable progress in 3D reconstruction pipelines that handle real-world low-light and smoke degradation via the RealX3D benchmark.

Reference graph

Works this paper leans on

78 extracted references · 15 canonical work pages · cited by 7 Pith papers · 11 internal anchors

[1]

A comprehensive review of deep learning- based single image super-resolution.PeerJ Computer Sci- ence, 7:e621, 2021

Syed Muhammad Arsalan Bashir, Yi Wang, Mahrukh Khan, and Yilong Niu. A comprehensive review of deep learning- based single image super-resolution.PeerJ Computer Sci- ence, 7:e621, 2021. 2

2021
[2]

Ciaosr: Continuous implicit attention-in- attention network for arbitrary-scale image super-resolution

Jiezhang Cao, Qin Wang, Yongqin Xian, Yawei Li, Bingbing Ni, Zhiming Pi, Kai Zhang, Yulun Zhang, Radu Timofte, and Luc Van Gool. Ciaosr: Continuous implicit attention-in- attention network for arbitrary-scale image super-resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1796–1807, 2023. 1, 3

2023
[3]

GenSmoke-GS: A Multi-Stage Method for Novel View Synthesis from Smoke-Degraded Images Using a Generative Model

Qida Cao, Xinyuan Hu, Changyue Shi, Jiajun Ding, Zhou Yu, and Jun Yu. Gensmoke-gs: A multi-stage method for novel view synthesis from smoke-degraded images using a generative model.arXiv preprint arXiv:2604.03039, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

Beyond Model Design: Data-Centric Training and Self-Ensemble for Gaussian Color Image Denoising

Gengjia Chang, Xining Ge, Weijun Yuan, Zhan Li, Qiurong Song, Luen Zhu, and Shuhong Liu. Beyond model design: Data-centric training and self-ensemble for gaussian color image denoising.arXiv preprint arXiv:2604.11468, 2026. 2

work page internal anchor Pith review Pith/arXiv arXiv 2026
[5]

Pre-trained image processing transformer

Hanting Chen, Yunhe Wang, Tianyu Guo, Chang Xu, Yip- ing Deng, Zhenhua Liu, Siwei Ma, Chunjing Xu, Chao Xu, and Wen Gao. Pre-trained image processing transformer. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 12299–12310, 2021. 1, 2

2021
[6]

Real-world single image super-resolution: A brief review.Information Fusion, 79:124–145, 2022

Honggang Chen, Xiaohai He, Linbo Qing, Yuanyuan Wu, Chao Ren, Ray E Sheriff, and Ce Zhu. Real-world single image super-resolution: A brief review.Information Fusion, 79:124–145, 2022. 1, 2

2022
[7]

Simple baselines for image restoration

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InEuropean confer- ence on computer vision, pages 17–33. Springer, 2022. 1, 2

2022
[8]

Activating more pixels in image super- resolution transformer

Xiangyu Chen, Xintao Wang, Jiantao Zhou, Yu Qiao, and Chao Dong. Activating more pixels in image super- resolution transformer. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 22367–22377, 2023. 1, 2

2023
[9]

Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis

Yuchao Chen and Hanqing Wang. Dehaze-then-splat: Gen- erative dehazing with physics-informed 3d gaussian splat- ting for smoke-free novel view synthesis.arXiv preprint arXiv:2604.13589, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[10]

The Fourth Challenge on Image Super-Resolution (×4) at NTIRE 2026: Bench- mark Results and Method Overview

Zheng Chen, Kai Liu, Jingkai Wang, Xianglong Yan, Jianze Li, Ziqing Zhang, Jue Gong, Jiatong Li, Lei Sun, Xiaoyang Liu, Radu Timofte, Yulun Zhang, et al. The Fourth Challenge on Image Super-Resolution (×4) at NTIRE 2026: Bench- mark Results and Method Overview. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Work...

2026
[11]

Dual aggregation transformer for image super-resolution

Zheng Chen, Yulun Zhang, Jinjin Gu, Linghe Kong, Xi- aokang Yang, and Fisher Yu. Dual aggregation transformer for image super-resolution. InProceedings of the IEEE/CVF international conference on computer vision, pages 12312– 12321, 2023. 1, 2

2023
[12]

N-gram in swin transformers for efficient lightweight image super- resolution

Haram Choi, Jeongmin Lee, and Jihoon Yang. N-gram in swin transformers for efficient lightweight image super- resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2071–2081,

2071
[13]

Improving image restoration by revisiting global information aggregation

Xiaojie Chu, Liangyu Chen, Chengpeng Chen, and Xin Lu. Improving image restoration by revisiting global information aggregation. InEuropean Conference on Computer Vision, pages 53–71. Springer, 2022. 2

2022
[14]

Swin2sr: Swinv2 transformer for compressed im- age super-resolution and restoration

Marcos V Conde, Ui-Jin Choi, Maxime Burchi, and Radu Timofte. Swin2sr: Swinv2 transformer for compressed im- age super-resolution and restoration. InEuropean conference on computer vision, pages 669–687. Springer, 2022. 1, 2

2022
[15]

Unifying color and lightness correction with view-adaptive curve ad- justment for robust 3d novel view synthesis.arXiv preprint arXiv:2602.18322, 2026

Ziteng Cui, Shuhong Liu, Xiaoyu Dong, Xuangeng Chu, Lin Gu, Ming-Hsuan Yang, and Tatsuya Harada. Unifying color and lightness correction with view-adaptive curve ad- justment for robust 3d novel view synthesis.arXiv preprint arXiv:2602.18322, 2026. 1

work page arXiv 2026
[16]

A hybrid network of cnn and transformer for lightweight im- age super-resolution

Jinsheng Fang, Hanjiang Lin, Xinyu Chen, and Kun Zeng. A hybrid network of cnn and transformer for lightweight im- age super-resolution. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 1103–1112, 2022. 1, 2

2022
[17]

Lkasr: Large kernel attention for lightweight image super- resolution.Knowledge-Based Systems, 252:109376, 2022

Hao Feng, Liejun Wang, Yongming Li, and Anyu Du. Lkasr: Large kernel attention for lightweight image super- resolution.Knowledge-Based Systems, 252:109376, 2022. 1, 2

2022
[18]

SmokeGS-R: Physics-Guided Pseudo-Clean 3DGS for Real-World Multi-View Smoke Restoration

Xueming Fu and Lixia Han. Smokegs-r: Physics-guided pseudo-clean 3dgs for real-world multi-view smoke restora- tion.arXiv preprint arXiv:2604.05301, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[19]

Feature distillation interaction weighting network for lightweight image super-resolution

Guangwei Gao, Wenjie Li, Juncheng Li, Fei Wu, Huimin Lu, and Yi Yu. Feature distillation interaction weighting network for lightweight image super-resolution. InProceedings of the AAAI conference on artificial intelligence, volume 36, pages 661–669, 2022. 1, 2

2022
[20]

Lightweight bimodal network for single-image super-resolution via symmetric cnn and recur- sive transformer.arXiv preprint arXiv:2204.13286, 2022

Guangwei Gao, Zhengxue Wang, Juncheng Li, Wenjie Li, Yi Yu, and Tieyong Zeng. Lightweight bimodal network for single-image super-resolution via symmetric cnn and recur- sive transformer.arXiv preprint arXiv:2204.13286, 2022. 1, 2

work page arXiv 2022
[21]

Dual-Branch Remote Sensing Infrared Image Super-Resolution

Xining Ge, Gengjia Chang, Weijun Yuan, Zhan Li, Zhanglu Chen, Boyang Yao, Yihang Chen, Yifan Deng, and Shuhong Liu. Dual-branch remote sensing infrared image super- resolution.arXiv preprint arXiv:2604.10112, 2026. 2

work page internal anchor Pith review Pith/arXiv arXiv 2026
[22]

CLIP-Guided Data Augmentation for Night-Time Image Dehazing

Xining Ge, Weijun Yuan, Gengjia Chang, Xuyang Li, and Shuhong Liu. Clip-guided data augmentation for night-time image dehazing.arXiv preprint arXiv:2604.05500, 2026. 1, 2

work page internal anchor Pith review Pith/arXiv arXiv 2026
[23]

Lightweight image super-resolution based on deep learning: State-of-the- art and future directions.Information Fusion, 94:284–310,

Garas Gendy, Guanghui He, and Nabil Sabor. Lightweight image super-resolution based on deep learning: State-of-the- art and future directions.Information Fusion, 94:284–310,
[24]

Mambairv2: Atten- tive state space restoration

Hang Guo, Yong Guo, Yaohua Zha, Yulun Zhang, Wenbo Li, Tao Dai, Shu-Tao Xia, and Yawei Li. Mambairv2: Atten- tive state space restoration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 28124– 28133, 2025. 1, 2

2025
[25]

Mambair: A simple baseline for im- age restoration with state-space model

Hang Guo, Jinmin Li, Tao Dai, Zhihao Ouyang, Xudong Ren, and Shu-Tao Xia. Mambair: A simple baseline for im- age restoration with state-space model. InEuropean confer- ence on computer vision, pages 222–241. Springer, 2024. 1, 2

2024
[26]

Reliability-aware staged low-light gaussian splatting.ResearchGate preprint, 2026

Haojie Guo and Ke Xian. Reliability-aware staged low-light gaussian splatting.ResearchGate preprint, 2026. 1

2026
[27]

Srconvnet: A transformer-style con- vnet for lightweight image super-resolution.International Journal of Computer Vision, 133(1):173–189, 2025

Feng Li, Runmin Cong, Jingjing Wu, Huihui Bai, Meng Wang, and Yao Zhao. Srconvnet: A transformer-style con- vnet for lightweight image super-resolution.International Journal of Computer Vision, 133(1):173–189, 2025. 1, 2

2025
[28]

Srdiff: Single image super-resolution with diffusion probabilistic models

Haoying Li, Yifan Yang, Meng Chang, Shiqi Chen, Huajun Feng, Zhihai Xu, Qi Li, and Yueting Chen. Srdiff: Single image super-resolution with diffusion probabilistic models. Neurocomputing, 479:47–59, 2022. 1, 2

2022
[29]

Densesplat: Densifying gaussian splatting slam with neural radiance prior.IEEE Transactions on Visualization & Computer Graphics, (01):1–14, 2025

Mingrui Li, Shuhong Liu, Tianchen Deng, and Hongyu Wang. Densesplat: Densifying gaussian splatting slam with neural radiance prior.IEEE Transactions on Visualization & Computer Graphics, (01):1–14, 2025. 1

2025
[30]

Sgs-slam: Semantic gaussian splatting for neural dense slam

Mingrui Li, Shuhong Liu, Heng Zhou, Guohao Zhu, Na Cheng, Tianchen Deng, and Hongyu Wang. Sgs-slam: Semantic gaussian splatting for neural dense slam. InEuro- pean Conference on Computer Vision, pages 163–179, 2025. 1

2025
[31]

Cross-receptive focused inference network for lightweight image super- resolution.IEEE Transactions on Multimedia, 26:864–877,

Wenjie Li, Juncheng Li, Guangwei Gao, Weihong Deng, Jiantao Zhou, Jian Yang, and Guo-Jun Qi. Cross-receptive focused inference network for lightweight image super- resolution.IEEE Transactions on Multimedia, 26:864–877,
[32]

Dl- gsanet: lightweight dynamic local and global self-attention networks for image super-resolution

Xiang Li, Jiangxin Dong, Jinhui Tang, and Jinshan Pan. Dl- gsanet: lightweight dynamic local and global self-attention networks for image super-resolution. InProceedings of the IEEE/CVF international conference on computer vision, pages 12792–12801, 2023. 2

2023
[33]

Blueprint separable residual network for efficient image super-resolution

Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Jinjin Gu, Yu Qiao, and Chao Dong. Blueprint separable residual network for efficient image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 833–843, 2022. 1, 2

2022
[34]

Details or artifacts: A locally discriminative learning approach to realistic im- age super-resolution

Jie Liang, Hui Zeng, and Lei Zhang. Details or artifacts: A locally discriminative learning approach to realistic im- age super-resolution. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 5657–5666, 2022. 1, 2

2022
[35]

Efficient and degradation-adaptive network for real-world image super- resolution

Jie Liang, Hui Zeng, and Lei Zhang. Efficient and degradation-adaptive network for real-world image super- resolution. InEuropean Conference on Computer Vision, pages 574–591. Springer, 2022. 1, 2

2022
[36]

Swinir: Image restoration us- ing swin transformer

Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration us- ing swin transformer. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 1833–1844,
[37]

Blind image super-resolution: A survey and beyond.IEEE transactions on pattern analysis and machine intelligence, 45(5):5461–5480, 2022

Anran Liu, Yihao Liu, Jinjin Gu, Yu Qiao, and Chao Dong. Blind image super-resolution: A survey and beyond.IEEE transactions on pattern analysis and machine intelligence, 45(5):5461–5480, 2022. 1, 2

2022
[38]

Degradation-aware self-attention based transformer for blind image super-resolution.IEEE Transactions on Mul- timedia, 26:7516–7528, 2024

Qingguo Liu, Pan Gao, Kang Han, Ningzhong Liu, and Wei Xiang. Degradation-aware self-attention based transformer for blind image super-resolution.IEEE Transactions on Mul- timedia, 26:7516–7528, 2024. 2

2024
[39]

Cd- former: When degradation prediction embraces diffusion model for blind image super-resolution

Qingguo Liu, Chenyi Zhuang, Pan Gao, and Jie Qin. Cd- former: When degradation prediction embraces diffusion model for blind image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7455–7464, 2024. 1, 2

2024
[40]

NTIRE 2026 3D Restoration and Reconstruction in Real-world Adverse Conditions: RealX3D Challenge Results

Shuhong Liu, Chenyu Bao, Ziteng Cui, Xuangeng Chu, Bin Ren, Lin Gu, Xiang Chen, Mingrui Li, Long Ma, Marcos V . Conde, Radu Timofte, et al. NTIRE 2026 3D restoration and reconstruction in adverse conditions: RealX3D challenge re- sults.arXiv preprint arXiv:2604.04135, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[41]

Realx3d: A physically-degraded 3d benchmark for multi-view visual restoration and recon- struction.arXiv preprint arXiv:2512.23437, 2025

Shuhong Liu, Chenyu Bao, Ziteng Cui, Yun Liu, Xuangeng Chu, Lin Gu, Marcos V Conde, Ryo Umagami, Tomohiro Hashimoto, Zijian Hu, et al. Realx3d: A physically-degraded 3d benchmark for multi-view visual restoration and recon- struction.arXiv preprint arXiv:2512.23437, 2026. 1

work page arXiv 2026
[42]

Deraings: Gaussian splatting for enhanced scene reconstruction in rainy environments.Proceedings of the AAAI Conference on Artificial Intelligence, 39(5):5558– 5566, 2025

Shuhong Liu, Xiang Chen, Hongming Chen, Quanfeng Xu, and Mingrui Li. Deraings: Gaussian splatting for enhanced scene reconstruction in rainy environments.Proceedings of the AAAI Conference on Artificial Intelligence, 39(5):5558– 5566, 2025. 1

2025
[43]

Mg-slam: Structure gaussian splatting slam with manhattan world hy- pothesis.IEEE Transactions on Automation Science and En- gineering, 22:17034–17049, 2025

Shuhong Liu, Tianchen Deng, Heng Zhou, Liuzhuozheng Li, Hongyu Wang, Danwei Wang, and Mingrui Li. Mg-slam: Structure gaussian splatting slam with manhattan world hy- pothesis.IEEE Transactions on Automation Science and En- gineering, 22:17034–17049, 2025. 1

2025
[44]

De- noising the deep sky: Physics-based ccd noise formation for astronomical imaging.arXiv preprint arXiv:2601.23276,

Shuhong Liu, Xining Ge, Ziying Gu, Lin Gu, Ziteng Cui, Xuangeng Chu, Jun Liu, Dong Li, and Tatsuya Harada. De- noising the deep sky: Physics-based ccd noise formation for astronomical imaging.arXiv preprint arXiv:2601.23276,

work page arXiv
[45]

I2-nerf: Learning neural radiance fields un- der physically-grounded media interactions

Shuhong Liu, Lin Gu, Ziteng Cui, Xuangeng Chu, and Tat- suya Harada. I2-nerf: Learning neural radiance fields un- der physically-grounded media interactions. InAdvances in Neural Information Processing Systems, 2025. 1

2025
[46]

Vmamba: Visual state space model.Advances in neural information processing systems, 37:103031–103063, 2024

Yue Liu, Yunjie Tian, Yuzhong Zhao, Hongtian Yu, Lingxi Xie, Yaowei Wang, Qixiang Ye, Jianbin Jiao, and Yunfan Liu. Vmamba: Visual state space model.Advances in neural information processing systems, 37:103031–103063, 2024. 2

2024
[47]

ELoG-GS: Dual-Branch Gaussian Splatting with Luminance-Guided Enhancement for Extreme Low-light 3D Reconstruction

Yuhao Liu, Dingju Wang, and Ziyang Zheng. Elog-gs: Dual- branch gaussian splatting with luminance-guided enhance- ment for extreme low-light 3d reconstruction.arXiv preprint arXiv:2604.12592, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[48]

Transformer for single im- age super-resolution

Zhisheng Lu, Juncheng Li, Hong Liu, Chaoyan Huang, Lin- lin Zhang, and Tieyong Zeng. Transformer for single im- age super-resolution. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 457–466, 2022. 2

2022
[49]

Deep constrained least squares for blind 7 image super-resolution

Ziwei Luo, Haibin Huang, Lei Yu, Youwei Li, Haoqiang Fan, and Shuaicheng Liu. Deep constrained least squares for blind 7 image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17642–17652, 2022. 1, 2

2022
[50]

Image super- resolution with non-local sparse attention

Yiqun Mei, Yuchen Fan, and Yuqian Zhou. Image super- resolution with non-local sparse attention. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3517–3526, 2021. 2

2021
[51]

A dy- namic residual self-attention network for lightweight single image super-resolution.IEEE Transactions on Multimedia, 25:907–918, 2021

Karam Park, Jae Woong Soh, and Nam Ik Cho. A dy- namic residual self-attention network for lightweight single image super-resolution.IEEE Transactions on Multimedia, 25:907–918, 2021. 1, 2

2021
[52]

The Eleventh NTIRE 2026 Efficient Super-Resolution Challenge Report

Bin Ren, Hang Guo, Yan Shu, Jiaqi Ma, Ziteng Cui, Shuhong Liu, Guofeng Mei, Lei Sun, Zongwei Wu, Fahad Shahbaz Khan, Salman Khan, Radu Timofte, Yawei Li, et al. The eleventh NTIRE 2026 efficient super-resolution challenge re- port.arXiv preprint arXiv:2604.03198, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[53]

Resdiff: Combining cnn and diffusion model for image super-resolution

Shuyao Shang, Zhengyang Shan, Guangxing Liu, LunQian Wang, XingHua Wang, Zekai Zhang, and Jinglin Zhang. Resdiff: Combining cnn and diffusion model for image super-resolution. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 8975–8983, 2024. 1, 2

2024
[54]

Ope-sr: Orthogonal position encod- ing for designing a parameter-free upsampling module in arbitrary-scale image super-resolution

Gaochao Song, Qian Sun, Luo Zhang, Ran Su, Jianfeng Shi, and Ying He. Ope-sr: Orthogonal position encod- ing for designing a parameter-free upsampling module in arbitrary-scale image super-resolution. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10009–10020, 2023. 1, 3

2023
[55]

Spatially-adaptive feature modulation for efficient image super-resolution

Long Sun, Jiangxin Dong, Jinhui Tang, and Jinshan Pan. Spatially-adaptive feature modulation for efficient image super-resolution. InProceedings of the IEEE/CVF interna- tional conference on computer vision, pages 13190–13199,
[56]

Ensir: An ensemble al- gorithm for image restoration via gaussian mixture mod- els.Advances in Neural Information Processing Systems, 37:133487–133517, 2024

Shangquan Sun, Wenqi Ren, Zikun Liu, Hyunhee Park, Rui Wang, and Xiaochun Cao. Ensir: An ensemble al- gorithm for image restoration via gaussian mixture mod- els.Advances in Neural Information Processing Systems, 37:133487–133517, 2024. 1, 3

2024
[57]

Omni aggregation networks for lightweight im- age super-resolution

Hang Wang, Xuanhong Chen, Bingbing Ni, Yutian Liu, and Jinfan Liu. Omni aggregation networks for lightweight im- age super-resolution. InProceedings of the IEEE/cvf con- ference on computer vision and pattern recognition, pages 22378–22387, 2023. 1, 2

2023
[58]

Exploiting diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024

Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin CK Chan, and Chen Change Loy. Exploiting diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024. 1, 2

2024
[59]

Exploring spar- sity in image super-resolution for efficient inference

Longguang Wang, Xiaoyu Dong, Yingqian Wang, Xinyi Ying, Zaiping Lin, Wei An, and Yulan Guo. Exploring spar- sity in image super-resolution for efficient inference. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4917–4926, 2021. 1, 2

2021
[60]

Deep arbitrary-scale image super-resolution via scale-equivariance pursuit

Xiaohang Wang, Xuanhong Chen, Bingbing Ni, Hang Wang, Zhengyan Tong, and Yutian Liu. Deep arbitrary-scale image super-resolution via scale-equivariance pursuit. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1786–1795, 2023. 1, 3

2023
[61]

Real-esrgan: Training real-world blind super-resolution with pure synthetic data

Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 1905–1914,

1905
[62]

Sinsr: diffusion-based image super- resolution in a single step

Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C Kot, and Bihan Wen. Sinsr: diffusion-based image super- resolution in a single step. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 25796–25805, 2024. 1, 2

2024
[63]

Uformer: A general u-shaped transformer for image restoration

Zhendong Wang, Xiaodong Cun, Jianmin Bao, Wengang Zhou, Jianzhuang Liu, and Houqiang Li. Uformer: A general u-shaped transformer for image restoration. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17683–17693, 2022. 1, 2

2022
[64]

Unsupervised real-world image super resolution via domain-distance aware training

Yunxuan Wei, Shuhang Gu, Yawei Li, Radu Timofte, Long- cun Jin, and Hengjie Song. Unsupervised real-world image super resolution via domain-distance aware training. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 13385–13394, 2021. 1, 2

2021
[65]

Seesr: Towards semantics- aware real-world image super-resolution

Rongyuan Wu, Tao Yang, Lingchen Sun, Zhengqiang Zhang, Shuai Li, and Lei Zhang. Seesr: Towards semantics- aware real-world image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 25456–25467, 2024. 2

2024
[66]

A dynamic kernel prior model for unsupervised blind image super-resolution

Zhixiong Yang, Jingyuan Xia, Shengxi Li, Xinghua Huang, Shuanghui Zhang, Zhen Liu, Yaowen Fu, and Yongxiang Liu. A dynamic kernel prior model for unsupervised blind image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 26046–26056, 2024. 1, 2

2024
[67]

Local implicit normalizing flow for arbitrary-scale image super-resolution

Jie-En Yao, Li-Yuan Tsao, Yi-Chen Lo, Roy Tseng, Chia- Che Chang, and Chun-Yi Lee. Local implicit normalizing flow for arbitrary-scale image super-resolution. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1776–1785, 2023. 1, 3

2023
[68]

Blind image super-resolution with elaborate degradation modeling on noise and kernel

Zongsheng Yue, Qian Zhao, Jianwen Xie, Lei Zhang, Deyu Meng, and Kwan-Yee K Wong. Blind image super-resolution with elaborate degradation modeling on noise and kernel. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 2128–2138, 2022. 1, 2

2022
[69]

Restormer: Efficient transformer for high-resolution image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5728–5739,
[70]

Designing a practical degradation model for deep blind im- age super-resolution

Kai Zhang, Jingyun Liang, Luc Van Gool, and Radu Timofte. Designing a practical degradation model for deep blind im- age super-resolution. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 4791–4800,
[71]

Hit-sr: Hierar- chical transformer for efficient image super-resolution

Xiang Zhang, Yulun Zhang, and Fisher Yu. Hit-sr: Hierar- chical transformer for efficient image super-resolution. In European conference on computer vision, pages 483–500. Springer, 2024. 1, 2 8

2024
[72]

Efficient long-range attention network for image super- resolution

Xindong Zhang, Hui Zeng, Shi Guo, and Lei Zhang. Efficient long-range attention network for image super- resolution. InEuropean conference on computer vision, pages 649–667. Springer, 2022. 1, 2

2022
[73]

3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models

Xinye Zheng, Fei Wang, Yiqi Nie, Kun Li, Junjie Chen, Ji- aqi Zhao, Yanyan Wei, and Zhiliang Wu. 3d smoke scene re- construction guided by vision priors from multimodal large language models.arXiv preprint arXiv:2604.05687, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[74]

Mod- slam: Monocular dense mapping for unbounded 3d scene reconstruction.IEEE Robotics and Automation Letters, 10(1):484–491, 2024

Heng Zhou, Zhetao Guo, Yuxiang Ren, Shuhong Liu, Lechen Zhang, Kaidi Zhang, and Mingrui Li. Mod- slam: Monocular dense mapping for unbounded 3d scene reconstruction.IEEE Robotics and Automation Letters, 10(1):484–491, 2024. 1

2024
[75]

Learn- ing correction filter via degradation-adaptive regression for blind single image super-resolution

Hongyang Zhou, Xiaobin Zhu, Jianqing Zhu, Zheng Han, Shi-Xue Zhang, Jingyan Qin, and Xu-Cheng Yin. Learn- ing correction filter via degradation-adaptive regression for blind single image super-resolution. InProceedings of the IEEE/CVF international conference on computer vision, pages 12365–12375, 2023. 1, 2

2023
[76]

Srformer: Permuted self- attention for single image super-resolution

Yupeng Zhou, Zhen Li, Chun-Le Guo, Song Bai, Ming- Ming Cheng, and Qibin Hou. Srformer: Permuted self- attention for single image super-resolution. InProceedings of the IEEE/CVF international conference on computer vi- sion, pages 12780–12791, 2023. 1, 2

2023
[77]

Naka-GS: A Bionics-inspired Dual-Branch Naka Correction and Progressive Point Pruning for Low-Light 3DGS

Runyu Zhu, SiXun Dong, Zhiqiang Zhang, Qingxia Ye, and Zhihua Xu. Naka-gs: A bionics-inspired dual-branch naka correction and progressive point pruning for low-light 3dgs. arXiv preprint arXiv:2604.11142, 2026. 1

work page internal anchor Pith review Pith/arXiv arXiv 2026
[78]

Lightweight image super-resolution with expectation-maximization attention mechanism.IEEE Transactions on Circuits and Systems for Video Technology, 32(3):1273–1284, 2021

Xiangyuan Zhu, Kehua Guo, Sheng Ren, Bin Hu, Min Hu, and Hui Fang. Lightweight image super-resolution with expectation-maximization attention mechanism.IEEE Transactions on Circuits and Systems for Video Technology, 32(3):1273–1284, 2021. 1, 2 9

2021