pith. sign in

arxiv: 2601.20306 · v2 · pith:UAB2QODXnew · submitted 2026-01-28 · 💻 cs.CV

TPGDiff: Hierarchical Triple-Prior Guided Diffusion for Image Restoration

Pith reviewed 2026-05-21 13:55 UTC · model grok-4.3

classification 💻 cs.CV
keywords image restorationdiffusion modelshierarchical priorsall-in-one restorationdegradation priorsstructural priorssemantic priors
0
0 comments X

The pith

A diffusion model restores diverse degraded images by injecting degradation priors at every step, structural cues into shallow layers, and semantic guidance into deep layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TPGDiff as a unified diffusion network for all-in-one image restoration. It routes degradation priors through the full trajectory to control the process adaptively, feeds multi-source structural priors to shallow layers to preserve fine details, and supplies distilled semantic priors only to deep layers to generate content without blurring spatial structures. This hierarchical split addresses the problem that early semantic injection often creates artifacts while pure degradation guidance fails on severe damage. A sympathetic reader would care because the approach suggests one model could replace multiple specialized restorers for noise, blur, low light, and other degradations.

Core claim

TPGDiff incorporates degradation priors throughout the diffusion trajectory, structural priors into shallow layers via multi-source cues, and semantic priors into deep layers through a distillation-driven extractor, producing hierarchical and complementary guidance that yields superior performance and generalization on both single- and multi-degradation benchmarks.

What carries the argument

The triple-prior guidance mechanism consisting of a degradation extractor for stage-adaptive diffusion control, multi-source structural cues for shallow-layer detail preservation, and a distillation-driven semantic extractor for reliable deep-layer content guidance.

If this is right

  • A single model handles both single-degradation and combined multi-degradation cases without retraining.
  • Content recovery improves in regions where degradation is severe compared with degradation-prior-only baselines.
  • Fine-grained details are captured by shallow structural cues without introducing extra blur.
  • High-level semantics remain usable under heavy corruption when confined to deep layers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same depth-specific prior routing could be tested in non-diffusion architectures such as transformers or CNNs for restoration.
  • Real-world images with unseen degradation combinations would provide a direct check on whether the reported generalization holds outside benchmark sets.
  • Pairing the hierarchical prior scheme with faster diffusion samplers might reduce inference cost while retaining the quality gains.

Load-bearing premise

The distillation-driven semantic extractor continues to supply accurate high-level guidance even when inputs are severely degraded and that restricting semantic priors to deep layers prevents disruption of low-level spatial structures.

What would settle it

A controlled ablation on a heavy-degradation test set that removes either the deep-layer semantic priors or the shallow-layer structural priors and shows no drop or an increase in perceptual quality metrics.

Figures

Figures reproduced from arXiv: 2601.20306 by Axi Niu, Jiacong Tang, Qingsen Yan, Yanjie Tu.

Figure 1
Figure 1. Figure 1: (a) Existing methods inject prior information uniformly [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall architecture of TPGDiff. The framework explicitly models three types of priors from a low-quality input image and [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Design of the proposed structural extractor with multi [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual ablation study on prior components. Each prior [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visual comparison results and other all-in-one image restoration methods on image denoising, low-light enhancement, image [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Per-pixel error distribution maps under different prior injection hierarchies. The injection of semantic and structural priors at [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visual results of applying our TPGDiff to various degradation recovery tasks, demonstrating a unified model capable of addressing [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Visualization results for the image deraining task under different methods on the Rain100L dataset. Zoom in for best view. [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Visualization results for the image denoising task under different methods on the CBSD68 dataset. Zoom in for best view. [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Visualization results for the low-light image enhancement task under different methods on the LOL-v1 dataset. Zoom in for [PITH_FULL_IMAGE:figures/full_fig_p021_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Visualization results for the image deblurring task under different methods on the GoPro dataset. Zoom in for best view. [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Visualization results for the image dehazing task under different methods on the SOTS dataset. Zoom in for best view. [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗
read the original abstract

All-in-one image restoration aims to address diverse degradation types using a single unified model. Existing methods typically rely on degradation priors to guide restoration, yet often struggle to reconstruct content in severely degraded regions. Although recent works leverage semantic information to facilitate content generation, integrating it into the shallow layers of diffusion models often disrupts spatial structures (\emph{e.g.}, blurring artifacts). To address this issue, we propose a Triple-Prior Guided Diffusion (TPGDiff) network for unified image restoration. TPGDiff incorporates degradation priors throughout the diffusion trajectory, while introducing structural priors into shallow layers and semantic priors into deep layers, enabling hierarchical and complementary prior guidance for image reconstruction. Specifically, we leverage multi-source structural cues as structural priors to capture fine-grained details and guide shallow layers representations. To complement this design, we further develop a distillation-driven semantic extractor that yields robust semantic priors, ensuring reliable high-level guidance at deep layers even under severe degradations. Furthermore, a degradation extractor is employed to learn degradation-aware priors, enabling stage-adaptive control of the diffusion process across all timesteps. Extensive experiments on both single- and multi-degradation benchmarks demonstrate that TPGDiff achieves superior performance and generalization across diverse restoration scenarios. Our project page is: https://leoyjtu.github.io/tpgdiff-project.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces TPGDiff, a diffusion-based network for all-in-one image restoration. It incorporates degradation priors throughout the diffusion trajectory, structural priors (from multi-source cues) into shallow layers, and semantic priors (from a distillation-driven extractor) into deep layers to provide hierarchical guidance. The central claim is that this design improves content reconstruction under severe degradations without disrupting spatial structures, leading to superior performance and generalization on single- and multi-degradation benchmarks.

Significance. If the hierarchical prior placement and the robustness of the distilled semantic features hold, the work could advance unified restoration models by showing how complementary priors can be injected at different depths of a diffusion process. The distillation-driven semantic extractor and stage-adaptive degradation control represent concrete technical contributions that, if validated through targeted ablations, would strengthen the case for prior-guided diffusion in restoration tasks.

major comments (2)
  1. [§3.2] §3.2 (distillation-driven semantic extractor): The claim that this extractor 'yields robust semantic priors, ensuring reliable high-level guidance at deep layers even under severe degradations' is load-bearing for the central claim, yet the manuscript provides no ablation or analysis comparing student features to the teacher on heavily degraded inputs (e.g., severe blur or high noise). Without such evidence, it remains possible that semantic content collapses, undermining the deep-layer guidance and the asserted advantage over prior diffusion restorers.
  2. [§4] §4 (experiments): The reported superior performance on multi-degradation benchmarks lacks error bars, statistical significance tests, or detailed per-degradation breakdowns that would substantiate the generalization claim. This weakens the ability to evaluate whether the triple-prior hierarchy delivers consistent gains across the diverse scenarios highlighted in the abstract.
minor comments (2)
  1. [Abstract] The abstract refers to 'multi-source structural cues' without naming the sources; this detail should appear in the introduction or method overview for immediate clarity.
  2. [§3] Notation for the three prior extractors is introduced without a consolidated table; adding one would help readers track the distinct roles of degradation, structural, and semantic components.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions that strengthen the paper without altering its core contributions.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (distillation-driven semantic extractor): The claim that this extractor 'yields robust semantic priors, ensuring reliable high-level guidance at deep layers even under severe degradations' is load-bearing for the central claim, yet the manuscript provides no ablation or analysis comparing student features to the teacher on heavily degraded inputs (e.g., severe blur or high noise). Without such evidence, it remains possible that semantic content collapses, undermining the deep-layer guidance and the asserted advantage over prior diffusion restorers.

    Authors: We agree that a targeted comparison of student and teacher features on heavily degraded inputs would provide stronger direct evidence for the robustness claim. While the end-to-end restoration results on severe degradation benchmarks indirectly support the utility of the distilled priors, we acknowledge the value of explicit validation. In the revised manuscript, we will add a new analysis subsection under §3.2 that includes quantitative comparisons (e.g., feature similarity metrics such as cosine similarity and perceptual distance) and qualitative visualizations of semantic features extracted from the student versus the teacher on inputs with severe blur and high noise. This addition will directly address the concern and further substantiate the hierarchical guidance design. revision: yes

  2. Referee: [§4] §4 (experiments): The reported superior performance on multi-degradation benchmarks lacks error bars, statistical significance tests, or detailed per-degradation breakdowns that would substantiate the generalization claim. This weakens the ability to evaluate whether the triple-prior hierarchy delivers consistent gains across the diverse scenarios highlighted in the abstract.

    Authors: We appreciate this observation on improving the statistical presentation of results. The current experiments report mean performance across benchmarks, but we concur that error bars, significance testing, and granular breakdowns would better support the generalization claims. In the revised manuscript, we will update the experimental section (§4) and associated tables/figures to include standard deviation error bars computed over multiple random seeds, paired statistical significance tests (e.g., t-tests) against baselines, and detailed per-degradation performance tables for the multi-degradation benchmarks. These additions will allow readers to more rigorously assess consistency across degradation types. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on independent architectural choices and benchmark validation.

full rationale

The paper introduces TPGDiff as a novel hierarchical integration of degradation priors across the diffusion trajectory, structural priors in shallow layers, and semantic priors (via a distillation-driven extractor) in deep layers. This design is presented as an engineering solution to known limitations of prior diffusion restorers, with performance demonstrated through experiments on single- and multi-degradation benchmarks. No equations or steps in the abstract reduce the claimed gains to fitted parameters defined by the same model, nor do they rely on self-citations that bear the load of uniqueness or correctness. The derivation chain remains self-contained against external benchmarks and standard diffusion formulations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review is abstract-only; no explicit free parameters, axioms, or invented entities can be identified from the given text.

pith-pipeline@v0.9.0 · 5767 in / 1111 out tokens · 47055 ms · 2026-05-21T13:55:21.217161+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

126 extracted references · 126 canonical work pages · 6 internal anchors

  1. [1]

    Defocus de- blurring using dual-pixel data

    Abdullah Abuolaim and Michael S Brown. Defocus de- blurring using dual-pixel data. InEuropean conference on computer vision, pages 111–126. Springer, 2020. 6, 16

  2. [2]

    Multimodal prompt perceiver: Empower adap- tiveness generalizability and fidelity for all-in-one image restoration

    Yuang Ai, Huaibo Huang, Xiaoqiang Zhou, Jiexiang Wang, and Ran He. Multimodal prompt perceiver: Empower adap- tiveness generalizability and fidelity for all-in-one image restoration. InProceedings of the IEEE/CVF Conference 8 on Computer Vision and Pattern Recognition, pages 25432– 25444, 2024. 1, 2, 7

  3. [3]

    Nh-haze: An image dehazing benchmark with non- homogeneous hazy and haze-free images

    Codruta O Ancuti, Cosmin Ancuti, and Radu Timo- fte. Nh-haze: An image dehazing benchmark with non- homogeneous hazy and haze-free images. InProceedings of the IEEE/CVF conference on computer vision and pat- tern recognition workshops, pages 444–445, 2020. 6, 16

  4. [4]

    Gans trained by a two time-scale update rule converge to a local nash equilibrium.Asian Journal of Applied Science and Engineering, 8(1):25–34,

    Naresh Babu Bynagari. Gans trained by a two time-scale update rule converge to a local nash equilibrium.Asian Journal of Applied Science and Engineering, 8(1):25–34,

  5. [5]

    Retinexformer: One-stage retinex-based transformer for low-light image enhance- ment

    Yuanhao Cai, Hao Bian, Jing Lin, Haoqian Wang, Radu Timofte, and Yulun Zhang. Retinexformer: One-stage retinex-based transformer for low-light image enhance- ment. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 12504–12513, 2023. 7

  6. [6]

    Grids: Grouped multiple-degradation restoration with image degradation similarity

    Shuo Cao, Yihao Liu, Wenlong Zhang, Yu Qiao, and Chao Dong. Grids: Grouped multiple-degradation restoration with image degradation similarity. InEuropean Conference on Computer Vision, pages 70–87. Springer, 2024. 2

  7. [7]

    Gated context aggregation network for image dehazing and deraining

    Dongdong Chen, Mingming He, Qingnan Fan, Jing Liao, Liheng Zhang, Dongdong Hou, Lu Yuan, and Gang Hua. Gated context aggregation network for image dehazing and deraining. In2019 IEEE winter conference on applications of computer vision (WACV), pages 1375–1383. IEEE, 2019. 7

  8. [8]

    Unirestore: Unified perceptual and task-oriented image restoration model us- ing diffusion prior

    I Chen, Wei-Ting Chen, Yu-Wei Liu, Yuan-Chun Chiang, Sy-Yen Kuo, Ming-Hsuan Yang, et al. Unirestore: Unified perceptual and task-oriented image restoration model us- ing diffusion prior. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 17969–17979,

  9. [9]

    Hinet: Half instance normalization network for image restoration

    Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Cheng- peng Chen. Hinet: Half instance normalization network for image restoration. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 182–192, 2021. 18

  10. [10]

    Simple baselines for image restoration

    Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InEuropean con- ference on computer vision, pages 17–33. Springer, 2022. 1, 7, 18

  11. [11]

    Sparse sampling transformer with uncertainty- driven ranking for unified removal of raindrops and rain streaks

    Sixiang Chen, Tian Ye, Jinbin Bai, Erkang Chen, Jun Shi, and Lei Zhu. Sparse sampling transformer with uncertainty- driven ranking for unified removal of raindrops and rain streaks. InProceedings of the IEEE/CVF international con- ference on computer vision, pages 13106–13117, 2023. 1

  12. [12]

    Jstasr: Joint size and transparency- aware snow removal algorithm based on modified partial convolution and veiling effect removal

    Wei-Ting Chen, Hao-Yu Fang, Jian-Jiun Ding, Cheng-Che Tsai, and Sy-Yen Kuo. Jstasr: Joint size and transparency- aware snow removal algorithm based on modified partial convolution and veiling effect removal. InEuropean con- ference on computer vision, pages 754–770. Springer, 2020. 1

  13. [13]

    All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss

    Wei-Ting Chen, Hao-Yu Fang, Cheng-Lin Hsieh, Cheng- Che Tsai, I Chen, Jian-Jiun Ding, Sy-Yen Kuo, et al. All snow removed: Single image desnowing algorithm using hierarchical dual-tree complex wavelet representation and contradict channel loss. InProceedings of the IEEE/CVF international conference on computer vision, pages 4196– 4205, 2021. 1

  14. [14]

    Learning a sparse transformer network for effective image deraining

    Xiang Chen, Hao Li, Mingqiang Li, and Jinshan Pan. Learning a sparse transformer network for effective image deraining. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5896–5905,

  15. [15]

    A comparative study of image restoration networks for general backbone network design

    Xiangyu Chen, Zheyuan Li, Yuandong Pu, Yihao Liu, Jiantao Zhou, Yu Qiao, and Chao Dong. A comparative study of image restoration networks for general backbone network design. InEuropean Conference on Computer Vi- sion, pages 74–91. Springer, 2024. 7

  16. [16]

    Foundir-v2: Optimizing pre-training data mixtures for image restoration foundation model.arXiv preprint arXiv:2512.09282, 2025

    Xiang Chen, Jinshan Pan, Jiangxin Dong, Jian Yang, and Jinhui Tang. Foundir-v2: Optimizing pre-training data mixtures for image restoration foundation model.arXiv preprint arXiv:2512.09282, 2025. 7

  17. [17]

    In- structir: High-quality image restoration following human instructions

    Marcos V Conde, Gregor Geigle, and Radu Timofte. In- structir: High-quality image restoration following human instructions. InEuropean Conference on Computer Vision, pages 1–21. Springer, 2024. 7

  18. [18]

    Exploring the potential of pooling techniques for universal image restora- tion.IEEE Transactions on Image Processing, 2025

    Yuning Cui, Wenqi Ren, and Alois Knoll. Exploring the potential of pooling techniques for universal image restora- tion.IEEE Transactions on Image Processing, 2025. 7

  19. [19]

    Adair: Adaptive all-in-one image restoration via frequency mining and mod- ulation

    Yuning Cui, Syed Waqas Zamir, Salman Khan, Alois Knoll, Mubarak Shah, and Fahad Shahbaz Khan. Adair: Adaptive all-in-one image restoration via frequency mining and mod- ulation. In13th international conference on learning repre- sentations, ICLR 2025, pages 57335–57356. International Conference on Learning Representations, ICLR, 2025. 7

  20. [20]

    Deepsn-net: Deep semi-smooth newton driven network for blind image restoration.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

    Xin Deng, Chenxiao Zhang, Lai Jiang, Jingyuan Xia, and Mai Xu. Deepsn-net: Deep semi-smooth newton driven network for blind image restoration.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 18

  21. [21]

    Restoration by generation with constrained priors

    Zheng Ding, Xuaner Zhang, Zhuowen Tu, and Zhihao Xia. Restoration by generation with constrained priors. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2567–2577, 2024. 2

  22. [22]

    A general decoupled learning framework for parameterized image operators.IEEE trans- actions on pattern analysis and machine intelligence, 43 (1):33–47, 2019

    Qingnan Fan, Dongdong Chen, Lu Yuan, Gang Hua, Neng- hai Yu, and Baoquan Chen. A general decoupled learning framework for parameterized image operators.IEEE trans- actions on pattern analysis and machine intelligence, 43 (1):33–47, 2019. 18

  23. [23]

    Iterative predictor-critic code decoding for real-world image dehaz- ing

    Jiayi Fu, Siyu Liu, Zikun Liu, Chun-Le Guo, Hyunhee Park, Ruiqi Wu, Guoqing Wang, and Chongyi Li. Iterative predictor-critic code decoding for real-world image dehaz- ing. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12700–12709, 2025. 1

  24. [24]

    Weatherbench: A real-world benchmark dataset for all-in-one adverse weather image restoration

    Qiyuan Guan, Qianfeng Yang, Xiang Chen, Tianyu Song, Guiyue Jin, and Jiyu Jin. Weatherbench: A real-world benchmark dataset for all-in-one adverse weather image restoration. InProceedings of the 33rd ACM international conference on multimedia, pages 12607–12613, 2025. 6, 16

  25. [25]

    Image dehazing transformer with transmission-aware 3d position embedding

    Chun-Le Guo, Qixin Yan, Saeed Anwar, Runmin Cong, Wenqi Ren, and Chongyi Li. Image dehazing transformer with transmission-aware 3d position embedding. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5812–5820, 2022. 1 9

  26. [26]

    Onerestore: A universal restoration framework for composite degradation

    Yu Guo, Yuan Gao, Yuxu Lu, Huilin Zhu, Ryan Wen Liu, and Shengfeng He. Onerestore: A universal restoration framework for composite degradation. InEuropean confer- ence on computer vision, pages 255–272. Springer, 2024. 2

  27. [27]

    Denoising dif- fusion probabilistic models.Advances in neural informa- tion processing systems, 33:6840–6851, 2020

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising dif- fusion probabilistic models.Advances in neural informa- tion processing systems, 33:6840–6851, 2020. 2

  28. [28]

    Uni- versal image restoration pre-training via degradation classi- fication.arXiv preprint arXiv:2501.15510, 2025

    JiaKui Hu, Lujia Jin, Zhengjian Yao, and Yanye Lu. Uni- versal image restoration pre-training via degradation classi- fication.arXiv preprint arXiv:2501.15510, 2025. 1, 7

  29. [29]

    Universal image restoration pre-training via masked degradation classification.arXiv preprint arXiv:2510.13282, 2025

    JiaKui Hu, Zhengjian Yao, Lujia Jin, Yinghao Chen, and Yanye Lu. Universal image restoration pre-training via masked degradation classification.arXiv preprint arXiv:2510.13282, 2025. 2, 7

  30. [30]

    Perceiver: General perception with iterative attention

    Andrew Jaegle, Felix Gimeno, Andy Brock, Oriol Vinyals, Andrew Zisserman, and Joao Carreira. Perceiver: General perception with iterative attention. InInternational confer- ence on machine learning, pages 4651–4664. PMLR, 2021. 4

  31. [31]

    A survey on all-in-one image restoration: Tax- onomy, evaluation and future trends.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

    Junjun Jiang, Zengyuan Zuo, Gang Wu, Kui Jiang, and Xi- anming Liu. A survey on all-in-one image restoration: Tax- onomy, evaluation and future trends.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 5

  32. [32]

    Autodir: Automatic all-in-one image restoration with latent diffusion

    Yitong Jiang, Zhaoyang Zhang, Tianfan Xue, and Jinwei Gu. Autodir: Automatic all-in-one image restoration with latent diffusion. InEuropean Conference on Computer Vi- sion, pages 340–359. Springer, 2024. 7

  33. [33]

    Dnf: Decouple and feedback network for seeing in the dark

    Xin Jin, Ling-Hao Han, Zhen Li, Chun-Le Guo, Zhi Chai, and Chongyi Li. Dnf: Decouple and feedback network for seeing in the dark. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 18135–18144, 2023. 1

  34. [34]

    Musiq: Multi-scale image quality transformer

    Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. Musiq: Multi-scale image quality transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 5148–5157, 2021. 6

  35. [35]

    Dual prompting image restoration with diffusion transformers

    Dehong Kong, Fan Li, Zhixin Wang, Jiaqi Xu, Renjing Pei, Wenbo Li, and WenQi Ren. Dual prompting image restoration with diffusion transformers. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12809–12819, 2025. 2

  36. [36]

    Towards ef- fective multiple-in-one image restoration: A sequential and prompt learning strategy

    Xiangtao Kong, Chao Dong, and Lei Zhang. Towards ef- fective multiple-in-one image restoration: A sequential and prompt learning strategy.arXiv preprint arXiv:2401.03379,

  37. [37]

    Benchmark- ing single-image dehazing and beyond.IEEE transactions on image processing, 28(1):492–505, 2018

    Boyi Li, Wenqi Ren, Dengpan Fu, Dacheng Tao, Dan Feng, Wenjun Zeng, and Zhangyang Wang. Benchmark- ing single-image dehazing and beyond.IEEE transactions on image processing, 28(1):492–505, 2018. 6, 16

  38. [38]

    All-in-one image restoration for unknown corruption

    Boyun Li, Xiao Liu, Peng Hu, Zhongqin Wu, Jiancheng Lv, and Xi Peng. All-in-one image restoration for unknown corruption. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17452– 17462, 2022. 1, 2, 7, 18

  39. [39]

    Promptcir: Blind compressed image restoration with prompt learning

    Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, and Zhibo Chen. Promptcir: Blind compressed image restoration with prompt learning. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6442–6452, 2024. 1

  40. [40]

    Mair: A locality-and continuity- preserving mamba for image restoration

    Boyun Li, Haiyu Zhao, Wenxin Wang, Peng Hu, Yuan- biao Gou, and Xi Peng. Mair: A locality-and continuity- preserving mamba for image restoration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 7491–7501, 2025. 18

  41. [41]

    Embedding fourier for ultra-high-definition low-light im- age enhancement.arXiv preprint arXiv:2302.11831, 2023

    Chongyi Li, Chun-Le Guo, Man Zhou, Zhexin Liang, Shangchen Zhou, Ruicheng Feng, and Chen Change Loy. Embedding fourier for ultra-high-definition low-light im- age enhancement.arXiv preprint arXiv:2302.11831, 2023. 1

  42. [42]

    Fouriermamba: Fourier learning integration with state space models for image deraining.arXiv preprint arXiv:2405.19450, 2024

    Dong Li, Yidi Liu, Xueyang Fu, Senyan Xu, and Zheng- Jun Zha. Fouriermamba: Fourier learning integration with state space models for image deraining.arXiv preprint arXiv:2405.19450, 2024. 1

  43. [43]

    Foundir: Unleashing million-scale training data to advance foundation models for image restoration

    Hao Li, Xiang Chen, Jiangxin Dong, Jinhui Tang, and Jin- shan Pan. Foundir: Unleashing million-scale training data to advance foundation models for image restoration. In Proceedings of the IEEE/CVF international conference on computer vision, pages 12626–12636, 2025. 2, 7

  44. [44]

    Megasr: Mining customized semantics and expressive guidance for image super-resolution.arXiv preprint arXiv:2503.08096,

    Xinrui Li, Jianlong Wu, Xinchuan Huang, Chong Chen, Weili Guan, Xian-Sheng Hua, and Liqiang Nie. Megasr: Mining customized semantics and expressive guidance for image super-resolution.arXiv preprint arXiv:2503.08096,

  45. [45]

    Swinir: Image restoration using swin transformer

    Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 1833– 1844, 2021. 7, 18

  46. [46]

    Unsupervised image denoising in real-world scenarios via self-collaboration parallel generative adversarial branches

    Xin Lin, Chao Ren, Xiao Liu, Jie Huang, and Yinjie Lei. Unsupervised image denoising in real-world scenarios via self-collaboration parallel generative adversarial branches. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 12642–12652, 2023. 1

  47. [47]

    Residual denoising diffu- sion models

    Jiawei Liu, Qiang Wang, Huijie Fan, Yinong Wang, Yan- dong Tang, and Liangqiong Qu. Residual denoising diffu- sion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2773– 2783, 2024. 7, 18

  48. [48]

    Tape: Task-agnostic prior embedding for image restora- tion

    Lin Liu, Lingxi Xie, Xiaopeng Zhang, Shanxin Yuan, Xi- angyu Chen, Wengang Zhou, Houqiang Li, and Qi Tian. Tape: Task-agnostic prior embedding for image restora- tion. InEuropean conference on computer vision, pages 447–464. Springer, 2022. 18

  49. [49]

    Griddehazenet: Attention-based multi-scale network for image dehazing

    Xiaohong Liu, Yongrui Ma, Zhihao Shi, and Jun Chen. Griddehazenet: Attention-based multi-scale network for image dehazing. InProceedings of the IEEE/CVF inter- national conference on computer vision, pages 7314–7323,

  50. [50]

    Diff-plugin: Revitalizing details for diffusion-based low-level tasks

    Yuhao Liu, Zhanghan Ke, Fang Liu, Nanxuan Zhao, and Rynson WH Lau. Diff-plugin: Revitalizing details for diffusion-based low-level tasks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4197–4208, 2024. 3 10

  51. [51]

    Desnownet: Context-aware deep network for snow removal.IEEE Transactions on Image Processing, 27(6): 3064–3073, 2018

    Yun-Fu Liu, Da-Wei Jaw, Shih-Chia Huang, and Jenq-Neng Hwang. Desnownet: Context-aware deep network for snow removal.IEEE Transactions on Image Processing, 27(6): 3064–3073, 2018. 1, 6

  52. [52]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 6

  53. [53]

    Visual-instructed degradation diffusion for all-in-one image restoration

    Wenyang Luo, Haina Qin, Zewen Chen, Libin Wang, Dan- dan Zheng, Yuming Li, Yufan Liu, Bing Li, and Weiming Hu. Visual-instructed degradation diffusion for all-in-one image restoration. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12764–12777,

  54. [54]

    arXiv preprint arXiv:2310.01018 , volume=

    Ziwei Luo, Fredrik K Gustafsson, Zheng Zhao, Jens Sj¨olund, and Thomas B Sch ¨on. Controlling vision- language models for universal image restoration.arXiv preprint arXiv:2310.01018, 3(8), 2023. 2, 3, 6, 7, 18

  55. [55]

    Image restoration with mean-reverting stochastic differential equations.arXiv preprint arXiv:2301.11699, 2023

    Ziwei Luo, Fredrik K Gustafsson, Zheng Zhao, Jens Sj¨olund, and Thomas B Sch ¨on. Image restoration with mean-reverting stochastic differential equations.arXiv preprint arXiv:2301.11699, 2023. 3, 7, 15

  56. [56]

    Prores: Explor- ing degradation-aware visual prompt for universal image restoration.arXiv preprint arXiv:2306.13653, 2023

    Jiaqi Ma, Tianheng Cheng, Guoli Wang, Qian Zhang, Xinggang Wang, and Lefei Zhang. Prores: Explor- ing degradation-aware visual prompt for universal image restoration.arXiv preprint arXiv:2306.13653, 2023. 2

  57. [57]

    Waterloo exploration database: New challenges for image quality as- sessment models.IEEE Transactions on Image Processing, 26(2):1004–1016, 2016

    Kede Ma, Zhengfang Duanmu, Qingbo Wu, Zhou Wang, Hongwei Yong, Hongliang Li, and Lei Zhang. Waterloo exploration database: New challenges for image quality as- sessment models.IEEE Transactions on Image Processing, 26(2):1004–1016, 2016. 6, 16

  58. [58]

    A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics

    David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. InProceedings eighth IEEE international conference on computer vision. ICCV 2001, pages 416–423. IEEE, 2001. 6, 16

  59. [59]

    The power of context: How multimodality improves image super-resolution

    Kangfu Mei, Hossein Talebi, Mojtaba Ardakani, Vishal M Patel, Peyman Milanfar, and Mauricio Delbracio. The power of context: How multimodality improves image super-resolution. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 23141–23152,

  60. [60]

    Deep generalized unfolding networks for image restoration

    Chong Mou, Qian Wang, and Jian Zhang. Deep generalized unfolding networks for image restoration. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 17399–17410, 2022. 18

  61. [61]

    T2i-adapter: Learn- ing adapters to dig out more controllable ability for text-to- image diffusion models

    Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang Qi, and Ying Shan. T2i-adapter: Learn- ing adapters to dig out more controllable ability for text-to- image diffusion models. InProceedings of the AAAI con- ference on artificial intelligence, pages 4296–4304, 2024. 2

  62. [62]

    Deep multi-scale convolutional neural network for dynamic scene deblurring

    Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3883–3891,

  63. [63]

    Cloud re- moval advances: A comprehensive review and analysis for optical remote sensing images.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,

    Jin Ning, Lianbin Xie, Jie Yin, and Yiguang Liu. Cloud re- moval advances: A comprehensive review and analysis for optical remote sensing images.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,

  64. [64]

    Restoring vision in adverse weather conditions with patch-based denoising diffusion models.IEEE transactions on pattern analysis and machine intelligence, 45(8):10346–10357, 2023

    Ozan ¨Ozdenizci and Robert Legenstein. Restoring vision in adverse weather conditions with patch-based denoising diffusion models.IEEE transactions on pattern analysis and machine intelligence, 45(8):10346–10357, 2023. 7

  65. [65]

    Promptir: Prompting for all- in-one image restoration.Advances in Neural Information Processing Systems, 36:71275–71293, 2023

    Vaishnav Potlapalli, Syed Waqas Zamir, Salman H Khan, and Fahad Shahbaz Khan. Promptir: Prompting for all- in-one image restoration.Advances in Neural Information Processing Systems, 36:71275–71293, 2023. 1, 2, 7

  66. [66]

    Spire: Semantic prompt-driven image restoration

    Chenyang Qi, Zhengzhong Tu, Keren Ye, Mauricio Delbra- cio, Peyman Milanfar, Qifeng Chen, and Hossein Talebi. Spire: Semantic prompt-driven image restoration. InEu- ropean Conference on Computer Vision, pages 446–464. Springer, 2024. 2

  67. [67]

    Attentive generative adversarial network for rain- drop removal from a single image

    Rui Qian, Robby T Tan, Wenhan Yang, Jiajun Su, and Jiay- ing Liu. Attentive generative adversarial network for rain- drop removal from a single image. InProceedings of the IEEE conference on computer vision and pattern recogni- tion, pages 2482–2491, 2018. 6, 7, 16

  68. [68]

    Ffa-net: Feature fusion attention network for single image dehazing

    Xu Qin, Zhilin Wang, Yuanchao Bai, Xiaodong Xie, and Huizhu Jia. Ffa-net: Feature fusion attention network for single image dehazing. InProceedings of the AAAI confer- ence on artificial intelligence, pages 11908–11915, 2020. 6, 16

  69. [69]

    Xpsr: Cross-modal priors for diffusion-based image super-resolution

    Yunpeng Qu, Kun Yuan, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, and Chao Zhou. Xpsr: Cross-modal priors for diffusion-based image super-resolution. InEuropean Conference on Computer Vision, pages 285–303. Springer,

  70. [70]

    Deep learning for seeing through window with raindrops

    Yuhui Quan, Shijie Deng, Yixin Chen, and Hui Ji. Deep learning for seeing through window with raindrops. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2463–2471, 2019. 7

  71. [71]

    Neumann network with recursive kernels for single image defocus deblurring

    Yuhui Quan, Zicong Wu, and Hui Ji. Neumann network with recursive kernels for single image defocus deblurring. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5754–5763, 2023. 1, 7

  72. [72]

    Learn- ing transferable visual models from natural language super- vision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learn- ing transferable visual models from natural language super- vision. InInternational conference on machine learning, pages 8748–8763. PmLR, 2021. 4, 5

  73. [73]

    Awracle: All- weather image restoration using visual in-context learning

    Sudarshan Rajagopalan and Vishal M Patel. Awracle: All- weather image restoration using visual in-context learning. InProceedings of the AAAI Conference on Artificial Intel- ligence, pages 6675–6683, 2025. 18

  74. [74]

    Hierarchical Text-Conditional Image Generation with CLIP Latents

    Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. Hierarchical text-conditional image generation with clip latents.arXiv preprint arXiv:2204.06125, 1(2):3, 2022. 2

  75. [75]

    Progressive image deraining net- works: A better and simpler baseline

    Dongwei Ren, Wangmeng Zuo, Qinghua Hu, Pengfei Zhu, and Deyu Meng. Progressive image deraining net- works: A better and simpler baseline. InProceedings of 11 the IEEE/CVF conference on computer vision and pattern recognition, pages 3937–3946, 2019. 7

  76. [76]

    Learning to deblur using light field generated and real de- focus images

    Lingyan Ruan, Bin Chen, Jizhou Li, and Miuling Lam. Learning to deblur using light field generated and real de- focus images. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 16304– 16313, 2022. 7

  77. [77]

    Photorealistic text-to-image diffusion models with deep language understanding.Advances in neural in- formation processing systems, 35:36479–36494, 2022

    Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily L Denton, Kamyar Ghasemipour, Raphael Gontijo Lopes, Burcu Karagol Ayan, Tim Sali- mans, et al. Photorealistic text-to-image diffusion models with deep language understanding.Advances in neural in- formation processing systems, 35:36479–36494, 2022. 2

  78. [78]

    Multi-domain multi-scale diffusion model for low-light image enhancement

    Kai Shang, Mingwen Shao, Chao Wang, Yuanshuo Cheng, and Shuigen Wang. Multi-domain multi-scale diffusion model for low-light image enhancement. InProceedings of the AAAI conference on artificial intelligence, pages 4722– 4730, 2024. 1

  79. [79]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score- based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456, 2020. 2, 15

  80. [80]

    Vision transformers for single image dehazing.IEEE Transactions on Image Processing, 32:1927–1941, 2023

    Yuda Song, Zhuqing He, Hui Qian, and Xin Du. Vision transformers for single image dehazing.IEEE Transactions on Image Processing, 32:1927–1941, 2023. 1, 7

Showing first 80 references.