Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning

Dongjin Kim; Guanghui Wang; Jaekyun Ko; Soomin Lee; Tae Hyun Kim

arxiv: 2603.04870 · v2 · pith:6SD74NYRnew · submitted 2026-03-05 · 💻 cs.CV

Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning

Jaekyun Ko , Dongjin Kim , Soomin Lee , Guanghui Wang , Tae Hyun Kim This is my paper

Pith reviewed 2026-05-21 11:57 UTC · model grok-4.3

classification 💻 cs.CV

keywords sRGB noise generationdiffusion modelsprompt learningreal noise synthesisimage denoisingcamera metadatagenerative modeling

0 comments

The pith

A diffusion model learns prompt features from limited pairs to generate realistic sRGB noise without camera metadata.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a Prompt-Driven Noise Generation (PNG) framework that uses diffusion to synthesize realistic noisy sRGB images. It learns high-dimensional prompt features that capture real-world noise characteristics directly from available noisy-clean pairs. This approach targets the scarcity of such pairs and removes the need for camera metadata that previous generative methods require during training and testing. A sympathetic reader would care because successful noise synthesis from limited data could expand the training sets available for real-world denoising models across different devices.

Core claim

The PNG model acquires high-dimensional prompt features that capture the characteristics of real-world input noise and creates a variety of realistic noisy images consistent with the distribution of the input noise, eliminating the dependency on explicit camera metadata.

What carries the argument

High-dimensional prompt features learned by the PNG diffusion model from noisy-clean image pairs to represent and synthesize input noise distributions.

Load-bearing premise

High-dimensional prompt features learned from limited noisy-clean pairs can reliably capture and generalize the full distribution of real-world sRGB noise across unseen devices and conditions without camera metadata.

What would settle it

Generated noisy images that fail to match the noise statistics or visual appearance of real captures from a previously unseen camera device would falsify the claim.

Figures

Figures reproduced from arXiv: 2603.04870 by Dongjin Kim, Guanghui Wang, Jaekyun Ko, Soomin Lee, Tae Hyun Kim.

**Figure 2.** Figure 2: Overview of the proposed method. (a) Training pipeline. (b) Inference pipeline. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: (a) Sketch of the Prompt Autoencoder (PAE). (b) Details of Global and Local Prompt Blocks. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: Visual comparison on denoising results with PSNR [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 4.** Figure 4: Visualization of synthetic noisy images on the SIDD [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

read the original abstract

Denoising in the sRGB image space is challenging due to large noise variability. Although end-to-end methods perform well, their effectiveness in real-world scenarios is limited by the scarcity of real noisy-clean image pairs, which are expensive and difficult to collect. To address this limitation, several generative methods have been developed to synthesize realistic noisy images from limited data. These approaches often rely on camera metadata during both training and testing to synthesize real-world noise. However, the lack of metadata or inconsistencies between devices restricts their usability. Therefore, we propose a novel framework called Prompt-Driven Noise Generation (PNG). This model is capable of acquiring high-dimensional prompt features that capture the characteristics of real-world input noise and creating a variety of realistic noisy images consistent with the distribution of the input noise. By eliminating the dependency on explicit camera metadata, our approach significantly enhances the generalizability and applicability of noise synthesis. Comprehensive experiments reveal that our model effectively produces realistic noisy images and show the successful application of these generated images in removing real-world noise across various benchmark datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the Prompt-Driven Noise Generation (PNG) framework, a diffusion-based model that learns high-dimensional prompt features directly from input noisy sRGB images to capture real-world noise characteristics and synthesize diverse realistic noisy images matching the input noise distribution. The central contribution is the elimination of explicit camera metadata during both training and inference, with claims of improved generalizability demonstrated via application to denoising benchmarks across multiple datasets.

Significance. If the generalization claims hold, the work would be significant for practical real-world denoising pipelines, as it removes a key practical barrier (metadata availability and device consistency) that limits prior generative noise synthesis methods. The prompt-driven approach to noise representation learning could enable more flexible use of limited noisy-clean pairs for training data augmentation in sRGB space.

major comments (2)

[§3 and §4.2] §3 (method) and §4.2 (cross-dataset experiments): The claim of device-agnostic generalization without metadata is load-bearing, yet the reported results use benchmarks whose device distributions overlap with typical training collections; no ablation is described that trains on one set of devices and evaluates synthesis on completely disjoint unseen devices/conditions to isolate whether the learned prompts truly encode transferable statistics rather than sensor-specific patterns.
[Tables 2-4] Tables 2-4: While performance on denoising benchmarks is asserted, the absence of an explicit metadata-free ablation (e.g., comparing PNG against metadata-dependent baselines when metadata is withheld at test time) leaves the central advantage unquantified relative to prior work.

minor comments (2)

[Abstract] The abstract states that 'comprehensive experiments reveal...' but provides no numerical values, baselines, or error bars; moving a concise quantitative summary (e.g., PSNR/SSIM deltas on key datasets) into the abstract would improve readability.
[§3.1] Notation for the prompt embedding dimension and its relation to the diffusion timestep conditioning should be clarified in §3.1 to avoid ambiguity when readers compare against standard diffusion conditioning schemes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the major comments point by point below and outline the revisions we will make to strengthen the evidence supporting our claims of metadata-free generalization.

read point-by-point responses

Referee: [§3 and §4.2] §3 (method) and §4.2 (cross-dataset experiments): The claim of device-agnostic generalization without metadata is load-bearing, yet the reported results use benchmarks whose device distributions overlap with typical training collections; no ablation is described that trains on one set of devices and evaluates synthesis on completely disjoint unseen devices/conditions to isolate whether the learned prompts truly encode transferable statistics rather than sensor-specific patterns.

Authors: We appreciate the referee pointing out this gap. Our cross-dataset experiments in §4.2 already span multiple real-world datasets collected under varying camera devices and imaging conditions, which provides some evidence of generalization. However, we agree that a dedicated ablation—training the model exclusively on images from one group of devices and evaluating noise synthesis performance on images from completely disjoint devices and conditions—would more rigorously isolate whether the prompt features capture transferable noise statistics. We will add this controlled ablation study to the revised manuscript. revision: yes
Referee: [Tables 2-4] Tables 2-4: While performance on denoising benchmarks is asserted, the absence of an explicit metadata-free ablation (e.g., comparing PNG against metadata-dependent baselines when metadata is withheld at test time) leaves the central advantage unquantified relative to prior work.

Authors: We concur that directly quantifying the practical benefit of our metadata-free approach requires an explicit comparison. We will add an ablation to Tables 2-4 (and associated text) in which metadata-dependent baseline methods are evaluated with metadata withheld at test time, while PNG operates without any metadata. This will allow a head-to-head quantification of the advantage on the denoising benchmarks. revision: yes

Circularity Check

0 steps flagged

No circularity; standard data-driven generative modeling from observed pairs

full rationale

The abstract and method description present a diffusion model that learns high-dimensional prompt embeddings directly from limited noisy-clean image pairs to match and synthesize noise distributions. This is a conventional supervised generative setup with no quoted equations or steps that reduce a claimed prediction back to its own fitted inputs by construction. No self-citation load-bearing uniqueness theorems, ansatz smuggling, or renaming of known results appear in the provided text. Generalization to unseen devices is asserted empirically rather than derived tautologically, leaving the central claim self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The approach rests on standard assumptions in diffusion modeling and prompt learning for generative tasks. Numerous learned parameters are expected in the neural network components, but specific free parameters are not detailed in the abstract.

free parameters (1)

prompt feature dimensionality
High-dimensional prompt features are learned to capture noise characteristics; the exact dimension and related hyperparameters are fitted during training.

axioms (1)

domain assumption Diffusion processes can model the distribution of real sRGB noise when conditioned on learned prompts
The framework builds the generation process on diffusion models conditioned via prompts.

invented entities (1)

Prompt-Driven Noise Generation (PNG) framework no independent evidence
purpose: To synthesize realistic noisy images without camera metadata by learning prompt features
New model introduced to address limitations of metadata-dependent methods.

pith-pipeline@v0.9.0 · 5719 in / 1348 out tokens · 59512 ms · 2026-05-21T11:57:14.638669+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a novel framework called Prompt-Driven Noise Generation (PNG). This model is capable of acquiring high-dimensional prompt features that capture the characteristics of real-world input noise
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the Prompt Encoder learns global and local prompt components, PGlobal and PLocal, as learnable parameters that encode real world noise characteristics

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

78 extracted references · 78 canonical work pages · 4 internal anchors

[1]

A high-quality denoising dataset for smartphone cameras

Abdelrahman Abdelhamed, Stephen Lin, and Michael S Brown. A high-quality denoising dataset for smartphone cameras. InCVPR, 2018. 1, 6

work page 2018
[2]

Noise flow: Noise modeling with con- ditional normalizing flows

Abdelrahman Abdelhamed, Marcus A Brubaker, and Michael S Brown. Noise flow: Noise modeling with con- ditional normalizing flows. InICCV, 2019. 1, 2, 5

work page 2019
[3]

Ntire 2020 challenge on real image denoising: Dataset, methods and results

Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, and Michael S Brown. Ntire 2020 challenge on real image denoising: Dataset, methods and results. InCVPR, 2020. 1, 6

work page 2020
[4]

Language models are few-shot learners.NeurIPS, 2020

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Sub- biah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.NeurIPS, 2020. 2

work page 2020
[5]

Toward real-world single image super-resolution: A new benchmark and a new model

Jianrui Cai, Hui Zeng, Hongwei Yong, Zisheng Cao, and Lei Zhang. Toward real-world single image super-resolution: A new benchmark and a new model. InICCV, 2019. 7

work page 2019
[6]

Hinet: Half instance normalization network for image restoration

Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Cheng- peng Chen. Hinet: Half instance normalization network for image restoration. InCVPR, 2021. 1

work page 2021
[7]

Simple baselines for image restoration

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InECCV, 2022. 1

work page 2022
[8]

Masked and shuffled blind spot denoising for real-world images

Hamadi Chihaoui and Paolo Favaro. Masked and shuffled blind spot denoising for real-world images. InCVPR, 2024. 2

work page 2024
[9]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InCVPR, 2009. 5, 6

work page 2009
[10]

Scaling rectified flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim En- tezari, Jonas M¨uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow transformers for high-resolution image synthesis. InICML,

work page
[11]

srgb real noise synthesizing with neighboring correlation-aware noise model

Zixuan Fu, Lanqing Guo, and Bihan Wen. srgb real noise synthesizing with neighboring correlation-aware noise model. InCVPR, 2023. 1, 2, 5, 6, 3

work page 2023
[12]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InCVPR, 2016. 4

work page 2016
[13]

Denoising diffu- sion probabilistic models.NeurIPS, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.NeurIPS, 2020. 3

work page 2020
[14]

Estimation of non- normalized statistical models by score matching.JMLR, 2005

Aapo Hyv ¨arinen and Peter Dayan. Estimation of non- normalized statistical models by score matching.JMLR, 2005. 3

work page 2005
[15]

Fast camera image denoising on mobile gpus with deep learning, mobile ai 2021 challenge: Report

Andrey Ignatov, Kim Byeoung-su, Radu Timofte, and Ange- line Pouget. Fast camera image denoising on mobile gpus with deep learning, mobile ai 2021 challenge: Report. In CVPRW, 2021. 7

work page 2021
[16]

C2n: Practical generative noise modeling for real-world denoising

Geonwoon Jang, Wooseok Lee, Sanghyun Son, and Ky- oung Mu Lee. C2n: Practical generative noise modeling for real-world denoising. InICCV, 2021. 2, 6

work page 2021
[17]

Progressive growing of gans for improved quality, stability, and variation

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. InICLR, 2018. 2

work page 2018
[18]

Elucidating the design space of diffusion-based generative models.NeurIPS, 2022

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.NeurIPS, 2022. 3, 2

work page 2022
[19]

Analyzing and improving the training dynamics of diffusion models

Tero Karras, Miika Aittala, Jaakko Lehtinen, Janne Hellsten, Timo Aila, and Samuli Laine. Analyzing and improving the training dynamics of diffusion models. InCVPR, 2024. 2, 3

work page 2024
[20]

srgb real noise modeling via noise-aware sampling with normalizing flows

Dongjin Kim, Donggoo Jung, Sungyong Baik, and Tae Hyun Kim. srgb real noise modeling via noise-aware sampling with normalizing flows. InICLR, 2024. 1, 2, 5, 6, 3

work page 2024
[21]

Idf: Iterative dynamic filtering networks for generalizable image denoising

Dongjin Kim, Jaekyun Ko, Muhammad Kashif Ali, and Tae Hyun Kim. Idf: Iterative dynamic filtering networks for generalizable image denoising. InICCV, 2025. 1

work page 2025
[22]

Continuous degradation modeling via latent flow matching for real-world super-resolution

Hyeonjae Kim, Dongjin Kim, Eugene Jin, and Tae Hyun Kim. Continuous degradation modeling via latent flow matching for real-world super-resolution. InAAAI, 2026. 1, 7

work page 2026
[23]

Variational diffusion models.NeurIPS, 2021

Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models.NeurIPS, 2021. 3

work page 2021
[24]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,

work page internal anchor Pith review Pith/arXiv arXiv
[25]

Act-diffusion: Efficient adversarial consistency training for one-step diffusion models

Fei Kong, Jinhao Duan, Lichao Sun, Hao Cheng, Renjing Xu, Hengtao Shen, Xiaofeng Zhu, Xiaoshuang Shi, and Kaidi Xu. Act-diffusion: Efficient adversarial consistency training for one-step diffusion models. InCVPR, 2024. 3

work page 2024
[26]

Modeling srgb camera noise with normalizing flows

Shayan Kousha, Ali Maleky, Michael S Brown, and Marcus A Brubaker. Modeling srgb camera noise with normalizing flows. InCVPR, 2022. 1, 2, 6, 5

work page 2022
[27]

Ap- bsn: Self-supervised denoising for real-world images via asymmetric pd and blind-spot network

Wooseok Lee, Sanghyun Son, and Kyoung Mu Lee. Ap- bsn: Self-supervised denoising for real-world images via asymmetric pd and blind-spot network. InCVPR, 2022. 2

work page 2022
[28]

Promptcir: blind compressed image restoration with prompt learning

Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, and Zhibo Chen. Promptcir: blind compressed image restoration with prompt learning. In CVPRW, 2024. 2

work page 2024
[29]

Ucip: A universal framework for compressed image super-resolution using dynamic prompt

Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, and Zhibo Chen. Ucip: A universal framework for compressed image super-resolution using dynamic prompt. InECCV, 2024

work page 2024
[30]

Prompt-in-prompt learning for universal image restoration.arXiv preprint arXiv:2312.05038, 2023

Zilong Li, Yiming Lei, Chenglong Ma, Junping Zhang, and Hongming Shan. Prompt-in-prompt learning for universal image restoration.arXiv preprint arXiv:2312.05038, 2023. 2

work page arXiv 2023
[31]

Diffbir: Toward blind image restoration with generative diffusion prior

Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Yu Qiao, Wanli Ouyang, and Chao Dong. Diffbir: Toward blind image restoration with generative diffusion prior. InECCV, 2024. 3

work page 2024
[32]

On the variance of the adaptive learning rate and beyond

Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. On the variance of the adaptive learning rate and beyond. InICLR, 2020. 5

work page 2020
[33]

SGDR: Stochastic Gradient Descent with Warm Restarts

Ilya Loshchilov and Frank Hutter. Sgdr: Stochastic gradient descent with warm restarts.arXiv preprint arXiv:1608.03983,

work page internal anchor Pith review Pith/arXiv arXiv
[34]

Cosine normalization: Using cosine simi- larity instead of dot product in neural networks

Chunjie Luo, Jianfeng Zhan, Xiaohe Xue, Lei Wang, Rui Ren, and Qiang Yang. Cosine normalization: Using cosine simi- larity instead of dot product in neural networks. InICANN,

work page
[35]

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. Latent consistency models: Synthesizing high- resolution images with few-step inference.arXiv preprint arXiv:2310.04378, 2023. 2, 3

work page internal anchor Pith review Pith/arXiv arXiv 2023
[36]

A holistic approach to cross-channel image noise modeling and its application to image denoising

Seonghyeon Nam, Youngbae Hwang, Yasuyuki Matsushita, and Seon Joo Kim. A holistic approach to cross-channel image noise modeling and its application to image denoising. InCVPR, 2016. 1, 6

work page 2016
[37]

Random sub-samples generation for self- supervised real image denoising

Yizhong Pan, Xiao Liu, Xiangyu Liao, Yuanzhouhan Cao, and Chao Ren. Random sub-samples generation for self- supervised real image denoising. InICCV, 2023. 2

work page 2023
[38]

Learning controllable degradation for real-world super- resolution via constrained flows

Seobin Park, Dongjin Kim, Sungyong Baik, and Tae Hyun Kim. Learning controllable degradation for real-world super- resolution via constrained flows. InICML, 2023. 1, 7

work page 2023
[39]

Scalable diffusion models with transformers

William Peebles and Saining Xie. Scalable diffusion models with transformers. InICCV, 2023. 2, 5, 1, 3

work page 2023
[40]

Film: Visual reasoning with a general conditioning layer

Ethan Perez, Florian Strub, Harm De Vries, Vincent Du- moulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. InAAAI, 2018. 1

work page 2018
[41]

Benchmarking denoising algo- rithms with real photographs

Tobias Plotz and Stefan Roth. Benchmarking denoising algo- rithms with real photographs. InCVPR, 2017. 1

work page 2017
[42]

PromptIR: Prompting for all-in-one image restoration

Vaishnav Potlapalli, Syed Waqas Zamir, Salman Khan, and Fahad Khan. PromptIR: Prompting for all-in-one image restoration. InNeurIPS, 2023. 2

work page 2023
[43]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, 2022. 3

work page 2022
[44]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, 2022. 5

work page 2022
[45]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMICCAI, 2015. 3

work page 2015
[46]

Exploiting cloze-questions for few-shot text classification and natural language inference

Timo Schick and Hinrich Sch¨utze. Exploiting cloze-questions for few-shot text classification and natural language inference. InEACL, 2021. 2

work page 2021
[47]

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

Wenzhe Shi, Jose Caballero, Ferenc Husz´ar, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR, 2016. 5

work page 2016
[48]

Logan IV , Eric Wal- lace, and Sameer Singh

Taylor Shin, Yasaman Razeghi, Robert L. Logan IV , Eric Wal- lace, and Sameer Singh. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. InEMNLP, 2020. 2

work page 2020
[49]

Deep unsupervised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. InICML, 2015. 3

work page 2015
[50]

Improved techniques for training consistency models

Yang Song and Prafulla Dhariwal. Improved techniques for training consistency models. InICLR, 2024. 3, 5, 2

work page 2024
[51]

Generative modeling by estimating gradients of the data distribution.NeurIPS, 2019

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.NeurIPS, 2019. 3

work page 2019
[52]

Improved techniques for training score-based generative models.NeurIPS, 2020

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models.NeurIPS, 2020

work page 2020
[53]

Score-based generative modeling through stochastic differential equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. InICLR, 2021. 3

work page 2021
[54]

Consistency models

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. InICML, 2023. 2, 3

work page 2023
[55]

Attention is all you need.NeurIPS, 2017

A Vaswani. Attention is all you need.NeurIPS, 2017. 1

work page 2017
[56]

Promptre- storer: A prompting image restoration method with degrada- tion perception.NeurIPS, 2024

Cong Wang, Jinshan Pan, Wei Wang, Jiangxin Dong, Mengzhu Wang, Yakun Ju, and Junyang Chen. Promptre- storer: A prompting image restoration method with degrada- tion perception.NeurIPS, 2024. 2

work page 2024
[57]

Promptrr: Diffusion models as prompt generators for single image reflection removal.arXiv preprint arXiv:2402.02374,

Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae- Kyun Kim, Tong Lu, Hongdong Li, and Ming-Hsuan Yang. Promptrr: Diffusion models as prompt generators for single image reflection removal.arXiv preprint arXiv:2402.02374,

work page arXiv
[58]

Real-esrgan: Training real-world blind super-resolution with pure synthetic data

Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. InICCV, 2021. 7

work page 2021
[59]

Lg-bpn: Local and global blind-patch network for self-supervised real- world denoising

Zichun Wang, Ying Fu, Ji Liu, and Yulun Zhang. Lg-bpn: Local and global blind-patch network for self-supervised real- world denoising. InCVPR, 2023. 2, 4

work page 2023
[60]

Realistic noise synthesis with diffusion models.arXiv preprint arXiv:2305.14022, 2023

Qi Wu, Mingyan Han, Ting Jiang, Haoqiang Fan, Bing Zeng, and Shuaicheng Liu. Realistic noise synthesis with diffusion models.arXiv preprint arXiv:2305.14022, 2023. 2

work page arXiv 2023
[61]

One-step effective diffusion network for real-world image super-resolution.NeurIPS, 2024

Rongyuan Wu, Lingchen Sun, Zhiyuan Ma, and Lei Zhang. One-step effective diffusion network for real-world image super-resolution.NeurIPS, 2024. 3

work page 2024
[62]

Seesr: Towards semantics-aware real-world image super-resolution

Rongyuan Wu, Tao Yang, Lingchen Sun, Zhengqiang Zhang, Shuai Li, and Lei Zhang. Seesr: Towards semantics-aware real-world image super-resolution. InCVPR, 2024. 3

work page 2024
[63]

Freprompter: Frequency self-prompt for all-in-one image restoration.Pattern Recognition, 2025

Zhijian Wu, Wenhui Liu, Jingchao Wang, Jun Li, and Dingjiang Huang. Freprompter: Frequency self-prompt for all-in-one image restoration.Pattern Recognition, 2025. 2

work page 2025
[64]

Diffir: Efficient diffusion model for image restoration

Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. Diffir: Efficient diffusion model for image restoration. InICCV, pages 13095–13105, 2023. 3

work page 2023
[65]

Real-world Noisy Image Denoising: A New Benchmark

Jun Xu, Hui Li, Zhetong Liang, David Zhang, and Lei Zhang. Real-world noisy image denoising: A new benchmark.arXiv preprint arXiv:1804.02603, 2018. 1, 6

work page internal anchor Pith review Pith/arXiv arXiv 2018
[66]

Synthesizing realistic image restoration training pairs: A diffusion approach.arXiv preprint arXiv:2303.06994, 2023

Tao Yang, Peiran Ren, Lei Zhang, et al. Synthesizing realistic image restoration training pairs: A diffusion approach.arXiv preprint arXiv:2303.06994, 2023. 7

work page arXiv 2023
[67]

Dual adversarial network: Toward real-world noise removal and noise generation

Zongsheng Yue, Qian Zhao, Lei Zhang, and Deyu Meng. Dual adversarial network: Toward real-world noise removal and noise generation. InECCV, 2020. 6

work page 2020
[68]

Cycleisp: Real image restoration via improved data synthesis

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Cycleisp: Real image restoration via improved data synthesis. InCVPR, 2020. 1, 2

work page 2020
[69]

Learning enriched features for real image restoration and enhancement

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Learning enriched features for real image restoration and enhancement. InECCV, 2020. 1

work page 2020
[70]

Restormer: Efficient transformer for high-resolution image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InCVPR, 2022. 1

work page 2022
[71]

Mm-bsn: Self-supervised image denoising for real-world with multi-mask based on blind-spot network

Dan Zhang, Fangfang Zhou, Yuwen Jiang, and Zhengming Fu. Mm-bsn: Self-supervised image denoising for real-world with multi-mask based on blind-spot network. InCVPRW,

work page
[72]

Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE TIP, 2017

Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE TIP, 2017. 5

work page 2017
[73]

Learning to prompt for vision-language models.IJCV,

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.IJCV,

work page
[74]

Seg- prompt: Boosting open-world segmentation via category- level prompt learning

Muzhi Zhu, Hengtao Li, Hao Chen, Chengxiang Fan, Weian Mao, Chenchen Jing, Yifan Liu, and Chunhua Shen. Seg- prompt: Boosting open-world segmentation via category- level prompt learning. InICCV, 2023. 2

work page 2023
[75]

Iterative denoiser and noise estimator for self-supervised image denoising

Yunhao Zou, Chenggang Yan, and Ying Fu. Iterative denoiser and noise estimator for self-supervised image denoising. In ICCV, 2023. 2 Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning Supplementary Material Contents

work page 2023
[76]

Preliminaries

Proposed Method 3 3.1. Preliminaries . . . . . . . . . . . . . . . . . 3 3.2. Overall Flow: PNG . . . . . . . . . . . . . . 3 3.3. Prompt Autoencoder . . . . . . . . . . . . . 4 3.3.1 . Prompt Encoder . . . . . . . . . . . 4 3.3.2 . Decoder . . . . . . . . . . . . . . . 5 3.4. Prompt DiT (P-DiT) . . . . . . . . . . . . . 5

work page
[77]

Experimental Setup

Experiments 5 4.1. Experimental Setup . . . . . . . . . . . . . . 5 4.2. Real-World sRGB Noise Generation and Re- moval . . . . . . . . . . . . . . . . . . . . 6 4.3. Application: Metadata-Free Noise Generation 7 4.4. Ablation Study . . . . . . . . . . . . . . . . 8

work page
[78]

Supplementary Material 1 S.1

Conclusion 8 S. Supplementary Material 1 S.1. Prompt DiT . . . . . . . . . . . . . . . . . . 1 S.2. Training details of P-DiT . . . . . . . . . . . 2 S.2.1 . CM Parameterization . . . . . . . . 2 S.2.2 . CM Hyperparamters . . . . . . . . . 2 S.2.3 . Latent Code Normalization . . . . . 3 S.2.4 . P-DiT Hyperparamters . . . . . . . . 3 S.3. Model Size and In...

work page

[1] [1]

A high-quality denoising dataset for smartphone cameras

Abdelrahman Abdelhamed, Stephen Lin, and Michael S Brown. A high-quality denoising dataset for smartphone cameras. InCVPR, 2018. 1, 6

work page 2018

[2] [2]

Noise flow: Noise modeling with con- ditional normalizing flows

Abdelrahman Abdelhamed, Marcus A Brubaker, and Michael S Brown. Noise flow: Noise modeling with con- ditional normalizing flows. InICCV, 2019. 1, 2, 5

work page 2019

[3] [3]

Ntire 2020 challenge on real image denoising: Dataset, methods and results

Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, and Michael S Brown. Ntire 2020 challenge on real image denoising: Dataset, methods and results. InCVPR, 2020. 1, 6

work page 2020

[4] [4]

Language models are few-shot learners.NeurIPS, 2020

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Sub- biah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners.NeurIPS, 2020. 2

work page 2020

[5] [5]

Toward real-world single image super-resolution: A new benchmark and a new model

Jianrui Cai, Hui Zeng, Hongwei Yong, Zisheng Cao, and Lei Zhang. Toward real-world single image super-resolution: A new benchmark and a new model. InICCV, 2019. 7

work page 2019

[6] [6]

Hinet: Half instance normalization network for image restoration

Liangyu Chen, Xin Lu, Jie Zhang, Xiaojie Chu, and Cheng- peng Chen. Hinet: Half instance normalization network for image restoration. InCVPR, 2021. 1

work page 2021

[7] [7]

Simple baselines for image restoration

Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration. InECCV, 2022. 1

work page 2022

[8] [8]

Masked and shuffled blind spot denoising for real-world images

Hamadi Chihaoui and Paolo Favaro. Masked and shuffled blind spot denoising for real-world images. InCVPR, 2024. 2

work page 2024

[9] [9]

Imagenet: A large-scale hierarchical image database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InCVPR, 2009. 5, 6

work page 2009

[10] [10]

Scaling rectified flow transformers for high-resolution image synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim En- tezari, Jonas M¨uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling rectified flow transformers for high-resolution image synthesis. InICML,

work page

[11] [11]

srgb real noise synthesizing with neighboring correlation-aware noise model

Zixuan Fu, Lanqing Guo, and Bihan Wen. srgb real noise synthesizing with neighboring correlation-aware noise model. InCVPR, 2023. 1, 2, 5, 6, 3

work page 2023

[12] [12]

Deep residual learning for image recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InCVPR, 2016. 4

work page 2016

[13] [13]

Denoising diffu- sion probabilistic models.NeurIPS, 2020

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.NeurIPS, 2020. 3

work page 2020

[14] [14]

Estimation of non- normalized statistical models by score matching.JMLR, 2005

Aapo Hyv ¨arinen and Peter Dayan. Estimation of non- normalized statistical models by score matching.JMLR, 2005. 3

work page 2005

[15] [15]

Fast camera image denoising on mobile gpus with deep learning, mobile ai 2021 challenge: Report

Andrey Ignatov, Kim Byeoung-su, Radu Timofte, and Ange- line Pouget. Fast camera image denoising on mobile gpus with deep learning, mobile ai 2021 challenge: Report. In CVPRW, 2021. 7

work page 2021

[16] [16]

C2n: Practical generative noise modeling for real-world denoising

Geonwoon Jang, Wooseok Lee, Sanghyun Son, and Ky- oung Mu Lee. C2n: Practical generative noise modeling for real-world denoising. InICCV, 2021. 2, 6

work page 2021

[17] [17]

Progressive growing of gans for improved quality, stability, and variation

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. InICLR, 2018. 2

work page 2018

[18] [18]

Elucidating the design space of diffusion-based generative models.NeurIPS, 2022

Tero Karras, Miika Aittala, Timo Aila, and Samuli Laine. Elucidating the design space of diffusion-based generative models.NeurIPS, 2022. 3, 2

work page 2022

[19] [19]

Analyzing and improving the training dynamics of diffusion models

Tero Karras, Miika Aittala, Jaakko Lehtinen, Janne Hellsten, Timo Aila, and Samuli Laine. Analyzing and improving the training dynamics of diffusion models. InCVPR, 2024. 2, 3

work page 2024

[20] [20]

srgb real noise modeling via noise-aware sampling with normalizing flows

Dongjin Kim, Donggoo Jung, Sungyong Baik, and Tae Hyun Kim. srgb real noise modeling via noise-aware sampling with normalizing flows. InICLR, 2024. 1, 2, 5, 6, 3

work page 2024

[21] [21]

Idf: Iterative dynamic filtering networks for generalizable image denoising

Dongjin Kim, Jaekyun Ko, Muhammad Kashif Ali, and Tae Hyun Kim. Idf: Iterative dynamic filtering networks for generalizable image denoising. InICCV, 2025. 1

work page 2025

[22] [22]

Continuous degradation modeling via latent flow matching for real-world super-resolution

Hyeonjae Kim, Dongjin Kim, Eugene Jin, and Tae Hyun Kim. Continuous degradation modeling via latent flow matching for real-world super-resolution. InAAAI, 2026. 1, 7

work page 2026

[23] [23]

Variational diffusion models.NeurIPS, 2021

Diederik Kingma, Tim Salimans, Ben Poole, and Jonathan Ho. Variational diffusion models.NeurIPS, 2021. 3

work page 2021

[24] [24]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980,

work page internal anchor Pith review Pith/arXiv arXiv

[25] [25]

Act-diffusion: Efficient adversarial consistency training for one-step diffusion models

Fei Kong, Jinhao Duan, Lichao Sun, Hao Cheng, Renjing Xu, Hengtao Shen, Xiaofeng Zhu, Xiaoshuang Shi, and Kaidi Xu. Act-diffusion: Efficient adversarial consistency training for one-step diffusion models. InCVPR, 2024. 3

work page 2024

[26] [26]

Modeling srgb camera noise with normalizing flows

Shayan Kousha, Ali Maleky, Michael S Brown, and Marcus A Brubaker. Modeling srgb camera noise with normalizing flows. InCVPR, 2022. 1, 2, 6, 5

work page 2022

[27] [27]

Ap- bsn: Self-supervised denoising for real-world images via asymmetric pd and blind-spot network

Wooseok Lee, Sanghyun Son, and Kyoung Mu Lee. Ap- bsn: Self-supervised denoising for real-world images via asymmetric pd and blind-spot network. InCVPR, 2022. 2

work page 2022

[28] [28]

Promptcir: blind compressed image restoration with prompt learning

Bingchen Li, Xin Li, Yiting Lu, Ruoyu Feng, Mengxi Guo, Shijie Zhao, Li Zhang, and Zhibo Chen. Promptcir: blind compressed image restoration with prompt learning. In CVPRW, 2024. 2

work page 2024

[29] [29]

Ucip: A universal framework for compressed image super-resolution using dynamic prompt

Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, and Zhibo Chen. Ucip: A universal framework for compressed image super-resolution using dynamic prompt. InECCV, 2024

work page 2024

[30] [30]

Prompt-in-prompt learning for universal image restoration.arXiv preprint arXiv:2312.05038, 2023

Zilong Li, Yiming Lei, Chenglong Ma, Junping Zhang, and Hongming Shan. Prompt-in-prompt learning for universal image restoration.arXiv preprint arXiv:2312.05038, 2023. 2

work page arXiv 2023

[31] [31]

Diffbir: Toward blind image restoration with generative diffusion prior

Xinqi Lin, Jingwen He, Ziyan Chen, Zhaoyang Lyu, Bo Dai, Fanghua Yu, Yu Qiao, Wanli Ouyang, and Chao Dong. Diffbir: Toward blind image restoration with generative diffusion prior. InECCV, 2024. 3

work page 2024

[32] [32]

On the variance of the adaptive learning rate and beyond

Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. On the variance of the adaptive learning rate and beyond. InICLR, 2020. 5

work page 2020

[33] [33]

SGDR: Stochastic Gradient Descent with Warm Restarts

Ilya Loshchilov and Frank Hutter. Sgdr: Stochastic gradient descent with warm restarts.arXiv preprint arXiv:1608.03983,

work page internal anchor Pith review Pith/arXiv arXiv

[34] [34]

Cosine normalization: Using cosine simi- larity instead of dot product in neural networks

Chunjie Luo, Jianfeng Zhan, Xiaohe Xue, Lei Wang, Rui Ren, and Qiang Yang. Cosine normalization: Using cosine simi- larity instead of dot product in neural networks. InICANN,

work page

[35] [35]

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Simian Luo, Yiqin Tan, Longbo Huang, Jian Li, and Hang Zhao. Latent consistency models: Synthesizing high- resolution images with few-step inference.arXiv preprint arXiv:2310.04378, 2023. 2, 3

work page internal anchor Pith review Pith/arXiv arXiv 2023

[36] [36]

A holistic approach to cross-channel image noise modeling and its application to image denoising

Seonghyeon Nam, Youngbae Hwang, Yasuyuki Matsushita, and Seon Joo Kim. A holistic approach to cross-channel image noise modeling and its application to image denoising. InCVPR, 2016. 1, 6

work page 2016

[37] [37]

Random sub-samples generation for self- supervised real image denoising

Yizhong Pan, Xiao Liu, Xiangyu Liao, Yuanzhouhan Cao, and Chao Ren. Random sub-samples generation for self- supervised real image denoising. InICCV, 2023. 2

work page 2023

[38] [38]

Learning controllable degradation for real-world super- resolution via constrained flows

Seobin Park, Dongjin Kim, Sungyong Baik, and Tae Hyun Kim. Learning controllable degradation for real-world super- resolution via constrained flows. InICML, 2023. 1, 7

work page 2023

[39] [39]

Scalable diffusion models with transformers

William Peebles and Saining Xie. Scalable diffusion models with transformers. InICCV, 2023. 2, 5, 1, 3

work page 2023

[40] [40]

Film: Visual reasoning with a general conditioning layer

Ethan Perez, Florian Strub, Harm De Vries, Vincent Du- moulin, and Aaron Courville. Film: Visual reasoning with a general conditioning layer. InAAAI, 2018. 1

work page 2018

[41] [41]

Benchmarking denoising algo- rithms with real photographs

Tobias Plotz and Stefan Roth. Benchmarking denoising algo- rithms with real photographs. InCVPR, 2017. 1

work page 2017

[42] [42]

PromptIR: Prompting for all-in-one image restoration

Vaishnav Potlapalli, Syed Waqas Zamir, Salman Khan, and Fahad Khan. PromptIR: Prompting for all-in-one image restoration. InNeurIPS, 2023. 2

work page 2023

[43] [43]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, 2022. 3

work page 2022

[44] [44]

High-resolution image synthesis with latent diffusion models

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj ¨orn Ommer. High-resolution image synthesis with latent diffusion models. InCVPR, 2022. 5

work page 2022

[45] [45]

U-net: Convolutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMICCAI, 2015. 3

work page 2015

[46] [46]

Exploiting cloze-questions for few-shot text classification and natural language inference

Timo Schick and Hinrich Sch¨utze. Exploiting cloze-questions for few-shot text classification and natural language inference. InEACL, 2021. 2

work page 2021

[47] [47]

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

Wenzhe Shi, Jose Caballero, Ferenc Husz´ar, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR, 2016. 5

work page 2016

[48] [48]

Logan IV , Eric Wal- lace, and Sameer Singh

Taylor Shin, Yasaman Razeghi, Robert L. Logan IV , Eric Wal- lace, and Sameer Singh. AutoPrompt: Eliciting knowledge from language models with automatically generated prompts. InEMNLP, 2020. 2

work page 2020

[49] [49]

Deep unsupervised learning using nonequilibrium thermodynamics

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. Deep unsupervised learning using nonequilibrium thermodynamics. InICML, 2015. 3

work page 2015

[50] [50]

Improved techniques for training consistency models

Yang Song and Prafulla Dhariwal. Improved techniques for training consistency models. InICLR, 2024. 3, 5, 2

work page 2024

[51] [51]

Generative modeling by estimating gradients of the data distribution.NeurIPS, 2019

Yang Song and Stefano Ermon. Generative modeling by estimating gradients of the data distribution.NeurIPS, 2019. 3

work page 2019

[52] [52]

Improved techniques for training score-based generative models.NeurIPS, 2020

Yang Song and Stefano Ermon. Improved techniques for training score-based generative models.NeurIPS, 2020

work page 2020

[53] [53]

Score-based generative modeling through stochastic differential equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations. InICLR, 2021. 3

work page 2021

[54] [54]

Consistency models

Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency models. InICML, 2023. 2, 3

work page 2023

[55] [55]

Attention is all you need.NeurIPS, 2017

A Vaswani. Attention is all you need.NeurIPS, 2017. 1

work page 2017

[56] [56]

Promptre- storer: A prompting image restoration method with degrada- tion perception.NeurIPS, 2024

Cong Wang, Jinshan Pan, Wei Wang, Jiangxin Dong, Mengzhu Wang, Yakun Ju, and Junyang Chen. Promptre- storer: A prompting image restoration method with degrada- tion perception.NeurIPS, 2024. 2

work page 2024

[57] [57]

Promptrr: Diffusion models as prompt generators for single image reflection removal.arXiv preprint arXiv:2402.02374,

Tao Wang, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Tae- Kyun Kim, Tong Lu, Hongdong Li, and Ming-Hsuan Yang. Promptrr: Diffusion models as prompt generators for single image reflection removal.arXiv preprint arXiv:2402.02374,

work page arXiv

[58] [58]

Real-esrgan: Training real-world blind super-resolution with pure synthetic data

Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. Real-esrgan: Training real-world blind super-resolution with pure synthetic data. InICCV, 2021. 7

work page 2021

[59] [59]

Lg-bpn: Local and global blind-patch network for self-supervised real- world denoising

Zichun Wang, Ying Fu, Ji Liu, and Yulun Zhang. Lg-bpn: Local and global blind-patch network for self-supervised real- world denoising. InCVPR, 2023. 2, 4

work page 2023

[60] [60]

Realistic noise synthesis with diffusion models.arXiv preprint arXiv:2305.14022, 2023

Qi Wu, Mingyan Han, Ting Jiang, Haoqiang Fan, Bing Zeng, and Shuaicheng Liu. Realistic noise synthesis with diffusion models.arXiv preprint arXiv:2305.14022, 2023. 2

work page arXiv 2023

[61] [61]

One-step effective diffusion network for real-world image super-resolution.NeurIPS, 2024

Rongyuan Wu, Lingchen Sun, Zhiyuan Ma, and Lei Zhang. One-step effective diffusion network for real-world image super-resolution.NeurIPS, 2024. 3

work page 2024

[62] [62]

Seesr: Towards semantics-aware real-world image super-resolution

Rongyuan Wu, Tao Yang, Lingchen Sun, Zhengqiang Zhang, Shuai Li, and Lei Zhang. Seesr: Towards semantics-aware real-world image super-resolution. InCVPR, 2024. 3

work page 2024

[63] [63]

Freprompter: Frequency self-prompt for all-in-one image restoration.Pattern Recognition, 2025

Zhijian Wu, Wenhui Liu, Jingchao Wang, Jun Li, and Dingjiang Huang. Freprompter: Frequency self-prompt for all-in-one image restoration.Pattern Recognition, 2025. 2

work page 2025

[64] [64]

Diffir: Efficient diffusion model for image restoration

Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. Diffir: Efficient diffusion model for image restoration. InICCV, pages 13095–13105, 2023. 3

work page 2023

[65] [65]

Real-world Noisy Image Denoising: A New Benchmark

Jun Xu, Hui Li, Zhetong Liang, David Zhang, and Lei Zhang. Real-world noisy image denoising: A new benchmark.arXiv preprint arXiv:1804.02603, 2018. 1, 6

work page internal anchor Pith review Pith/arXiv arXiv 2018

[66] [66]

Synthesizing realistic image restoration training pairs: A diffusion approach.arXiv preprint arXiv:2303.06994, 2023

Tao Yang, Peiran Ren, Lei Zhang, et al. Synthesizing realistic image restoration training pairs: A diffusion approach.arXiv preprint arXiv:2303.06994, 2023. 7

work page arXiv 2023

[67] [67]

Dual adversarial network: Toward real-world noise removal and noise generation

Zongsheng Yue, Qian Zhao, Lei Zhang, and Deyu Meng. Dual adversarial network: Toward real-world noise removal and noise generation. InECCV, 2020. 6

work page 2020

[68] [68]

Cycleisp: Real image restoration via improved data synthesis

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Cycleisp: Real image restoration via improved data synthesis. InCVPR, 2020. 1, 2

work page 2020

[69] [69]

Learning enriched features for real image restoration and enhancement

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, and Ling Shao. Learning enriched features for real image restoration and enhancement. InECCV, 2020. 1

work page 2020

[70] [70]

Restormer: Efficient transformer for high-resolution image restoration

Syed Waqas Zamir, Aditya Arora, Salman Khan, Mu- nawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InCVPR, 2022. 1

work page 2022

[71] [71]

Mm-bsn: Self-supervised image denoising for real-world with multi-mask based on blind-spot network

Dan Zhang, Fangfang Zhou, Yuwen Jiang, and Zhengming Fu. Mm-bsn: Self-supervised image denoising for real-world with multi-mask based on blind-spot network. InCVPRW,

work page

[72] [72]

Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE TIP, 2017

Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE TIP, 2017. 5

work page 2017

[73] [73]

Learning to prompt for vision-language models.IJCV,

Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.IJCV,

work page

[74] [74]

Seg- prompt: Boosting open-world segmentation via category- level prompt learning

Muzhi Zhu, Hengtao Li, Hao Chen, Chengxiang Fan, Weian Mao, Chenchen Jing, Yifan Liu, and Chunhua Shen. Seg- prompt: Boosting open-world segmentation via category- level prompt learning. InICCV, 2023. 2

work page 2023

[75] [75]

Iterative denoiser and noise estimator for self-supervised image denoising

Yunhao Zou, Chenggang Yan, and Ying Fu. Iterative denoiser and noise estimator for self-supervised image denoising. In ICCV, 2023. 2 Diffusion-Based sRGB Real Noise Generation via Prompt-Driven Noise Representation Learning Supplementary Material Contents

work page 2023

[76] [76]

Preliminaries

Proposed Method 3 3.1. Preliminaries . . . . . . . . . . . . . . . . . 3 3.2. Overall Flow: PNG . . . . . . . . . . . . . . 3 3.3. Prompt Autoencoder . . . . . . . . . . . . . 4 3.3.1 . Prompt Encoder . . . . . . . . . . . 4 3.3.2 . Decoder . . . . . . . . . . . . . . . 5 3.4. Prompt DiT (P-DiT) . . . . . . . . . . . . . 5

work page

[77] [77]

Experimental Setup

Experiments 5 4.1. Experimental Setup . . . . . . . . . . . . . . 5 4.2. Real-World sRGB Noise Generation and Re- moval . . . . . . . . . . . . . . . . . . . . 6 4.3. Application: Metadata-Free Noise Generation 7 4.4. Ablation Study . . . . . . . . . . . . . . . . 8

work page

[78] [78]

Supplementary Material 1 S.1

Conclusion 8 S. Supplementary Material 1 S.1. Prompt DiT . . . . . . . . . . . . . . . . . . 1 S.2. Training details of P-DiT . . . . . . . . . . . 2 S.2.1 . CM Parameterization . . . . . . . . 2 S.2.2 . CM Hyperparamters . . . . . . . . . 2 S.2.3 . Latent Code Normalization . . . . . 3 S.2.4 . P-DiT Hyperparamters . . . . . . . . 3 S.3. Model Size and In...

work page