Recognition: no theorem link
LPNSR: Optimal Noise-Guided Diffusion Image Super-Resolution Via Learnable Noise Prediction
Pith reviewed 2026-05-15 07:26 UTC · model grok-4.3
The pith
Diffusion super-resolution models achieve stable results by using a learned predictor for optimal noise instead of random sampling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We establish a theoretical framework that derives the closed-form analytical solution for optimal intermediate noise in diffusion models from a maximum likelihood estimation perspective, revealing a consistent conditional dependence structure that generalizes across diffusion paradigms. We instantiate this framework under the residual-shifting diffusion paradigm and accordingly design an LR-guided multi-input-aware noise predictor to replace random Gaussian noise. We further mitigate initialization bias with a high-quality pre-upsampling network. The compact 4-step trajectory uniquely enables end-to-end optimization of the entire reverse chain.
What carries the argument
The LR-guided multi-input-aware noise predictor, which instantiates the derived conditional dependence structure to generate optimal noise at each diffusion step.
Load-bearing premise
The derived conditional dependence structure for optimal noise generalizes across diffusion paradigms and can be instantiated in the residual-shifting setup without introducing fitting biases that undermine the solution.
What would settle it
Comparing perceptual quality metrics and output variance between the proposed model and a version using standard random Gaussian noise on the same datasets would show if the learned predictor provides the claimed improvement and stability.
Figures
read the original abstract
Diffusion-based image super-resolution (SR) aims to reconstruct high-resolution (HR) images from low-resolution (LR) observations. However, the inherent randomness injected during the reverse diffusion process causes the performance of diffusion-based SR models to vary significantly across different sampling runs, particularly when the sampling trajectory is compressed into a limited number of steps. A critical yet underexplored question is: what is the optimal noise to inject at each intermediate diffusion step? In this paper, we establish a theoretical framework that derives the closed-form analytical solution for optimal intermediate noise in diffusion models from a maximum likelihood estimation perspective, revealing a consistent conditional dependence structure that generalizes across diffusion paradigms. We instantiate this framework under the residual-shifting diffusion paradigm and accordingly design an LR-guided multi-input-aware noise predictor to replace random Gaussian noise. We further mitigate initialization bias with a high-quality pre-upsampling network. The compact 4-step trajectory uniquely enables end-to-end optimization of the entire reverse chain, which is computationally prohibitive for conventional long-trajectory diffusion models. Extensive experiments demonstrate that LPNSR achieves state-of-the-art perceptual performance on both synthetic and real-world datasets, without relying on any large-scale text-to-image priors. The source code of our method can be found at https://github.com/Faze-Hsw/LPNSR.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to establish a theoretical framework deriving a closed-form analytical solution for optimal intermediate noise in diffusion models via maximum likelihood estimation, revealing a conditional dependence structure that generalizes across paradigms; it instantiates this under residual-shifting diffusion with an LR-guided multi-input-aware noise predictor, adds a high-quality pre-upsampling network to mitigate initialization bias, and demonstrates SOTA perceptual performance on synthetic and real-world SR datasets using a compact 4-step trajectory without large text-to-image priors.
Significance. If the MLE derivation is sound and the learnable predictor faithfully implements the claimed optimal noise without introducing unaccounted biases, the work would provide a principled way to reduce variance and sampling steps in diffusion SR while maintaining quality, offering an alternative to random Gaussian noise that could improve efficiency and consistency in low-step regimes.
major comments (2)
- [Abstract / Theoretical Framework] Abstract and theoretical framework section: the claim of a closed-form MLE solution yielding an invariant conditional dependence structure requires explicit algebraic steps (e.g., the derivation from the likelihood objective through the residual-shifting forward process) to confirm that assumptions such as noise-LR independence survive instantiation; without these steps the generalization claim rests on an unverified transition from derivation to learnable predictor.
- [Experiments] Experiments section: SOTA perceptual claims are made for the 4-step trajectory, yet the abstract (and visible summary) provides no error bars, ablation results on the noise predictor components, or quantitative baseline tables; these omissions make it impossible to assess whether the reported gains are robust or attributable to the derived noise structure versus the pre-upsampler.
minor comments (1)
- [Method] The manuscript should clarify the exact architecture of the LR-guided multi-input-aware noise predictor (input channels, conditioning mechanism) and confirm that the 4-step end-to-end optimization does not inadvertently overfit to the training distribution used for the predictor parameters.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We will revise the manuscript to address the points on the theoretical derivation and experimental reporting, as detailed below.
read point-by-point responses
-
Referee: [Abstract / Theoretical Framework] Abstract and theoretical framework section: the claim of a closed-form MLE solution yielding an invariant conditional dependence structure requires explicit algebraic steps (e.g., the derivation from the likelihood objective through the residual-shifting forward process) to confirm that assumptions such as noise-LR independence survive instantiation; without these steps the generalization claim rests on an unverified transition from derivation to learnable predictor.
Authors: We agree that explicit algebraic steps will strengthen the presentation. In the revised manuscript, we will expand the theoretical framework section with the full derivation: starting from the maximum likelihood estimation objective, proceeding step-by-step through the residual-shifting forward process, and explicitly verifying that the noise-LR independence assumption holds under the model. This will also clarify the transition from the closed-form optimal noise to the LR-guided multi-input-aware predictor and support the generalization claim across paradigms. revision: yes
-
Referee: [Experiments] Experiments section: SOTA perceptual claims are made for the 4-step trajectory, yet the abstract (and visible summary) provides no error bars, ablation results on the noise predictor components, or quantitative baseline tables; these omissions make it impossible to assess whether the reported gains are robust or attributable to the derived noise structure versus the pre-upsampler.
Authors: We acknowledge that the current experimental reporting lacks sufficient detail for full assessment. In the revised version, we will add error bars (computed over multiple independent runs) to all quantitative metrics, include ablation studies isolating the contributions of the noise predictor components (LR-guidance and multi-input awareness), and provide expanded quantitative baseline tables. These additions will demonstrate robustness and help attribute gains to the derived noise structure versus the pre-upsampler. revision: yes
Circularity Check
Theoretical MLE derivation of optimal noise stands independent of learnable instantiation
full rationale
The paper first presents a theoretical framework deriving a closed-form analytical solution for optimal intermediate noise via maximum likelihood estimation, revealing a conditional dependence structure claimed to generalize across paradigms. This derivation is positioned prior to and separate from the subsequent instantiation under the residual-shifting paradigm and the design of an LR-guided multi-input-aware noise predictor. No equations or steps reduce the MLE result to fitted parameters by construction, nor rely on self-citations, ansatz smuggling, or renaming of known results. The end-to-end optimization of the 4-step chain is a practical engineering choice enabled by the short trajectory, but does not make the initial analytical derivation circular. The central claim therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- parameters of the LR-guided noise predictor
axioms (1)
- domain assumption The conditional dependence structure derived from MLE generalizes across diffusion paradigms
invented entities (1)
-
LR-guided multi-input-aware noise predictor
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, and Mohammad Norouzi. Image super-resolution via iterative refinement.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4713–4726, 2023
work page 2023
-
[2]
Resshift: Efficient diffusion model for image super-resolution by residual shifting
Zongsheng Yue, Jianyi Wang, and Chen Change Loy. Resshift: Efficient diffusion model for image super-resolution by residual shifting. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Information Processing Systems, volume 36, pages 13294–13307. Curran Associates, Inc., 2023
work page 2023
-
[3]
Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin CK Chan, and Chen Change Loy. Exploit- ing diffusion prior for real-world image super-resolution.International Journal of Computer Vision, 132(12):5929–5949, 2024
work page 2024
-
[4]
Arbitrary-steps image super-resolution via diffusion inversion
Zongsheng Yue, Kang Liao, and Chen Change Loy. Arbitrary-steps image super-resolution via diffusion inversion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 23153–23163, June 2025
work page 2025
-
[5]
Denoising diffusion restoration models
Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song. Denoising diffusion restoration models. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 23593–23606. Curran Associates, Inc., 2022
work page 2022
-
[6]
Hyungjin Chung, Byeongsu Sim, and Jong Chul Ye. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12413–12422, June 2022
work page 2022
-
[7]
High-resolution image synthesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, June 2022
work page 2022
-
[8]
Rongyuan Wu, Lingchen Sun, Zhiyuan Ma, and Lei Zhang. One-step effective diffusion network for real-world image super-resolution.Advances in Neural Information Processing Systems, 37:92529–92553, 2024
work page 2024
-
[9]
Seesr: Towards semantics-aware real-world image super-resolution
Rongyuan Wu, Tao Yang, Lingchen Sun, Zhengqiang Zhang, Shuai Li, and Lei Zhang. Seesr: Towards semantics-aware real-world image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 25456–25467, 2024
work page 2024
-
[10]
Denoising diffusion probabilistic models
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020
work page 2020
-
[11]
Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, and Jun Zhu. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps.Advances in neural information processing systems, 35:5775–5787, 2022
work page 2022
-
[12]
Improved denoising diffusion probabilistic models
Alexander Quinn Nichol and Prafulla Dhariwal. Improved denoising diffusion probabilistic models. In International conference on machine learning, pages 8162–8171. PMLR, 2021
work page 2021
-
[13]
Denoising Diffusion Implicit Models
Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[14]
Sinsr: diffusion-based image super-resolution in a single step
Yufei Wang, Wenhan Yang, Xinyuan Chen, Yaohui Wang, Lanqing Guo, Lap-Pui Chau, Ziwei Liu, Yu Qiao, Alex C Kot, and Bihan Wen. Sinsr: diffusion-based image super-resolution in a single step. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 25796–25805, 2024
work page 2024
-
[15]
Hyungjin Chung, Byeongsu Sim, Dohoon Ryu, and Jong Chul Ye. Improving diffusion models for inverse problems using manifold constraints.Advances in Neural Information Processing Systems, 35:25683– 25696, 2022
work page 2022
-
[16]
Diffusion Posterior Sampling for General Noisy Inverse Problems
Hyungjin Chung, Jeongsol Kim, Michael T Mccann, Marc L Klasky, and Jong Chul Ye. Diffusion posterior sampling for general noisy inverse problems.arXiv preprint arXiv:2209.14687, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[17]
Generative diffusion prior for unified image restoration and enhancement
Ben Fei, Zhaoyang Lyu, Liang Pan, Junzhe Zhang, Weidong Yang, Tianyue Luo, Bo Zhang, and Bo Dai. Generative diffusion prior for unified image restoration and enhancement. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9935–9946, 2023. 10
work page 2023
-
[18]
Pseudoinverse-guided diffusion models for inverse problems
Jiaming Song, Arash Vahdat, Morteza Mardani, and Jan Kautz. Pseudoinverse-guided diffusion models for inverse problems. InInternational Conference on Learning Representations, 2023
work page 2023
-
[19]
Dreamclean: Restoring clean image using deep diffusion prior
Jie Xiao, Ruili Feng, Han Zhang, Zhiheng Liu, Zhantao Yang, Yurui Zhu, Xueyang Fu, Kai Zhu, Yu Liu, and Zheng-Jun Zha. Dreamclean: Restoring clean image using deep diffusion prior. InThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[20]
Difface: Blind face restoration with diffused error contraction
Zongsheng Yue and Chen Change Loy. Difface: Blind face restoration with diffused error contraction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12):9991–10004, 2024
work page 2024
-
[21]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Image super-resolution using deep convolutional networks.IEEE transactions on pattern analysis and machine intelligence, 38(2):295–307, 2015
work page 2015
-
[22]
Deep learning for image super resolution
Pablo Rojas Sedó. Deep learning for image super resolution. B.S. thesis, Universitat Politècnica de Catalunya, 2022
work page 2022
-
[23]
Image super-resolution via progressive cascading residual network
Namhyuk Ahn, Byungkon Kang, and Kyung-Ah Sohn. Image super-resolution via progressive cascading residual network. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 791–799, 2018
work page 2018
-
[24]
Accurate image super-resolution using very deep convolutional networks
Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. Accurate image super-resolution using very deep convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1646–1654, 2016
work page 2016
-
[25]
Deep networks for image super-resolution with sparse prior
Zhaowen Wang, Ding Liu, Jianchao Yang, Wei Han, and Thomas Huang. Deep networks for image super-resolution with sparse prior. InProceedings of the IEEE international conference on computer vision, pages 370–378, 2015
work page 2015
-
[26]
Photo-realistic single image super- resolution using a generative adversarial network
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo-realistic single image super- resolution using a generative adversarial network. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 4681–4690, 2017
work page 2017
-
[27]
Pulse: Self-supervised photo upsampling via latent space exploration of generative models
Sachit Menon, Alexandru Damian, Shijia Hu, Nikhil Ravi, and Cynthia Rudin. Pulse: Self-supervised photo upsampling via latent space exploration of generative models. InProceedings of the ieee/cvf conference on computer vision and pattern recognition, pages 2437–2445, 2020
work page 2020
-
[28]
Enhancenet: Single image super-resolution through automated texture synthesis
Mehdi SM Sajjadi, Bernhard Scholkopf, and Michael Hirsch. Enhancenet: Single image super-resolution through automated texture synthesis. InProceedings of the IEEE international conference on computer vision, pages 4491–4500, 2017
work page 2017
-
[29]
Pixel recursive super resolution
Ryan Dahl, Mohammad Norouzi, and Jonathon Shlens. Pixel recursive super resolution. InProceedings of the IEEE international conference on computer vision, pages 5439–5448, 2017
work page 2017
-
[30]
Generating High Fidelity Images with Subscale Pixel Networks and Multidimensional Upscaling
Jacob Menick and Nal Kalchbrenner. Generating high fidelity images with subscale pixel networks and multidimensional upscaling.arXiv preprint arXiv:1812.01608, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[31]
Aaron Van den Oord, Nal Kalchbrenner, Lasse Espeholt, Oriol Vinyals, Alex Graves, et al. Conditional image generation with pixelcnn decoders.Advances in neural information processing systems, 29, 2016
work page 2016
-
[32]
Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Lukasz Kaiser, Noam Shazeer, Alexander Ku, and Dustin Tran. Image transformer. InInternational conference on machine learning, pages 4055–4064. PMLR, 2018
work page 2018
-
[33]
Lar-sr: A local autoregressive model for image super-resolution
Baisong Guo, Xiaoyun Zhang, Haoning Wu, Yu Wang, Ya Zhang, and Yan-Feng Wang. Lar-sr: A local autoregressive model for image super-resolution. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1909–1918, 2022
work page 1909
-
[34]
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation.arXiv preprint arXiv:1710.10196, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[35]
Jooyoung Choi, Sungwon Kim, Yonghyun Jeong, Youngjune Gwon, and Sungroh Yoon. Ilvr: Conditioning method for denoising diffusion probabilistic models.arXiv preprint arXiv:2108.02938, 2021
-
[36]
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H Bermano, Gal Chechik, and Daniel Cohen-Or. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv preprint arXiv:2208.01618, 2022. 11
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[37]
Null-text inversion for editing real images using guided diffusion models
Ron Mokady, Amir Hertz, Kfir Aberman, Yael Pritch, and Daniel Cohen-Or. Null-text inversion for editing real images using guided diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6038–6047, 2023
work page 2023
-
[38]
Negative-prompt inversion: Fast image inversion for editing with text-guided diffusion models
Daiki Miyake, Akihiro Iohara, Yu Saito, and Toshiyuki Tanaka. Negative-prompt inversion: Fast image inversion for editing with text-guided diffusion models. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2063–2072. IEEE, 2025
work page 2063
-
[39]
Thao Nguyen, Yuheng Li, Utkarsh Ojha, and Yong Jae Lee. Visual instruction inversion: Image editing via image prompting.Advances in Neural Information Processing Systems, 36:9598–9613, 2023
work page 2023
-
[40]
Xuan Ju, Ailing Zeng, Yuxuan Bian, Shaoteng Liu, and Qiang Xu. Direct inversion: Boosting diffusion- based editing with 3 lines of code.arXiv preprint arXiv:2310.01506, 2023
-
[41]
Eta inversion: Designing an optimal eta function for diffusion-based real image editing
Wonjun Kang, Kevin Galim, and Hyung Il Koo. Eta inversion: Designing an optimal eta function for diffusion-based real image editing. InEuropean Conference on Computer Vision, pages 90–106. Springer, 2024
work page 2024
-
[42]
Fixed-point inversion for text-to-image diffusion models.CoRR, 2023
Barak Meiri, Dvir Samuel, Nir Darshan, Gal Chechik, Shai Avidan, and Rami Ben-Ari. Fixed-point inversion for text-to-image diffusion models.CoRR, 2023
work page 2023
-
[43]
Edict: Exact diffusion inversion via coupled transformations
Bram Wallace, Akash Gokul, and Nikhil Naik. Edict: Exact diffusion inversion via coupled transformations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22532– 22541, 2023
work page 2023
-
[44]
Yiyang Ma, Huan Yang, Wenhan Yang, Jianlong Fu, and Jiaying Liu. Solving diffusion odes with optimal boundary conditions for better image super-resolution.arXiv preprint arXiv:2305.15357, 2023
-
[45]
Swinir: Image restoration using swin transformer
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 1833–1844, 2021
work page 2021
-
[46]
U-net: Convolutional networks for biomedical image segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015
work page 2015
-
[47]
Taming transformers for high-resolution image synthesis
Patrick Esser, Robin Rombach, and Bjorn Ommer. Taming transformers for high-resolution image synthesis. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 12873–12883, June 2021
work page 2021
-
[48]
Real-esrgan: Training real-world blind super- resolution with pure synthetic data
Xintao Wang, Liangbin Xie, Chao Dong, and Ying Shan. Real-esrgan: Training real-world blind super- resolution with pure synthetic data. InProceedings of the IEEE/CVF international conference on computer vision, pages 1905–1914, 2021
work page 1905
-
[49]
The unreasonable effectiveness of deep features as a perceptual metric
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unreasonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018
work page 2018
-
[50]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. InAdvances in Neural Information Processing Systems, pages 2672–2680, 2014
work page 2014
-
[51]
Designing a practical degradation model for deep blind image super-resolution
Kai Zhang, Jingyun Liang, Luc Van Gool, and Radu Timofte. Designing a practical degradation model for deep blind image super-resolution. InProceedings of the IEEE/CVF international conference on computer vision, pages 4791–4800, 2021
work page 2021
-
[52]
Lsdir: A large scale dataset for image restoration
Yawei Li, Kai Zhang, Jingyun Liang, Jiezhang Cao, Ce Liu, Rui Gong, Yulun Zhang, Hao Tang, Yun Liu, Denis Demandolx, et al. Lsdir: A large scale dataset for image restoration. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1775–1787, 2023
work page 2023
-
[53]
A style-based generator architecture for generative adversarial networks
Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019
work page 2019
-
[54]
Decoupled Weight Decay Regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[55]
SGDR: Stochastic Gradient Descent with Warm Restarts
Ilya Loshchilov and Frank Hutter. Sgdr: Stochastic gradient descent with warm restarts.arXiv preprint arXiv:1608.03983, 2016. 12
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[56]
Imagenet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009
work page 2009
-
[57]
Toward real-world single image super-resolution: A new benchmark and a new model
Jianrui Cai, Hui Zeng, Hongwei Yong, Zisheng Cao, and Lei Zhang. Toward real-world single image super-resolution: A new benchmark and a new model. InProceedings of the IEEE/CVF international conference on computer vision, pages 3086–3095, 2019
work page 2019
-
[58]
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity.IEEE transactions on image processing, 13(4):600–612, 2004
work page 2004
-
[59]
Anish Mittal, Rajiv Soundararajan, and Alan C Bovik. Making a “completely blind” image quality analyzer. IEEE Signal processing letters, 20(3):209–212, 2012
work page 2012
-
[60]
The 2018 pirm challenge on perceptual image super-resolution
Yochai Blau, Roey Mechrez, Radu Timofte, Tomer Michaeli, and Lihi Zelnik-Manor. The 2018 pirm challenge on perceptual image super-resolution. InProceedings of the European conference on computer vision (ECCV) workshops, pages 0–0, 2018
work page 2018
-
[61]
Musiq: Multi-scale image quality transformer
Junjie Ke, Qifei Wang, Yilin Wang, Peyman Milanfar, and Feng Yang. Musiq: Multi-scale image quality transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 5148–5157, 2021
work page 2021
-
[62]
Exploring clip for assessing the look and feel of images
Jianyi Wang, Kelvin CK Chan, and Chen Change Loy. Exploring clip for assessing the look and feel of images. InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 2555–2563, 2023. 13 A Appendix In the appendix, we provide the following materials: • Extension to the DDPM Paradigm • Quantitative and qualitative results of differen...
work page 2023
-
[63]
(12) to generate noise, and perform the full 4-step inference to produce the final result
as the x0 substitute in Eq. (12) to generate noise, and perform the full 4-step inference to produce the final result. We compare its performance with that of random Gaussian noise, theoretical optimal noise (calculated from the ground-truth HR image), and our LR-guided noise predictor, with results presented in Tab. 5. It can be seen that the theoretical...
work page 1995
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.