A Wavelet Diffusion GAN for Image Super-Resolution

Aurelio Uncini; Danilo Comminiello; Lorenzo Aloisi; Luigi Sigillo

arxiv: 2410.17966 · v3 · submitted 2024-10-23 · 📡 eess.IV · cs.CV

A Wavelet Diffusion GAN for Image Super-Resolution

Lorenzo Aloisi , Luigi Sigillo , Aurelio Uncini , Danilo Comminiello This is my paper

Pith reviewed 2026-05-23 19:02 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords wavelet diffusion GANsingle-image super-resolutiondiffusion modelsGANdiscrete wavelet transformCelebA-HQhigh-fidelity image generation

0 comments

The pith

A wavelet diffusion GAN reduces timesteps for faster high-fidelity image super-resolution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a wavelet-based conditional Diffusion GAN scheme for single-image super-resolution. It employs the diffusion GAN paradigm to reduce the timesteps in the reverse diffusion process and applies the Discrete Wavelet Transform to lower dimensionality, thereby decreasing training and inference times. Experimental results on the CelebA-HQ dataset indicate that this approach outperforms other state-of-the-art methods while maintaining high-fidelity outputs. This addresses the limitation of slow speeds in diffusion models for time-sensitive applications.

Core claim

Integrating the Discrete Wavelet Transform with the diffusion GAN paradigm reduces the number of timesteps required for the reverse diffusion process and achieves dimensionality reduction, leading to significantly faster training and inference while ensuring high-fidelity super-resolution outputs on the CelebA-HQ dataset.

What carries the argument

Wavelet-based conditional Diffusion GAN scheme that combines diffusion GAN for timestep reduction with DWT for dimensionality reduction.

If this is right

Faster training and inference times for diffusion-based super-resolution tasks.
High-fidelity image outputs that surpass other state-of-the-art methodologies.
Makes diffusion models practical for real-time or time-sensitive image processing applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The scheme could be adapted for other image-to-image translation tasks mentioned in the abstract.
Additional experiments on varied datasets might confirm broader effectiveness beyond faces.
The dimensionality reduction via wavelets may inspire similar efficiency gains in related generative models.

Load-bearing premise

The experimental validation on the CelebA-HQ dataset is sufficient to establish outperformance and time savings over other methods.

What would settle it

A comparison on standard super-resolution benchmarks showing the method requires similar time or produces lower fidelity than existing diffusion or GAN baselines.

Figures

Figures reproduced from arXiv: 2410.17966 by Aurelio Uncini, Danilo Comminiello, Lorenzo Aloisi, Luigi Sigillo.

**Figure 1.** Figure 1: Method architecture and training scheme. In green our discriminator and in blue our conditional generator. x0 undergoes forward diffusion in wavelet space and the resulting pure noise xt gets concatenated to the low-res input xlr to condition the generator for the backward diffusion. into four wavelet sub-bands Xll, Xlh, Xhl, and Xhh with a size of H 2 × W 2 . For an input image x belonging to R 3×H×W we … view at source ↗

**Figure 2.** Figure 2: Reverse diffusion process and inference: the model pθ iteratively produces a more refined sample from xt and xlr. After T iterations, x ′ 0 is used to reconstruct the super-resolved image. generator G(xt , xlr,t). In this formulation, the model does not directly predict xt−1. Instead, it predicts the clean image x0 and uses the known diffusion process to obtain xt−1. Specifically xt is the noisy image at … view at source ↗

**Figure 3.** Figure 3: Qualitative comparison between our model, ESRGAN, SR3 and DiWa trained for 25k iteration steps on CelebA-HQ for the task of 16x16 → 128x128 [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Results on Shipspotting [19]. the traditional diffusion baselines suffer from annoying crosshatch artifacts, along with color shift artifacts, where the color distribution of the reconstructed image does not correspond with that of the target image, as noted in [1]. Notably, even with a number as low as 25k iteration steps, our method provides state-of-the-art results and a high degree of visual fidelity,… view at source ↗

read the original abstract

In recent years, diffusion models have emerged as a superior alternative to generative adversarial networks (GANs) for high-fidelity image generation, with wide applications in text-to-image generation, image-to-image translation, and super-resolution. However, their real-time feasibility is hindered by slow training and inference speeds. This study addresses this challenge by proposing a wavelet-based conditional Diffusion GAN scheme for Single-Image Super-Resolution (SISR). Our approach utilizes the diffusion GAN paradigm to reduce the timesteps required by the reverse diffusion process and the Discrete Wavelet Transform (DWT) to achieve dimensionality reduction, decreasing training and inference times significantly. The results of an experimental validation on the CelebA-HQ dataset confirm the effectiveness of our proposed scheme. Our approach outperforms other state-of-the-art methodologies successfully ensuring high-fidelity output while overcoming inherent drawbacks associated with diffusion models in time-sensitive applications. The code is available at https://www.github.com/aloilor/WaDiGAN-SR

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Wavelet diffusion GAN for SR is a sensible combination but the abstract's performance claims lack any supporting numbers.

read the letter

The paper's core move is to fold the discrete wavelet transform into a conditional diffusion GAN for single-image super-resolution. This lets them shrink the input size and cut the number of diffusion steps at once. The combination itself is the new element, even if wavelets and diffusion GANs have each shown up separately before. The goal is clear: keep the quality that diffusion models can deliver while making training and inference fast enough for real applications. Releasing the code is a straightforward positive step that lets others test the setup directly. The approach sits on standard components without obvious circular definitions or invented entities. The main weakness is the validation. The abstract states that the method outperforms other state-of-the-art approaches on CelebA-HQ and solves the speed problem, yet it supplies no PSNR, SSIM, LPIPS values, no wall-clock times, no baseline scores, and no ablation results. Without those numbers the central claim cannot be checked, which matches the stress-test note. If the full paper contains detailed tables, statistical comparisons, and runtime measurements, the gap closes; from the abstract alone it remains the load-bearing issue. This work is aimed at researchers who build efficient generative models for super-resolution or who need practical speed-ups in video or medical pipelines. A reader already working on wavelet methods or hybrid GAN-diffusion systems could extract usable ideas even if they later run their own checks. It deserves peer review because the technical framing is coherent and the problem it targets is real, though the authors will need to supply concrete evidence before the performance assertions can be taken as settled.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes a wavelet-based conditional Diffusion GAN (WaDiGAN-SR) for single-image super-resolution. It combines the diffusion-GAN framework to reduce the number of timesteps in the reverse diffusion process with the Discrete Wavelet Transform (DWT) for dimensionality reduction, with the goal of lowering training and inference times. Experimental validation on CelebA-HQ is asserted to demonstrate outperformance over state-of-the-art methods while preserving high-fidelity output; code is released at the cited GitHub repository.

Significance. If the speed and fidelity claims are substantiated by quantitative results, the approach could address a practical limitation of diffusion models for real-time super-resolution. The public code release is a positive factor for reproducibility.

major comments (1)

[Abstract] Abstract: the central claim that the method 'outperforms other state-of-the-art methodologies' while 'ensuring high-fidelity output' is unsupported by any reported metrics (PSNR, SSIM, LPIPS, FID), error bars, wall-clock times, baseline comparisons, or ablation results on CelebA-HQ. Without these numbers the headline assertion cannot be evaluated.

minor comments (1)

[Abstract] Abstract: the statement that DWT 'achieve[s] dimensionality reduction' would benefit from a brief indication of the wavelet family and decomposition level used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comment. We agree that the abstract's claims require clearer support from the reported results and will revise accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the method 'outperforms other state-of-the-art methodologies' while 'ensuring high-fidelity output' is unsupported by any reported metrics (PSNR, SSIM, LPIPS, FID), error bars, wall-clock times, baseline comparisons, or ablation results on CelebA-HQ. Without these numbers the headline assertion cannot be evaluated.

Authors: We acknowledge the referee's point. While the manuscript body presents quantitative comparisons on CelebA-HQ (including PSNR, SSIM, LPIPS, FID, and timing results against baselines), the abstract does not explicitly cite these numbers. We will revise the abstract to include key metrics (e.g., PSNR/SSIM improvements and inference speedup) and reference the experimental tables, ensuring the claims are directly supported. We will also add error bars where appropriate and clarify the ablation studies. revision: yes

Circularity Check

0 steps flagged

No circularity detected; proposal combines standard components with empirical claims

full rationale

The abstract and provided text describe a wavelet-based conditional Diffusion GAN for SISR that combines the diffusion GAN paradigm (to reduce timesteps) with DWT (for dimensionality reduction). No equations, derivations, or load-bearing steps are shown that reduce any claimed result to a self-definition, fitted input renamed as prediction, or self-citation chain. The outperformance claim is presented as resting on experimental validation on CelebA-HQ rather than any mathematical reduction to inputs. This is the expected non-finding for an applied methods paper whose central assertions are empirical rather than derivational.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the established properties of the discrete wavelet transform and the diffusion GAN paradigm for timestep reduction; no new free parameters, axioms beyond standard signal-processing assumptions, or invented entities are introduced in the abstract.

axioms (2)

standard math Discrete Wavelet Transform provides effective dimensionality reduction for image data while preserving essential information
Invoked implicitly when stating that DWT achieves dimensionality reduction.
domain assumption Diffusion GAN paradigm reduces the number of timesteps required by the reverse diffusion process
Stated directly in the abstract as the mechanism for faster inference.

pith-pipeline@v0.9.0 · 5703 in / 1279 out tokens · 51924 ms · 2026-05-23T19:02:01.203343+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Latent Wavelet Diffusion For Ultra-High-Resolution Image Synthesis
cs.CV 2025-05 unverdicted novelty 6.0

Latent Wavelet Diffusion uses wavelet energy map masking and a scale-consistent VAE to improve detail fidelity in 2K-4K image generation without extra inference overhead.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · cited by 1 Pith paper

[1]

In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Choi, J., Lee, J., Shin, C., Kim, S., Kim, H., Yoon, S.: Perception prioritized training of diffu- sion models. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 11462–11471 (2022)

work page 2022
[2]

In: Ranzato, M., Beygelzimer, A., Dauphin, Y ., Liang, P., Vaughan, J.W

Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. In: Ranzato, M., Beygelzimer, A., Dauphin, Y ., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Informa- tion Processing Systems. vol. 34, pp. 8780–8794. Curran Associates, Inc. (2021)

work page 2021
[3]

ACM Trans

Gal, R., Hochberg, D.C., Bermano, A., Cohen-Or, D.: Swagan: a style-based wavelet-driven generative model. ACM Trans. Graph. 40(4) (jul 2021)

work page 2021
[4]

IEEE Signal Processing Letters 30, 1397– 1401 (2023)

Grassucci, E., Sigillo, L., Uncini, A., Comminiello, D.: Grouse: A task and model agnostic wavelet- driven framework for medical imaging. IEEE Signal Processing Letters 30, 1397– 1401 (2023)

work page 2023
[5]

In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Work- shops (CVPRW)

Guo, T., Mousavi, H.S., Vu, T.H., Monga, V .: Deep wavelet prediction for image super- resolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Work- shops (CVPRW). pp. 1100–1109 (2017)

work page 2017
[6]

Advances in Neural Information Processing Systems 35, 478–491 (2022)

Guth, F., Coste, S., De Bortoli, V ., Mallat, S.: Wavelet score-based generative modeling. Advances in Neural Information Processing Systems 35, 478–491 (2022)

work page 2022
[7]

In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778 (2016)

work page 2016
[8]

In: Larochelle, H., Ran- zato, M., Hadsell, R., Balcan, M., Lin, H

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Larochelle, H., Ran- zato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems. vol. 33, pp. 6840–6851. Curran Associates, Inc. (2020)

work page 2020
[9]

Huang, Y ., Huang, J., Liu, J., Yan, M., Dong, Y ., Lv, J., Chen, C., Chen, S.: Wavedm: Wavelet-based diffusion models for image restoration (2024)

work page 2024
[10]

In: International Conference on Learning Representations (2018)

Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved qual- ity, stability, and variation. In: International Conference on Learning Representations (2018)

work page 2018
[11]

In: International Confer- ence on Learning Representations (ICLR)

Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: International Confer- ence on Learning Representations (ICLR). San Diega, CA, USA (2015) 10 Lorenzo Aloisi, Luigi Sigillo, Aurelio Uncini, and Danilo Comminiello

work page 2015
[12]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

Li, Y ., Fan, Y ., Xiang, X., Demandolx, D., Ranjan, R., Timofte, R., Van Gool, L.: Efficient and explicit modelling of image hierarchies for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

work page 2023
[13]

In: Artificial Neural Networks and Machine Learning – ICANN

Moser, B.B., Frolov, S., Raue, F., Palacio, S., Dengel, A.: Dwa: Differential wavelet amplifier for image super-resolution. In: Artificial Neural Networks and Machine Learning – ICANN. pp. 232–243. Springer Nature Switzerland, Cham (2023)

work page 2023
[14]

In: ACM SIGGRAPH 2023 Conference Proceedings

Parmar, G., Kumar Singh, K., Zhang, R., Li, Y ., Lu, J., Zhu, J.Y .: Zero-shot image-to-image translation. In: ACM SIGGRAPH 2023 Conference Proceedings. SIGGRAPH ’23, Associ- ation for Computing Machinery, New York, NY , USA (2023)

work page 2023
[15]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR)

Phung, H., Dao, Q., Tran, A.: Wavelet diffusion models are fast and scalable image genera- tors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR). pp. 10199–10208 (June 2023)

work page 2023
[16]

In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image syn- thesis with latent diffusion models. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 10674–10685. IEEE Computer Society (2022)

work page 2022
[17]

In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. pp. 234–241. Springer International Publishing, Cham (2015)

work page 2015
[18]

IEEE Trans

Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. on Pattern Analysis and Machine Intelligence (2023)

work page 2023
[19]

In: 2024 International Joint Conference on Neural Net- works (IJCNN)

Sigillo, L., Gramaccioni, R.F., Nicolosi, A., Comminiello, D.: Ship in sight: Diffusion mod- els for ship-image super resolution. In: 2024 International Joint Conference on Neural Net- works (IJCNN). pp. 1–8 (2024)

work page 2024
[20]

In: 2023 IEEE International Symposium on Circuits and Systems (ISCAS)

Sigillo, L., Grassucci, E., Comminiello, D.: Stawgan: Structural-aware generative adversarial networks for infrared image translation. In: 2023 IEEE International Symposium on Circuits and Systems (ISCAS). pp. 1–5 (2023)

work page 2023
[21]

Neurocomputing 638, 130195 (2025)

Sigillo, L., Grassucci, E., Uncini, A., Comminiello, D.: Generalizing medical image repre- sentations via quaternion wavelet networks. Neurocomputing 638, 130195 (2025)

work page 2025
[22]

In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition

Tumanyan, N., Geyer, M., Bagon, S., Dekel, T.: Plug-and-play diffusion features for text- driven image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. pp. 1921–1930 (2023)

work page 1921
[23]

Wang, J., Yue, Z., Zhou, S., Chan, K.C.K., Loy, C.C.: Exploiting diffusion prior for real- world image super-resolution (2023)

work page 2023
[24]

In: Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition

Wang, S., Saharia, C., Montgomery, C., Pont-Tuset, J., Noy, S., Pellegrini, S., Onoe, Y ., Las- zlo, S., Fleet, D.J., Soricut, R., et al.: Imagen editor and editbench: Advancing and evaluating text-guided image inpainting. In: Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition. pp. 18359–18369 (2023)

work page 2023
[25]

In: Proceedings of the IEEE/CVF international confer- ence on computer vision

Wang, X., Xie, L., Dong, C., Shan, Y .: Real-esrgan: Training real-world blind super- resolution with pure synthetic data. In: Proceedings of the IEEE/CVF international confer- ence on computer vision. pp. 1905–1914 (2021)

work page 1905
[26]

In: ECCV 2018 Workshops

Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y ., Dong, C., Qiao, Y ., Loy, C.C.: Esrgan: Enhanced super-resolution generative adversarial networks. In: ECCV 2018 Workshops. pp. 63–79. Springer International Publishing, Cham

work page 2018
[27]

In: International Conference on Learning Representations (2022)

Xiao, Z., Kreis, K., Vahdat, A.: Tackling the generative learning trilemma with denoising diffusion GANs. In: International Conference on Learning Representations (2022)

work page 2022
[28]

arXiv preprint arXiv:2401.03788 (2024)

Xue, M., He, J., He, Y ., Liu, Z., Wang, W., Zhou, M.: Low-light image enhancement via clip-fourier guided wavelet diffusion. arXiv preprint arXiv:2401.03788 (2024)

work page arXiv 2024
[29]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3836–3847 (2023)

work page 2023

[1] [1]

In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Choi, J., Lee, J., Shin, C., Kim, S., Kim, H., Yoon, S.: Perception prioritized training of diffu- sion models. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 11462–11471 (2022)

work page 2022

[2] [2]

In: Ranzato, M., Beygelzimer, A., Dauphin, Y ., Liang, P., Vaughan, J.W

Dhariwal, P., Nichol, A.: Diffusion models beat gans on image synthesis. In: Ranzato, M., Beygelzimer, A., Dauphin, Y ., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Informa- tion Processing Systems. vol. 34, pp. 8780–8794. Curran Associates, Inc. (2021)

work page 2021

[3] [3]

ACM Trans

Gal, R., Hochberg, D.C., Bermano, A., Cohen-Or, D.: Swagan: a style-based wavelet-driven generative model. ACM Trans. Graph. 40(4) (jul 2021)

work page 2021

[4] [4]

IEEE Signal Processing Letters 30, 1397– 1401 (2023)

Grassucci, E., Sigillo, L., Uncini, A., Comminiello, D.: Grouse: A task and model agnostic wavelet- driven framework for medical imaging. IEEE Signal Processing Letters 30, 1397– 1401 (2023)

work page 2023

[5] [5]

In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Work- shops (CVPRW)

Guo, T., Mousavi, H.S., Vu, T.H., Monga, V .: Deep wavelet prediction for image super- resolution. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Work- shops (CVPRW). pp. 1100–1109 (2017)

work page 2017

[6] [6]

Advances in Neural Information Processing Systems 35, 478–491 (2022)

Guth, F., Coste, S., De Bortoli, V ., Mallat, S.: Wavelet score-based generative modeling. Advances in Neural Information Processing Systems 35, 478–491 (2022)

work page 2022

[7] [7]

In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 770–778 (2016)

work page 2016

[8] [8]

In: Larochelle, H., Ran- zato, M., Hadsell, R., Balcan, M., Lin, H

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. In: Larochelle, H., Ran- zato, M., Hadsell, R., Balcan, M., Lin, H. (eds.) Advances in Neural Information Processing Systems. vol. 33, pp. 6840–6851. Curran Associates, Inc. (2020)

work page 2020

[9] [9]

Huang, Y ., Huang, J., Liu, J., Yan, M., Dong, Y ., Lv, J., Chen, C., Chen, S.: Wavedm: Wavelet-based diffusion models for image restoration (2024)

work page 2024

[10] [10]

In: International Conference on Learning Representations (2018)

Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved qual- ity, stability, and variation. In: International Conference on Learning Representations (2018)

work page 2018

[11] [11]

In: International Confer- ence on Learning Representations (ICLR)

Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: International Confer- ence on Learning Representations (ICLR). San Diega, CA, USA (2015) 10 Lorenzo Aloisi, Luigi Sigillo, Aurelio Uncini, and Danilo Comminiello

work page 2015

[12] [12]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

Li, Y ., Fan, Y ., Xiang, X., Demandolx, D., Ranjan, R., Timofte, R., Van Gool, L.: Efficient and explicit modelling of image hierarchies for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

work page 2023

[13] [13]

In: Artificial Neural Networks and Machine Learning – ICANN

Moser, B.B., Frolov, S., Raue, F., Palacio, S., Dengel, A.: Dwa: Differential wavelet amplifier for image super-resolution. In: Artificial Neural Networks and Machine Learning – ICANN. pp. 232–243. Springer Nature Switzerland, Cham (2023)

work page 2023

[14] [14]

In: ACM SIGGRAPH 2023 Conference Proceedings

Parmar, G., Kumar Singh, K., Zhang, R., Li, Y ., Lu, J., Zhu, J.Y .: Zero-shot image-to-image translation. In: ACM SIGGRAPH 2023 Conference Proceedings. SIGGRAPH ’23, Associ- ation for Computing Machinery, New York, NY , USA (2023)

work page 2023

[15] [15]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR)

Phung, H., Dao, Q., Tran, A.: Wavelet diffusion models are fast and scalable image genera- tors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR). pp. 10199–10208 (June 2023)

work page 2023

[16] [16]

In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image syn- thesis with latent diffusion models. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 10674–10685. IEEE Computer Society (2022)

work page 2022

[17] [17]

In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. pp. 234–241. Springer International Publishing, Cham (2015)

work page 2015

[18] [18]

IEEE Trans

Saharia, C., Ho, J., Chan, W., Salimans, T., Fleet, D.J., Norouzi, M.: Image super-resolution via iterative refinement. IEEE Trans. on Pattern Analysis and Machine Intelligence (2023)

work page 2023

[19] [19]

In: 2024 International Joint Conference on Neural Net- works (IJCNN)

Sigillo, L., Gramaccioni, R.F., Nicolosi, A., Comminiello, D.: Ship in sight: Diffusion mod- els for ship-image super resolution. In: 2024 International Joint Conference on Neural Net- works (IJCNN). pp. 1–8 (2024)

work page 2024

[20] [20]

In: 2023 IEEE International Symposium on Circuits and Systems (ISCAS)

Sigillo, L., Grassucci, E., Comminiello, D.: Stawgan: Structural-aware generative adversarial networks for infrared image translation. In: 2023 IEEE International Symposium on Circuits and Systems (ISCAS). pp. 1–5 (2023)

work page 2023

[21] [21]

Neurocomputing 638, 130195 (2025)

Sigillo, L., Grassucci, E., Uncini, A., Comminiello, D.: Generalizing medical image repre- sentations via quaternion wavelet networks. Neurocomputing 638, 130195 (2025)

work page 2025

[22] [22]

In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition

Tumanyan, N., Geyer, M., Bagon, S., Dekel, T.: Plug-and-play diffusion features for text- driven image-to-image translation. In: Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. pp. 1921–1930 (2023)

work page 1921

[23] [23]

Wang, J., Yue, Z., Zhou, S., Chan, K.C.K., Loy, C.C.: Exploiting diffusion prior for real- world image super-resolution (2023)

work page 2023

[24] [24]

In: Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition

Wang, S., Saharia, C., Montgomery, C., Pont-Tuset, J., Noy, S., Pellegrini, S., Onoe, Y ., Las- zlo, S., Fleet, D.J., Soricut, R., et al.: Imagen editor and editbench: Advancing and evaluating text-guided image inpainting. In: Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition. pp. 18359–18369 (2023)

work page 2023

[25] [25]

In: Proceedings of the IEEE/CVF international confer- ence on computer vision

Wang, X., Xie, L., Dong, C., Shan, Y .: Real-esrgan: Training real-world blind super- resolution with pure synthetic data. In: Proceedings of the IEEE/CVF international confer- ence on computer vision. pp. 1905–1914 (2021)

work page 1905

[26] [26]

In: ECCV 2018 Workshops

Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y ., Dong, C., Qiao, Y ., Loy, C.C.: Esrgan: Enhanced super-resolution generative adversarial networks. In: ECCV 2018 Workshops. pp. 63–79. Springer International Publishing, Cham

work page 2018

[27] [27]

In: International Conference on Learning Representations (2022)

Xiao, Z., Kreis, K., Vahdat, A.: Tackling the generative learning trilemma with denoising diffusion GANs. In: International Conference on Learning Representations (2022)

work page 2022

[28] [28]

arXiv preprint arXiv:2401.03788 (2024)

Xue, M., He, J., He, Y ., Liu, Z., Wang, W., Zhou, M.: Low-light image enhancement via clip-fourier guided wavelet diffusion. arXiv preprint arXiv:2401.03788 (2024)

work page arXiv 2024

[29] [29]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision

Zhang, L., Rao, A., Agrawala, M.: Adding conditional control to text-to-image diffusion models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 3836–3847 (2023)

work page 2023