pith. sign in

arxiv: 1907.11880 · v1 · pith:YYQC5FPOnew · submitted 2019-07-27 · 📡 eess.IV · cs.CV

Blind Deblurring Using GANs

Pith reviewed 2026-05-24 14:47 UTC · model grok-4.3

classification 📡 eess.IV cs.CV
keywords blind deblurringGANnon-local blockresidual connectionimage restorationattention mechanismsupervised learning
0
0 comments X

The pith

Non-local attention blocks and residual connections in GANs improve blind deblurring by supplying global image perception.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests modifications to GAN architectures for restoring sharp images from blurred ones when the blur kernel is unknown. Standard convolutions give only local information, which falls short for non-uniform blur that varies across an image. The authors insert non-local blocks to compute dependencies over the whole image and add residual connections to pass features from early layers to later ones. They also combine several loss terms with the usual adversarial loss and try edge maps plus feedback modules. These steps are meant to produce clearer output images in the supervised setting where blurred-sharp pairs are available.

Core claim

The paper claims that inserting non-local attention modules into the GAN encoder-decoder gives the global perception required for non-uniform blur, that residual connections improve results by combining lower-layer features with upper layers, and that adding L1, L2, and perceptual losses alongside the adversarial loss aids training stability for better image restoration.

What carries the argument

The non-local block, an attention module that models long-range dependencies across the entire image, placed inside the GAN to overcome the local-only view of convolutions.

If this is right

  • The model becomes able to restore images whose blur changes from one region to another.
  • Feature reuse across layers produces sharper results than the unmodified encoder-decoder.
  • Training becomes more stable when the adversarial objective is joined with L1, L2, and perceptual terms.
  • Edge maps and feedback loops can be added without changing the core architecture.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same non-local additions could be tried in other image-to-image tasks that need whole-image context.
  • Attention layers might let deblurring models stay shallow and fast while still seeing the full scene.
  • The supervised gains might carry over to unpaired data if the attention mechanism learns blur patterns independently of exact pairs.

Load-bearing premise

That adding non-local blocks will give the needed global view of the image without creating training convergence problems or slowing inference.

What would settle it

A side-by-side test on a standard paired deblurring dataset where the version with non-local blocks shows no gain in sharpness metrics or visual quality over the same GAN without them.

Figures

Figures reproduced from arXiv: 1907.11880 by Anubha Pandey, Anurag Mittal, Manoj Kumar Lenka.

Figure 1
Figure 1. Figure 1: Representation of Self Attention, the symbols mean the same as in [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Representation of Channel Attention, the symbols mean the same as [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Representation of how a feedback network works. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Each encode block consists of a convolution with stride 2 and padding [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Representation of DeblurGAN [8] 4.3 Metrics 4.3.1 PSNR PSNR can be thought of as the reciprocal of MSE. MSE can be calculated as MSE = P H,W (S − R) 2 H × W (8) where H, W are the size of the dimensions of the image. S and R are the sharp and restored image respectively. Given MSE, PSNR can be calculated using, P SNR = m2 MSE (9) where m is the maximum possible intensity value, since we are using 8-bit int… view at source ↗
read the original abstract

Deblurring is the task of restoring a blurred image to a sharp one, retrieving the information lost due to the blur. In blind deblurring we have no information regarding the blur kernel. As deblurring can be considered as an image to image translation task, deep learning based solutions, including the ones which use GAN (Generative Adversarial Network), have been proven effective for deblurring. Most of them have an encoder-decoder structure. Our objective is to try different GAN structures and improve its performance through various modifications to the existing structure for supervised deblurring. In supervised deblurring we have pairs of blurred and their corresponding sharp images, while in the unsupervised case we have a set of blurred and sharp images but their is no correspondence between them. Modifications to the structures is done to improve the global perception of the model. As blur is non-uniform in nature, for deblurring we require global information of the entire image, whereas convolution used in CNN is able to provide only local perception. Deep models can be used to improve global perception but due to large number of parameters it becomes difficult for it to converge and inference time increases, to solve this we propose the use of attention module (non-local block) which was previously used in language translation and other image to image translation tasks in deblurring. Use of residual connection also improves the performance of deblurring as features from the lower layers are added to the upper layers of the model. It has been found that classical losses like L1, L2, and perceptual loss also help in training of GANs when added together with adversarial loss. We also concatenate edge information of the image to observe its effects on deblurring. We also use feedback modules to retain long term dependencies

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes modifications to GAN architectures for supervised blind deblurring, including non-local attention blocks to improve global perception for non-uniform blur, residual connections, composite losses (adversarial combined with L1, L2, and perceptual), edge information concatenation, and feedback modules. These changes are motivated as addressing limitations of local convolutions and deep-model convergence/inference issues while enhancing performance on paired blurred-sharp image data.

Significance. If the modifications yield measurable gains on standard paired deblurring benchmarks without exacerbating runtime or training issues, the work could contribute an incremental architectural recipe for GAN-based image restoration. The explicit combination of non-local blocks with residual paths and multi-term losses is a concrete proposal that could be tested for reproducibility, though the abstract supplies no metrics, ablations, or timing data to ground the claimed benefits.

major comments (2)
  1. [Abstract] Abstract: the central motivation asserts that non-local blocks solve the convergence and inference-time problems of deep models while supplying global perception. This is load-bearing for the proposed architecture, yet non-local attention computes pairwise similarities over all spatial positions and therefore incurs O((HW)^2) complexity per layer, directly increasing rather than mitigating the stated computational burdens.
  2. [Abstract] Abstract: performance improvements are claimed for residual connections, composite losses, edge concatenation, and feedback modules, but the provided text contains no quantitative results, ablation tables, or comparisons against baselines on representative paired datasets, rendering the empirical claims unevaluable.
minor comments (2)
  1. [Abstract] Abstract: grammatical error 'their is no correspondence' should read 'there is no correspondence'.
  2. [Abstract] Abstract: the final sentence on feedback modules is truncated and does not specify how long-term dependencies are retained or integrated into the GAN.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below, proposing revisions to the abstract where the concerns are valid.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central motivation asserts that non-local blocks solve the convergence and inference-time problems of deep models while supplying global perception. This is load-bearing for the proposed architecture, yet non-local attention computes pairwise similarities over all spatial positions and therefore incurs O((HW)^2) complexity per layer, directly increasing rather than mitigating the stated computational burdens.

    Authors: We acknowledge the validity of this observation. Non-local attention does incur quadratic complexity, which can increase computational cost. The intent in the manuscript was to use non-local blocks to capture long-range dependencies for non-uniform blur without relying solely on deeper convolutional stacks, but the abstract wording overstates the resolution of inference-time issues. We will revise the abstract to provide a more precise motivation that notes the trade-off in complexity. revision: yes

  2. Referee: [Abstract] Abstract: performance improvements are claimed for residual connections, composite losses, edge concatenation, and feedback modules, but the provided text contains no quantitative results, ablation tables, or comparisons against baselines on representative paired datasets, rendering the empirical claims unevaluable.

    Authors: The abstract is a high-level summary of the proposed modifications. The full manuscript contains the experimental results, including comparisons on paired deblurring datasets. To improve evaluability from the abstract alone, we will add key quantitative metrics and baseline comparisons to the revised abstract. revision: yes

Circularity Check

0 steps flagged

No circularity: architectural proposals rest on empirical modifications without self-referential derivations

full rationale

The paper proposes empirical modifications to GAN architectures for supervised blind deblurring (non-local attention blocks for global perception, residual connections, composite losses including L1/L2/perceptual, edge concatenation, and feedback modules). No equations, fitted parameters, or predictions are defined or derived in the provided text. Claims rely on prior uses of non-local blocks in other domains and standard observations about CNN limitations, without any self-citation chains, ansatzes smuggled via citation, or reductions where outputs equal inputs by construction. The derivation chain is self-contained as a set of design choices evaluated on paired data.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The central claim implicitly assumes that global context is the dominant missing factor in prior CNN-based deblurring and that attention modules will supply it without side effects.

pith-pipeline@v0.9.0 · 5856 in / 1052 out tokens · 21172 ms · 2026-05-24T14:47:51.684862+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 4 internal anchors

  1. [1]

    A computational approach to edge-detection

    J Canny. A computational approach to edge-detection. Ieee transactions on pattern analysis and machine intelligence , 8(6):679–698, Nov 1986

  2. [2]

    From motion blur to motion flow: A deep learning solution for removing heterogeneous motion blur

    Dong Gong, Jie Yang, Lingqiao Liu, Yanning Zhang, Ian Reid, Chunhua Shen, Anton van den Hengel, and Qinfeng Shi. From motion blur to motion flow: A deep learning solution for removing heterogeneous motion blur. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017

  3. [3]

    Generative adversarial nets

    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27 , pages 2672–2680. Curran Associates, Inc., 2014

  4. [4]

    Squeeze-and-excitation networks

    Jie Hu, Li Qin Shen, and Gang Sun. Squeeze-and-excitation networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 7132–7141, 2018

  5. [5]

    Wein- berger

    Gao Huang, Zhuang Liu, Laurens van der Maaten, and Kilian Q. Wein- berger. Densely connected convolutional networks. In The IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR) , July 2017

  6. [6]

    Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image- to-image translation with conditional adversarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , July 2017

  7. [7]

    Perceptual losses for real-time style transfer and super-resolution

    Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision , 2016

  8. [8]

    Deblurgan: Blind motion deblurring using conditional adver- sarial networks

    Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Ji Matas. Deblurgan: Blind motion deblurring using conditional adver- sarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018

  9. [9]

    Spectral normalization for generative adversarial networks

    Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida. Spectral normalization for generative adversarial networks. InInternational Conference on Learning Representations , 2018

  10. [10]

    Deep multi-scale convolutional neural network for dynamic scene deblurring

    Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. Deep multi-scale convolutional neural network for dynamic scene deblurring. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , July 2017

  11. [11]

    Motion Deblurring in the Wild

    Mehdi Noroozi, Paramanand Chandramouli, and Paolo Favaro. Motion deblurring in the wild. CoRR, abs/1701.01486, 2017. Page 13 Indian Academy of Sciences Summer Research Fellowship Program

  12. [12]

    Deep Generative Filter for Motion Deblurring

    Sainandan Ramakrishnan, Shubham Pachori, Aalok Gangopadhyay, and Shanmuganathan Raman. Deep generative filter for motion deblurring. CoRR, abs/1709.03481, 2017

  13. [13]

    U-Net: Convolutional Networks for Biomedical Image Segmentation

    O. Ronneberger, P.Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI) , volume 9351 of LNCS, pages 234–241. Springer, 2015. (available on arXiv:1505.04597 [cs.CV])

  14. [14]

    Learning a convolu- tional neural network for non-uniform motion blur removal

    Jian Sun, Wenfei Cao, Zongben Xu, and Jean Ponce. Learning a convolu- tional neural network for non-uniform motion blur removal. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , June 2015

  15. [15]

    Scale- recurrent network for deep image deblurring

    Xin Tao, Hongyun Gao, Xiaoyong Shen, Jue Wang, and Jiaya Jia. Scale- recurrent network for deep image deblurring. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , June 2018

  16. [16]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, L ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30 , pages 5998–6008. Curran Associates, Inc., 2017

  17. [17]

    Non-local Neural Networks

    Xiaolong Wang, Ross B. Girshick, Abhinav Gupta, and Kaiming He. Non- local neural networks. CoRR, abs/1711.07971, 2017

  18. [18]

    Self- attention generative adversarial networks

    Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. Self- attention generative adversarial networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Con- ference on Machine Learning , volume 97 of Proceedings of Machine Learn- ing Research, pages 7354–7363, Long Beach, California, USA, 09–15 Jun

  19. [19]

    Lau, and Ming-Hsuan Yang

    Jiawei Zhang, Jinshan Pan, Jimmy Ren, Yibing Song, Linchao Bao, Ryn- son W.H. Lau, and Ming-Hsuan Yang. Dynamic scene deblurring using spatially variant recurrent neural networks. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , June 2018

  20. [20]

    Image super-resolution using very deep residual channel attention net- works

    Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. Image super-resolution using very deep residual channel attention net- works. In The European Conference on Computer Vision (ECCV) , Septem- ber 2018. Page 14 Indian Academy of Sciences Summer Research Fellowship Program Appendices A Abbreviations Below are the abbreviations used: GAN...