pith. sign in

arxiv: 1907.03128 · v1 · pith:YOHXSMQ4new · submitted 2019-07-06 · 💻 cs.CV · eess.IV

Multi-level Wavelet Convolutional Neural Networks

Pith reviewed 2026-05-25 01:33 UTC · model grok-4.3

classification 💻 cs.CV eess.IV
keywords wavelet transformconvolutional neural networkimage restorationreceptive fieldU-Netdenoisingsuper-resolutionpooling
0
0 comments X

The pith

Multi-level wavelet transforms embedded in CNNs allow larger receptive fields with less information loss than pooling or dilated convolutions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a multi-level wavelet CNN that inserts wavelet decomposition into the network to downsample features while expanding the receptive field. This is intended to overcome the information loss from pooling and the gridding artifacts from dilated filters. The architecture uses a U-Net base with inverse wavelet transforms to rebuild high-resolution maps for image restoration tasks. A sympathetic reader would care because it promises a more efficient way to handle large context in convolutional networks for vision problems. The model is also positioned as a generalization of average pooling applicable to classification.

Core claim

By embedding the wavelet transform into the CNN, the MWCNN reduces the resolution of feature maps to increase the receptive field size while preserving information better than pooling, and uses inverse wavelet transform to reconstruct the high resolution feature maps from the decomposed versions. This provides a better trade-off between receptive field and computational efficiency, and can replace pooling operations in CNNs.

What carries the argument

The multi-level wavelet transform (with inverse) embedded at multiple levels in the U-Net architecture to decompose and reconstruct feature maps.

If this is right

  • Improved results on image denoising, single image super-resolution, and JPEG artifact removal compared to prior methods.
  • Effective replacement for pooling in any CNN that requires downsampling operations.
  • Generalization of average pooling and improvement over dilated filters without checkerboard patterns.
  • Extension to object classification tasks with maintained efficiency.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Wavelet-based downsampling might preserve frequency information better in other signal processing tasks beyond images.
  • If the reconstruction is artifact-free, similar embeddings could be tested in video or 3D CNNs for temporal or volumetric data.
  • The approach could lead to parameter-free ways to control receptive field growth in network design.

Load-bearing premise

The inverse wavelet transform can reconstruct high-resolution feature maps from the low-resolution wavelet coefficients without adding significant artifacts or losing the efficiency advantage.

What would settle it

A direct comparison experiment where MWCNN is applied to a standard benchmark like BSD68 for denoising and shows no improvement in PSNR or SSIM over a baseline U-Net with strided convolutions or pooling at equivalent computational cost.

Figures

Figures reproduced from arXiv: 1907.03128 by Hongzhi Zhang, Pengju Liu, Wangmeng Zuo, Wei Lian.

Figure 1
Figure 1. Figure 1: The running time vs. PSNR value of representative [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: From WPT to MWCNN. Intuitively, WPT can be seen [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Multi-level wavelet-CNN architecture. It consists of two parts: the contracting and expanding subnetworks. Each solid [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of average pooling, dilated filter and the proposed MWCNN. Take one CNN block as an example: (a) sum-pooling with factor 2 leads to the most significant information loss which is not suitable for image restoration; (b) dilated filtering with rate 2 is equal to shared parameter convolution on sub-images; (c) the proposed MWCNN first decomposes an image into 4 sub-bands and then concatenates the… view at source ↗
Figure 5
Figure 5. Figure 5: Illustration of the gridding effect. Taken 3-layer CNNs as an example: (a) the dilated filtering with rate 2 suffers from large amount of information loss, (b) the two neighbored pixels are based on information from totally non-overlapped locations, and (c) our MWCNN can perfectly avoid underlying drawbacks. Compared with dilated filtering, MWCNN can also avoid the gridding effect. With the increase of dep… view at source ↗
Figure 6
Figure 6. Figure 6: Image denoising results of “T est044” (Set68) with noise level of 50. TABLE II: Average PSNR(dB) / SSIM results of the competing methods for SISR with scale factors S = 2, 3 and 4 on datasets Set5, Set14, BSD100 and Urban100. Red color indicates the best performance. Dataset S VDSR [2] DnCNN [5] RED30 [20] SRResNet [11] LapSRN [3] DRRN [17] MemNet [19] WaveResNet [45] SRMDNF [44] MWCNN(P) MWCNN Set5 ×2 37.… view at source ↗
Figure 7
Figure 7. Figure 7: Single image super-resolution: result of “253027” (BSD100) with upscaling factor of ×4. network. For qualitative comparisons, we use source codes of nine CNN-based methods, including VDSR [2], DnCNN [5], RED30 [20], SRResNet [11], LapSRN [3], DRRN [17], Mem￾Net [19], WaveResNet [45] and SRMDNF [44]. Since the source code of SRResNet is not released, their results as shown in Table II are incomplete. And th… view at source ↗
Figure 8
Figure 8. Figure 8: JPEG image artifacts removal: visual results of “carnivaldolls” (LIVE1) with quality factor of 10. the JPEG encoder. In our experiments, MWCNN is compared to four competing methods, i.e., ARCNN [35], TNRD [26], DnCNN [5], and MemNet [19]. The results of MemNet [19] and TNRD [26] are incomplete according to their paper and released source codes. Table III shows the average PSNR/SSIM results of the competing… view at source ↗
read the original abstract

In computer vision, convolutional networks (CNNs) often adopts pooling to enlarge receptive field which has the advantage of low computational complexity. However, pooling can cause information loss and thus is detrimental to further operations such as features extraction and analysis. Recently, dilated filter has been proposed to trade off between receptive field size and efficiency. But the accompanying gridding effect can cause a sparse sampling of input images with checkerboard patterns. To address this problem, in this paper, we propose a novel multi-level wavelet CNN (MWCNN) model to achieve better trade-off between receptive field size and computational efficiency. The core idea is to embed wavelet transform into CNN architecture to reduce the resolution of feature maps while at the same time, increasing receptive field. Specifically, MWCNN for image restoration is based on U-Net architecture, and inverse wavelet transform (IWT) is deployed to reconstruct the high resolution (HR) feature maps. The proposed MWCNN can also be viewed as an improvement of dilated filter and a generalization of average pooling, and can be applied to not only image restoration tasks, but also any CNNs requiring a pooling operation. The experimental results demonstrate effectiveness of the proposed MWCNN for tasks such as image denoising, single image super-resolution, JPEG image artifacts removal and object classification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a novel multi-level wavelet CNN (MWCNN) model that embeds the wavelet transform into CNN architectures to achieve a better trade-off between receptive field size and computational efficiency. It uses discrete wavelet transform (DWT) to reduce the resolution of feature maps and inverse wavelet transform (IWT) to reconstruct high-resolution feature maps within a U-Net architecture for image restoration tasks. The model is presented as an improvement over dilated filters and a generalization of average pooling, with claimed applicability to various CNNs and experimental effectiveness on image denoising, single image super-resolution, JPEG artifact removal, and object classification.

Significance. If the results hold, this approach could offer an efficient alternative to pooling and dilated convolutions in CNN design by leveraging wavelet properties for resolution reduction and receptive field expansion without gridding effects or information loss. The conceptual framing as a generalization of pooling is a positive aspect.

major comments (2)
  1. [Abstract] The central claim of experimental effectiveness on four tasks rests on an assertion without any accompanying metrics, baselines, ablation details, or error analysis, which is load-bearing since the soundness of the proposal depends on demonstrated performance gains.
  2. [Abstract] The description of deploying IWT to reconstruct HR feature maps provides no implementation specifics such as the wavelet family, boundary handling, or how differentiability is ensured for backpropagation, leaving the assumption of artifact-free integration unverified for CNN feature maps.
minor comments (1)
  1. [Abstract] The statement that MWCNN 'can be applied to not only image restoration tasks, but also any CNNs requiring a pooling operation' would benefit from more precise examples of such CNNs or operations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below, noting that the abstract serves as a concise summary while the full manuscript provides supporting details and experiments.

read point-by-point responses
  1. Referee: [Abstract] The central claim of experimental effectiveness on four tasks rests on an assertion without any accompanying metrics, baselines, ablation details, or error analysis, which is load-bearing since the soundness of the proposal depends on demonstrated performance gains.

    Authors: The abstract provides a high-level overview of the contributions. Quantitative metrics (e.g., PSNR/SSIM gains), baselines (DnCNN, VDSR, etc.), ablation studies on wavelet decomposition levels, and error analyses are presented in detail in Sections 4.1–4.4 of the manuscript for all four tasks. We can partially revise the abstract to include one or two representative performance numbers if space allows, to better highlight the gains without altering its summary nature. revision: partial

  2. Referee: [Abstract] The description of deploying IWT to reconstruct HR feature maps provides no implementation specifics such as the wavelet family, boundary handling, or how differentiability is ensured for backpropagation, leaving the assumption of artifact-free integration unverified for CNN feature maps.

    Authors: The abstract is intentionally brief. Full implementation details appear in Section 3: we employ the Haar wavelet (orthogonal with perfect reconstruction), use symmetric extension for boundary handling, and note that DWT/IWT are fixed linear operations and thus differentiable, enabling seamless end-to-end backpropagation. Experiments in Section 4 confirm artifact-free integration through visual and quantitative results. No change to the abstract is needed, as these specifics belong in the methods section. revision: no

Circularity Check

0 steps flagged

No circularity: architectural proposal without load-bearing derivations

full rationale

The paper proposes an MWCNN architecture that embeds discrete wavelet transform (DWT) and inverse wavelet transform (IWT) into a U-Net backbone for image restoration tasks. No equations, predictions, or first-principles derivations are presented that reduce to fitted parameters, self-definitions, or self-citation chains. The core claim is an engineering suggestion (wavelet-based downsampling as an alternative to pooling or dilation), supported by experimental results on standard tasks rather than any mathematical equivalence to its inputs. Self-citations, if present in the full text, are not load-bearing for the central architectural idea. This is a standard non-circular model proposal.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that wavelet transforms integrate cleanly into CNN feature maps.

pith-pipeline@v0.9.0 · 5757 in / 1094 out tokens · 21455 ms · 2026-05-25T01:33:29.653982+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages · 5 internal anchors

  1. [1]

    C. Dong, C. C. Loy, K. He, and X. Tang. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2):295–307, 2016

  2. [2]

    J. Kim, J. K. Lee, and K. M. Lee. Accurate image super-resolution using very deep convolutional networks. In IEEE Conference on Computer Vision and Pattern Recognition , pages 1646–1654, 2016

  3. [3]

    Lai, J.-B

    W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang. Deep Laplacian pyramid networks for fast and accurate super-resolution.IEEE Conference on Computer Vision and Pattern Recognition , 2017

  4. [4]

    W. Shi, J. Caballero, F. Husz ´ar, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang. Real-time single image and video super- resolution using an efficient sub-pixel convolutional neural network. In IEEE Conference on Computer Vision and Pattern Recognition , pages 1874–1883, 2016

  5. [5]

    Zhang, W

    K. Zhang, W. Zuo, Y . Chen, D. Meng, and L. Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing , PP(99):1–1, 2016

  6. [6]

    Krizhevsky, I

    A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Infor- mation Processing Systems , pages 1097–1105, 2012

  7. [7]

    Very Deep Convolutional Networks for Large-Scale Image Recognition

    K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 , 2014

  8. [8]

    Multi-Scale Context Aggregation by Dilated Convolutions

    F. Yu and V . Koltun. Multi-scale context aggregation by dilated convo- lutions. arXiv preprint arXiv:1511.07122 , 2015

  9. [9]

    K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016

  10. [10]

    K. He, X. Zhang, S. Ren, and J. Sun. Identity mappings in deep residual networks. In European Conference on Computer Vision , pages 630–645. Springer, 2016. 11

  11. [11]

    Ledig, L

    C. Ledig, L. Theis, F. Husz ´ar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. IEEE Conference on Computer Vision and Pattern Recognition , 2017

  12. [12]

    Zhang, W

    K. Zhang, W. Zuo, S. Gu, and L. Zhang. Learning deep cnn denoiser prior for image restoration. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3929–3938, 2017

  13. [13]

    A Deep Learning Approach to Block-based Compressed Sensing of Images

    A. Adler, D. Boublil, M. Elad, and M. Zibulevsky. A deep learning approach to block-based compressed sensing of images. arXiv preprint arXiv:1606.01519, 2016

  14. [14]

    The Little Engine that Could: Regularization by Denoising (RED)

    Y . Romano, M. Elad, and P. Milanfar. The little engine that could: Regularization by denoising (red). arXiv preprint arXiv:1611.02862 , 2016

  15. [15]

    S. Yan, X. Xu, D. Xu, S. Lin, and X. Li. Image classification with densely sampled image windows and generalized adaptive multiple kernel learning. IEEE Transactions on Cybernetics , 45(3):381–390, 2015

  16. [16]

    P. Wang, P. Chen, Y . Yuan, D. Liu, Z. Huang, X. Hou, and G. Cottrell. Understanding convolution for semantic segmentation. arXiv preprint arXiv:1702.08502, 2017

  17. [17]

    Y . Tai, J. Yang, and X. Liu. Image super-resolution via deep recursive residual network. In IEEE Conference on Computer Vision and Pattern Recognition, 2017

  18. [18]

    C. Dong, C. L. Chen, and X. Tang. Accelerating the super-resolution convolutional neural network. In European Conference on Computer Vision, pages 391–407, 2016

  19. [19]

    Y . Tai, J. Yang, X. Liu, and C. Xu. MemNet: A persistent memory network for image restoration. In IEEE Conference on International Conference on Computer Vision , 2017

  20. [20]

    X. Mao, C. Shen, and Y . Yang. Image restoration using very deep con- volutional encoder-decoder networks with symmetric skip connections. In Advances in Neural Information Processing Systems , pages 2802–2810, 2016

  21. [21]

    Daubechies

    I. Daubechies. The wavelet transform, time-frequency localization and signal analysis. IEEE Transactions on Information Theory , 36(5):961– 1005, 1990

  22. [22]

    Daubechies

    I. Daubechies. Ten lectures on wavelets . SIAM, 1992

  23. [23]

    Ronneberger, P

    O. Ronneberger, P. Fischer, and T. Brox. U-Net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention , pages 234–241, 2015

  24. [24]

    P. Liu, H. Zhang, K. Zhang, L. Lin, and W. Zuo. Multi-level wavelet- cnn for image restoration. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 773–782, 2018

  25. [25]

    M. R. Banham and A. K. Katsaggelos. Digital image restoration. IEEE Signal Processing Magazine , 14(2):24–41, 1997

  26. [26]

    Chen and T

    Y . Chen and T. Pock. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE Transactions on Pattern Analysis and Machine Intelligence , PP(99):1–1, 2015

  27. [27]

    Dabov, A

    K. Dabov, A. Foi, V . Katkovnik, and K. Egiazarian. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Transactions on Image Processing , 16(8):2080–2095, 2007

  28. [28]

    S. Gu, L. Zhang, W. Zuo, and X. Feng. Weighted nuclear norm minimization with application to image denoising. In IEEE Conference on Computer Vision and Pattern Recognition , pages 2862–2869, 2014

  29. [29]

    Schmidt and S

    U. Schmidt and S. Roth. Shrinkage fields for effective image restoration. In IEEE Conference on Computer Vision and Pattern Recognition , pages 2774–2781, 2014

  30. [30]

    Wright, A

    J. Wright, A. Y . Yang, A. Ganesh, S. S. Sastry, and Y . Ma. Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence , 31(2):210–227, 2009

  31. [31]

    Agostinelli, M

    F. Agostinelli, M. R. Anderson, and H. Lee. Robust image denoising with multi-column deep neural networks. In Advances in Neural Infor- mation Processing Systems , pages 1493–1501, 2013

  32. [32]

    Jain and S

    V . Jain and S. Seung. Natural image denoising with convolutional networks. In Advances in Neural Information Processing Systems , pages 769–776, 2009

  33. [33]

    J. Xie, L. Xu, and E. Chen. Image denoising and inpainting with deep neural networks. In International Conference on Neural Information Processing Systems, pages 341–349, 2012

  34. [34]

    H. C. Burger, C. J. Schuler, and S. Harmeling. Image denoising: Can plain neural networks compete with BM3D? In IEEE Conference on Computer Vision and Pattern Recognition , pages 2392–2399, 2012

  35. [35]

    C. Dong, Y . Deng, C. Change Loy, and X. Tang. Compression artifacts reduction by a deep convolutional network. In IEEE Conference on International Conference on Computer Vision , pages 576–584, 2015

  36. [36]

    B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee. Enhanced deep residual networks for single image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition Workshops , pages 1132– 1140, 2017

  37. [37]

    Zhang, K

    Y . Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y . Fu. Image super-resolution using very deep residual channel attention networks. In European Conference on Computer Vision , 2018

  38. [38]

    J. Kim, J. Kwon Lee, and K. Mu Lee. Deeply-recursive convolutional network for image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition , pages 1637–1645, 2016

  39. [39]

    Santhanam, V

    V . Santhanam, V . I. Morariu, and L. S. Davis. Generalized deep image to image regression. IEEE Conference on Computer Vision and Pattern Recognition, pages 5609–5619, 2017

  40. [40]

    Zhang, W

    K. Zhang, W. Zuo, and L. Zhang. FFDNet: Toward a fast and flexible solution for CNN based image denoising. IEEE Transactions on Image Processing, 2018

  41. [41]

    S. Guo, Z. Yan, K. Zhang, W. Zuo, and L. Zhang. Toward convolutional blind denoising of real photographs. In IEEE Conference on Computer Vision and Pattern Recognition , 2016

  42. [42]

    Johnson, A

    J. Johnson, A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, pages 694–711. Springer, 2016

  43. [43]

    Riegler, S

    G. Riegler, S. Schulter, M. Ruther, and H. Bischof. Conditioned regression models for non-blind single image super-resolution. In IEEE Conference on International Conference on Computer Vision , 2015

  44. [44]

    Zhang, W

    K. Zhang, W. Zuo, and L. Zhang. Learning a single convolutional super- resolution network for multiple degradations. In IEEE Conference on Computer Vision and Pattern Recognition , 2018

  45. [45]

    W. Bae, J. Yoo, and J. C. Ye. Beyond deep residual learning for image restoration: Persistent homology-guided manifold simplification. In IEEE Conference on Computer Vision and Pattern Recognition Workshops , pages 1141–1149, 2017

  46. [46]

    T. Guo, H. S. Mousavi, T. H. Vu, and V . Monga. Deep wavelet prediction for image super-resolution. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

  47. [47]

    Han and J

    Y . Han and J. C. Ye. Framing U-Net via deep convolutional framelets: Application to sparse-view CT. IEEE Transactions on Medical Imaging , pages 1418–1429, 2018

  48. [48]

    J. C. Ye and Y . S. Han. Deep convolutional framelets: A general deep learning for inverse problems. Society for Industrial and Applied Mathematics, 2018

  49. [49]

    Szegedy, W

    C. Szegedy, W. Liu, Y . Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V . Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In IEEE Conference on Computer Vision and Pattern Recognition , pages 1–9, 2015

  50. [50]

    D. Han, J. Kim, and J. Kim. Deep pyramidal residual networks. In IEEE Conference on Computer Vision and Pattern Recognition , pages 6307–6315, 2017

  51. [51]

    S. Zhai, Y . Cheng, Z. M. Zhang, and W. Lu. Doubly convolutional neural networks. In Advances in Neural Information Processing Systems , pages 1082–1090, 2016

  52. [52]

    Huang, Z

    G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In 2017 IEEE Conference on Com- puter Vision and Pattern Recognition , pages 2261–2269, 2017

  53. [53]

    Zagoruyko and N

    S. Zagoruyko and N. Komodakis. Wide residual networks. In British Machine Vision Conference, 2016

  54. [54]

    Takeki, D

    A. Takeki, D. Ikami, G. Irie, and K. Aizawa. Parallel grid pooling for data augmentation. In European Conference on Computer Vision , 2018

  55. [55]

    Q. Wang, Z. Gao, J. Xie, W. Zuo, and P. Li. Global gated mixture of second-order pooling for improving deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1277–1286, 2018

  56. [56]

    S. G. Mallat. A theory for multiresolution signal decomposition: the wavelet representation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11(7):674–693, 1989

  57. [57]

    A. N. Akansu and R. A. Haddad. Multiresolution signal decomposition: transforms, subbands, and wavelets . Academic Press, 2001

  58. [58]

    A. S. Lewis and G. Knowles. Image compression using the 2-D wavelet transform. IEEE Transactions on Image Processing, 1(2):244–250, 1992

  59. [59]

    S. G. Chang, B. Yu, and M. Vetterli. Adaptive wavelet thresholding for image denoising and compression. IEEE Transactions on Image Processing, 9(9):1532–1546, 2000

  60. [60]

    Kingma and J

    D. Kingma and J. Ba. Adam: A method for stochastic optimization. In International Conference for Learning Representations , 2015

  61. [61]

    S. Xie, R. Girshick, P. Doll ´ar, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In IEEE Conference on Computer Vision and Pattern Recognition , pages 1492–1500, 2017. 12

  62. [62]

    Agustsson and R

    E. Agustsson and R. Timofte. Ntire 2017 challenge on single image super-resolution: Dataset and study. In IEEE Conference on Computer Vision and Pattern Recognition Workshops , pages 1122–113, 2017

  63. [63]

    Martin, C

    D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In IEEE Conference on International Conference Computer Vision , volume 2, pages 416–423, 2001

  64. [64]

    Huang, A

    J.-B. Huang, A. Singh, and N. Ahuja. Single image super-resolution from transformed self-exemplars. In IEEE Conference on Computer Vision and Pattern Recognition , pages 5197–5206, 2015

  65. [65]

    Bevilacqua, A

    M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel. Low- complexity single-image super-resolution based on nonnegative neighbor embedding. 2012

  66. [66]

    Zeyde, M

    R. Zeyde, M. Elad, and M. Protter. On single image scale-up using sparse-representations. In International Conference on Curves and Surfaces, pages 711–730. Springer, 2010

  67. [67]

    A. K. Moorthy and A. C. Bovik. Visual importance pooling for image quality assessment. IEEE Journal of Selected Topics in Signal Processing, 3(2):193–201, 2009

  68. [68]

    Vedaldi and K

    A. Vedaldi and K. Lenc. Matconvnet: Convolutional neural networks for matlab. In the 23rd ACM international conference on Multimedia , pages 689–692, 2015

  69. [69]

    Krizhevsky and G

    A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical Report, Citeseer, 2009

  70. [70]

    Netzer, T

    Y . Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu, and A. Y . Ng. Reading digits in natural images with unsupervised feature learning. In NIPS Workshop, volume 2011, page 5, 2011

  71. [71]

    LeCun, L

    Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278– 2324, 1998

  72. [72]

    B. Zhou, A. Lapedriza, A. Khosla, A. Oliva, and A. Torralba. Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017

  73. [73]

    J. T. Springenberg, A. Dosovitskiy, T. Brox, and M. Riedmiller. Striving for simplicity: The all convolutional net. In International Conference on Learning Representations Workshop, 2015