Linear Attention Based Deep Nonlocal Means Filtering for Multiplicative Noise Removal
Pith reviewed 2026-05-23 22:55 UTC · model grok-4.3
The pith
A deep linear attention mechanism linearizes nonlocal means to remove multiplicative noise at linear cost while keeping classical interpretability.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By extracting representation vectors with deep channel convolution neural networks and replacing the similarity calculation and weighted averaging processes of nonlocal means with the inner operations of an attention mechanism, a nonlocal filter of linear complexity is obtained. Experiments demonstrate that this LDNLM is competitive with state-of-the-art methods on both simulated and real multiplicative noise, and the method is shown to possess interpretability close to traditional NLM.
What carries the argument
Linear attention mechanism derived directly from the similarity calculation and weighted averaging formulas of nonlocal means, applied to representation vectors extracted by deep channel convolutions.
If this is right
- The derived filter runs with linear computational complexity instead of the quadratic cost of standard nonlocal means.
- Denoising performance on simulated and real multiplicative noise data is competitive with state-of-the-art methods.
- Interpretability remains close to that of the classical nonlocal means algorithm.
- The approach applies directly to radar images and medical images corrupted by multiplicative noise.
Where Pith is reading between the lines
- The attention substitution creates a natural bridge between classical nonlocal filters and modern attention-based neural networks that could be explored in other vision tasks.
- Linear runtime makes it feasible to apply nonlocal-style denoising to high-resolution or video data where quadratic methods become impractical.
- If the learned representation vectors transfer across domains, the same linearization strategy might extend to additive noise or other inverse problems.
- Further theoretical work could quantify how closely the deep representations match the hand-designed patch features used in the original NLM.
Load-bearing premise
The representation vectors produced by the deep channel convolution networks preserve the similarity relationships required by the original nonlocal means algorithm so that the attention substitution yields an equivalent denoiser.
What would settle it
A side-by-side comparison of the attention weights generated by LDNLM against the explicit similarity weights of classical NLM on the same set of image patches would show whether the weights and resulting filtered outputs align closely enough to support the claimed equivalence.
Figures
read the original abstract
Multiplicative noise widely exists in radar images, medical images and other important fields' images. Compared to normal noises, multiplicative noise has a generally stronger effect on the visual expression of images. Aiming at the denoising problem of multiplicative noise, we linearize the nonlocal means algorithm with deep learning and propose a linear attention mechanism based deep nonlocal means filtering (LDNLM). Starting from the traditional nonlocal means filtering, we employ deep channel convolution neural networks to extract the information of the neighborhood matrix and obtain representation vectors of every pixel. Then we replace the similarity calculation and weighted averaging processes with the inner operations of the attention mechanism. To reduce the computational overhead, through the formula of similarity calculation and weighted averaging, we derive a nonlocal filter with linear complexity. Experiments on both simulated and real multiplicative noise demonstrate that the LDNLM is more competitive compared with the state-of-the-art methods. Additionally, we prove that the LDNLM possesses interpretability close to traditional NLM. The source code and pre-trained model are available at https://github.com/ShowiBin/LDNLM.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes LDNLM, a method that linearizes the nonlocal means (NLM) filter for multiplicative noise removal by extracting representation vectors via deep channel convolution networks, substituting similarity computation and weighted averaging into a linear attention mechanism, and algebraically deriving a linear-complexity nonlocal filter. It claims that experiments on simulated and real multiplicative noise data show competitiveness with state-of-the-art methods and that interpretability close to classical NLM is proved, with source code and pre-trained models released.
Significance. If the learned embeddings preserve the similarity ordering required by NLM under the multiplicative model, the work could yield an efficient denoiser that retains some classical interpretability while achieving linear complexity. Explicit release of code and models is a clear strength for reproducibility.
major comments (3)
- [abstract (method pipeline)] Abstract (method pipeline paragraph): the claim that representation vectors from the deep channel convolution networks can be substituted into attention while preserving the similarity relationships of classical NLM is load-bearing for both the interpretability statement and the competitiveness claim, yet no verification, ordering preservation argument, or ablation is supplied for the multiplicative noise case where the original exp(-||p-q||^2) similarity is already an approximation.
- [abstract] Abstract: the assertion 'we prove that the LDNLM possesses interpretability close to traditional NLM' is unsupported by any theorem, derivation, or equivalence proof in the provided description; algebraic substitution alone does not establish that the learned inner products replicate classical NLM weights.
- [abstract] Abstract: the statement that 'experiments on both simulated and real multiplicative noise demonstrate that the LDNLM is more competitive' supplies no quantitative metrics, tables, or ablation details, preventing assessment of whether the performance advantage is robust or merely consistent with the unverified embedding assumption.
minor comments (1)
- Abstract: 'more competitive compared with' is nonstandard; 'more competitive than' is the conventional phrasing.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the abstract. We agree that the claims require stronger support and will revise the abstract to address the points raised while preserving the core contributions. Below we respond point by point.
read point-by-point responses
-
Referee: [abstract (method pipeline)] Abstract (method pipeline paragraph): the claim that representation vectors from the deep channel convolution networks can be substituted into attention while preserving the similarity relationships of classical NLM is load-bearing for both the interpretability statement and the competitiveness claim, yet no verification, ordering preservation argument, or ablation is supplied for the multiplicative noise case where the original exp(-||p-q||^2) similarity is already an approximation.
Authors: We acknowledge the need for explicit verification of similarity ordering under the multiplicative model. The manuscript derives the linear attention directly from the NLM weighted-average formula by substituting learned feature inner products for the original patch-distance similarity; this algebraic step is intended to preserve the ordering when the network is trained end-to-end on multiplicative-noise data. To strengthen the abstract we will add a concise statement referencing the ordering-preservation property shown by the training objective and will include a short supporting argument or cross-reference to the ablation in Section 4. revision: yes
-
Referee: [abstract] Abstract: the assertion 'we prove that the LDNLM possesses interpretability close to traditional NLM' is unsupported by any theorem, derivation, or equivalence proof in the provided description; algebraic substitution alone does not establish that the learned inner products replicate classical NLM weights.
Authors: The interpretability claim rests on the explicit derivation in Section 3, which shows that the attention operations recover the classical NLM weighted sum once the representation vectors are obtained. We will revise the abstract to replace the word “prove” with “show via algebraic derivation” and will add a one-sentence pointer to the key equivalence steps in Section 3 so that the abstract no longer stands alone. revision: yes
-
Referee: [abstract] Abstract: the statement that 'experiments on both simulated and real multiplicative noise demonstrate that the LDNLM is more competitive' supplies no quantitative metrics, tables, or ablation details, preventing assessment of whether the performance advantage is robust or merely consistent with the unverified embedding assumption.
Authors: Abstract length constraints preclude full tables, yet the experiments section reports concrete PSNR/SSIM gains and visual comparisons against state-of-the-art methods on both simulated and real data. In revision we will insert a brief quantitative phrase (e.g., “outperforms prior methods by up to 1.2 dB PSNR on average”) while keeping the abstract within limits, thereby giving readers an immediate sense of the reported advantage. revision: partial
Circularity Check
No circularity in derivation chain
full rationale
The paper starts from classical NLM, substitutes CNN-derived representation vectors into an attention mechanism, and algebraically rewrites the resulting filter for linear complexity. The interpretability claim follows from the explicit structural substitution rather than from any fitted parameter or self-citation. Performance assertions rest on experiments, not on a prediction that reduces to the training data by construction. No load-bearing step equates to its own inputs.
Axiom & Free-Parameter Ledger
free parameters (1)
- CNN weights
axioms (2)
- standard math The algebraic rearrangement of attention equations yields an exactly equivalent linear-complexity filter
- domain assumption Learned neighborhood representations preserve the similarity structure needed for NLM denoising
Reference graph
Works this paper leans on
-
[1]
IEEE Transactions on Geoscience and Remote sensing 41(8), 1773–1784 (2003) 1
Achim, A., Tsakalides, P., Bezerianos, A.: Sar image denoising via bayesian wavelet shrinkage based on heavy-tailed modeling. IEEE Transactions on Geoscience and Remote sensing 41(8), 1773–1784 (2003) 1
work page 2003
-
[2]
In: 2005IEEEcomputersocietyconferenceoncomputervisionandpatternrecognition (CVPR’05)
Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005IEEEcomputersocietyconferenceoncomputervisionandpatternrecognition (CVPR’05). vol. 2, pp. 60–65. Ieee (2005) 2
work page 2005
-
[3]
Image Processing On Line 1, 208–212 (2011) 2, 3, 8, 9, 11
Buades, A., Coll, B., Morel, J.M.: Non-local means denoising. Image Processing On Line 1, 208–212 (2011) 2, 3, 8, 9, 11
work page 2011
-
[4]
Biomedical optics express12(3), 1482–1498 (2021) 2
Cheong, H., Devalla, S.K., Chuangsuwanich, T., Tun, T.A., Wang, X., Aung, T., Schmetterer, L., Buist, M.L., Boote, C., Thiéry, A.H., et al.: Oct-gan: single step shadow and noise removal from optical coherence tomography images of the human optic nerve head. Biomedical optics express12(3), 1482–1498 (2021) 2
work page 2021
-
[5]
In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS)
Chierchia, G., Cozzolino, D., Poggi, G., Verdoliva, L.: Sar image despeckling through convolutional neural networks. In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). pp. 5438–5441. IEEE (2017) 2
work page 2017
-
[6]
Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)
Clevert, D.A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289 (2015) 7
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[7]
Remote Sensing12(6), 1006 (2020) 2, 4, 8, 9, 11
Cozzolino, D., Verdoliva, L., Scarpa, G., Poggi, G.: Nonlocal cnn sar image de- speckling. Remote Sensing12(6), 1006 (2020) 2, 4, 8, 9, 11
work page 2020
-
[8]
IEEE Transactions on image processing 16(8), 2080–2095 (2007) 3, 8, 9, 11
Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Transactions on image processing 16(8), 2080–2095 (2007) 3, 8, 9, 11
work page 2080
-
[9]
IEEE Transactions on Geoscience and Remote Sensing 60, 1–13 (2021) 2
Dalsasso, E., Denis, L., Tupin, F.: As if by magic: self-supervised training of deep despeckling networks with merlin. IEEE Transactions on Geoscience and Remote Sensing 60, 1–13 (2021) 2
work page 2021
-
[10]
Dalsasso, E., Denis, L., Tupin, F.: Sar2sar: A semi-supervised despeckling algo- rithm for sar images. IEEE Journal of Selected Topics in Applied Earth Observa- tions and Remote Sensing14, 4321–4329 (2021) 2
work page 2021
-
[11]
In: IGARSS 2019- 2019 IEEE International Geoscience and Remote Sensing Symposium
Denis, L., Deledalle, C.A., Tupin, F.: From patches to deep learning: Combining self-similarity and neural networks for sar image despeckling. In: IGARSS 2019- 2019 IEEE International Geoscience and Remote Sensing Symposium. pp. 5113–
work page 2019
- [12]
-
[13]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020) 2
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[14]
Remote Sensing9(4), 389 (2017) 9
Gomez, L., Ospina, R., Frery, A.C.: Unassisted quantitative evaluation of despeck- ling filters. Remote Sensing9(4), 389 (2017) 9
work page 2017
-
[15]
arXiv preprint arXiv:2305.09890 (2023) 2
Han, Y.J., Yu, H.J.: Ss-bsn: Attentive blind-spot network for self-supervised de- noising with nonlocal self-similarity. arXiv preprint arXiv:2305.09890 (2023) 2
-
[16]
Advances in neural information processing systems33, 6840–6851 (2020) 2
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020) 2
work page 2020
-
[17]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Hu, X., Ma, R., Liu, Z., Cai, Y., Zhao, X., Zhang, Y., Wang, H.: Pseudo 3d auto- correlation network for real image denoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16175–16184 (2021) 2
work page 2021
-
[18]
In: Proceedings of the AAAI Conference on Artificial Intelligence
Joo, S., Cha, S., Moon, T.: Dopamine: Double-sided masked cnn for pixel adap- tive multiplicative noise despeckling. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 4031–4038 (2019) 2
work page 2019
-
[19]
In: International conference on machine learning
Katharopoulos, A., Vyas, A., Pappas, N., Fleuret, F.: Transformers are rnns: Fast autoregressive transformers with linear attention. In: International conference on machine learning. pp. 5156–5165. PMLR (2020) 7
work page 2020
-
[20]
ACM Computing Surveys (CSUR)55(2), 1–38 (2022) 4
Kaur, D., Uslu, S., Rittichier, K.J., Durresi, A.: Trustworthy artificial intelligence: a review. ACM Computing Surveys (CSUR)55(2), 1–38 (2022) 4
work page 2022
-
[21]
Ko, J., Lee, S.: Sar image despeckling using continuous attention module. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing15, 3–19 (2021) 8, 9, 11
work page 2021
-
[22]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Lai, Z., Yan, C., Fu, Y.: Hybrid spectral denoising transformer with guided at- tention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 13065–13075 (2023) 2
work page 2023
-
[23]
IEEE transactions on pattern analysis and machine intelligence (2), 165–168 (1980) 1
Lee, J.S.: Digital image enhancement and noise filtering by use of local statistics. IEEE transactions on pattern analysis and machine intelligence (2), 165–168 (1980) 1
work page 1980
-
[24]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Lefkimmiatis, S.: Non-local color image denoising with convolutional neural net- works. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3587–3596 (2017) 2, 4
work page 2017
-
[25]
Noise2Noise: Learning Image Restoration without Clean Data
Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M., Aila, T.: Noise2noise: Learning image restoration without clean data. arXiv preprint arXiv:1803.04189 (2018) 2
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[26]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
Li, M., Liu, J., Fu, Y., Zhang, Y., Dou, D.: Spectral enhanced rectangle transformer for hyperspectral image denoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 5805–5814 (2023) 2
work page 2023
-
[27]
In: Proceedings of the IEEE/CVF international conference on computer vision
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restoration using swin transformer. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 1833–1844 (2021) 2
work page 2021
-
[28]
In: 10th annual international sympo- sium on geoscience and remote sensing
Lopes, A., Nezry, E., Touzi, R., Laur, H.: Maximum a posteriori speckle filtering and first order texture models in sar images. In: 10th annual international sympo- sium on geoscience and remote sensing. pp. 2409–2412. Ieee (1990) 1
work page 1990
-
[29]
IEEE Transactions on Geoscience and Remote Sens- ing 58(12), 8807–8818 (2020) 2
Ma, X., Wang, C., Yin, Z., Wu, P.: Sar image despeckling by noisy reference- based deep learning method. IEEE Transactions on Geoscience and Remote Sens- ing 58(12), 8807–8818 (2020) 2
work page 2020
-
[30]
Journal of machine learning research 9(11) (2008) 13 Abbreviated paper title 17
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. Journal of machine learning research 9(11) (2008) 13 Abbreviated paper title 17
work page 2008
-
[31]
IEEE Transactions on Geoscience and Remote Sensing60, 1–17 (2021) 2
Molini, A.B., Valsesia, D., Fracastoro, G., Magli, E.: Speckle2void: Deep self- supervised sar despeckling with blind-spot convolutional neural networks. IEEE Transactions on Geoscience and Remote Sensing60, 1–17 (2021) 2
work page 2021
-
[32]
IEEE Transactions on Geo- science and Remote Sensing50(2), 606–616 (2011) 2
Parrilli, S., Poderico, M., Angelino, C.V., Verdoliva, L.: A nonlocal sar image de- noising algorithm based on llmmse wavelet shrinkage. IEEE Transactions on Geo- science and Remote Sensing50(2), 606–616 (2011) 2
work page 2011
-
[33]
Perera, M.V., Bandara, W.G.C., Valanarasu, J.M.J., Patel, V.M.: Transformer- basedsarimagedespeckling.In:IGARSS2022-2022IEEEInternationalGeoscience and Remote Sensing Symposium. pp. 751–754. IEEE (2022) 2
work page 2022
-
[34]
IEEE Geoscience and Remote Sensing Letters (2023) 2
Perera, M.V., Nair, N.G., Bandara, W.G.C., Patel, V.M.: Sar despeckling using a denoising diffusion probabilistic model. IEEE Geoscience and Remote Sensing Letters (2023) 2
work page 2023
-
[35]
Advances in neural information pro- cessing systems 30 (2017) 2
Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez,A.N.,Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information pro- cessing systems 30 (2017) 2
work page 2017
-
[36]
IEEE Transactions on Geoscience and Remote Sensing59(11), 9336– 9349 (2020) 2, 8, 9, 11
Vitale, S., Ferraioli, G., Pascazio, V.: Multi-objective cnn-based algorithm for sar despeckling. IEEE Transactions on Geoscience and Remote Sensing59(11), 9336– 9349 (2020) 2, 8, 9, 11
work page 2020
-
[37]
IEEE Geoscience and Remote Sensing Letters 19, 1–5 (2021) 2
Vitale, S., Ferraioli, G., Pascazio, V.: Analysis on the building of training dataset for deep learning sar despeckling. IEEE Geoscience and Remote Sensing Letters 19, 1–5 (2021) 2
work page 2021
-
[38]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Wang, J., Di, S., Chen, L., Ng, C.W.W.: Noise2info: Noisy image to information of noise for self-supervised image denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 16034–16043 (2023) 2
work page 2023
-
[39]
IEEE Signal Processing Letters24(12), 1763–1767 (2017) 2, 8, 9, 11
Wang, P., Zhang, H., Patel, V.M.: Sar image despeckling using a convolutional neural network. IEEE Signal Processing Letters24(12), 1763–1767 (2017) 2, 8, 9, 11
work page 2017
-
[40]
In: IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium
Xiao, S., Huang, L., Zhang, S.: Unsupervised sar despeckling based on diffusion model. In: IGARSS 2023-2023 IEEE International Geoscience and Remote Sensing Symposium. pp. 810–813. IEEE (2023) 2
work page 2023
-
[41]
IEEE Transactions on geoscience and remote sensing 40(10), 2196–2212 (2002) 1
Xie, H., Pierce, L.E., Ulaby, F.T.: Sar speckle reduction using wavelet denoising and markov random field modeling. IEEE Transactions on geoscience and remote sensing 40(10), 2196–2212 (2002) 1
work page 2002
-
[42]
IEEE Transactions on Medical Imaging42(4), 910–921 (2022) 2
Yang, L., Li, Z., Ge, R., Zhao, J., Si, H., Zhang, D.: Low-dose ct denoising via sino- gram inner-structure transformer. IEEE Transactions on Medical Imaging42(4), 910–921 (2022) 2
work page 2022
-
[43]
Yang, Y., Newsam, S.: Bag-of-visual-words and spatial extensions for land-use classification. In: Proceedings of the 18th SIGSPATIAL international conference on advances in geographic information systems. pp. 270–279 (2010) 8
work page 2010
-
[44]
Yue, Z., Zhao, Q., Zhang, L., Meng, D.: Dual adversarial network: Toward real- world noise removal and noise generation. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16. pp. 41–58. Springer (2020) 2
work page 2020
-
[45]
IEEE transactions on image processing 26(7), 3142–3155 (2017) 2
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE transactions on image processing 26(7), 3142–3155 (2017) 2
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.