pith. machine review for the scientific record. sign in

arxiv: 2605.08376 · v1 · submitted 2026-05-08 · 💻 cs.CV

Recognition: no theorem link

UIESNN: A Scale-Aware Spiking Network for Underwater Image Enhancement

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:04 UTC · model grok-4.3

classification 💻 cs.CV
keywords spiking neural networksunderwater image enhancementmulti-scale poolingleaky integrate-and-fireimage restorationenergy-efficient vision
0
0 comments X

The pith

A scale-aware spiking network corrects large-scale color distortions in underwater images by injecting multi-scale pooling into neuron membrane dynamics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Underwater images are degraded by wavelength-dependent color casts and scattering that affect broad regions, which local spiking operations struggle to fix without losing detail. The paper proposes UIESNN, whose core Multi-scale Pooling LIF Block feeds hierarchical pooling responses directly into the membrane potential update of leaky integrate-and-fire neurons. This enlarges the effective receptive field, produces scale-dependent activations, and supports a fully spike-driven residual architecture with frequency decomposition and attention refinement. Experiments on the EUVP and LSUI benchmarks show better color fidelity and spatial coherence than prior SNN methods at comparable energy cost. If the mechanism works as described, spiking networks could become practical for power-limited underwater vision tasks.

Core claim

The paper establishes that the Multi-scale Pooling LIF Block enlarges the receptive field of spiking neurons for underwater enhancement by injecting hierarchical multi-scale pooling responses into membrane dynamics, thereby enabling global correction of low-frequency degradations while preserving fine-grained details and generating heterogeneous scale-dependent activations, all within a fully spike-driven pipeline that integrates frequency decomposition and attention-based refinement.

What carries the argument

The Multi-scale Pooling LIF Block (MPLB), which injects hierarchical multi-scale pooling responses into leaky integrate-and-fire membrane dynamics to enlarge the effective receptive field and induce scale-dependent spiking activations.

If this is right

  • UIESNN delivers improved colour fidelity and spatial coherence on underwater images compared with earlier SNN designs.
  • The network maintains competitive energy cost while operating entirely in the spike domain.
  • Frequency decomposition and attention refinement are integrated without breaking the spike-driven constraint.
  • State-of-the-art results among SNN-based methods are achieved on the EUVP and LSUI benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multi-scale injection into membrane dynamics could be tested on terrestrial dehazing or low-light restoration where low-frequency biases also dominate.
  • Because the design avoids very deep layers, it may allow direct deployment on battery-powered underwater vehicles with limited compute.
  • Temporal consistency on video could be checked by feeding consecutive frames through the same scale-aware blocks.

Load-bearing premise

That hierarchical multi-scale pooling responses can be injected into spiking membrane dynamics to expand receptive field for global corrections without losing fine detail or spiking efficiency.

What would settle it

Evaluating UIESNN on the EUVP or LSUI benchmarks and finding no measurable gain in color fidelity metrics such as PSNR, SSIM or UIQM over prior SNN methods, or finding substantially higher energy consumption.

Figures

Figures reproduced from arXiv: 2605.08376 by Amir Atapour-Abarghouei, Farshad Arvin, Ronald Thenius, Ruochen Li, Shuang Chen, Zihan Zhu.

Figure 1
Figure 1. Figure 1: Visualisation of complex degradation in the underwater scenario. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed framework. The top panel presents the overall pipeline. The right panel illustrates the proposed Spiking Residual Network [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The illustration of the proposed Multi-scale Pooling LIF Block. NI-LIF is illustrated in Fig. 4. [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The illustration of the NI-LIF [7] [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗
Figure 7
Figure 7. Figure 7: Qualitative comparison on underwater image enhancement. The top [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗
read the original abstract

Underwater image enhancement (UIE) is a practically important yet underexplored application of spiking neural networks (SNNs), where the dominant degradations are large-scale and low-frequency, such as wavelength-dependent colour casts and scattering-induced veiling. Existing SNN restoration designs rely on locally bounded spiking perception, which can limit global correction and lead to saturated or inconsistent representations. To address these challenges, we propose a scale-aware SNN framework for UIE named UIESNN. At its core is a Multi-scale Pooling LIF Block (MPLB) that injects hierarchical multi-scale pooling responses into membrane dynamics, thereby enlarging the effective receptive field while preserving fine-grained details and inducing heterogeneous scale-dependent activations. Building on MPLB, we design a spiking residual architecture that integrates frequency decomposition and attention-based refinement in a fully spike-driven pipeline. Extensive experiments on the EUVP and LSUI benchmarks demonstrate that UIESNN achieves state-of-the-art performance among SNN-based methods, delivering improved colour fidelity and spatial coherence with competitive energy cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes UIESNN, a spiking neural network for underwater image enhancement (UIE) that introduces the Multi-scale Pooling LIF Block (MPLB). MPLB injects hierarchical multi-scale pooling responses into LIF membrane dynamics to enlarge the effective receptive field for large-scale low-frequency degradations (e.g., color casts and veiling) while preserving fine details and producing scale-dependent activations. The architecture builds a fully spike-driven residual pipeline with frequency decomposition and attention refinement. It claims state-of-the-art results among SNN-based methods on the EUVP and LSUI benchmarks, with gains in color fidelity and spatial coherence at competitive energy cost.

Significance. If the central performance claims and the claimed mechanism hold under scrutiny, this would represent a meaningful extension of SNNs into a practically relevant restoration domain where global context matters. The work supplies a concrete architectural innovation (MPLB) and reports energy-aware metrics, which are strengths for low-power vision applications. However, the absence of tabulated quantitative results, baseline comparisons, ablations, or error bars in the abstract leaves the SOTA claim unverified at present.

major comments (3)
  1. [§3.2] §3.2 (MPLB definition): The paper states that MPLB 'injects hierarchical multi-scale pooling responses into membrane dynamics' but supplies no explicit update equation showing how the pooled features modify the LIF membrane potential V(t), threshold, or reset. Without this rule it is impossible to determine whether the operation remains a genuine spike-driven scale-aware mechanism or reduces to non-spiking feature concatenation that could be replicated in an ANN.
  2. [Experimental section] Experimental section (results on EUVP/LSUI): The abstract asserts SOTA performance among SNN methods with improved color fidelity and spatial coherence, yet no tables, PSNR/SSIM values, baseline comparisons (e.g., against other SNN or ANN UIE models), ablation studies on the MPLB components, or error bars are referenced. These data are load-bearing for the central claim and must be supplied with statistical detail.
  3. [§4] §4 (architecture overview): The claim that the spiking residual pipeline with frequency decomposition remains 'fully spike-driven' while incorporating attention-based refinement requires an explicit statement of which operations are converted to spike-compatible forms and which are not; otherwise the energy-cost comparison with non-spiking methods cannot be evaluated.
minor comments (2)
  1. [§3] Notation for the LIF neuron and pooling operations should be introduced once with consistent symbols (e.g., define τ, V_th, and the pooling kernel sizes) rather than appearing piecemeal.
  2. [Figures] Figure captions for the network diagram and qualitative results should explicitly label the input degradation types and the scale levels used in MPLB.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to improve clarity and completeness while preserving the core contributions.

read point-by-point responses
  1. Referee: [§3.2] §3.2 (MPLB definition): The paper states that MPLB 'injects hierarchical multi-scale pooling responses into membrane dynamics' but supplies no explicit update equation showing how the pooled features modify the LIF membrane potential V(t), threshold, or reset. Without this rule it is impossible to determine whether the operation remains a genuine spike-driven scale-aware mechanism or reduces to non-spiking feature concatenation that could be replicated in an ANN.

    Authors: We agree that the absence of an explicit update rule hinders verification of the spiking nature of the mechanism. In the revised version, we will add the precise mathematical formulation in §3.2, showing how the hierarchical multi-scale pooling outputs are integrated into the LIF membrane potential update (specifically, as an additive term to the input current I(t) before the standard LIF integration step), while keeping the threshold and reset unchanged and ensuring the entire block remains event-driven with no non-spiking concatenation. revision: yes

  2. Referee: [Experimental section] Experimental section (results on EUVP/LSUI): The abstract asserts SOTA performance among SNN methods with improved color fidelity and spatial coherence, yet no tables, PSNR/SSIM values, baseline comparisons (e.g., against other SNN or ANN UIE models), ablation studies on the MPLB components, or error bars are referenced. These data are load-bearing for the central claim and must be supplied with statistical detail.

    Authors: We acknowledge that the experimental results require more explicit presentation and statistical rigor to support the SOTA claim. The revised manuscript will include comprehensive tables reporting PSNR, SSIM, and additional perceptual metrics on both EUVP and LSUI, direct comparisons against prior SNN-based UIE methods as well as representative ANN UIE baselines, detailed ablations isolating each MPLB component, and error bars derived from multiple independent runs with standard deviation and statistical significance tests. revision: yes

  3. Referee: [§4] §4 (architecture overview): The claim that the spiking residual pipeline with frequency decomposition remains 'fully spike-driven' while incorporating attention-based refinement requires an explicit statement of which operations are converted to spike-compatible forms and which are not; otherwise the energy-cost comparison with non-spiking methods cannot be evaluated.

    Authors: We will expand §4 with a dedicated subsection that enumerates every operation: frequency decomposition is realized via spike-rate-based bandpass filtering, the residual connections and MPLB blocks are purely spike-driven, and the attention refinement is implemented using a spike-compatible attention module (operating on binary spike trains and membrane potentials). No non-spiking floating-point operations remain in the forward pass except for the final non-spiking reconstruction layer required for image output; this breakdown will enable direct energy-cost comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture and performance claims are independent design and experimental results

full rationale

The paper introduces UIESNN as a novel scale-aware SNN with MPLB that injects multi-scale pooling into LIF membrane dynamics, followed by residual blocks, frequency decomposition and attention. Performance is reported from experiments on EUVP and LSUI benchmarks rather than any derivation that reduces to fitted inputs or self-referential definitions. No equations are presented that equate a 'prediction' to a parameter fit by construction, and no load-bearing uniqueness theorem or ansatz is smuggled via self-citation. The central claim rests on the empirical outcome of the proposed architecture, which is self-contained against external benchmarks and does not collapse to renaming or redefinition of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The paper rests on standard SNN assumptions and introduces one new architectural entity; no free parameters or additional axioms are visible from the abstract.

axioms (1)
  • domain assumption Leaky integrate-and-fire neurons with local spiking perception are the appropriate base model for image restoration tasks
    Invoked when the paper states that existing SNN designs are limited by locally bounded perception.
invented entities (1)
  • Multi-scale Pooling LIF Block (MPLB) no independent evidence
    purpose: Inject hierarchical multi-scale pooling responses into membrane dynamics to enlarge receptive field while preserving details
    New component introduced to address the stated limitation of prior SNNs

pith-pipeline@v0.9.0 · 5502 in / 1260 out tokens · 48440 ms · 2026-05-12T02:04:59.260941+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages

  1. [1]

    Networks of spiking neurons: the third generation of neural network models,

    W. Maass, “Networks of spiking neurons: the third generation of neural network models,”Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997

  2. [2]

    Efficient speech command recog- nition leveraging spiking neural networks and progressive time-scaled curriculum distillation,

    J. Wang, L. Yu, L. Huang, C. Zhou, H. Zhang, Z. Song, H. Liu, M. Zhang, Z. Ma, and Z. Zhang, “Efficient speech command recog- nition leveraging spiking neural networks and progressive time-scaled curriculum distillation,”Neural Networks, vol. 195, p. 108253, 2026

  3. [3]

    A spiking neural network for image segmentation,

    K. Patel, E. Hunsberger, S. Batir, and C. Eliasmith, “A spiking neural network for image segmentation,”arXiv preprint arXiv:2106.08921, 2021

  4. [4]

    Integer-valued training and spike-driven inference spiking neural network for high-performance and energy-efficient object detection,

    X. Luo, M. Yao, Y . Chou, B. Xu, and G. Li, “Integer-valued training and spike-driven inference spiking neural network for high-performance and energy-efficient object detection,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 253–272

  5. [5]

    Learning a spiking neural network for efficient image deraining,

    T. Song, G. Jin, P. Li, K. Jiang, X. Chen, and J. Jin, “Learning a spiking neural network for efficient image deraining,” inIJCAI, 2024

  6. [6]

    Exploring the potentials of spiking neural networks for image deraining,

    S. Chen, T. Krajnik, F. Arvin, and A. Atapour-Abarghouei, “Exploring the potentials of spiking neural networks for image deraining,”arXiv preprint arXiv:2512.02258, 2025

  7. [7]

    Spike2former: Efficient spiking transformer for high-performance image segmentation,

    Z. Lei, M. Yao, J. Hu, X. Luo, Y . Lu, B. Xu, and G. Li, “Spike2former: Efficient spiking transformer for high-performance image segmentation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 2, 2025, pp. 1364–1372

  8. [8]

    Underwater image enhancement by wavelength compensation and dehazing,

    J. Y . Chiang and Y .-C. Chen, “Underwater image enhancement by wavelength compensation and dehazing,”IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1756–1769, 2011

  9. [9]

    Underwater image restoration based on a parallel convolutional neural network,

    K. Wang, Y . Hu, J. Chen, X. Wu, X. Zhao, and Y . Li, “Underwater image restoration based on a parallel convolutional neural network,”Remote sensing, vol. 11, no. 13, p. 1591, 2019

  10. [10]

    Underwater scene prior inspired deep underwater image and video enhancement,

    C. Li, S. Anwar, and F. Porikli, “Underwater scene prior inspired deep underwater image and video enhancement,”Pattern Recognition, vol. 98, p. 107038, 2020

  11. [11]

    Pugan: Physical model-guided underwater image enhance- ment using gan with dual-discriminators,

    R. Cong, W. Yang, W. Zhang, C. Li, C.-L. Guo, Q. Huang, and S. Kwong, “Pugan: Physical model-guided underwater image enhance- ment using gan with dual-discriminators,”IEEE Transactions on Image Processing, 2023

  12. [12]

    U-shape transformer for underwater image enhancement,

    L. Peng, C. Zhu, and L. Bian, “U-shape transformer for underwater image enhancement,”IEEE Transactions on Image Processing, 2023

  13. [13]

    Phaseformer: Phase-based attention mechanism for under- water image restoration and beyond,

    M. Khan, A. Negi, A. Kulkarni, S. S. Phutke, S. K. Vipparthi, and S. Murala, “Phaseformer: Phase-based attention mechanism for under- water image restoration and beyond,”arXiv preprint arXiv:2412.01456, 2024

  14. [14]

    Wavelet-based fourier infor- mation interaction with frequency diffusion adjustment for underwater image restoration,

    C. Zhao, W. Cai, C. Dong, and C. Hu, “Wavelet-based fourier infor- mation interaction with frequency diffusion adjustment for underwater image restoration,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 8281–8291

  15. [15]

    Underwater image restoration via polymorphic large kernel cnns,

    X. Guo, Y . Dong, X. Chen, W. Chen, Z. Li, F. Zheng, and C.-M. Pun, “Underwater image restoration via polymorphic large kernel cnns,”arXiv preprint arXiv:2412.18459, 2024

  16. [16]

    Deep-sea: Deep-learning enhancement for environmental perception in submerged aquatics,

    S. Chen, R. Thenius, F. Arvin, and A. Atapour-Abarghouei, “Deep-sea: Deep-learning enhancement for environmental perception in submerged aquatics,”arXiv preprint arXiv:2508.12824, 2025

  17. [17]

    Underwater image enhancement by convolutional spiking neural networks,

    V . Sudevan, F. Zayer, R. Kausar, S. Javed, H. Karki, G. De Masi, and J. Dias, “Underwater image enhancement by convolutional spiking neural networks,”arXiv preprint arXiv:2503.20485, 2025

  18. [18]

    Lamsnn: Learnable adaptive modula- tion for artifact suppression in spiking underwater image enhancement networks,

    J. Shao, H. Zhang, and J. Miao, “Lamsnn: Learnable adaptive modula- tion for artifact suppression in spiking underwater image enhancement networks,”Neural Networks, p. 108210, 2025

  19. [19]

    S$ˆ2$m-former: Spiking symmetric mixing branchformer for brain auditory attention detection,

    J. Wang, Z. Ma, X. Shen, C. Zhou, L. Zhao, H. Zhang, Y . Zhong, S. Cai, Z. Song, and Z. Zhang, “S$ˆ2$m-former: Spiking symmetric mixing branchformer for brain auditory attention detection,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. [Online]. Available: https://openreview.net/forum?id=WtMuGdHvh6

  20. [20]

    Spikcommander: A high-performance spiking transformer with multi-view learning for efficient speech command recognition,

    J. Wang, L. Yu, X. Shen, S. Guo, C. Zhou, L. Zhao, Y . Zhong, Z. Zhang, and Z. Ma, “Spikcommander: A high-performance spiking transformer with multi-view learning for efficient speech command recognition,” arXiv preprint arXiv:2511.07883, 2025

  21. [21]

    Spiking deep convolutional neural networks for energy-efficient object recognition,

    Y . Cao, Y . Chen, and D. Khosla, “Spiking deep convolutional neural networks for energy-efficient object recognition,”IJCV, vol. 113, pp. 54–66, 2015

  22. [22]

    Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based opti- mization to spiking neural networks,

    E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based opti- mization to spiking neural networks,”IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, 2019

  23. [23]

    Snnsir: A simple spiking neural network for stereo image restoration,

    R. Xu, J. Xie, J. Nie, J. Cao, and Y . Pang, “Snnsir: A simple spiking neural network for stereo image restoration,”arXiv preprint arXiv:2508.12271, 2025

  24. [24]

    Attention spiking neural networks,

    M. Yao, G. Zhao, H. Zhang, Y . Hu, L. Deng, Y . Tian, B. Xu, and G. Li, “Attention spiking neural networks,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 8, pp. 9393–9410, 2023

  25. [25]

    Twin adversarial contrastive learning for underwater image enhancement and beyond,

    R. Liu, Z. Jiang, S. Yang, and X. Fan, “Twin adversarial contrastive learning for underwater image enhancement and beyond,”IEEE Trans- actions on Image Processing, vol. 31, pp. 4922–4936, 2022

  26. [26]

    A wavelet-based dual-stream network for underwater image enhancement,

    Z. Ma and C. Oh, “A wavelet-based dual-stream network for underwater image enhancement,” inICASSP, 2022, pp. 2769–2773

  27. [27]

    Fast underwater image enhancement for improved visual perception,

    M. J. Islam, Y . Xia, and J. Sattar, “Fast underwater image enhancement for improved visual perception,”IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3227–3234, 2020

  28. [28]

    1.1 computing’s energy problem (and what we can do about it),

    M. Horowitz, “1.1 computing’s energy problem (and what we can do about it),” in2014 IEEE international solid-state circuits conference digest of technical papers (ISSCC). IEEE, 2014, pp. 10–14