Recognition: no theorem link
UIESNN: A Scale-Aware Spiking Network for Underwater Image Enhancement
Pith reviewed 2026-05-12 02:04 UTC · model grok-4.3
The pith
A scale-aware spiking network corrects large-scale color distortions in underwater images by injecting multi-scale pooling into neuron membrane dynamics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that the Multi-scale Pooling LIF Block enlarges the receptive field of spiking neurons for underwater enhancement by injecting hierarchical multi-scale pooling responses into membrane dynamics, thereby enabling global correction of low-frequency degradations while preserving fine-grained details and generating heterogeneous scale-dependent activations, all within a fully spike-driven pipeline that integrates frequency decomposition and attention-based refinement.
What carries the argument
The Multi-scale Pooling LIF Block (MPLB), which injects hierarchical multi-scale pooling responses into leaky integrate-and-fire membrane dynamics to enlarge the effective receptive field and induce scale-dependent spiking activations.
If this is right
- UIESNN delivers improved colour fidelity and spatial coherence on underwater images compared with earlier SNN designs.
- The network maintains competitive energy cost while operating entirely in the spike domain.
- Frequency decomposition and attention refinement are integrated without breaking the spike-driven constraint.
- State-of-the-art results among SNN-based methods are achieved on the EUVP and LSUI benchmarks.
Where Pith is reading between the lines
- The same multi-scale injection into membrane dynamics could be tested on terrestrial dehazing or low-light restoration where low-frequency biases also dominate.
- Because the design avoids very deep layers, it may allow direct deployment on battery-powered underwater vehicles with limited compute.
- Temporal consistency on video could be checked by feeding consecutive frames through the same scale-aware blocks.
Load-bearing premise
That hierarchical multi-scale pooling responses can be injected into spiking membrane dynamics to expand receptive field for global corrections without losing fine detail or spiking efficiency.
What would settle it
Evaluating UIESNN on the EUVP or LSUI benchmarks and finding no measurable gain in color fidelity metrics such as PSNR, SSIM or UIQM over prior SNN methods, or finding substantially higher energy consumption.
Figures
read the original abstract
Underwater image enhancement (UIE) is a practically important yet underexplored application of spiking neural networks (SNNs), where the dominant degradations are large-scale and low-frequency, such as wavelength-dependent colour casts and scattering-induced veiling. Existing SNN restoration designs rely on locally bounded spiking perception, which can limit global correction and lead to saturated or inconsistent representations. To address these challenges, we propose a scale-aware SNN framework for UIE named UIESNN. At its core is a Multi-scale Pooling LIF Block (MPLB) that injects hierarchical multi-scale pooling responses into membrane dynamics, thereby enlarging the effective receptive field while preserving fine-grained details and inducing heterogeneous scale-dependent activations. Building on MPLB, we design a spiking residual architecture that integrates frequency decomposition and attention-based refinement in a fully spike-driven pipeline. Extensive experiments on the EUVP and LSUI benchmarks demonstrate that UIESNN achieves state-of-the-art performance among SNN-based methods, delivering improved colour fidelity and spatial coherence with competitive energy cost.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes UIESNN, a spiking neural network for underwater image enhancement (UIE) that introduces the Multi-scale Pooling LIF Block (MPLB). MPLB injects hierarchical multi-scale pooling responses into LIF membrane dynamics to enlarge the effective receptive field for large-scale low-frequency degradations (e.g., color casts and veiling) while preserving fine details and producing scale-dependent activations. The architecture builds a fully spike-driven residual pipeline with frequency decomposition and attention refinement. It claims state-of-the-art results among SNN-based methods on the EUVP and LSUI benchmarks, with gains in color fidelity and spatial coherence at competitive energy cost.
Significance. If the central performance claims and the claimed mechanism hold under scrutiny, this would represent a meaningful extension of SNNs into a practically relevant restoration domain where global context matters. The work supplies a concrete architectural innovation (MPLB) and reports energy-aware metrics, which are strengths for low-power vision applications. However, the absence of tabulated quantitative results, baseline comparisons, ablations, or error bars in the abstract leaves the SOTA claim unverified at present.
major comments (3)
- [§3.2] §3.2 (MPLB definition): The paper states that MPLB 'injects hierarchical multi-scale pooling responses into membrane dynamics' but supplies no explicit update equation showing how the pooled features modify the LIF membrane potential V(t), threshold, or reset. Without this rule it is impossible to determine whether the operation remains a genuine spike-driven scale-aware mechanism or reduces to non-spiking feature concatenation that could be replicated in an ANN.
- [Experimental section] Experimental section (results on EUVP/LSUI): The abstract asserts SOTA performance among SNN methods with improved color fidelity and spatial coherence, yet no tables, PSNR/SSIM values, baseline comparisons (e.g., against other SNN or ANN UIE models), ablation studies on the MPLB components, or error bars are referenced. These data are load-bearing for the central claim and must be supplied with statistical detail.
- [§4] §4 (architecture overview): The claim that the spiking residual pipeline with frequency decomposition remains 'fully spike-driven' while incorporating attention-based refinement requires an explicit statement of which operations are converted to spike-compatible forms and which are not; otherwise the energy-cost comparison with non-spiking methods cannot be evaluated.
minor comments (2)
- [§3] Notation for the LIF neuron and pooling operations should be introduced once with consistent symbols (e.g., define τ, V_th, and the pooling kernel sizes) rather than appearing piecemeal.
- [Figures] Figure captions for the network diagram and qualitative results should explicitly label the input degradation types and the scale levels used in MPLB.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript to improve clarity and completeness while preserving the core contributions.
read point-by-point responses
-
Referee: [§3.2] §3.2 (MPLB definition): The paper states that MPLB 'injects hierarchical multi-scale pooling responses into membrane dynamics' but supplies no explicit update equation showing how the pooled features modify the LIF membrane potential V(t), threshold, or reset. Without this rule it is impossible to determine whether the operation remains a genuine spike-driven scale-aware mechanism or reduces to non-spiking feature concatenation that could be replicated in an ANN.
Authors: We agree that the absence of an explicit update rule hinders verification of the spiking nature of the mechanism. In the revised version, we will add the precise mathematical formulation in §3.2, showing how the hierarchical multi-scale pooling outputs are integrated into the LIF membrane potential update (specifically, as an additive term to the input current I(t) before the standard LIF integration step), while keeping the threshold and reset unchanged and ensuring the entire block remains event-driven with no non-spiking concatenation. revision: yes
-
Referee: [Experimental section] Experimental section (results on EUVP/LSUI): The abstract asserts SOTA performance among SNN methods with improved color fidelity and spatial coherence, yet no tables, PSNR/SSIM values, baseline comparisons (e.g., against other SNN or ANN UIE models), ablation studies on the MPLB components, or error bars are referenced. These data are load-bearing for the central claim and must be supplied with statistical detail.
Authors: We acknowledge that the experimental results require more explicit presentation and statistical rigor to support the SOTA claim. The revised manuscript will include comprehensive tables reporting PSNR, SSIM, and additional perceptual metrics on both EUVP and LSUI, direct comparisons against prior SNN-based UIE methods as well as representative ANN UIE baselines, detailed ablations isolating each MPLB component, and error bars derived from multiple independent runs with standard deviation and statistical significance tests. revision: yes
-
Referee: [§4] §4 (architecture overview): The claim that the spiking residual pipeline with frequency decomposition remains 'fully spike-driven' while incorporating attention-based refinement requires an explicit statement of which operations are converted to spike-compatible forms and which are not; otherwise the energy-cost comparison with non-spiking methods cannot be evaluated.
Authors: We will expand §4 with a dedicated subsection that enumerates every operation: frequency decomposition is realized via spike-rate-based bandpass filtering, the residual connections and MPLB blocks are purely spike-driven, and the attention refinement is implemented using a spike-compatible attention module (operating on binary spike trains and membrane potentials). No non-spiking floating-point operations remain in the forward pass except for the final non-spiking reconstruction layer required for image output; this breakdown will enable direct energy-cost comparisons. revision: yes
Circularity Check
No circularity: architecture and performance claims are independent design and experimental results
full rationale
The paper introduces UIESNN as a novel scale-aware SNN with MPLB that injects multi-scale pooling into LIF membrane dynamics, followed by residual blocks, frequency decomposition and attention. Performance is reported from experiments on EUVP and LSUI benchmarks rather than any derivation that reduces to fitted inputs or self-referential definitions. No equations are presented that equate a 'prediction' to a parameter fit by construction, and no load-bearing uniqueness theorem or ansatz is smuggled via self-citation. The central claim rests on the empirical outcome of the proposed architecture, which is self-contained against external benchmarks and does not collapse to renaming or redefinition of its own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Leaky integrate-and-fire neurons with local spiking perception are the appropriate base model for image restoration tasks
invented entities (1)
-
Multi-scale Pooling LIF Block (MPLB)
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Networks of spiking neurons: the third generation of neural network models,
W. Maass, “Networks of spiking neurons: the third generation of neural network models,”Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997
work page 1997
-
[2]
J. Wang, L. Yu, L. Huang, C. Zhou, H. Zhang, Z. Song, H. Liu, M. Zhang, Z. Ma, and Z. Zhang, “Efficient speech command recog- nition leveraging spiking neural networks and progressive time-scaled curriculum distillation,”Neural Networks, vol. 195, p. 108253, 2026
work page 2026
-
[3]
A spiking neural network for image segmentation,
K. Patel, E. Hunsberger, S. Batir, and C. Eliasmith, “A spiking neural network for image segmentation,”arXiv preprint arXiv:2106.08921, 2021
-
[4]
X. Luo, M. Yao, Y . Chou, B. Xu, and G. Li, “Integer-valued training and spike-driven inference spiking neural network for high-performance and energy-efficient object detection,” inEuropean Conference on Computer Vision. Springer, 2024, pp. 253–272
work page 2024
-
[5]
Learning a spiking neural network for efficient image deraining,
T. Song, G. Jin, P. Li, K. Jiang, X. Chen, and J. Jin, “Learning a spiking neural network for efficient image deraining,” inIJCAI, 2024
work page 2024
-
[6]
Exploring the potentials of spiking neural networks for image deraining,
S. Chen, T. Krajnik, F. Arvin, and A. Atapour-Abarghouei, “Exploring the potentials of spiking neural networks for image deraining,”arXiv preprint arXiv:2512.02258, 2025
-
[7]
Spike2former: Efficient spiking transformer for high-performance image segmentation,
Z. Lei, M. Yao, J. Hu, X. Luo, Y . Lu, B. Xu, and G. Li, “Spike2former: Efficient spiking transformer for high-performance image segmentation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 2, 2025, pp. 1364–1372
work page 2025
-
[8]
Underwater image enhancement by wavelength compensation and dehazing,
J. Y . Chiang and Y .-C. Chen, “Underwater image enhancement by wavelength compensation and dehazing,”IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1756–1769, 2011
work page 2011
-
[9]
Underwater image restoration based on a parallel convolutional neural network,
K. Wang, Y . Hu, J. Chen, X. Wu, X. Zhao, and Y . Li, “Underwater image restoration based on a parallel convolutional neural network,”Remote sensing, vol. 11, no. 13, p. 1591, 2019
work page 2019
-
[10]
Underwater scene prior inspired deep underwater image and video enhancement,
C. Li, S. Anwar, and F. Porikli, “Underwater scene prior inspired deep underwater image and video enhancement,”Pattern Recognition, vol. 98, p. 107038, 2020
work page 2020
-
[11]
Pugan: Physical model-guided underwater image enhance- ment using gan with dual-discriminators,
R. Cong, W. Yang, W. Zhang, C. Li, C.-L. Guo, Q. Huang, and S. Kwong, “Pugan: Physical model-guided underwater image enhance- ment using gan with dual-discriminators,”IEEE Transactions on Image Processing, 2023
work page 2023
-
[12]
U-shape transformer for underwater image enhancement,
L. Peng, C. Zhu, and L. Bian, “U-shape transformer for underwater image enhancement,”IEEE Transactions on Image Processing, 2023
work page 2023
-
[13]
Phaseformer: Phase-based attention mechanism for under- water image restoration and beyond,
M. Khan, A. Negi, A. Kulkarni, S. S. Phutke, S. K. Vipparthi, and S. Murala, “Phaseformer: Phase-based attention mechanism for under- water image restoration and beyond,”arXiv preprint arXiv:2412.01456, 2024
-
[14]
C. Zhao, W. Cai, C. Dong, and C. Hu, “Wavelet-based fourier infor- mation interaction with frequency diffusion adjustment for underwater image restoration,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 8281–8291
work page 2024
-
[15]
Underwater image restoration via polymorphic large kernel cnns,
X. Guo, Y . Dong, X. Chen, W. Chen, Z. Li, F. Zheng, and C.-M. Pun, “Underwater image restoration via polymorphic large kernel cnns,”arXiv preprint arXiv:2412.18459, 2024
-
[16]
Deep-sea: Deep-learning enhancement for environmental perception in submerged aquatics,
S. Chen, R. Thenius, F. Arvin, and A. Atapour-Abarghouei, “Deep-sea: Deep-learning enhancement for environmental perception in submerged aquatics,”arXiv preprint arXiv:2508.12824, 2025
-
[17]
Underwater image enhancement by convolutional spiking neural networks,
V . Sudevan, F. Zayer, R. Kausar, S. Javed, H. Karki, G. De Masi, and J. Dias, “Underwater image enhancement by convolutional spiking neural networks,”arXiv preprint arXiv:2503.20485, 2025
-
[18]
J. Shao, H. Zhang, and J. Miao, “Lamsnn: Learnable adaptive modula- tion for artifact suppression in spiking underwater image enhancement networks,”Neural Networks, p. 108210, 2025
work page 2025
-
[19]
S$ˆ2$m-former: Spiking symmetric mixing branchformer for brain auditory attention detection,
J. Wang, Z. Ma, X. Shen, C. Zhou, L. Zhao, H. Zhang, Y . Zhong, S. Cai, Z. Song, and Z. Zhang, “S$ˆ2$m-former: Spiking symmetric mixing branchformer for brain auditory attention detection,” inThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. [Online]. Available: https://openreview.net/forum?id=WtMuGdHvh6
work page 2025
-
[20]
J. Wang, L. Yu, X. Shen, S. Guo, C. Zhou, L. Zhao, Y . Zhong, Z. Zhang, and Z. Ma, “Spikcommander: A high-performance spiking transformer with multi-view learning for efficient speech command recognition,” arXiv preprint arXiv:2511.07883, 2025
-
[21]
Spiking deep convolutional neural networks for energy-efficient object recognition,
Y . Cao, Y . Chen, and D. Khosla, “Spiking deep convolutional neural networks for energy-efficient object recognition,”IJCV, vol. 113, pp. 54–66, 2015
work page 2015
-
[22]
E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based opti- mization to spiking neural networks,”IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, 2019
work page 2019
-
[23]
Snnsir: A simple spiking neural network for stereo image restoration,
R. Xu, J. Xie, J. Nie, J. Cao, and Y . Pang, “Snnsir: A simple spiking neural network for stereo image restoration,”arXiv preprint arXiv:2508.12271, 2025
-
[24]
Attention spiking neural networks,
M. Yao, G. Zhao, H. Zhang, Y . Hu, L. Deng, Y . Tian, B. Xu, and G. Li, “Attention spiking neural networks,”IEEE transactions on pattern analysis and machine intelligence, vol. 45, no. 8, pp. 9393–9410, 2023
work page 2023
-
[25]
Twin adversarial contrastive learning for underwater image enhancement and beyond,
R. Liu, Z. Jiang, S. Yang, and X. Fan, “Twin adversarial contrastive learning for underwater image enhancement and beyond,”IEEE Trans- actions on Image Processing, vol. 31, pp. 4922–4936, 2022
work page 2022
-
[26]
A wavelet-based dual-stream network for underwater image enhancement,
Z. Ma and C. Oh, “A wavelet-based dual-stream network for underwater image enhancement,” inICASSP, 2022, pp. 2769–2773
work page 2022
-
[27]
Fast underwater image enhancement for improved visual perception,
M. J. Islam, Y . Xia, and J. Sattar, “Fast underwater image enhancement for improved visual perception,”IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3227–3234, 2020
work page 2020
-
[28]
1.1 computing’s energy problem (and what we can do about it),
M. Horowitz, “1.1 computing’s energy problem (and what we can do about it),” in2014 IEEE international solid-state circuits conference digest of technical papers (ISSCC). IEEE, 2014, pp. 10–14
work page 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.