Aquatic Neuromorphic Optical Flow
Pith reviewed 2026-05-14 21:21 UTC · model grok-4.3
The pith
A self-supervised spiking neural network estimates per-pixel optical flow from underwater event streams without any labeled data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A spiking neural network trained in a fully self-supervised manner on raw event streams can recover accurate per-pixel optical flow fields in underwater scenes, matching leading supervised methods in visual and quantitative quality while consuming far less power and computation.
What carries the argument
Self-supervised spiking neural network that learns motion from asynchronous event streams by exploiting temporal consistency in the event data itself.
If this is right
- Real-time per-pixel motion estimation becomes practical on low-power underwater edge devices.
- Labeled underwater optical-flow datasets are no longer required for training.
- Neuromorphic pipelines can operate continuously in turbid or low-light aquatic conditions where frame cameras fail.
- Computational budgets for underwater vehicles drop enough to allow simultaneous execution of other perception tasks.
- Lightweight autonomous systems gain a pathway to agile navigation without heavy GPU hardware.
Where Pith is reading between the lines
- The same self-supervised event-to-flow pipeline could transfer to other data-scarce environments such as deep-sea or space-based robotics.
- Combining this network with spiking depth or segmentation heads would test whether a single neuromorphic stack can handle multiple underwater perception tasks.
- Long-term deployment tests on actual AUVs would reveal whether the efficiency gains translate into measurable increases in mission duration.
- If event noise characteristics differ sharply across water types, the method may need only minor recalibration rather than full retraining.
Load-bearing premise
Underwater event streams contain enough inherent structure that a spiking network can learn accurate optical flow without labeled examples or any domain-specific tuning.
What would settle it
On real underwater event sequences with independent ground-truth flow, the self-supervised network produces flow fields whose endpoint error exceeds that of a standard supervised baseline by more than 30 percent.
Figures
read the original abstract
Underwater environments impose severe constraints on conventional imaging systems and demand solutions that balance high-quality sensing with strict resource efficiency. While emerging event cameras offer a promising alternative, their potential in aquatic scenarios remains largely unexplored. Through the lens of neuromorphic vision, this work pioneers the investigation of motion fields that serve as key media for agile underwater perception. Built upon spiking neural networks, we introduce a self-supervised framework to estimate per-pixel optical flow from asynchronous event streams, elegantly bypassing the long-standing bottleneck of underwater data scarcity. Extensive evaluations demonstrate that our method achieves competitive visual and quantitative results against leading techniques while operating with superior computational efficiency. By bridging neuromorphic sensing and aquatic intelligence, this work opens new frontiers for lightweight, real-time, and low-cost perception on resource-constrained underwater edge platforms.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a self-supervised framework based on spiking neural networks (SNNs) to estimate per-pixel optical flow directly from asynchronous event camera streams in underwater environments. It claims this approach bypasses the need for labeled aquatic datasets, achieves competitive visual and quantitative performance against existing methods, and delivers superior computational efficiency suitable for resource-constrained underwater edge platforms.
Significance. If validated, the work would meaningfully extend neuromorphic vision to challenging aquatic domains where conventional cameras fail due to scattering and attenuation. The self-supervised formulation is a notable strength, as it directly addresses data scarcity without requiring domain-specific labeled data or heavy adaptations, potentially enabling lightweight real-time perception on underwater vehicles.
major comments (2)
- [Abstract] Abstract: the central claim that the method 'achieves competitive visual and quantitative results against leading techniques' cannot be evaluated because no datasets, metrics (e.g., average endpoint error, F1 score), baselines, or error analysis are presented; this directly undermines the assertion of bypassing underwater data scarcity.
- [Framework] Framework / loss definition (inferred from abstract description of self-supervised SNN): the approach relies on an implicit self-supervised loss (likely contrast maximization or time-surface consistency) without explicit terms for underwater scattering or low event density; if event rates drop below the threshold needed for stable gradients, the optimization can converge to trivial or noisy flow fields, violating the assumption that event streams contain sufficient non-degenerate structure.
minor comments (2)
- The abstract refers to 'extensive evaluations' yet provides no reference to figures, tables, or supplementary material containing the quantitative results.
- Notation for the event stream representation and SNN spiking mechanism should be defined explicitly to support reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below and have revised the manuscript to improve clarity and completeness where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the method 'achieves competitive visual and quantitative results against leading techniques' cannot be evaluated because no datasets, metrics (e.g., average endpoint error, F1 score), baselines, or error analysis are presented; this directly undermines the assertion of bypassing underwater data scarcity.
Authors: The abstract is a concise summary; the full manuscript details the evaluations in Section 4, including specific underwater-adapted event datasets, metrics such as average endpoint error, F1 score, baselines from prior event-based methods, and error analysis. To address the concern directly, we have revised the abstract to briefly reference these elements and emphasize the self-supervised approach on unlabeled data. revision: yes
-
Referee: [Framework] Framework / loss definition (inferred from abstract description of self-supervised SNN): the approach relies on an implicit self-supervised loss (likely contrast maximization or time-surface consistency) without explicit terms for underwater scattering or low event density; if event rates drop below the threshold needed for stable gradients, the optimization can converge to trivial or noisy flow fields, violating the assumption that event streams contain sufficient non-degenerate structure.
Authors: Section 3 explicitly defines the self-supervised loss as a contrast-maximization objective integrated with the spiking network. No dedicated scattering term is present because the event-driven formulation prioritizes motion-induced changes resilient to attenuation. We acknowledge the low-event-density concern and have added discussion plus ablation results on event-rate thresholds to demonstrate avoidance of trivial solutions via network regularization and sparsity handling. revision: partial
Circularity Check
No significant circularity in self-supervised event-based optical flow derivation
full rationale
The paper introduces a self-supervised spiking neural network framework for per-pixel optical flow estimation directly from asynchronous underwater event streams. No equations, loss definitions, or training procedures in the abstract or description reduce the output flow field to a fitted parameter or input by construction. The self-supervision claim relies on standard contrast or consistency objectives applied to the event data itself rather than any self-referential definition or prior self-citation that would force the result. The derivation chain remains independent of the target underwater results and does not import uniqueness theorems or ansatzes from the authors' prior work.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We exploit contrast maximization (CM) to estimate per-pixel flow fields from event streams in a self-supervised fashion [29], [30]. ... LCM(t′) = ...
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat_induction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the membrane potential ui[t] of a postsynaptic neuron i at a timestep t follows ui[t] = ζ ui[t−1] + (1−ζ)(...)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A survey on underwater computer vision,
S. P. Gonz ´alez-Sabbagh and A. Robles-Kelly, “A survey on underwater computer vision,”ACM Computing Surveys, vol. 55, no. 13s, pp. 1–39, 2023
work page 2023
-
[2]
Y . Zhu and L. F. Tadesse, “SpectroGen: a physically informed gener- ative artificial intelligence for accelerated cross-modality spectroscopic materials characterization,”Matter, vol. 9, no. 1, p. 102434, 2026
work page 2026
-
[3]
Lossless compression of event camera frames,
I. Schiopu and R. C. Bilcu, “Lossless compression of event camera frames,”IEEE Signal Processing Letters, vol. 29, pp. 1779–1783, 2022
work page 2022
-
[4]
Neuromorphic imaging with density-based spatiotemporal denoising,
P. Zhang, Z. Ge, L. Song, and E. Y . Lam, “Neuromorphic imaging with density-based spatiotemporal denoising,”IEEE Transactions on Computational Imaging, vol. 9, pp. 530–541, 2023
work page 2023
-
[5]
Event-based shutter unrolling and motion deblurring in dynamic scenes,
Y . Wang, C. Jiang, X. Jia, Y . Guo, and L. Yu, “Event-based shutter unrolling and motion deblurring in dynamic scenes,”IEEE Signal Processing Letters, vol. 31, pp. 1069–1073, 2024
work page 2024
-
[6]
Angle- based neuromorphic wave normal sensing,
C. Wang, S. Zhu, P. Zhang, K. Wang, J. Huang, and E. Y . Lam, “Angle- based neuromorphic wave normal sensing,”Laser & Photonics Reviews, vol. 19, no. 4, p. 2400647, 2025
work page 2025
-
[7]
Neuromorphic imaging with super-resolution,
P. Zhang, S. Zhu, C. Wang, Y . Zhao, and E. Y . Lam, “Neuromorphic imaging with super-resolution,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 2, pp. 1715–1727, 2025
work page 2025
-
[8]
Low-latency automotive vision with event cameras,
D. Gehrig and D. Scaramuzza, “Low-latency automotive vision with event cameras,”Nature, vol. 629, pp. 1034–1040, 2024
work page 2024
-
[9]
Ultrafast dynamic defect inspection with computational neuromorphic imaging,
S. Zhu, Q. Yin, C. Wang, J. Huang, and E. Y . Lam, “Ultrafast dynamic defect inspection with computational neuromorphic imaging,”Advanced Science, vol. 12, no. 44, p. e10338, 2025
work page 2025
-
[10]
Event-based stereo depth es- timation by temporal-spatial context learning,
W. Chen, Y . Zhang, X. Sun, and F. Wu, “Event-based stereo depth es- timation by temporal-spatial context learning,”IEEE Signal Processing Letters, vol. 31, pp. 1429–1433, 2024
work page 2024
-
[11]
Self-calibrated neuromorphic hyperspectral derivative imaging,
R. Chen, C. Wang, Y . Li, Y . Cao, S. Zhu, and E. Y . Lam, “Self-calibrated neuromorphic hyperspectral derivative imaging,”Optica, vol. 13, no. 4, pp. 587–590, 2026
work page 2026
-
[12]
Fast event-based optical flow estimation by triplet matching,
S. Shiba, Y . Aoki, and G. Gallego, “Fast event-based optical flow estimation by triplet matching,”IEEE Signal Processing Letters, vol. 29, pp. 2712–2716, 2023
work page 2023
-
[13]
Dark-EvGS: event camera as an eye for radiance field in the dark,
J. Wu, P. Duan, Z. Wang, C. Wang, B. Shi, and E. Y . Lam, “Dark-EvGS: event camera as an eye for radiance field in the dark,”IEEE Transactions on Image Processing, vol. 35, pp. 3172–3185, 2026
work page 2026
-
[14]
AquaticVision: benchmarking visual SLAM in underwater environment with events and frames,
Y . Peng, Y . Hong, Z. Hong, A. P.-Y . Chui, and J. Wu, “AquaticVision: benchmarking visual SLAM in underwater environment with events and frames,”arXiv preprint arXiv:2505.03448, 2025
-
[15]
Event-dataset for underwater SLAM,
J. H. Klasson, B. Sorensen, K. Brummenaes, and G. B. Ellingsen, “Event-dataset for underwater SLAM,” https: //github.com/OsloMet-OceanLab/underwater event dataset, 2023
work page 2023
-
[16]
Event-based circular detection for AUV docking based on spiking neural network,
F. Zhang, Y . Zhong, L. Chen, and Z. Wang, “Event-based circular detection for AUV docking based on spiking neural network,”Frontiers in Neurorobotics, vol. 15, p. 815144, 2022
work page 2022
-
[17]
S. Takatsuka, N. Miyamoto, H. Sato, Y . Morino, Y . Kurita, A. Yabuki, C. Chen, and S. Kawagucci, “Millisecond-scale behaviours of plankton quantified in vitro and in situ using the event-based vision sensor,” Ecology and Evolution, vol. 14, no. 8, p. e70150, 2024
work page 2024
-
[18]
C. Luo, J. Wu, S. Sun, and P. Ren, “TransCODNet: underwater transparently camouflaged object detection via RGB and event frames collaboration,”IEEE Robotics and Automation Letters, vol. 9, no. 2, pp. 1444–1451, 2024
work page 2024
-
[19]
Non-uniform illumination underwater image enhancement via events and frame fusion,
X. Bi, P. Wang, T. Wu, F. Zha, and P. Xu, “Non-uniform illumination underwater image enhancement via events and frame fusion,”Applied Optics, vol. 61, no. 29, pp. 8826–8832, 2022
work page 2022
-
[20]
RGB/event signal fusion framework for multi-degraded underwater image enhancement,
X. Bi, P. Wang, W. Guo, F. Zha, and L. Sun, “RGB/event signal fusion framework for multi-degraded underwater image enhancement,” Frontiers in Marine Science, vol. 11, p. 1366815, 2024
work page 2024
-
[21]
EV-FlowNet: self-supervised optical flow estimation for event-based cameras,
A. Z. Zhu, L. Yuan, K. Chaney, and K. Daniilidis, “EV-FlowNet: self-supervised optical flow estimation for event-based cameras,” in Proceedings of Robotics: Science and Systems, 2018
work page 2018
-
[22]
Self-supervised learning of event-based optical flow with spiking neural networks,
J. Hagenaars, F. Paredes-Vall ´es, and G. de Croon, “Self-supervised learning of event-based optical flow with spiking neural networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 7167– 7179, 2021
work page 2021
-
[23]
Taming contrast maximization for learning sequential, low-latency, event-based optical flow,
F. Paredes-Vall ´es, K. Y . Scheper, C. De Wagter, and G. C. De Croon, “Taming contrast maximization for learning sequential, low-latency, event-based optical flow,” inProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 9661–9671
work page 2023
-
[24]
Networks of spiking neurons: the third generation of neural network models,
W. Maass, “Networks of spiking neurons: the third generation of neural network models,”Neural Networks, vol. 10, no. 9, pp. 1659–1671, 1997
work page 1997
-
[25]
A. K. Kosta and K. Roy, “Adaptive-SpikeNet: event-based optical flow estimation using spiking neural networks with learnable neuronal dy- namics,” inIEEE International Conference on Robotics and Automation (ICRA), 2023, pp. 6021–6027
work page 2023
-
[26]
SU-YOLO: spiking neural network for efficient underwater object detection,
C. Li, W. Liu, G. Gong, X. Ding, and X. Zhong, “SU-YOLO: spiking neural network for efficient underwater object detection,”Neurocomput- ing, vol. 644, p. 130310, 2025
work page 2025
-
[27]
Underwater image enhancement by convolutional spiking neural networks,
V . Sudevan, F. Zayer, R. Kausar, S. Javed, H. Karki, G. De Masi, and J. Dias, “Underwater image enhancement by convolutional spiking neural networks,”arXiv preprint arXiv:2503.20485, 2025
-
[28]
The Bouguer–Beer–Lambert law: shining light on the obscure,
T. G. Mayerh ¨ofer, S. Pahlow, and J. Popp, “The Bouguer–Beer–Lambert law: shining light on the obscure,”ChemPhysChem, vol. 21, no. 18, pp. 2029–2046, 2020
work page 2029
-
[29]
G. Gallego, H. Rebecq, and D. Scaramuzza, “A unifying contrast maximization framework for event cameras, with applications to motion, depth, and optical flow estimation,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 3867–3876
work page 2018
-
[30]
Focus is all you need: loss functions for event-based vision,
G. Gallego, M. Gehrig, and D. Scaramuzza, “Focus is all you need: loss functions for event-based vision,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 12 272–12 281
work page 2019
-
[31]
Neuromorphic imag- ing with joint image deblurring and event denoising,
P. Zhang, H. Liu, Z. Ge, C. Wang, and E. Y . Lam, “Neuromorphic imag- ing with joint image deblurring and event denoising,”IEEE Transactions on Image Processing, vol. 33, pp. 2318–2333, 2024
work page 2024
-
[32]
Unsupervised event- based learning of optical flow, depth, and egomotion,
A. Z. Zhu, L. Yuan, K. Chaney, and K. Daniilidis, “Unsupervised event- based learning of optical flow, depth, and egomotion,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 989–997
work page 2019
-
[33]
Lapicque’s introduction of the integrate-and-fire model neuron (1907),
L. F. Abbott, “Lapicque’s introduction of the integrate-and-fire model neuron (1907),”Brain Research Bulletin, vol. 50, no. 5–6, pp. 303–304, 1999
work page 1907
-
[34]
Fast image reconstruction with an event camera,
C. Scheerlinck, H. Rebecq, D. Gehrig, N. Barnes, R. Mahony, and D. Scaramuzza, “Fast image reconstruction with an event camera,” in IEEE Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 156–163
work page 2020
-
[35]
Delving Deeper into Convolutional Networks for Learning Video Representations
N. Ballas, L. Yao, C. Pal, and A. Courville, “Delving deeper into con- volutional networks for learning video representations,”arXiv preprint arXiv:1511.06432, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[36]
Adam: a method for stochastic optimization,
D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” inInternational Conference on Learning Representations (ICLR), 2015
work page 2015
-
[37]
RAFT: recurrent all-pairs field transforms for optical flow,
Z. Teed and J. Deng, “RAFT: recurrent all-pairs field transforms for optical flow,” inEuropean Conference on Computer Vision (ECCV), 2020, pp. 402–419
work page 2020
-
[38]
Reducing the sim-to-real gap for event cameras,
T. Stoffregen, C. Scheerlinck, D. Scaramuzza, T. Drummond, N. Barnes, L. Kleeman, and R. Mahony, “Reducing the sim-to-real gap for event cameras,” inEuropean Conference on Computer Vision (ECCV), 2020, pp. 534–549
work page 2020
-
[39]
Spikformer: when spiking neural network meets transformer,
Z. Zhou, Y . Zhu, C. He, Y . Wang, S. Y AN, Y . Tian, and L. Yuan, “Spikformer: when spiking neural network meets transformer,” inInter- national Conference on Learning Representations (ICLR), 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.