A Retinomorphic Optical Spiking Neuron for Camouflaged Object Detection
Pith reviewed 2026-06-28 17:34 UTC · model grok-4.3
The pith
An optical spiking neuron emulates retinal center-surround fields and boosts camouflaged object detection accuracy in spiking networks by up to 28 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a Hodgkin-Huxley-based optical spiking neuron incorporating a two-dimensional anti-ambipolar phototransistor operated in the subthreshold regime can emulate retinal antagonistic center-surround receptive fields at a single wavelength with varying intensities, visual adaptation at 480 nm, and L-M cone opponency, and that a CSRF-augmented spiking neural network built on this neuron achieves accuracy improvements of 4.4 percent on FMNIST, 10.4 percent on COD10K, and 28.4 percent on synthetic camouflaged datasets over conventional spiking networks while consuming between 0.9 and 24.5 pJ per spike.
What carries the argument
The optical spiking neuron (OSHN) that uses a Hodgkin-Huxley model with a two-dimensional anti-ambipolar phototransistor in subthreshold operation to generate wavelength- and intensity-sensitive spikes and to perform concurrent spectral-spatial retinal preprocessing.
If this is right
- The OSHN enables concurrent spectral-spatial processing and event-driven vision at the edge with response times between 4.2 microseconds and 1.25 milliseconds.
- The CSRF-augmented SNN outperforms both conventional spiking networks and existing photoactive spiking architectures on camouflaged object detection tasks.
- Energy consumption remains below 25 pJ per spike across dark, 480 nm, and 800 nm conditions, supporting low-power intelligent edge systems.
- The device supports visual adaptation that prevents saturation and L-M opponency that mimics midget ganglion cells.
Where Pith is reading between the lines
- The same retinal preprocessing could be tested on dynamic video streams where motion and camouflage interact, an extension the paper does not perform.
- Hardware integration of multiple OSHNs into arrays might allow direct comparison of power draw against digital preprocessing pipelines on the same camouflaged scenes.
- If the wavelength-sensitive spike encoding scales to additional spectral bands, the approach could address multispectral camouflage detection without separate sensor channels.
Load-bearing premise
The accuracy gains in the camouflaged object detection task arise specifically from the retinal CSRF, adaptation, and opponency emulation rather than from unstated differences in network training, hyperparameters, or dataset handling.
What would settle it
Running the same spiking neural network architecture with identical training procedures and hyperparameters on the COD10K and synthetic camouflaged datasets once with the CSRF augmentation and once without it, then finding no accuracy difference, would falsify the central claim.
read the original abstract
Advanced vision systems require retinomorphic, energy-efficient spike-based preprocessing of dynamic visual scenes. Here, we demonstrate multiple retinal preprocessing functionalities by leveraging a Hodgkin-Huxley-based optical spiking neuron (OSHN) that incorporates a two-dimensional anti-ambipolar phototransistor operated in the subthreshold regime to minimize power consumption. OSHN exhibits wavelength- and intensity-sensitive spike encoding with energy consumption per spike of 0.9 pJ under dark, 2 pJ at 480 nm (mid wavelength, M), and 24.5 pJ at 800 nm (long wavelength, L). The low (biological)-to-high spiking rate (0 - 2 kHz) with substantially faster response times (4.2 $\mu$s - 1.25 ms) than the human retina (30 ms - 60 ms), reveal OSHN's fast decision-making capability. OSHN facilitates concurrent spectral-spatial processing by emulating retinal antagonistic center-surround receptive fields (CSRFs) at a single wavelength (480 nm or 800 nm) with varying intensities, visual adaptation (at 480 nm) to prevent system saturation, and L-M cone opponency in midget ganglion cells. Finally, a CSRF-augmented spiking neural network (SNN) has been developed for camouflaged object detection, achieving 4.4%, 10.4%, and 28.4% improvements in accuracy over conventional SNN on FMNIST, COD10K, and synthetic camouflaged datasets, outperforming existing photoactive spiking architectures while enabling event-driven intelligent edge vision systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an optical spiking neuron (OSHN) fabricated from a 2D anti-ambipolar phototransistor operated in the subthreshold regime, which emulates retinal center-surround receptive fields (CSRFs), visual adaptation, and L-M cone opponency via Hodgkin-Huxley dynamics. It reports wavelength- and intensity-dependent spiking with energies of 0.9 pJ (dark), 2 pJ (480 nm), and 24.5 pJ (800 nm), response times from 4.2 μs to 1.25 ms, and integrates the device into a CSRF-augmented spiking neural network that achieves accuracy gains of 4.4%, 10.4%, and 28.4% over a conventional SNN on FMNIST, COD10K, and synthetic camouflaged datasets.
Significance. If the accuracy improvements are shown to arise specifically from the retinomorphic CSRF emulation rather than uncontrolled differences in network configuration, the work would advance hardware implementations of biological preprocessing for low-power, event-driven edge vision. The explicit energy-per-spike figures and direct comparison to biological timescales provide concrete, falsifiable benchmarks.
major comments (1)
- [SNN results section] SNN results section: the central claim attributes the reported accuracy gains (4.4% on FMNIST, 10.4% on COD10K, 28.4% on synthetic data) to the OSHN's emulation of CSRFs, adaptation, and opponency. No evidence is supplied that the conventional SNN baseline employed identical spike encoding, training protocol, hyperparameters, loss function, initialization, or data preprocessing; any deviation in these factors could produce the observed deltas without the retinal features being causal.
minor comments (2)
- [Abstract] Abstract: the parenthetical phrasing 'low (biological)-to-high spiking rate (0 - 2 kHz)' is unclear and should be reworded for precision.
- Figure captions and legends throughout should explicitly state error bars, number of devices measured, and any statistical tests used for the reported energies and response times.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The single major comment raises a valid methodological concern about the SNN baseline. We address it directly below and commit to revisions that strengthen the causal attribution of the reported accuracy gains.
read point-by-point responses
-
Referee: [SNN results section] SNN results section: the central claim attributes the reported accuracy gains (4.4% on FMNIST, 10.4% on COD10K, 28.4% on synthetic data) to the OSHN's emulation of CSRFs, adaptation, and opponency. No evidence is supplied that the conventional SNN baseline employed identical spike encoding, training protocol, hyperparameters, loss function, initialization, or data preprocessing; any deviation in these factors could produce the observed deltas without the retinal features being causal.
Authors: We agree that explicit documentation of identical experimental conditions is required to support the claim that the accuracy improvements arise specifically from the retinomorphic CSRF emulation. The submitted manuscript described the CSRF-augmented network architecture but did not include a side-by-side specification of all other variables. In the revised version we will add a dedicated subsection (and accompanying table) that lists, for both networks: (i) identical spike encoding scheme and rate normalization, (ii) the same training protocol, optimizer, learning-rate schedule, and loss function, (iii) identical hyperparameter values and random-seed initialization, and (iv) the same data-preprocessing pipeline. This addition will make the only controlled difference the insertion of the OSHN-derived CSRF, adaptation, and opponency operations, thereby addressing the referee’s concern. revision: yes
Circularity Check
No circularity; experimental hardware demonstration with reported empirical metrics
full rationale
The paper describes fabrication and testing of an optical spiking neuron device followed by network accuracy measurements on standard datasets. No equations, parameter fitting, or derivation steps are present that would reduce any claimed result to its own inputs by construction. The accuracy deltas are presented as direct experimental outcomes rather than predictions derived from prior fitted values or self-referential definitions. Self-citations, if any, are not load-bearing for the central empirical claims.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Hodgkin-Huxley model accurately describes the dynamics of the optical spiking neuron
Reference graph
Works this paper leans on
-
[1]
Thakar K, Rajendran B, Lodha S. Ultra-low power neuromorphic obstacle detection using a two-dimensional materials-based subthreshold transistor. npj 2D Materials and Applications. 2023 Sep;7(1):68. https://doi.org/10.1038/s41699-023-00422-z
-
[2]
Spiking neurons from tunable Gaussian heterojunction transistors
Beck ME, Shylendra A, Sangwan VK, Guo S, Rojas WAG, Yoo H, et al. Spiking neurons from tunable Gaussian heterojunction transistors. Nature Communications. 2020 3;11:1565. https://doi.org/10.1038/s41467-020-15378-7
-
[3]
In: Vision
Feher J. In: Vision. Elsevier; 2012. p. 456–470
2012
-
[4]
On and off signaling pathways in the retina and the visual system
Ichinose T, Habib S. On and off signaling pathways in the retina and the visual system. Frontiers in Ophthalmology. 2022 8;2. https://doi.org/10.3389/fopht.2022.989002
-
[5]
Receptive field center-surround interactions mediate context- dependent spatial contrast encoding in the retina
Turner MH, Schwartz GW, Rieke F. Receptive field center-surround interactions mediate context- dependent spatial contrast encoding in the retina. eLife. 2018;7:e38841. https://doi.org/10.7554/ eLife.38841
2018
-
[6]
Gaynes JA, Budoff SA, Grybko MJ, Hunt JB, Poleg-Polsky A. Classical center-surround receptive fields facilitate novel object detection in retinal bipolar cells. Nature Communications. 2022 9;13:5575. https://doi.org/10.1038/s41467-022-32761-8
-
[7]
IncorporatingLearnableMembraneTime Constant to Enhance Learning of Spiking Neural Networks
FangW,YuZ,ChenY,MasquelierT,HuangT,TianY. IncorporatingLearnableMembraneTime Constant to Enhance Learning of Spiking Neural Networks. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE; 2021. p. 2641–2651
2021
-
[8]
Advances in deep concealed scene understanding
Fan DP, Ji GP, Xu P, Cheng MM, Sakaridis C, Gool LV. Advances in deep concealed scene understanding. Visual Intelligence. 2023 8;1:16. https://doi.org/10.1007/s44267-023-00019-6
-
[9]
Camouflaged image synthesis is all you need to boost camouflaged detection
Zhang H, Qin C, Yin Y, Fu Y. Camouflaged image synthesis is all you need to boost camouflaged detection. arXiv preprint arXiv:230806701. 2023
2023
-
[10]
Perlin K. An image synthesizer. ACM SIGGRAPH Computer Graphics. 1985 7;19:287–296. https://doi.org/10.1145/325165.325247
-
[11]
InP Quantum Dots Tailored Oxide Thin Film Phototransistor for Bioinspired Visual Adaptation
Gao Z, Ju X, Zhang H, Liu X, Chen H, Li W, et al. InP Quantum Dots Tailored Oxide Thin Film Phototransistor for Bioinspired Visual Adaptation. Advanced Functional Materials. 2023 12;33. https://doi.org/10.1002/adfm.202305959
-
[12]
Li L, Dai Q, Li Y, Pei M, Osada M, Li Y. Autonomous Light Intensity Adaptation in an Energy- Efficient Retinomorphic Organic Ferroelectric Neuristor. Advanced Optical Materials. 2024 6;12. https://doi.org/10.1002/adom.202303172
-
[13]
Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor
Wang CY, Liang SJ, Wang S, Wang P, Li Z, Wang Z, et al. Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor. Science Advances. 2020 6;6. https://doi.org/10.1126/sciadv.aba6173
-
[14]
Self-Powered Bidirectional Photoresponse in High-Detectivity WSe 2 Phototransistor with Asymmetrical van der Waals Stacking for Retinal Neurons Emulation
Zhang Y, Wang L, Lei Y, Wang B, Lu Y, Yao Y, et al. Self-Powered Bidirectional Photoresponse in High-Detectivity WSe 2 Phototransistor with Asymmetrical van der Waals Stacking for Retinal Neurons Emulation. ACS Nano. 2022 12;16:20937–20945. https://doi.org/10.1021/ 20 acsnano.2c08542
2022
-
[15]
Han Z, Zhang Y, Mi Q, You J, Zhang N, Zhong Z, et al. Reconfigurable Homojunction Phototransistor for Near-Zero Power Consumption Artificial Biomimetic Retina Function. ACS Nano. 2024 10;18:29968–29977. https://doi.org/10.1021/acsnano.4c10619
-
[16]
Laswick Z, Wu X, Surendran A, Zhou Z, Ji X, Matrone GM, et al. Tunable anti-ambipolar vertical bilayer organic electrochemical transistor enable neuromorphic retinal pathway. Nature Communications. 2024 7;15:6309. https://doi.org/10.1038/s41467-024-50496-6
-
[17]
The structure and precision of retinal spike trains
Berry MJ, Warland DK, Meister M. The structure and precision of retinal spike trains. Pro- ceedings of the National Academy of Sciences of the United States of America. 1997 5;94:5411–6. https://doi.org/10.1073/pnas.94.10.5411
-
[18]
Intrinsic physiological properties of cat retinal ganglion cells
O’Brien BJ, Isayama T, Richardson R, Berson DM. Intrinsic physiological properties of cat retinal ganglion cells. The Journal of physiology. 2002 2;538:787–802. https://doi.org/10.1113/ jphysiol.2001.013009
-
[19]
Response Latency Tuning by Retinal Circuits Modulates Signal Efficiency
Ádám Jonatán Tengölics, Szarka G, Ganczer A, Szabó-Meleg E, Nyitrai M, Kovács-Öller T, et al. Response Latency Tuning by Retinal Circuits Modulates Signal Efficiency. Scientific reports. 2019 10;9:15110. https://doi.org/10.1038/s41598-019-51756-y
-
[20]
An artificial visual neuron with multiplexed rate and time-to-first-spike coding
Li F, Li D, Wang C, Liu G, Wang R, Ren H, et al. An artificial visual neuron with multiplexed rate and time-to-first-spike coding. Nature Communications. 2024 5;15:3689. https://doi.org/10. 1038/s41467-024-48103-9
2024
-
[21]
Vision: Two Speeds in the Retina
Masland RH. Vision: Two Speeds in the Retina. Current Biology. 2017 4;27:R303–R305. https://doi.org/10.1016/j.cub.2017.02.056
-
[22]
A biomimetic neural encoder for spiking neural network
Radhakrishnan SS, Sebastian A, Oberoi A, Das S, Das S. A biomimetic neural encoder for spiking neural network. Nature Communications. 2021 4;12:2143. https://doi.org/10.1038/ s41467-021-22332-8
2021
-
[23]
Atwo-dimensionalmid-infraredoptoelectronic retina enabling simultaneous perception and encoding
WangF,HuF,DaiM,ZhuS,SunF,DuanR,etal. Atwo-dimensionalmid-infraredoptoelectronic retina enabling simultaneous perception and encoding. Nature Communications. 2023 4;14:1938. https://doi.org/10.1038/s41467-023-37623-5
-
[24]
ArtificialVisualPerceptionNervousSystemBased on Low-Dimensional Material Photoelectric Memristors
PeiY,YanL,WuZ,LuJ,ZhaoJ,ChenJ,etal. ArtificialVisualPerceptionNervousSystemBased on Low-Dimensional Material Photoelectric Memristors. ACS Nano. 2021 11;15:17319–17326. https://doi.org/10.1021/acsnano.1c04676
-
[25]
Single-Transistor Optoelectronic Spiking Neuron with Optogenetics-Inspired Spatiotemporal Dynamics
Li H, Hu J, Zhang Y, Chen A, Zhou J, Zhao Y, et al. Single-Transistor Optoelectronic Spiking Neuron with Optogenetics-Inspired Spatiotemporal Dynamics. Advanced Functional Materials. 2024 5;34. https://doi.org/10.1002/adfm.202314456
-
[26]
Photoactive Monolayer MoS 2 for Spiking Neural Networks Enabled Machine Vision Applications
Aung T, Giridhar SP, Abidi IH, Ahmed T, AI-Hourani A, Walia S. Photoactive Monolayer MoS 2 for Spiking Neural Networks Enabled Machine Vision Applications. Advanced Materials Technologies. 2025 8;10. https://doi.org/10.1002/admt.202401677
-
[27]
Neuromorphic Visual Receptive Field Hardware with Vertically Integrated Indium-Gallium-Zinc-Oxide Optoelectronic 21 Memristors over Silicon Neuron Transistors
Kim HW, Kim JH, Shin DH, Jung MC, Park TW, Park HJ, et al. Neuromorphic Visual Receptive Field Hardware with Vertically Integrated Indium-Gallium-Zinc-Oxide Optoelectronic 21 Memristors over Silicon Neuron Transistors. Advanced Materials. 2025 9;https://doi.org/10. 1002/adma.202513907. 22
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.