pith. sign in

arxiv: 2606.00818 · v1 · pith:J4IGTGXZnew · submitted 2026-05-30 · ⚛️ physics.app-ph · quant-ph

A Retinomorphic Optical Spiking Neuron for Camouflaged Object Detection

Pith reviewed 2026-06-28 17:34 UTC · model grok-4.3

classification ⚛️ physics.app-ph quant-ph
keywords retinomorphicoptical spiking neuroncamouflaged object detectioncenter-surround receptive fieldsspiking neural networkphototransistorretinal emulationevent-driven vision
0
0 comments X

The pith

An optical spiking neuron emulates retinal center-surround fields and boosts camouflaged object detection accuracy in spiking networks by up to 28 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Hodgkin-Huxley-based optical spiking neuron that uses a two-dimensional anti-ambipolar phototransistor in the subthreshold regime to perform retinal-like preprocessing of visual scenes. It demonstrates that this device can encode spikes with wavelength and intensity sensitivity while emulating antagonistic center-surround receptive fields, visual adaptation, and cone opponency at low energy per spike. When these retinal features augment a spiking neural network, the system shows measurable accuracy gains on standard and camouflaged image datasets compared with conventional spiking networks. A sympathetic reader would care because the work points to hardware that could enable energy-efficient, event-driven vision at the edge without relying on conventional frame-based cameras.

Core claim

The central claim is that a Hodgkin-Huxley-based optical spiking neuron incorporating a two-dimensional anti-ambipolar phototransistor operated in the subthreshold regime can emulate retinal antagonistic center-surround receptive fields at a single wavelength with varying intensities, visual adaptation at 480 nm, and L-M cone opponency, and that a CSRF-augmented spiking neural network built on this neuron achieves accuracy improvements of 4.4 percent on FMNIST, 10.4 percent on COD10K, and 28.4 percent on synthetic camouflaged datasets over conventional spiking networks while consuming between 0.9 and 24.5 pJ per spike.

What carries the argument

The optical spiking neuron (OSHN) that uses a Hodgkin-Huxley model with a two-dimensional anti-ambipolar phototransistor in subthreshold operation to generate wavelength- and intensity-sensitive spikes and to perform concurrent spectral-spatial retinal preprocessing.

If this is right

  • The OSHN enables concurrent spectral-spatial processing and event-driven vision at the edge with response times between 4.2 microseconds and 1.25 milliseconds.
  • The CSRF-augmented SNN outperforms both conventional spiking networks and existing photoactive spiking architectures on camouflaged object detection tasks.
  • Energy consumption remains below 25 pJ per spike across dark, 480 nm, and 800 nm conditions, supporting low-power intelligent edge systems.
  • The device supports visual adaptation that prevents saturation and L-M opponency that mimics midget ganglion cells.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same retinal preprocessing could be tested on dynamic video streams where motion and camouflage interact, an extension the paper does not perform.
  • Hardware integration of multiple OSHNs into arrays might allow direct comparison of power draw against digital preprocessing pipelines on the same camouflaged scenes.
  • If the wavelength-sensitive spike encoding scales to additional spectral bands, the approach could address multispectral camouflage detection without separate sensor channels.

Load-bearing premise

The accuracy gains in the camouflaged object detection task arise specifically from the retinal CSRF, adaptation, and opponency emulation rather than from unstated differences in network training, hyperparameters, or dataset handling.

What would settle it

Running the same spiking neural network architecture with identical training procedures and hyperparameters on the COD10K and synthetic camouflaged datasets once with the CSRF augmentation and once without it, then finding no accuracy difference, would falsify the central claim.

read the original abstract

Advanced vision systems require retinomorphic, energy-efficient spike-based preprocessing of dynamic visual scenes. Here, we demonstrate multiple retinal preprocessing functionalities by leveraging a Hodgkin-Huxley-based optical spiking neuron (OSHN) that incorporates a two-dimensional anti-ambipolar phototransistor operated in the subthreshold regime to minimize power consumption. OSHN exhibits wavelength- and intensity-sensitive spike encoding with energy consumption per spike of 0.9 pJ under dark, 2 pJ at 480 nm (mid wavelength, M), and 24.5 pJ at 800 nm (long wavelength, L). The low (biological)-to-high spiking rate (0 - 2 kHz) with substantially faster response times (4.2 $\mu$s - 1.25 ms) than the human retina (30 ms - 60 ms), reveal OSHN's fast decision-making capability. OSHN facilitates concurrent spectral-spatial processing by emulating retinal antagonistic center-surround receptive fields (CSRFs) at a single wavelength (480 nm or 800 nm) with varying intensities, visual adaptation (at 480 nm) to prevent system saturation, and L-M cone opponency in midget ganglion cells. Finally, a CSRF-augmented spiking neural network (SNN) has been developed for camouflaged object detection, achieving 4.4%, 10.4%, and 28.4% improvements in accuracy over conventional SNN on FMNIST, COD10K, and synthetic camouflaged datasets, outperforming existing photoactive spiking architectures while enabling event-driven intelligent edge vision systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript presents an optical spiking neuron (OSHN) fabricated from a 2D anti-ambipolar phototransistor operated in the subthreshold regime, which emulates retinal center-surround receptive fields (CSRFs), visual adaptation, and L-M cone opponency via Hodgkin-Huxley dynamics. It reports wavelength- and intensity-dependent spiking with energies of 0.9 pJ (dark), 2 pJ (480 nm), and 24.5 pJ (800 nm), response times from 4.2 μs to 1.25 ms, and integrates the device into a CSRF-augmented spiking neural network that achieves accuracy gains of 4.4%, 10.4%, and 28.4% over a conventional SNN on FMNIST, COD10K, and synthetic camouflaged datasets.

Significance. If the accuracy improvements are shown to arise specifically from the retinomorphic CSRF emulation rather than uncontrolled differences in network configuration, the work would advance hardware implementations of biological preprocessing for low-power, event-driven edge vision. The explicit energy-per-spike figures and direct comparison to biological timescales provide concrete, falsifiable benchmarks.

major comments (1)
  1. [SNN results section] SNN results section: the central claim attributes the reported accuracy gains (4.4% on FMNIST, 10.4% on COD10K, 28.4% on synthetic data) to the OSHN's emulation of CSRFs, adaptation, and opponency. No evidence is supplied that the conventional SNN baseline employed identical spike encoding, training protocol, hyperparameters, loss function, initialization, or data preprocessing; any deviation in these factors could produce the observed deltas without the retinal features being causal.
minor comments (2)
  1. [Abstract] Abstract: the parenthetical phrasing 'low (biological)-to-high spiking rate (0 - 2 kHz)' is unclear and should be reworded for precision.
  2. Figure captions and legends throughout should explicitly state error bars, number of devices measured, and any statistical tests used for the reported energies and response times.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The single major comment raises a valid methodological concern about the SNN baseline. We address it directly below and commit to revisions that strengthen the causal attribution of the reported accuracy gains.

read point-by-point responses
  1. Referee: [SNN results section] SNN results section: the central claim attributes the reported accuracy gains (4.4% on FMNIST, 10.4% on COD10K, 28.4% on synthetic data) to the OSHN's emulation of CSRFs, adaptation, and opponency. No evidence is supplied that the conventional SNN baseline employed identical spike encoding, training protocol, hyperparameters, loss function, initialization, or data preprocessing; any deviation in these factors could produce the observed deltas without the retinal features being causal.

    Authors: We agree that explicit documentation of identical experimental conditions is required to support the claim that the accuracy improvements arise specifically from the retinomorphic CSRF emulation. The submitted manuscript described the CSRF-augmented network architecture but did not include a side-by-side specification of all other variables. In the revised version we will add a dedicated subsection (and accompanying table) that lists, for both networks: (i) identical spike encoding scheme and rate normalization, (ii) the same training protocol, optimizer, learning-rate schedule, and loss function, (iii) identical hyperparameter values and random-seed initialization, and (iv) the same data-preprocessing pipeline. This addition will make the only controlled difference the insertion of the OSHN-derived CSRF, adaptation, and opponency operations, thereby addressing the referee’s concern. revision: yes

Circularity Check

0 steps flagged

No circularity; experimental hardware demonstration with reported empirical metrics

full rationale

The paper describes fabrication and testing of an optical spiking neuron device followed by network accuracy measurements on standard datasets. No equations, parameter fitting, or derivation steps are present that would reduce any claimed result to its own inputs by construction. The accuracy deltas are presented as direct experimental outcomes rather than predictions derived from prior fitted values or self-referential definitions. Self-citations, if any, are not load-bearing for the central empirical claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review provides insufficient detail to enumerate free parameters or invented entities; the Hodgkin-Huxley model is referenced as the basis for the neuron but treated as standard background.

axioms (1)
  • domain assumption Hodgkin-Huxley model accurately describes the dynamics of the optical spiking neuron
    Invoked to justify the spike encoding behavior of the OSHN

pith-pipeline@v0.9.1-grok · 5844 in / 1188 out tokens · 23026 ms · 2026-06-28T17:34:23.324337+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 19 canonical work pages

  1. [1]

    Ultra-low power neuromorphic obstacle detection using a two-dimensional materials-based subthreshold transistor

    Thakar K, Rajendran B, Lodha S. Ultra-low power neuromorphic obstacle detection using a two-dimensional materials-based subthreshold transistor. npj 2D Materials and Applications. 2023 Sep;7(1):68. https://doi.org/10.1038/s41699-023-00422-z

  2. [2]

    Spiking neurons from tunable Gaussian heterojunction transistors

    Beck ME, Shylendra A, Sangwan VK, Guo S, Rojas WAG, Yoo H, et al. Spiking neurons from tunable Gaussian heterojunction transistors. Nature Communications. 2020 3;11:1565. https://doi.org/10.1038/s41467-020-15378-7

  3. [3]

    In: Vision

    Feher J. In: Vision. Elsevier; 2012. p. 456–470

  4. [4]

    On and off signaling pathways in the retina and the visual system

    Ichinose T, Habib S. On and off signaling pathways in the retina and the visual system. Frontiers in Ophthalmology. 2022 8;2. https://doi.org/10.3389/fopht.2022.989002

  5. [5]

    Receptive field center-surround interactions mediate context- dependent spatial contrast encoding in the retina

    Turner MH, Schwartz GW, Rieke F. Receptive field center-surround interactions mediate context- dependent spatial contrast encoding in the retina. eLife. 2018;7:e38841. https://doi.org/10.7554/ eLife.38841

  6. [6]

    Classical center-surround receptive fields facilitate novel object detection in retinal bipolar cells

    Gaynes JA, Budoff SA, Grybko MJ, Hunt JB, Poleg-Polsky A. Classical center-surround receptive fields facilitate novel object detection in retinal bipolar cells. Nature Communications. 2022 9;13:5575. https://doi.org/10.1038/s41467-022-32761-8

  7. [7]

    IncorporatingLearnableMembraneTime Constant to Enhance Learning of Spiking Neural Networks

    FangW,YuZ,ChenY,MasquelierT,HuangT,TianY. IncorporatingLearnableMembraneTime Constant to Enhance Learning of Spiking Neural Networks. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE; 2021. p. 2641–2651

  8. [8]

    Advances in deep concealed scene understanding

    Fan DP, Ji GP, Xu P, Cheng MM, Sakaridis C, Gool LV. Advances in deep concealed scene understanding. Visual Intelligence. 2023 8;1:16. https://doi.org/10.1007/s44267-023-00019-6

  9. [9]

    Camouflaged image synthesis is all you need to boost camouflaged detection

    Zhang H, Qin C, Yin Y, Fu Y. Camouflaged image synthesis is all you need to boost camouflaged detection. arXiv preprint arXiv:230806701. 2023

  10. [10]

    An image synthesizer

    Perlin K. An image synthesizer. ACM SIGGRAPH Computer Graphics. 1985 7;19:287–296. https://doi.org/10.1145/325165.325247

  11. [11]

    InP Quantum Dots Tailored Oxide Thin Film Phototransistor for Bioinspired Visual Adaptation

    Gao Z, Ju X, Zhang H, Liu X, Chen H, Li W, et al. InP Quantum Dots Tailored Oxide Thin Film Phototransistor for Bioinspired Visual Adaptation. Advanced Functional Materials. 2023 12;33. https://doi.org/10.1002/adfm.202305959

  12. [12]

    Autonomous Light Intensity Adaptation in an Energy- Efficient Retinomorphic Organic Ferroelectric Neuristor

    Li L, Dai Q, Li Y, Pei M, Osada M, Li Y. Autonomous Light Intensity Adaptation in an Energy- Efficient Retinomorphic Organic Ferroelectric Neuristor. Advanced Optical Materials. 2024 6;12. https://doi.org/10.1002/adom.202303172

  13. [13]

    Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor

    Wang CY, Liang SJ, Wang S, Wang P, Li Z, Wang Z, et al. Gate-tunable van der Waals heterostructure for reconfigurable neural network vision sensor. Science Advances. 2020 6;6. https://doi.org/10.1126/sciadv.aba6173

  14. [14]

    Self-Powered Bidirectional Photoresponse in High-Detectivity WSe 2 Phototransistor with Asymmetrical van der Waals Stacking for Retinal Neurons Emulation

    Zhang Y, Wang L, Lei Y, Wang B, Lu Y, Yao Y, et al. Self-Powered Bidirectional Photoresponse in High-Detectivity WSe 2 Phototransistor with Asymmetrical van der Waals Stacking for Retinal Neurons Emulation. ACS Nano. 2022 12;16:20937–20945. https://doi.org/10.1021/ 20 acsnano.2c08542

  15. [15]

    Reconfigurable Homojunction Phototransistor for Near-Zero Power Consumption Artificial Biomimetic Retina Function

    Han Z, Zhang Y, Mi Q, You J, Zhang N, Zhong Z, et al. Reconfigurable Homojunction Phototransistor for Near-Zero Power Consumption Artificial Biomimetic Retina Function. ACS Nano. 2024 10;18:29968–29977. https://doi.org/10.1021/acsnano.4c10619

  16. [16]

    Tunable anti-ambipolar vertical bilayer organic electrochemical transistor enable neuromorphic retinal pathway

    Laswick Z, Wu X, Surendran A, Zhou Z, Ji X, Matrone GM, et al. Tunable anti-ambipolar vertical bilayer organic electrochemical transistor enable neuromorphic retinal pathway. Nature Communications. 2024 7;15:6309. https://doi.org/10.1038/s41467-024-50496-6

  17. [17]

    The structure and precision of retinal spike trains

    Berry MJ, Warland DK, Meister M. The structure and precision of retinal spike trains. Pro- ceedings of the National Academy of Sciences of the United States of America. 1997 5;94:5411–6. https://doi.org/10.1073/pnas.94.10.5411

  18. [18]

    Intrinsic physiological properties of cat retinal ganglion cells

    O’Brien BJ, Isayama T, Richardson R, Berson DM. Intrinsic physiological properties of cat retinal ganglion cells. The Journal of physiology. 2002 2;538:787–802. https://doi.org/10.1113/ jphysiol.2001.013009

  19. [19]

    Response Latency Tuning by Retinal Circuits Modulates Signal Efficiency

    Ádám Jonatán Tengölics, Szarka G, Ganczer A, Szabó-Meleg E, Nyitrai M, Kovács-Öller T, et al. Response Latency Tuning by Retinal Circuits Modulates Signal Efficiency. Scientific reports. 2019 10;9:15110. https://doi.org/10.1038/s41598-019-51756-y

  20. [20]

    An artificial visual neuron with multiplexed rate and time-to-first-spike coding

    Li F, Li D, Wang C, Liu G, Wang R, Ren H, et al. An artificial visual neuron with multiplexed rate and time-to-first-spike coding. Nature Communications. 2024 5;15:3689. https://doi.org/10. 1038/s41467-024-48103-9

  21. [21]

    Vision: Two Speeds in the Retina

    Masland RH. Vision: Two Speeds in the Retina. Current Biology. 2017 4;27:R303–R305. https://doi.org/10.1016/j.cub.2017.02.056

  22. [22]

    A biomimetic neural encoder for spiking neural network

    Radhakrishnan SS, Sebastian A, Oberoi A, Das S, Das S. A biomimetic neural encoder for spiking neural network. Nature Communications. 2021 4;12:2143. https://doi.org/10.1038/ s41467-021-22332-8

  23. [23]

    Atwo-dimensionalmid-infraredoptoelectronic retina enabling simultaneous perception and encoding

    WangF,HuF,DaiM,ZhuS,SunF,DuanR,etal. Atwo-dimensionalmid-infraredoptoelectronic retina enabling simultaneous perception and encoding. Nature Communications. 2023 4;14:1938. https://doi.org/10.1038/s41467-023-37623-5

  24. [24]

    ArtificialVisualPerceptionNervousSystemBased on Low-Dimensional Material Photoelectric Memristors

    PeiY,YanL,WuZ,LuJ,ZhaoJ,ChenJ,etal. ArtificialVisualPerceptionNervousSystemBased on Low-Dimensional Material Photoelectric Memristors. ACS Nano. 2021 11;15:17319–17326. https://doi.org/10.1021/acsnano.1c04676

  25. [25]

    Single-Transistor Optoelectronic Spiking Neuron with Optogenetics-Inspired Spatiotemporal Dynamics

    Li H, Hu J, Zhang Y, Chen A, Zhou J, Zhao Y, et al. Single-Transistor Optoelectronic Spiking Neuron with Optogenetics-Inspired Spatiotemporal Dynamics. Advanced Functional Materials. 2024 5;34. https://doi.org/10.1002/adfm.202314456

  26. [26]

    Photoactive Monolayer MoS 2 for Spiking Neural Networks Enabled Machine Vision Applications

    Aung T, Giridhar SP, Abidi IH, Ahmed T, AI-Hourani A, Walia S. Photoactive Monolayer MoS 2 for Spiking Neural Networks Enabled Machine Vision Applications. Advanced Materials Technologies. 2025 8;10. https://doi.org/10.1002/admt.202401677

  27. [27]

    Neuromorphic Visual Receptive Field Hardware with Vertically Integrated Indium-Gallium-Zinc-Oxide Optoelectronic 21 Memristors over Silicon Neuron Transistors

    Kim HW, Kim JH, Shin DH, Jung MC, Park TW, Park HJ, et al. Neuromorphic Visual Receptive Field Hardware with Vertically Integrated Indium-Gallium-Zinc-Oxide Optoelectronic 21 Memristors over Silicon Neuron Transistors. Advanced Materials. 2025 9;https://doi.org/10. 1002/adma.202513907. 22