pith. machine review for the scientific record. sign in

arxiv: 2605.00146 · v1 · submitted 2026-04-30 · 💻 cs.CV

Recognition: unknown

Real-Time Frame- and Event-based Object Detection with Spiking Neural Networks on Edge Neuromorphic Hardware: Design, Deployment and Benchmark

Authors on Pith no claims yet

Pith reviewed 2026-05-09 20:13 UTC · model grok-4.3

classification 💻 cs.CV
keywords spiking neural networksneuromorphic hardwareobject detectionenergy efficiencyreal-time detectionedge computingLoihi 2ANN-to-SNN conversion
0
0 comments X

The pith

Spiking neural networks on Loihi 2 perform real-time object detection with the lowest energy per inference among tested platforms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows how to design spiking neural networks for object detection that run on neuromorphic hardware such as the Intel Loihi 2 processor. The networks handle both regular video frames and event-based camera data in real time. Using a distillation method to train the spiking networks from conventional ones, they recover nearly all the accuracy while using far less energy than GPU-based systems. This matters for battery-powered robots and drones that need to see and react without quickly draining power.

Core claim

The central discovery is that spiking neural networks deployed on the Loihi 2 neuromorphic processor can achieve real-time object detection on both frame-based and event-based datasets. With ANN-to-SNN distillation-aware training, these networks recover 87-100% of the detection accuracy of their artificial neural network counterparts. They also exhibit the lowest per-inference dynamic energy consumption and lower overall power draw compared to ANNs running on NVIDIA Jetson and Apple M2 platforms, although the latter can achieve higher inference rates.

What carries the argument

ANN-to-SNN distillation-aware training combined with deployment adaptations for the Loihi 2 processor, which converts standard detection models into energy-efficient spiking versions while preserving accuracy.

If this is right

  • Real-time detection becomes viable on severely power-limited edge devices like UAVs and mobile robots.
  • Event-based inputs pair efficiently with SNNs, potentially reducing data processing overhead.
  • Distillation training closes much of the accuracy gap between spiking and non-spiking networks for detection tasks.
  • Neuromorphic platforms like Loihi 2 offer superior power efficiency for vision at the edge despite lower peak speeds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar approaches might apply to other real-time tasks such as tracking or segmentation on neuromorphic hardware.
  • The energy savings could enable longer operation times in autonomous systems without increasing hardware size.
  • Further optimizations in model size or hardware mapping might push inference rates closer to those of GPUs while retaining energy benefits.

Load-bearing premise

That energy consumption and latency figures measured across Loihi 2, Jetson, and Apple platforms can be compared directly without hidden variations in measurement methods, clock speeds, or how the detection workload is represented.

What would settle it

Conducting identical object detection experiments on the same input data using both the Loihi 2 SNN implementation and an ANN on the Jetson Orin Nano, with standardized power and timing measurements.

Figures

Figures reproduced from arXiv: 2605.00146 by Cesar Cadena, Matteo Fumagalli, Silvia Tolu, Udayanga G.W.K.N. Gamage, Yan Zeng.

Figure 1
Figure 1. Figure 1: Overall workflow for SNN model development, training, deployment on Loihi 2, [PITH_FULL_IMAGE:figures/full_fig_p010_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Proposed SNN model backbones for (a) Model-1 (b) Model-2 (c) Model-3. Each [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Anchor-free detection head. tivation units. In the case of Model-3, in addition to the stateless fully con￾volutional version, we also implemented a Conv-LSTM based architecture with added batch normalization layers to facilitate detection performance comparison . 4.3. Event input encoding In our work, we employ two event-encoding methods, a 2D two-channel event histogram [17] and a voxel-grid [18] represe… view at source ↗
Figure 4
Figure 4. Figure 4: figure 4.c [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 4
Figure 4. Figure 4: Event-based data encoding for event-based object detection: (a) Positive (red) and negative (blue) events triggered within a 30 ms time period. (b) A Voxel-grid rep￾resentation with three bins (B0, B1, and B2) each with a temporal length of 10ms. It illustrates how the value of bin B0 is computed using weights based on temporal distances from its center (C0). (c) A 2D two-channel event histogram, where all… view at source ↗
Figure 5
Figure 5. Figure 5: ANN to SNN distillation for Model-1 for prophesee GEN1 dataset which contains two different object classes: SNN Feature maps from 3rd layer and 9th layer are convolved with learnable 1x1 convolutions and then take the average across time dimesion ’T’ ( T = 7 in our case). Then the feature map distillation loss (Lfeat−distill) is calculated together with corresponding ANN feature maps. Since GEN1 has two ob… view at source ↗
Figure 6
Figure 6. Figure 6: Sample images from the tunnel inspection dataset with bounding box–annotated defects 5.1.4. Data preprocessing To evaluate detection performance,, We used the same train, validation and test sets used to evaluate ev-CIVIL [59],prophesee GEN1 [60] and PAS￾CAL VOC [61] datasets. For GEN1, each sample consists of events within a period of 100ms and for evCIVIL-ev each sample consist of events within a time wi… view at source ↗
Figure 7
Figure 7. Figure 7: (a) mAP0.5 improvements of Model-1 and Model-2 on four benchmark datasets. (b) mAP0.5:0.95 improvements of Model-1 and Model-2 on the same datasets. Error bars represent the standard deviation of detection metrics for distilled SNN models. (For each dataset, distillation-aware training was performed five times using different random seeds with Kaiming normal initialization. The bar heights for distilled SN… view at source ↗
Figure 8
Figure 8. Figure 8: (a) mAP0.5 vs. inference rate, and (b) mAP0.5 vs. dynamic energy per sample (mJ), comparing the single-chip Loihi 2 system on the Intel Oheo Gulch platform [34], the Jetson Nano B01 edge GPU, and the Jetson Orin Nano edge GPU (batch size = 1). Results are reported for Model-1, Model-2, and Model-3 across event-based datasets. 50 100 150 200 Inference Rate (samples/s) 0.40 0.42 0.44 0.46 0.48 mAP@0.5 Loihi2… view at source ↗
Figure 9
Figure 9. Figure 9: (a) mAP0.5 vs. inference rate, and (b) mAP0.5 vs. dynamic energy per sample (mJ), comparing the single-chip Loihi 2 system on the Intel Oheo Gulch platform [34], the Jetson Nano B01 edge GPU, and the Jetson Orin Nano edge GPU (batch size = 1). Results are reported for Model-1, Model-2, and Model-3 across frame-based datasets. 31 [PITH_FULL_IMAGE:figures/full_fig_p032_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative visualization of sample detections with Model-2 SNN on UAV-based tunnel dataset compared to Prophesee GEN1, PASCAL VOC achieves a higher inference rate on Model-2, although PASCAL VOC is frame-based while GEN1 is event-based. Interestingly, for the evCIVIL-ev dataset, Model-2, which is approxi￾mately 50% larger than Model-1, achieves a higher inference rate (170 sam￾ples/s vs. 160 samples/s). … view at source ↗
Figure 11
Figure 11. Figure 11: (a) Number of SOP operations vs. inference rate(Throughput); (b) Number of [PITH_FULL_IMAGE:figures/full_fig_p037_11.png] view at source ↗
read the original abstract

Real-time object detection on energy-constrained platforms is critical for applications such as UAV-based inspection, autonomous navigation, and mobile robotics. Spiking neural networks (SNNs) on neuromorphic hardware are believed to be significantly more energy-efficient than conventional artificial neural networks (ANNs). In this work, we present a comprehensive methodology for designing general SNN detection architectures targeting neuromorphic platforms, along with the engineering adaptations required to deploy them on the state-of-the-art Neuromorphic processor, Intel Loihi 2. We benchmark SNN-based object detection on Loihi 2 using both frame-based and event-based datasets, comparing performance with ANN-based detection on the NVIDIA Jetson Orin Nano, NVIDIA Jetson Nano B01, and the Apple M2 CPU. Our results show that SNNs on Loihi 2 can perform real-time detection while achieving the lowest per-inference dynamic energy among all platforms. Also, Loihi 2 outperforms the other platforms in terms of power consumption, though ANNs on Jetson Orin Nano achieve higher inference rates. Furthermore, our ANN-to-SNN distillation-aware training enables SNNs to recover 87-100% of the detection accuracy of their ANN counterparts while maintaining lower inference latency; without distillation, SNNs exhibit an 11-27% accuracy drop. These results highlight the potential of neuromorphic systems for energy-efficient, real-time object detection at the edge.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a methodology for designing and deploying spiking neural networks (SNNs) for real-time object detection on Intel Loihi 2 neuromorphic hardware, including adaptations for frame-based and event-based inputs and ANN-to-SNN distillation-aware training. It benchmarks SNN performance on Loihi 2 against ANN-based detectors on NVIDIA Jetson Orin Nano, Jetson Nano B01, and Apple M2, claiming that SNNs achieve the lowest per-inference dynamic energy and power consumption while maintaining real-time operation and recovering 87-100% of ANN accuracy (versus 11-27% drop without distillation).

Significance. If the cross-platform energy and latency comparisons are shown to be methodologically equivalent, the work provides a valuable practical demonstration of SNN deployment for energy-efficient edge object detection in applications such as UAV inspection and robotics. The engineering details on Loihi 2 deployment, use of both input modalities, and distillation results constitute useful implementation contributions for the neuromorphic vision community.

major comments (2)
  1. [Experimental Setup / Benchmarking section] The section on experimental setup and hardware measurement protocols does not specify the exact definition of 'per-inference dynamic energy' (e.g., inference start/end boundaries, subtraction of static/leakage power, inclusion of spike encoding or DMA overhead) nor provide a common normalization (such as energy per output bounding box or per effective MAC) that would allow direct comparison between Loihi 2 on-chip counters, Jetson nvpmodel/external meters, and M2 powermetrics. This directly undermines the central claim that Loihi 2 achieves the lowest dynamic energy while remaining real-time.
  2. [Results section] The results reporting the 87-100% accuracy recovery with distillation and 11-27% drop without it lacks accompanying details on the precise model architectures (e.g., backbone, detection head), exact datasets used for frame-based and event-based cases, training hyperparameters, and any statistical tests or variance across runs. Without these, the quantitative claims cannot be independently assessed or generalized.
minor comments (2)
  1. [Figures and Tables] Figure captions and tables should explicitly state the input modality (frame vs. event) and the exact hardware configuration for each reported metric to improve readability.
  2. [Abstract] The abstract would benefit from naming the specific datasets and model sizes used to support the quantitative claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments that help strengthen the clarity and reproducibility of our work. We address each major point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Experimental Setup / Benchmarking section] The section on experimental setup and hardware measurement protocols does not specify the exact definition of 'per-inference dynamic energy' (e.g., inference start/end boundaries, subtraction of static/leakage power, inclusion of spike encoding or DMA overhead) nor provide a common normalization (such as energy per output bounding box or per effective MAC) that would allow direct comparison between Loihi 2 on-chip counters, Jetson nvpmodel/external meters, and M2 powermetrics. This directly undermines the central claim that Loihi 2 achieves the lowest dynamic energy while remaining real-time.

    Authors: We agree that precise definitions are required to support the energy-efficiency claims. In the revised manuscript we will explicitly state the start and end boundaries used for each per-inference measurement, describe how static/leakage power is subtracted on every platform, and clarify the treatment of spike encoding and DMA overheads. We will also add a normalized metric (energy per output bounding box) alongside the raw per-inference figures to enable direct cross-platform comparison. These additions will be placed in the Experimental Setup section and referenced in the Results. revision: yes

  2. Referee: [Results section] The results reporting the 87-100% accuracy recovery with distillation and 11-27% drop without it lacks accompanying details on the precise model architectures (e.g., backbone, detection head), exact datasets used for frame-based and event-based cases, training hyperparameters, and any statistical tests or variance across runs. Without these, the quantitative claims cannot be independently assessed or generalized.

    Authors: We acknowledge the need for fuller documentation. The revised Results section will include: (i) exact backbone and detection-head architectures for both frame-based and event-based models, (ii) the precise datasets and splits used in each case, (iii) all training hyperparameters (learning rates, epochs, distillation temperature, etc.), and (iv) accuracy statistics (mean and standard deviation) across at least three independent runs together with the statistical test employed. These details will be presented in a new table or expanded text to allow independent assessment and generalization of the distillation results. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical implementation and hardware benchmarking paper with no derivations or self-referential predictions.

full rationale

The paper presents a design methodology for SNN object detection architectures, engineering adaptations for Loihi 2 deployment, and direct benchmark comparisons of energy, latency, power, and accuracy against ANN baselines on Jetson and Apple platforms. All reported results (real-time detection, lowest dynamic energy on Loihi 2, 87-100% accuracy recovery via distillation) derive from implementation, training runs, and platform-specific measurements rather than any equations, fitted parameters renamed as predictions, or self-citation chains. No load-bearing steps reduce by construction to inputs within the paper; the work is self-contained against external hardware benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is an applied systems and benchmarking study; its claims rest on empirical training procedures and hardware measurements rather than new mathematical axioms, free parameters, or postulated entities.

pith-pipeline@v0.9.0 · 5584 in / 1189 out tokens · 80110 ms · 2026-05-09T20:13:05.669164+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

69 extracted references · 32 canonical work pages · 2 internal anchors

  1. [2]

    N. M. Alahdal, F. Abukhodair, L. H. Meftah, A. Cherif, Real-time object detection in autonomous vehicles with yolo, Procedia Computer Science 246 (2024) 2792–2801, 28th International Conference on Knowledge Based and Intelligent information and Engineering Systems (KES 2024).doi:https://doi.org/10.1016/j.procs.2024.09.392. URLhttps://www.sciencedirect.com...

  2. [3]

    K. Luo, X. Kong, J. Zhang, J. Hu, J. Li, H. Tang, Computer vision-based bridge inspection and monitoring: A review, Sensors 23 (18) (2023). doi:10.3390/s23187863

  3. [4]

    M. T. Rasheed, D. Shi, H. Khan, A comprehensive experiment-based review of low-light image enhancement methods and benchmarking low- light image quality assessment, Signal Processing 204 (2023) 108821. doi:https://doi.org/10.1016/j.sigpro.2022.108821

  4. [5]

    ACDC: The adverse conditions dataset with correspondences for robust semantic driving scene perception,

    G. Gallego, T. Delbr¨ uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, D. Scaramuzza, Event-based vision: A survey, IEEE Transactions on Pattern Analysis and Machine Intelligence 44 (1) (2022) 154–180.doi:10.1109/TPAMI. 2020.3008413

  5. [6]

    YOLOv3: An Incremental Improvement

    J. Redmon, A. Farhadi, Yolov3: An incremental improvement, arXiv preprint arXiv:1804.02767 (2018)

  6. [7]

    W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg, Ssd: Single shot multibox detector, in: B. Leibe, J. Matas, N. Sebe, M. Welling (Eds.), Computer Vision – ECCV 2016, Springer International Publishing, Cham, 2016, pp. 21–37

  7. [8]

    Ngo, H.-C

    D. Ngo, H.-C. Park, B. Kang, Edge intelligence: A review of deep neural network inference in resource-limited environments, Electronics 14 (12) (2025)

  8. [9]

    P. Mittal, A comprehensive survey of deep learning-based lightweight object detection models for edge devices, Artificial Intelligence Review 57 (9) (2024) 242.doi:10.1007/s10462-024-10877-1

  9. [10]

    J. Chen, X. Ran, Deep learning with edge computing: A review, Pro- 42 ceedings of the IEEE 107 (8) (2019) 1655–1674.doi:10.1109/JPROC. 2019.2921977

  10. [11]

    S. Liu, J. Zha, J. Sun, Z. Li, G. Wang, Edgeyolo: An edge-real- time object detector, in: Proceedings of the 2023 42nd Chinese Con- trol Conference (CCC), Tianjin, China, 2023, pp. 7507–7512.doi: 10.23919/CCC58697.2023.10239786

  11. [12]

    Sandler, A

    M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, Mo- bilenetv2: Inverted residuals and linear bottlenecks, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recog- nition (CVPR), 2018, pp. 4510–4520

  12. [13]

    Sci Robot

    F. Paredes-Vall´ es, J. J. Hagenaars, J. Dupeyroux, S. Stroobants, Y. Xu, G. C. H. E. de Croon, Fully neuromorphic vision and control for au- tonomous drone flight, Science Robotics 9 (90) (2024) eadi0591, erra- tum:Science Robotics, vol. 9, no. 91, p. eadr0223, Jun 2024.doi: 10.1126/scirobotics.adi0591

  13. [14]

    Joshi, S

    A. Joshi, S. Sanyal, K. Roy, Real-time neuromorphic navigation: In- tegrating event-based vision and physics-driven planning on a parrot bebop2 quadrotor, ArXiv abs/2407.00931 (2024). URLhttps://api.semanticscholar.org/CorpusID:270870393

  14. [16]

    Hinet: Half instance normalization network for image restoration

    Y. Hu, S.-C. Liu, T. Delbruck, v2e: From video frames to realistic dvs events, in: Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Virtual, 2021, pp. 1312–1321.doi:10.1109/CVPRW53098.2021.00144

  15. [17]

    Sironi, M

    A. Sironi, M. Brambilla, N. Bourdis, X. Lagorce, R. Benosman, HATS: Histograms of averaged time surfaces for robust event-based object clas- sification, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 2018, pp. 1731–1740.doi:10.1109/CVPR.2018.00186

  16. [18]

    Gehrig, A

    D. Gehrig, A. Loquercio, K. G. Derpanis, D. Scaramuzza, End-to-end learning of representations for asynchronous event-based data, in: Pro- ceedings of the IEEE/CVF International Conference on Computer Vi- sion (ICCV), Seoul, South Korea, 2019, pp. 5633–5643.doi:10.1109/ ICCV.2019.00573

  17. [19]

    Speech recognition through physical reservoir computing with neuromorphic nanowire networks,

    L. Cordone, B. Miramond, P. Thierion, Object detection with spiking neural networks on automotive event data, in: 2022 International Joint 43 Conference on Neural Networks (IJCNN), Padua, Italy, 2022, pp. 1–8. doi:10.1109/IJCNN55064.2022.9892618

  18. [20]

    J. D. Nunes, M. Carvalho, D. Carneiro, J. S. Cardoso, Spiking neural networks: A survey, IEEE Access 10 (2022) 60738–60764.doi:10.1109/ ACCESS.2022.3179968

  19. [21]

    W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, Y. Tian, In- corporating learnable membrane time constant to enhance learning of spiking neural networks, in: Proceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV), 2021, pp. 2641–2651. doi:10.1109/ICCV48922.2021.00265

  20. [22]

    K. Roy, A. R. Jaiswal, P. Panda, Towards spike-based machine intelli- gence with neuromorphic computing, Nature 575 (7784) (2019) 607–617. doi:10.1038/s41586-019-1677-2

  21. [23]

    Dampfhoffer, T

    M. Dampfhoffer, T. Mesquida, A. Valentian, L. Anghel, Backpropagation-based learning techniques for deep spiking neural networks: A survey, IEEE Transactions on Neural Networks and Learning Systems 35 (9) (2024) 11906–11921. doi:10.1109/TNNLS.2023.3263008

  22. [24]

    S. B. Shrestha, G. Orchard, SLAYER: Spike layer error reassign- ment in time, in: Advances in Neural Information Processing Systems (NeurIPS), Vol. 31, 2018

  23. [25]

    Panda, Spiking neural networks, ESWEEK 2021 Tutorial, Yale Uni- versity, accessed: 2025-04-28 (2021)

    P. Panda, Spiking neural networks, ESWEEK 2021 Tutorial, Yale Uni- versity, accessed: 2025-04-28 (2021). URLhttps://www.youtube.com/watch?v=7TybETlCslM

  24. [26]

    URLhttps://www.ibm.com/think/topics/neuromorphic- computing

    IBM, What is neuromorphic computing?, accessed: 2025-05-01 (2025). URLhttps://www.ibm.com/think/topics/neuromorphic- computing

  25. [27]

    7, 2025 (2025)

    BrainChip Inc., Akida — BrainChip neuromorphic chip overview, accessed: Dec. 7, 2025 (2025). URLhttps://brainchip.com/neuromorphic-chip-maker-takes- aim-at-the-edge

  26. [28]

    7, 2025 (2025)

    SynSense, DYNAP-CNN — event-driven neuromorphic AI processor for vision processing, accessed: Dec. 7, 2025 (2025). URLhttps://open-neuromorphic.org/neuromorphic-computing/ hardware

  27. [29]

    Furber, F

    S. Furber, F. Galluppi, S. Temple, L. A. Plana, SpiNNaker: A spiking neural network architecture, Journal of Neural Engineering 11 (5) (2014) 056021. 44

  28. [30]

    7, 2025 (2018)

    Intel Corporation, Loihi neuromorphic research chip, accessed: Dec. 7, 2025 (2018). URLhttps://open-neuromorphic.org/neuromorphic-computing/ hardware

  29. [31]

    rep., Intel Corporation, accessed: 2025-05-01 (2025)

    Intel Corporation, Neuromorphic computing: Loihi 2 technology brief, Tech. rep., Intel Corporation, accessed: 2025-05-01 (2025). URLhttps://www.intel.com/content/www/us/en/research/ neuromorphic-computing-loihi-2-technology-brief.html

  30. [32]

    A. Wild, M. Richter, Lava: Open-source framework for neuro-inspired applications, Online Video, accessed: 2025-05-01 (2023). URLhttps://www.youtube.com/watch?v=vXZukQ6A79k

  31. [33]

    URLhttps://github.com/lava-nc/lava-dl

    Lava Neuromorphic Computing Community, lava-dl: Deep learning li- brary for lava, GitHub Repository, accessed: 2025-05-01 (2025). URLhttps://github.com/lava-nc/lava-dl

  32. [34]

    URLhttps://www.intel.com/content/www/us/en/newsroom/news/ intel-advances-neuromorphic-loihi-2-lava.html

    Intel Corporation, Intel advances neuromorphic with Loihi 2, new Lava software framework and new partners, Intel Newsroom Press Release, accessed: 2025-12-08 (2021). URLhttps://www.intel.com/content/www/us/en/newsroom/news/ intel-advances-neuromorphic-loihi-2-lava.html

  33. [35]

    URLhttps://www.youtube.com/watch?v=AlASZb93rrc

    MIT 6.S965, Lecture 05 - quantization (Part I), accessed: May 1, 2025 (2020). URLhttps://www.youtube.com/watch?v=AlASZb93rrc

  34. [36]

    URLhttps://www.youtube.com/watch?v=tT9Lnt6stwA

    MIT HAN Lab, Lecture 10 - knowledge distillation — MIT 6.S965, accessed: May 1, 2025 (2020). URLhttps://www.youtube.com/watch?v=tT9Lnt6stwA

  35. [37]

    Vicente-Sola, D

    A. Vicente-Sola, D. L. Manna, P. Kirkland, G. Di Caterina, T. J. Bihl, Spiking neural networks for event-based action recognition: A new task to understand their advantage, Neurocomputing 611 (2025) 128657. doi:10.1016/j.neucom.2024.128657

  36. [38]

    Barchid, J

    S. Barchid, J. Mennesson, J. Eshraghian, C. Dj´ eraba, M. Bennamoun, Spiking neural networks for frame-based and event-based single ob- ject localization, Neurocomputing 559 (2023) 126805.doi:10.1016/ j.neucom.2023.126805

  37. [39]

    U. K. N. G. W. Gamage, L. Zanatta, M. Fumagalli, C. Cadena, S. Tolu, Event-based classification of defects in civil infrastructures with artificial and spiking neural networks, in: I. Rojas, G. Joya, A. Catala (Eds.), Advances in Computational Intelligence, Springer Nature Switzerland, Cham, 2023, pp. 629–640. 45

  38. [40]

    S. Kim, S. Park, B. Na, S. Yoon, Spiking-yolo: Spiking neural net- work for energy-efficient object detection, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 2020, pp. 11270–11277. doi:10.1609/aaai.v34i07.6787

  39. [41]

    X. Luo, M. Yao, Y. Chou, B. Xu, G. Li, Integer-valued training and spike-driven inference spiking neural network for high-performance and energy-efficient object detection, in: Proceedings of the European Con- ference on Computer Vision (ECCV), 2024, best Paper Candidate

  40. [42]

    Kugele, T

    A. Kugele, T. Pfeil, M. Pfeiffer, E. Chicca, Hybrid snn-ann: Energy- efficient classification and object detection for event-based vision, in: C. Bauckhage, J. Gall, A. Schwing (Eds.), Pattern Recognition, Springer International Publishing, Cham, 2021, pp. 297–312

  41. [43]

    Z. Wang, Z. Wang, H. Li, L. Qin, R. Jiang, D. Ma, H. Tang, Eas-snn: End-to-end adaptive sampling and representation for event-based detec- tion with recurrent spiking neural networks, in: A. Leonardis, E. Ricci, S. Roth, O. Russakovsky, T. Sattler, G. Varol (Eds.), Computer Vision – ECCV 2024, Springer Nature Switzerland, Cham, 2025, pp. 310–328

  42. [44]

    Systematic Gener- alization in Neural Networks-based Multivariate Time Series Forecasting Models

    A. Viale, A. Marchisio, M. Martina, G. Masera, M. Shafique, CarSNN: An efficient spiking neural network for event-based autonomous cars on the loihi neuromorphic research processor, in: 2021 International Joint Conference on Neural Networks (IJCNN), Virtual (Shenzhen, China), 2021, pp. 1–10.doi:10.1109/IJCNN52387.2021.9533738

  43. [45]

    Trajectory Optimization and Following for a Three Degrees of Freedom Overactuated Floating Platform

    A. Viale, A. Marchisio, M. Martina, G. Masera, M. Shafique, LaneSNNs: Spiking neural networks for lane detection on the Loihi neuromorphic processor, in: Proceedings of the 2022 IEEE/RSJ International Confer- ence on Intelligent Robots and Systems (IROS), Kyoto, Japan, 2022, pp. 79–86.doi:10.1109/IROS47612.2022.9981034

  44. [46]

    Brehove, S

    M. Brehove, S. A. Tumpa, E. Kyubwa, N. Menon, N. Vijaykrish- nan, Sigma-delta neural network conversion on loihi 2, arXiv preprint arXiv:2505.06417 (2025)

  45. [47]

    and Carlsson, Marcel , month = apr, year =

    J. Moosmann, P. Bonazzi, Y. Li, S. Bian, P. Mayer, L. Benini, M. Magno, Ultra-Efficient On-Device Object Detection on AI-Integrated Smart Glasses with TinyissimoYOLO, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024.doi:10.1109/ICASSP48485.2024.10446059

  46. [48]

    Rueckauer, Y

    B. Rueckauer, Y. Wen, K. Liu, Y. Shi, G. Cauwenberghs, Going deeper in spiking neural networks: VGG and residual architectures, Frontiers in Neuroscience 13 (2019) 95. 46

  47. [49]

    Balaji, A

    A. Balaji, A. Das, Compiling spiking neural networks to mitigate neu- romorphic hardware constraints, in: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2020. doi:10.1145/3400302.3415694

  48. [50]

    Intel Neuromorphic Research Community, Deep learning applications for Loihi 2,https://intel-ncl.atlassian.net/wiki/spaces/INRC/ pages/1969225744/, accessed: May 1, 2025 (2023)

  49. [51]

    Jocher, A

    G. Jocher, A. Chaurasia, J. Qiu, YOLOv8: Ultralytics real-time object detection (2023). URLhttps://github.com/ultralytics/ultralytics

  50. [52]

    X. Ding, X. Zhang, N. Ma, J. Han, G. Ding, J. Sun, RepVGG: Making VGG-style ConvNets great again, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 13733–13742

  51. [53]

    T. Kong, F. Sun, H. Liu, Y. Jiang, L. Li, J. Shi, FoveaBox: Beyond anchor-based object detector, IEEE Transactions on Image Processing 29 (2020) 7389–7398.doi:10.1109/TIP.2020.3002345

  52. [54]

    Rebecq, R

    H. Rebecq, R. Ranftl, V. Koltun, D. Scaramuzza, High speed and high HDR video with an event camera, IEEE Transactions on Pattern Anal- ysis and Machine Intelligence (TPAMI) 43 (6) (2021) 1964–1980

  53. [55]

    Ioffe, C

    S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal covariate shift, in: Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015, pp. 448– 456

  54. [56]

    X. Li, W. Wang, L. Wu, S. Chen, X. Hu, J. Li, J. Tang, J. Yang, Gen- eralized focal loss: Learning qualified and distributed bounding boxes for dense object detection, in: Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 12839–12848

  55. [57]

    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll´ ar, C. L. Zitnick, Microsoft COCO: Common objects in context, in: Computer Vision – ECCV 2014, Springer International Publish- ing, Cham, Switzerland, 2014, pp. 740–755.doi:10.1007/978-3-319- 10602-1\_48

  56. [58]

    URLhttps://github.com/lava-nc/lava-dl/blob/main/ 47 tutorials/lava/lib/dl/netx/pilotnet_snn/benchmark.ipynb

    lava-nc contributors, lava-dl: Deep Learning Library for Lava — PilotNet SNN Benchmark Notebook, GitHub repository, accessed: 26-Nov-2025 (2025). URLhttps://github.com/lava-nc/lava-dl/blob/main/ 47 tutorials/lava/lib/dl/netx/pilotnet_snn/benchmark.ipynb

  57. [59]

    U. G. W. K. N. Gamage, X. Huo, L. Zanatta, T. Delbr¨ uck, C. Ca- dena, M. Fumagalli, S. Tolu, Event-based civil infrastructure visual defect detection: ev-CIVIL dataset and benchmark, arXiv preprint arXiv:2504.05679 (2025)

  58. [60]

    de Tournemire, D

    P. de Tournemire, D. Nitti, E. Perot, D. Migliore, A. Sironi, A large scale event-based detection dataset for automotive, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020, pp. 346–347.doi:10.1109/CVPRW50498. 2020.00173

  59. [61]

    Everingham, L

    M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisser- man, The PASCAL visual object classes (VOC) challenge, International Journal of Computer Vision 88 (2) (2010) 303–338

  60. [62]

    Paszke, S

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al., PyTorch: An im- perative style, high-performance deep learning library, in: Advances in Neural Information Processing Systems (NeurIPS), 2019

  61. [63]

    W. Fang, Y. Chen, J. Ding, D. Chen, Z. Yu, T. Masquelier, Y. Chen, T. Huang, Y. Tian, SpikingJelly,https://github.com/ fangwei123456/spikingjelly, accessed: May 28, 2025 (2020)

  62. [64]

    Z. Teed, J. Deng, RAFT: Recurrent all-pairs field transforms for optical flow, in: Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK (Virtual), 2020, pp. 402–419

  63. [65]

    C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng, W. Nie, et al., YOLOv6: A single-stage object detection framework for industrial applications, arXiv preprint arXiv:2209.02976 (2022)

  64. [66]

    7, 2025 (2025)

    lava-nc contributors, lava-dl: PilotNet SDNN inference tutorial, ac- cessed: Dec. 7, 2025 (2025). URLhttps://github.com/lava-nc/lava-dl/tree/main/ tutorials/lava/lib/dl/netx/pilotnet_sdnn

  65. [67]

    URLhttps://category.yahboom.net/products/jetson-nano-sub

    Yahboom, Jetson nano 4gb b01 sub developer kit, accessed: 2025-12-07 (2025). URLhttps://category.yahboom.net/products/jetson-nano-sub

  66. [68]

    NVIDIA Corporation, Jetson Orin Nano Super Developer Kit, https://www.nvidia.com/en-us/autonomous-machines/embedded- systems/jetson-orin/nano-super-developer-kit/, accessed: 2026-03-29 (2024)

  67. [69]

    48 URLhttps://www.macrumors.com/guide/m2/

    MacRumors, M2 chip guide, accessed: 2025-06-12 (2022). 48 URLhttps://www.macrumors.com/guide/m2/

  68. [70]

    D. Cooney, Intel unveils 1.15bn neuron neuromorphic system Hala Point,https://www.datacenterdynamics.com/en/news/intel- unveils-115bn-neuron-neuromorphic-system-hala-point/, ac- cessed: 2026-03-29 (Apr 2024)

  69. [71]

    MIT HAN Lab, Lecture 07 – Neural Architecture Search (Part I) — MIT 6.S965, YouTube video (2020). URLhttps://www.youtube.com/watch?v=NQj5TkqX48Q 49 (a) (b) (c) (d) Figure A.1: Qualitative visualization of sample detections on (a) evCIVIL-ev, (b) GEN1, (c) evCIVIL-fr, and (d) PASCAL VOC datasets. For each dataset, the first row shows the correct detections...