Event-based Civil Infrastructure Visual Defect Detection: ev-CIVIL Dataset and Benchmark

Cesar Cadena; Luca Zanatta; Matteo Fumagalli; Silvia Tolu; T Delbruck; Udayanga G.W.K.N. Gamage; Xuanni Huo

arxiv: 2504.05679 · v2 · submitted 2025-04-08 · 💻 cs.CV

Event-based Civil Infrastructure Visual Defect Detection: ev-CIVIL Dataset and Benchmark

Udayanga G.W.K.N. Gamage , Xuanni Huo , Luca Zanatta , T Delbruck , Cesar Cadena , Matteo Fumagalli , Silvia Tolu This is my paper

Pith reviewed 2026-05-22 20:48 UTC · model grok-4.3

classification 💻 cs.CV

keywords event-based visioncivil infrastructuredefect detectiondynamic vision sensorcrack detectionspallingUAV inspectiondataset

0 comments

The pith

Event-based cameras support reliable detection of cracks and spalling on civil structures even under rapidly changing light.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates the first dedicated dataset of event streams from dynamic vision sensors for finding cracks and spalling on infrastructure. Data were gathered with a DAVIS346 sensor that records both event streams and simultaneous grayscale frames, across hundreds of field and laboratory sequences. Standard real-time object detectors trained on these streams achieve usable performance where conventional cameras lose detail due to blur or saturation. A sympathetic reader would care because UAV inspections of bridges and buildings currently lose effectiveness whenever lighting shifts, and event sensors avoid that loss by design. If the results hold, maintenance teams could shift to lower-power, higher-reliability sensors without changing the rest of their detection pipeline.

Core claim

The central claim is that dynamic vision sensors produce event streams sufficient for real-time object detection of civil defects; the ev-CIVIL dataset supplies 680 recording sequences containing 678 cracks and 429 spalling instances, each captured simultaneously as events and APS frames, and four detection models trained on the event data demonstrate applicability under the same lighting conditions that degrade frame-based methods.

What carries the argument

The ev-CIVIL dataset of paired event streams and intensity frames recorded with the DAVIS346 camera, focused on cracks and spalling in field and laboratory settings.

If this is right

DVS data can be fed directly to existing real-time detectors without requiring new hardware beyond the sensor itself.
Inspections remain possible during dawn, dusk, or under moving shadows where frame cameras lose contrast.
Power consumption per inspection can decrease because event cameras transmit data only on change.
Separate training on event streams and on APS frames allows direct comparison of the two modalities on identical scenes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same event streams could support tracking of defect growth over repeated flights without storing full video.
Combining event and frame data in one model might improve robustness beyond either modality alone.
Extension to additional defect types such as corrosion or joint separation would require only new labels on similar recordings.

Load-bearing premise

The specific sequences collected with one camera model and the defects labeled in them represent the range of real-world civil infrastructure surfaces and lighting variations.

What would settle it

A new set of recordings on previously unseen structures under lighting conditions outside the collected range where event-based detectors drop below usable precision while frame-based detectors remain usable.

Figures

Figures reproduced from arXiv: 2504.05679 by Cesar Cadena, Luca Zanatta, Matteo Fumagalli, Silvia Tolu, T Delbruck, Udayanga G.W.K.N. Gamage, Xuanni Huo.

**Figure 2.** Figure 2: Field and Laboratory data examples structures, including roads, pavements, tunnels, buildings, and walls containing 458 unique crack instances and 121 spalling instances. Examples of these data samples are visualized in fig. 2a. The figure displays grayscale image frames captured by the APS sensor, denoted as ’fr’ in the corresponding columns. Data captured by the DVS sensor are represented as 2D event his… view at source ↗

**Figure 3.** Figure 3: Number of sequences collected under different lighting conditions. timestamp x y p timestamp x y p timestamp x y p timestamp image_array timestamp image_array timestamp <class_id> (bbox_xlow) (bbox_ylow) (bbox_width) (bbox_height) events.h5 frames.h5 label.npy (a) Template outlining the composition of files events.h5 frames.h5 14015274 0 2.5 13.0 65 30.6 14015272 308 144 0 14015272 167 16 1 14015373 78 60 … view at source ↗

**Figure 4.** Figure 4: Structure of a recording sequence in the ev-CIVIL Dataset: each event is characterized by a timestamp in µs, x and y are pixel coordinates within the 346x260 DAVIS346 spatial resolution, and a polarity value (p). The polarity p indicates the type of event: 1 for an increase in pixel intensity and 0 for a decrease. Data Collection For our study, which evaluates event-based defect detection in comparison to … view at source ↗

**Figure 5.** Figure 5: Data Collection Setup 0.10 0.05 0.00 0.05 0.10 0.15 0.20 X (m) 0.025 0.020 0.015 0.010 0.005 0.000 0.005 0.010 Y (m) 0.0 0.2 0.4 0.6 0.8 1.0 Z (m) 3D Trajectory (a) Z-shaped camera trajectory with smooth, flowing bends 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Time (s) 0.0 0.5 1.0 1.5 2.0 Velocity Magnitude (m/s) (b) Instant velocity magnitude throughout the trajectory 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Time (s) 1 0 1 … view at source ↗

**Figure 6.** Figure 6: An illustration of the DAVIS346 camera’s ’Z’-shaped trajectory with smooth, flowing bends is shown in (a), depicting the trajectory’s two horizontal segments at different distances from the object, connected by a diagonal segment. The trajectory spans a range of 0.2 to 0.7 m, effectively covering this distance. Additionally, the corresponding instantaneous velocity magnitude profile (b), the variation in c… view at source ↗

**Figure 7.** Figure 7: Overview of the data collection process consisting of preparing the DAVIS346 camera, capturing grayscale images and DVS events simultaneously, and transferring the data to a PC after each recording sequence. to capture any meaningful information due to insufficient ambient lighting. To configure the DAVIS346 for data collection, we adjusted the bias values, optimized focus settings, and performed calibrati… view at source ↗

**Figure 8.** Figure 8: The process of preparing and integrating an IR laser as an external illuminator with the DAVIS346 camera for scenarios requiring external illumination. (a) The IR laser projector of the Intel RealSense D435 is covered with a thin plastic strip containing tiny holes, where the hole diameter is smaller than that of the structured dot patterns. The covered laser projector is then attached to the handheld moun… view at source ↗

**Figure 9.** Figure 9: Comparison of Fixed time length based 2D event histogram formation with our 2D event histogram formation method illustrated in algorithm 1: defect areas (in first-row “crack” defect, second row “spalling” defect) are localized by drawing bounding boxes extracted feature maps to detect objects efficiently in realtime applications. YOLOv6 Architecture Variants : In our evaluations, we used the YOLOv6m16 and… view at source ↗

**Figure 10.** Figure 10: SSD300 Architecture various object shapes and sizes, particularly in real-time applications where computational efficiency is crucial. SSD300 Architecture and Model Variants : In SSD300-ResNet50, as shown in fig. 10 the backbone uses ResNet5019, a deep residual network known for its ability to capture high-level, semantic features. ResNet50 generates feature maps at different layers, which are then utiliz… view at source ↗

**Figure 11.** Figure 11: Yolov6 Architecture71 AP(c) = 1 n Xn i=1 P(IOUi) (3) where, n : the number of IoU thresholds. (More specifically, for coco AP0.5:0.95, these IOU thresholds range from 0.5 to 0.95 with a step size of 0.05. And for coco AP0.5 n = 1, as it calculated for the IOU threshold 0.5) P(IOUi) : the precision calculated at IoU threshold IOUi as in eq. (5) F1iou0.5 score : F1iou0.5 metric which combines precision and … view at source ↗

**Figure 12.** Figure 12: Extraction of grayscale images and event-based data from the evCIVIL dataset for benchmarking crack and spalling detection. First, 10-15 samples are obtained from each recording sequence. For each sample, 2D event histograms are generated from the corresponding events. The extracted grayscale image frames are then preprocessed. These preprocessed grayscale images, 2D event histograms, and extracted annota… view at source ↗

**Figure 13.** Figure 13: Qualititative visualization of event-based and frame-based crack and spalling detection results of four detection models on Adequately-Illuminated Test Set [PITH_FULL_IMAGE:figures/full_fig_p019_13.png] view at source ↗

**Figure 14.** Figure 14: Qualitative visualization of the event-based and image-based crack and spalling detection results for the YOLOv6m and YOLOv6lite-s models on the Low-Light Test Set: ”Other scenarios” refers to conditions involving saturated and dynamic lighting, in addition to dimly lit environments. formation method outperformed the fixed temporal lengthbased formulations when fixed temporal lengths were set to 25 ms, 3… view at source ↗

**Figure 15.** Figure 15: Percentage reduction of mAP@0.5 and F1iou0.5 metrics of defect detections with (a) YOLOv6m, (b) YOLOv6lite-s, (c) SSD300 with ResNet backbone, and (d) SSD300 with MobileNetV2 backbone, when those models were trained without Laboratory data. The results are displayed with respect to the two test sets; that is adequate lighting test set and low/dynamic lighting test set T_10 T_15 T_20 T_25 Ours method 0.0 0… view at source ↗

**Figure 16.** Figure 16: Comparison of our 2 channel histogram method explained in algorithm 1 (Ours) with fixed temporal length based histogram formation method. For adequately illuminated data, the fixed temporal lengths are 10 ms (T 10), 15 ms (T 15), 20 ms (T 20), and 25 ms (T 25). For low-illuminated data, the fixed temporal lengths are 25 ms (T 25), 30 ms (T 30), 35 ms (T 35), and 40 ms (T 40). Performance is evaluated in t… view at source ↗

**Figure 17.** Figure 17: Variation in classification accuracy among ResNet34, VGG16, and MobileNetV2 models for frame-based and event-based classification tasks across different input spatial resolutions (32x32, 64x64, 128x128, 224x224) in both Adequately-Illuminated and Low-Light test datasets 128×128. For the same Low-light Test Set, the highest eventbased classification accuracy of 93% was also achieved with the EfficientNet-… view at source ↗

**Figure 18.** Figure 18: Image-based and event-based detection errors 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 ECC… view at source ↗

**Figure 1.** Figure 1: Issues involved with event-based detections: (a) partial detections due gure 19. Issues involved with eventbased detections: (a) partial detections due to directionality of DVS, (b) blur in nighttime vent-based 2D histograms as camera movement increases. [PITH_FULL_IMAGE:figures/full_fig_p023_1.png] view at source ↗

read the original abstract

Small unmanned aerial vehicle (UAV)-based visual inspections are a more efficient alternative to manual methods for examining civil structural defects, offering safe access to hazardous areas and significant cost savings by reducing labor requirements. However, traditional frame-based cameras, widely used in UAV-based inspections, often struggle to capture defects under low or dynamic lighting conditions. In contrast, dynamic vision sensors (DVS), or event-based cameras, excel in such scenarios by minimizing motion blur, enhancing power efficiency, and maintaining high-quality imaging across diverse lighting conditions without saturation or information loss. Despite these advantages, existing research lacks studies exploring the feasibility of using DVS for detecting civil structural defects. Moreover, there is no dedicated event-based dataset tailored for this purpose. Addressing this gap, this study introduces the first event-based civil infrastructure defect detection dataset, capturing defective surfaces as a spatio-temporal event stream using DVS. In addition to event-based data, the dataset includes grayscale intensity image frames captured simultaneously using an active pixel sensor (APS). Both data types were collected using the DAVIS346 camera, which integrates DVS and APS sensors. The dataset focuses on two types of defects: cracks and spalling, and includes data from both field and laboratory environments. The field dataset comprises 318 recording sequences, documenting 458 distinct cracks and 121 distinct spalling instances. The laboratory dataset includes 362 recording sequences, covering 220 distinct cracks and 308 spalling instances. We evaluated the dataset using four real-time object detection models.The results demonstrate the applicability of DVS cameras for robust detection of civil infrastructure defects under challenging lighting conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The new ev-CIVIL dataset is the actual contribution here; the abstract gives no numbers or lighting details to back the claim that DVS handles challenging conditions better than frames.

read the letter

The paper's real addition is the first dedicated event-based dataset for detecting cracks and spalling on civil infrastructure. It records 318 field sequences and 362 lab sequences with a DAVIS346, capturing both event streams and simultaneous APS frames, then tests four standard object detectors on the event data. That fills a gap the cited literature apparently left open, and the split between field and lab plus the two defect types is a reasonable start for an applied benchmark. The collection numbers themselves look like straightforward empirical work with no obvious fitting or self-reference issues. The soft spot is the central claim about robust performance under challenging lighting. The abstract states the advantage of DVS over frame cameras in low or dynamic light but supplies no lux values, no protocol for creating those conditions, no APS-versus-DVS comparison results, and no quantitative detection metrics at all. Without those, it is hard to tell whether the recorded sequences actually test the regimes where standard cameras fail. If the full paper contains the missing comparisons and numbers, the claim strengthens; on the visible evidence it remains under-supported. This work is mainly for groups already doing event-based vision or UAV inspection who need a starting dataset. A reader looking for a ready-to-use benchmark in that niche would get something concrete from it. The paper deserves a serious referee because the dataset is new and the application is practical, even if the lighting-robustness part needs more evidence before publication.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces the ev-CIVIL dataset, the first event-based dataset for civil infrastructure visual defect detection. It collects spatio-temporal event streams and simultaneous APS grayscale frames using a DAVIS346 camera across 318 field sequences (458 cracks, 121 spalling) and 362 laboratory sequences (220 cracks, 308 spalling). Four real-time object detection models are evaluated on the data, with the central claim that the results demonstrate the applicability of DVS cameras for robust detection of cracks and spalling under challenging lighting conditions.

Significance. If the dataset collection protocols and model evaluations establish a clear performance advantage for event data over frame-based imaging specifically in low or dynamic lighting, the work would be significant as the first dedicated benchmark in this application domain. The dual field/lab collection and inclusion of both DVS and APS modalities provide a useful resource for future UAV-based inspection research.

major comments (3)

[Abstract and Dataset section] Abstract and Dataset section: The claim that DVS enables 'robust detection ... under challenging lighting conditions' is not supported by any reported lux ranges, dynamic lighting protocols, or quantitative APS-vs-DVS performance differentials. Without these, the central robustness claim cannot be evaluated.
[Experiments section] Experiments section: No quantitative performance numbers (mAP, precision-recall, or error analysis) are supplied for the four object detection models, nor any baseline comparison against frame-based methods on the same sequences. This leaves the 'results demonstrate' statement without empirical grounding.
[Dataset section] Dataset section: The representativeness of the 680 total sequences for real-world civil infrastructure under conditions where frame-based cameras fail is asserted but not demonstrated; no details on lighting variability, motion speeds, or failure cases of APS are provided to substantiate the weakest assumption.

minor comments (2)

[Dataset section] Clarify the exact train/validation/test splits and labeling protocol (e.g., how event streams were annotated) to improve reproducibility.
[Experiments section] Add a table summarizing the four models, their input modalities (events vs. APS), and key hyperparameters.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript introducing the ev-CIVIL dataset. We have reviewed each major comment and provide our responses below, along with plans for revision.

read point-by-point responses

Referee: [Abstract and Dataset section] The claim that DVS enables 'robust detection ... under challenging lighting conditions' is not supported by any reported lux ranges, dynamic lighting protocols, or quantitative APS-vs-DVS performance differentials. Without these, the central robustness claim cannot be evaluated.

Authors: We acknowledge this point and agree that additional details are needed to support the claim. In the revised manuscript, we will include measured lux ranges for the field and laboratory sequences, describe the dynamic lighting protocols used during collection, and provide quantitative performance comparisons between the DVS event data and the simultaneous APS frames for the detection models. This will be incorporated into both the Dataset and Experiments sections. revision: yes
Referee: [Experiments section] No quantitative performance numbers (mAP, precision-recall, or error analysis) are supplied for the four object detection models, nor any baseline comparison against frame-based methods on the same sequences. This leaves the 'results demonstrate' statement without empirical grounding.

Authors: We agree that the manuscript lacks sufficient quantitative details. In the revision, we will supply the mAP, precision-recall, and error analysis numbers for the four models, as well as include baseline comparisons against frame-based methods using the APS data on the same sequences to provide empirical grounding for the results. revision: yes
Referee: [Dataset section] The representativeness of the 680 total sequences for real-world civil infrastructure under conditions where frame-based cameras fail is asserted but not demonstrated; no details on lighting variability, motion speeds, or failure cases of APS are provided to substantiate the weakest assumption.

Authors: To address this, we will expand the Dataset section with specific details on lighting variability across the sequences, typical UAV motion speeds during recording, and documented cases where APS frames exhibited failures such as motion blur or saturation, while the corresponding event streams enabled successful defect capture. This will better demonstrate the real-world applicability. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical dataset collection and standard model benchmarking

full rationale

The paper introduces the ev-CIVIL dataset collected with a DAVIS346 camera and evaluates four off-the-shelf real-time object detection models on event streams and APS frames for crack and spalling detection. No derivations, equations, or parameter-fitting steps appear in the provided text. The central claim rests on new field and laboratory recordings plus standard benchmark results rather than any self-referential reduction of outputs to inputs defined by the authors. Self-citations, if present, are not load-bearing for the empirical demonstration.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies primarily on standard practices in event camera data collection and off-the-shelf object detection models rather than introducing new free parameters, axioms, or invented entities.

axioms (1)

standard math Standard assumptions about event generation and calibration in DAVIS346 DVS/APS sensors hold for the collected sequences.
Implicit in the use of the integrated camera for simultaneous event and intensity data capture.

pith-pipeline@v0.9.0 · 5847 in / 1304 out tokens · 52982 ms · 2026-05-22T20:48:51.018306+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Real-Time Frame- and Event-based Object Detection with Spiking Neural Networks on Edge Neuromorphic Hardware: Design, Deployment and Benchmark
cs.CV 2026-04 unverdicted novelty 4.0

SNNs deployed on Loihi 2 achieve real-time object detection with the lowest dynamic energy per inference and recover 87-100% of ANN accuracy via distillation-aware training.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages · cited by 1 Pith paper · 4 internal anchors

[1]

Drone- based non-destructive inspection of industrial sites: A review and case studies

Nooralishahi P, Ibarra-Castanedo C, Deane S et al. Drone- based non-destructive inspection of industrial sites: A review and case studies. Drones 2021; 5(4). DOI:10. 3390/drones5040106. URL https://www.mdpi.com/ 2504-446X/5/4/106

work page 2021
[2]

Pilot visual detection of small unmanned aircraft systems (suas) equipped with strobe lighting

Wallace R, Lofﬁ J, Vance S et al. Pilot visual detection of small unmanned aircraft systems (suas) equipped with strobe lighting. Journal of Aviation Technology and Engineering 2018; 7. DOI:10.7771/2159-6670.1177

work page doi:10.7771/2159-6670.1177 2018
[3]

Drone spotlights, 2024

Unmanned Systems Technology. Drone spotlights, 2024. URL https://www.unmannedsystemstechnology. com/expo/drone-spotlights/

work page 2024
[4]

Event-based human intrusion detection in uas using deep learning

Prez-Cutio M, Eguluz AG, Dios JMd et al. Event-based human intrusion detection in uas using deep learning. In 2021 International Conference on Unmanned Aircraft Systems (ICUAS). pp. 91–100. DOI:10.1109/ICUAS51884.2021. 9476677

work page doi:10.1109/icuas51884.2021 2021
[5]

Kristianto, G

Gallego G, Delbrck T, Orchard G et al. Event-based vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 2022; 44(1): 154–180. DOI:10.1109/TPAMI. 2020.3008413

work page doi:10.1109/tpami 2022
[6]

Design and experimental evaluation of an aerial solution for visual inspection of tunnel- like infrastructures

Bendris B and Cayero Becerra J. Design and experimental evaluation of an aerial solution for visual inspection of tunnel- like infrastructures. Remote Sensing 2022; 14(1). DOI: 10.3390/rs14010195. URL https://www.mdpi.com/ 2072-4292/14/1/195

work page doi:10.3390/rs14010195 2022
[7]

Basler ace, Accessed 2024

Basler. Basler ace, Accessed 2024. URL https://www. baslerweb.com/en/shop/aca4112-20um/

work page 2024
[8]

Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types

Cha YJ, Choi W, Suh G et al. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering 2018; 33(9): 731–747

work page 2018
[9]

A novel hybrid approach for crack detection

Fang F, Li L, Gu Y et al. A novel hybrid approach for crack detection. Pattern Recognition 2020; 107: 107474

work page 2020
[10]

A survey and evaluation of promising approaches for automatic image- based defect detection of bridge structures

Jahanshahi MR, Kelly JS, Masri SF et al. A survey and evaluation of promising approaches for automatic image- based defect detection of bridge structures. Structure and Infrastructure Engineering 2009; 5(6): 455–486

work page 2009
[11]

A review of computer visionbased structural health monitoring at local and global levels

Dong CZ and Catbas N. A review of computer visionbased structural health monitoring at local and global levels. Structural Health Monitoring 2020; 20: 692 –

work page 2020
[12]

URL https://api.semanticscholar.org/ CorpusID:225627479

work page
[13]

Accessed: Dec

First Principles of Computer Vision, What is an Edge? — Edge Detection , YouTube, 2024, https: //www.youtube.com/watch?v=G8yp6f9V_6c. Accessed: Dec. 2, 2024

work page 2024
[14]

Accessed: Dec

Augmented AI, Support V ector Machine (SVM) in 7 minutes , YouTube, 2024, https://www.youtube.com/watch? v=Y6RRHw9uN9o. Accessed: Dec. 2, 2024

work page 2024
[15]

com/watch?v=2xqkSUhmmXU

Alexander Amini, MIT 6.S191: Convolutional Neural Net- works, YouTube, May 2024, https://www.youtube. com/watch?v=2xqkSUhmmXU. Accessed: Dec. 2, 2024

work page 2024
[16]

youtube.com/watch?v=ErnWZxJovaM&list= PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI

Alexander Amini, MIT Introduction to Deep Learning — 6.S191 , YouTube, May 2024, https://www. youtube.com/watch?v=ErnWZxJovaM&list= PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI. Accessed: Dec. 2, 2024

work page 2024
[17]

Meituan. YOLOv6. https://github.com/meituan/ YOLOv6. Accessed: March 7, 2024

work page 2024
[18]

Ssd: Single shot multibox detector

Liu W, Anguelov D, Erhan D et al. Ssd: Single shot multibox detector. In Leibe B, Matas J, Sebe N et al. (eds.) Computer Vision – ECCV 2016. Cham: Springer International Publishing, pp. 21–37

work page 2016
[19]

Very Deep Convolutional Networks for Large-Scale Image Recognition

Simonyan K and Zisserman A. Very deep convolutional networks for large-scale image recognition. CoRR 2014; abs/1409.1556. URL https://api. semanticscholar.org/CorpusID:14124313

work page internal anchor Pith review Pith/arXiv arXiv 2014
[20]

Deep residual learning for image recognition

He K, Zhang X, Ren S et al. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015; : 770–778URL https://api.semanticscholar.org/ CorpusID:206594692

work page 2016
[21]

A comprehensive review of deep learning-based crack detection approaches

Hamishebahar Y , Guan H, So S et al. A comprehensive review of deep learning-based crack detection approaches. Applied Sciences 2022; 12(3): 1374

work page 2022
[22]

Artiﬁcial intelli- gence assisted infrastructure assessment using mixed reality systems

Karaaslan E, Bagci U and Catbas FN. Artiﬁcial intelli- gence assisted infrastructure assessment using mixed reality systems. Transportation Research Record 2018; 2673: 413 – 424. URL https://api.semanticscholar.org/ CorpusID:55702295

work page 2018
[23]

Event-based classiﬁcation ofdefects incivil infrastructures withartiﬁcial andspiking neural networks

Gamage UKNGW, Zanatta L, Fumagalli M et al. Event-based classiﬁcation ofdefects incivil infrastructures withartiﬁcial andspiking neural networks. In Rojas I, Joya G and Catala A (eds.) Advances in Computational Intelligence . Cham: Springer Nature Switzerland. ISBN 978-3-031-43078-7, pp. 629–640

work page
[24]

A 128× 128 120 db 15 µs latency asynchronous temporal contrast vision sensor.IEEE Journal of Solid-State Circuits 2008; 43(2): 566–576

Lichtsteiner P, Posch C and Delbruck T. A 128× 128 120 db 15 µs latency asynchronous temporal contrast vision sensor.IEEE Journal of Solid-State Circuits 2008; 43(2): 566–576. DOI: 10.1109/JSSC.2007.914337

work page doi:10.1109/jssc.2007.914337 2008
[25]

Measuring diameters and velocities of artiﬁcial raindrops with a neuromorphic dynamic vision sensor disdrometer

Steiner JG, Micev K, Aydin A et al. Measuring diameters and velocities of artiﬁcial raindrops with a neuromorphic dynamic vision sensor disdrometer. URL https://api. semanticscholar.org/CorpusID:253707979

work page
[26]

URL https://inivation.com/

Inivation. URL https://inivation.com/

work page
[27]

A large scale event-based detection dataset for automotive,

de Tournemire P, Nitti DO, Perot E et al. A large scale event-based detection dataset for automotive. ArXiv 2020; abs/2001.08499. URL https://api. semanticscholar.org/CorpusID:210860813

work page arXiv 2020
[28]

Learning to detect objects with a 1 megapixel event camera

Perot E, de Tournemire P, Nitti D et al. Learning to detect objects with a 1 megapixel event camera. In Proceedings of the 34th International Conference on Neural Information Processing Systems . NIPS’20, Red Hook, NY , USA: Curran Associates Inc. ISBN 9781713829546

work page
[29]

Pushing the limits of asynchronous graph-based object detection with event cameras

Gehrig D and Scaramuzza D. Pushing the limits of asynchronous graph-based object detection with event cameras. arXiv 2022

work page 2022
[30]

High-temporal-resolution object detection and tracking using images and events

El Shair Z and Rawashdeh SA. High-temporal-resolution object detection and tracking using images and events. Journal of Imaging 2022; 8(8): 210. DOI:10.3390/ jimaging8080210. URL https://www.mdpi.com/ 2313-433X/8/8/210

work page 2022
[31]

Pedro: an event- based dataset for person detection in robotics

Boretti C, Bich P, Pareschi F et al. Pedro: an event- based dataset for person detection in robotics. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

work page
[32]

MoCoGAN: Decomposing motion and content for video generation

Sironi A, Brambilla M, Bourdis N et al. Hats: His- tograms of averaged time surfaces for robust event-based object classiﬁcation. In 2018 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) . IEEE Com- puter Society, pp. 1731–1740. DOI:10.1109/CVPR.2018. 00186. URL https://doi.ieeecomputersociety. org/10.1109/CVPR.2018.00186

work page doi:10.1109/cvpr.2018 2018
[33]

Dsec: A stereo event camera dataset for driving scenarios

Gehrig M, Aarents W, Gehrig D et al. Dsec: A stereo event camera dataset for driving scenarios. IEEE Robotics and Automation Letters 2021; DOI:10.1109/LRA.2021.3068942

work page doi:10.1109/lra.2021.3068942 2021
[34]

In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Hu Y , Liu S and Delbruck T. v2e: From video frames to realistic dvs events. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1312–1321. DOI:10.1109/CVPRW53098.2021. 00144. URL https://doi.ieeecomputersociety. org/10.1109/CVPRW53098.2021.00144

work page doi:10.1109/cvprw53098.2021 2021
[35]

M3ed: Multi-robot, multi- sensor, multi-environment event dataset

Chaney K, Cladera F, Wang Z et al. M3ed: Multi-robot, multi- sensor, multi-environment event dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. pp. 4015–4022

work page
[36]

Converting static image datasets to spiking neuromorphic datasets using saccades

Orchard G, Jayawant A, Cohen G et al. Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in Neuroscience 2015; 9. URL https://api. semanticscholar.org/CorpusID:940928

work page 2015
[37]

Cifar10-dvs: An event-stream dataset for object classiﬁcation

Li H, Liu H, Ji X et al. Cifar10-dvs: An event-stream dataset for object classiﬁcation. Frontiers in Neuroscience 2017; 11. URL https://api.semanticscholar. org/CorpusID:2406565

work page 2017
[38]

Intel RealSense D400 Series Product Family Datasheet,

Intel, “Intel RealSense D400 Series Product Family Datasheet,” Available: https://www.intel.com/ content/www/us/en/content-details/841984/ intel-realsense-d400-series-product-family-datasheet. html. [Accessed: 31-Jan-2025]

work page 2025
[39]

URL https://www.sz3km.cn/index.php/content/71

The application of ir laser illuminator in the drones. URL https://www.sz3km.cn/index.php/content/71. Prepared using sagej.cls 25

work page
[40]

Introduction to the Physics and Techniques of Remote Sensing

Elachi C and van Zyl J. Introduction to the Physics and Techniques of Remote Sensing . John Wiley & Sons,

work page
[41]

URL https://onlinelibrary.wiley.com/ doi/book/10.1002/0471783390

work page doi:10.1002/0471783390
[42]

Intel realsense sdk api how-to: Controlling the laser, Year

Intel Corporation. Intel realsense sdk api how-to: Controlling the laser, Year. URL https://github. com/IntelRealSense/librealsense/wiki/ API-How-To#controlling-the-laser

work page
[43]

Meta-learning convolutional neural architectures for multi-target concrete defect classiﬁcation with the concrete defect bridge image dataset

Mundt M, Majumder S, Murali S et al. Meta-learning convolutional neural architectures for multi-target concrete defect classiﬁcation with the concrete defect bridge image dataset. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . pp. 11188–11197. DOI: 10.1109/CVPR.2019.01145

work page doi:10.1109/cvpr.2019.01145 2019
[44]

Deep concrete inspection using unmanned aerial vehicle towards cssc database

Yang L, Li B, Li W et al. Deep concrete inspection using unmanned aerial vehicle towards cssc database

work page
[45]

Detecting cracks and spalling automatically in extreme events by end-to-end deep learning frameworks

Bai Y , Sezen H and Yilmaz A. Detecting cracks and spalling automatically in extreme events by end-to-end deep learning frameworks. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2021; V-2-2021: 161–168. DOI:10.5194/isprs-annals-V-2-2021-161-2021

work page doi:10.5194/isprs-annals-v-2-2021-161-2021 2021
[46]

SensorsINI. jaer. https://github.com/SensorsINI/ jaer, Accessed: Dec 7, 2024

work page 2024
[47]

Labelme: Image annotation and labeling tool

LabelMe Development Team. Labelme: Image annotation and labeling tool. https://github.com/labelmeai/ labelme, ongoing. Accessed: March 7, 2024

work page 2024
[48]

Microsoft COCO: Common Objects in Context

Lin T, Maire M, Belongie SJ et al. Microsoft COCO: common objects in context. CoRR 2014; abs/1405.0312. URL http: //arxiv.org/abs/1405.0312. 1405.0312

work page internal anchor Pith review Pith/arXiv arXiv 2014
[49]

Event-based vision meets deep learning on steering prediction for self- driving cars

Maqueda AMI, Loquercio A, Gallego G et al. Event-based vision meets deep learning on steering prediction for self- driving cars. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018; : 5419–5427URL https:// api.semanticscholar.org/CorpusID:4610262

work page 2018
[50]

cv::CLAHE Class Reference, n.d

OpenCV. cv::CLAHE Class Reference, n.d. URL https://docs.opencv.org/4.x/d6/db6/ classcv_1_1CLAHE.html

work page
[51]

Mobilenetv2: Inverted residuals and linear bottlenecks

Sandler M, Howard AG, Zhu M et al. Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition2018; : 4510–4520URL https://api.semanticscholar. org/CorpusID:4555207

work page 2018
[52]

PyTorch: an imperative style, high-performance deep learning library

Paszke A, Gross S, Massa F et al. PyTorch: an imperative style, high-performance deep learning library. Red Hook, NY , USA: Curran Associates Inc., 2019

work page 2019
[53]

SGDR: Stochastic Gradient Descent with Warm Restarts

Loshchilov I and Hutter F. Sgdr: Stochastic gradient descent with restarts. ArXiv 2016; abs/1608.03983. URL https:// api.semanticscholar.org/CorpusID:15884797

work page internal anchor Pith review Pith/arXiv arXiv 2016
[54]

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Tan M and Le QV . Efﬁcientnet: Rethinking model scaling for convolutional neural networks. ArXiv 2019; abs/1905.11946. URL https://api.semanticscholar.org/ CorpusID:167217261

work page internal anchor Pith review Pith/arXiv arXiv 2019
[55]

Microsoft coco: Common objects in context

Lin TY , Maire M, Belongie S et al. Microsoft coco: Common objects in context. In Fleet D, Pajdla T, Schiele B et al. (eds.) Computer Vision – ECCV 2014 . Cham: Springer International Publishing, pp. 740–755

work page 2014
[56]

pycocotools: Python API for Microsoft COCO.https://github.com/ cocodataset/cocoapi, 2017

Wei W, Mercurial T and Consortium C. pycocotools: Python API for Microsoft COCO.https://github.com/ cocodataset/cocoapi, 2017

work page 2017
[57]

Prophesee evk4 event camera, 2024

Prophesee. Prophesee evk4 event camera, 2024. URL https: //www.prophesee.ai/event-camera-evk4/

work page 2024
[58]

ROS package for DVS data processing and applications

University of Zurich, Robotics and Perception Group. ROS package for DVS data processing and applications. Available at https://github.com/uzh-rpg/rpg_dvs_ros, Accessed December 9, 2024

work page 2024
[59]

Accessed: 2024- 12-09

Silvio Savarese, Lecture 3: Camera Models & Camera Calibration, Computational Vision and Geometry Lab, 2014 https://cvgl.stanford.edu/teaching/ cs231a_winter1415/lecture/lecture3_ camera_calibration_notes.pdf. Accessed: 2024- 12-09

work page 2014
[60]

Biasing Dynamic Sensors,

iniVation AG, “Biasing Dynamic Sensors,” 2024. [Online]. Available: https://docs.inivation.com/ hardware/hardware-advanced-usage/biasing. html. [Accessed: Dec. 9, 2024]

work page 2024
[61]

Light Meter LM-3000 4+,

Apps Studio, “Light Meter LM-3000 4+,” 2024. [Online]. Available: https://apps.apple.com/us/app/ light-meter-lm-3000/id1554264761 . [Accessed: Dec. 9, 2024]

work page 2024
[62]

S120C Photodiode Power Sensor,

Thorlabs, Inc., “S120C Photodiode Power Sensor,” 2024. [Online]. Available: https://www.thorlabs.com/ thorproduct.cfm?partnumber=S120C. [Accessed: Dec. 9, 2024]

work page 2024
[63]

[Accessed: 5-Jan-2025]

OpenCV , ”Multiple Object Tracking in Real-Time,” OpenCV Blog, Available: https://opencv.org/blog/ multiple-object-tracking-in-realtime/ . [Accessed: 5-Jan-2025]

work page 2025
[64]

Event-Driven Sensing for Efﬁcient Perception: Vision and Audition Algorithms,

Shih-Chii Liu, Bodo Rueckauer, Enea Ceolini, Adrian Huber, and Tobi Delbruck, “Event-Driven Sensing for Efﬁcient Perception: Vision and Audition Algorithms,” IEEE Signal Processing Magazine , vol. 36, no. 6, pp. 29-37, 2019, doi: 10.1109/MSP.2019.2928127

work page doi:10.1109/msp.2019.2928127 2019
[65]

Adaptive Time-Slice Block- Matching Optical Flow Algorithm for Dynamic Vision Sensors,

Min Liu and Tobi Delbrck, “Adaptive Time-Slice Block- Matching Optical Flow Algorithm for Dynamic Vision Sensors,” in British Machine Vision Conference , 2018. Available at: https://api.semanticscholar.org/ CorpusID:52283776

work page 2018
[66]

RepVGG: Making VGG- style ConvNets Great Again,

Xiaohan Ding, X. Zhang, Ningning Ma, Jungong Han, Guiguang Ding, and Jian Sun, “RepVGG: Making VGG- style ConvNets Great Again,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 13728-13737, 2021. Available at: https://api. semanticscholar.org/CorpusID:231572790

work page 2021
[67]

EfﬁcientRep: An Efﬁcient Repvgg-style ConvNets with Hardware-aware Neural Network Design,

Kaiheng Weng, Xiangxiang Chu, Xiaoming Xu, Junshi Huang, and Xiaoming Wei, “EfﬁcientRep: An Efﬁcient Repvgg-style ConvNets with Hardware-aware Neural Network Design,” arXiv preprint arXiv:2302.00386 , 2023, submitted on 1 Feb

work page arXiv 2023
[68]

Available at: https://arxiv.org/abs/2302. 00386

work page
[69]

CSPNet: A New Backbone that can Enhance Learning Capability of CNN,

Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh- Hua Wu, Ping-Yang Chen, and Jun-Wei Hsieh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1571-1580,

work page 2020
[70]

org/CorpusID:208310312

Available at: https://api.semanticscholar. org/CorpusID:208310312

work page
[71]

GhostNet: More Features From Cheap Operations,

Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu, “GhostNet: More Features From Cheap Operations,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1577-1586, 2019. Available at: https://api.semanticscholar.org/ CorpusID:208310058

work page 2020
[72]

ShufﬂeNet: An Extremely Efﬁcient Convolutional Neural Network for Mobile Devices,

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun, “ShufﬂeNet: An Extremely Efﬁcient Convolutional Neural Network for Mobile Devices,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , pp. 6848-6856,

work page 2018
[73]

org/CorpusID:24982157

Available at: https://api.semanticscholar. org/CorpusID:24982157

work page
[74]

Feature Pyramid Networks for Object Detection,

Tsung-Yi Lin, Piotr Dollr, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie, “Feature Pyramid Networks for Object Detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936-944, 2016. Available at: https://api. semanticscholar.org/CorpusID:10716717

work page 2017
[75]

Path Aggregation Network for Instance Segmentation,

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia, “Path Aggregation Network for Instance Segmentation,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8759-8768, 2018. Available at: https:// api.semanticscholar.org/CorpusID:3698141

work page 2018
[76]

RETRACTED: Highway forecasting for weather factor and trafﬁc ﬂow interaction scenarios,

Ning Tao, Deng Shengteng, Jia Xiangkun, et al., “RETRACTED: Highway forecasting for weather factor and trafﬁc ﬂow interaction scenarios,” PREPRINT, 11 October 2023, Version 1. Available at: https: //doi.org/10.21203/rs.3.rs-3418469/v1

work page doi:10.21203/rs.3.rs-3418469/v1 2023
[77]

Liang, Y.-C

L. Cordone, B. Miramond, and P. Thierion, ”Object Detection with Spiking Neural Networks on Automotive Event Data,” in *Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN)*, 2022, pp. 1-8. doi: 10.1109/IJCNN55064.2022.9892618

work page doi:10.1109/ijcnn55064.2022.9892618 2022
[78]

Zubic, D

N. Zubic, D. Gehrig, M. Gehrig, and D. Scaramuzza, ”From Chaos Comes Order: Ordering Event Representations for Object Recognition and Detection,” in *Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV)*, 2023, pp. 12800-12810. Available: Semantic Scholar

work page 2023
[79]

Z. Zhou, Z. Wu, R. Boutteau, F. Yang, C. Demonceaux, and D. Ginhac, ”RGB-Event Fusion for Moving Object Detection in Autonomous Driving,” in *Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA)*, 2022, pp. 7808-7815. Available: Semantic Scholar

work page 2023
[80]

C. M. Torrejn, U. G. W. K. N. Gamage, and S. Tolu, ”Concurrent Detection of Known Defects and Out-of- Distribution Instances in Building Inspections: Advancements in Deep Classiﬁcation,” in *Proceedings of the 2023 IEEE International Conference on Imaging Systems and Techniques (IST)*, 2023, pp. 1-6. doi: 10.1109/IST59124.2023.10355664. Prepared using sagej.cls

work page doi:10.1109/ist59124.2023.10355664 2023

[1] [1]

Drone- based non-destructive inspection of industrial sites: A review and case studies

Nooralishahi P, Ibarra-Castanedo C, Deane S et al. Drone- based non-destructive inspection of industrial sites: A review and case studies. Drones 2021; 5(4). DOI:10. 3390/drones5040106. URL https://www.mdpi.com/ 2504-446X/5/4/106

work page 2021

[2] [2]

Pilot visual detection of small unmanned aircraft systems (suas) equipped with strobe lighting

Wallace R, Lofﬁ J, Vance S et al. Pilot visual detection of small unmanned aircraft systems (suas) equipped with strobe lighting. Journal of Aviation Technology and Engineering 2018; 7. DOI:10.7771/2159-6670.1177

work page doi:10.7771/2159-6670.1177 2018

[3] [3]

Drone spotlights, 2024

Unmanned Systems Technology. Drone spotlights, 2024. URL https://www.unmannedsystemstechnology. com/expo/drone-spotlights/

work page 2024

[4] [4]

Event-based human intrusion detection in uas using deep learning

Prez-Cutio M, Eguluz AG, Dios JMd et al. Event-based human intrusion detection in uas using deep learning. In 2021 International Conference on Unmanned Aircraft Systems (ICUAS). pp. 91–100. DOI:10.1109/ICUAS51884.2021. 9476677

work page doi:10.1109/icuas51884.2021 2021

[5] [5]

Kristianto, G

Gallego G, Delbrck T, Orchard G et al. Event-based vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 2022; 44(1): 154–180. DOI:10.1109/TPAMI. 2020.3008413

work page doi:10.1109/tpami 2022

[6] [6]

Design and experimental evaluation of an aerial solution for visual inspection of tunnel- like infrastructures

Bendris B and Cayero Becerra J. Design and experimental evaluation of an aerial solution for visual inspection of tunnel- like infrastructures. Remote Sensing 2022; 14(1). DOI: 10.3390/rs14010195. URL https://www.mdpi.com/ 2072-4292/14/1/195

work page doi:10.3390/rs14010195 2022

[7] [7]

Basler ace, Accessed 2024

Basler. Basler ace, Accessed 2024. URL https://www. baslerweb.com/en/shop/aca4112-20um/

work page 2024

[8] [8]

Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types

Cha YJ, Choi W, Suh G et al. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering 2018; 33(9): 731–747

work page 2018

[9] [9]

A novel hybrid approach for crack detection

Fang F, Li L, Gu Y et al. A novel hybrid approach for crack detection. Pattern Recognition 2020; 107: 107474

work page 2020

[10] [10]

A survey and evaluation of promising approaches for automatic image- based defect detection of bridge structures

Jahanshahi MR, Kelly JS, Masri SF et al. A survey and evaluation of promising approaches for automatic image- based defect detection of bridge structures. Structure and Infrastructure Engineering 2009; 5(6): 455–486

work page 2009

[11] [11]

A review of computer visionbased structural health monitoring at local and global levels

Dong CZ and Catbas N. A review of computer visionbased structural health monitoring at local and global levels. Structural Health Monitoring 2020; 20: 692 –

work page 2020

[12] [12]

URL https://api.semanticscholar.org/ CorpusID:225627479

work page

[13] [13]

Accessed: Dec

First Principles of Computer Vision, What is an Edge? — Edge Detection , YouTube, 2024, https: //www.youtube.com/watch?v=G8yp6f9V_6c. Accessed: Dec. 2, 2024

work page 2024

[14] [14]

Accessed: Dec

Augmented AI, Support V ector Machine (SVM) in 7 minutes , YouTube, 2024, https://www.youtube.com/watch? v=Y6RRHw9uN9o. Accessed: Dec. 2, 2024

work page 2024

[15] [15]

com/watch?v=2xqkSUhmmXU

Alexander Amini, MIT 6.S191: Convolutional Neural Net- works, YouTube, May 2024, https://www.youtube. com/watch?v=2xqkSUhmmXU. Accessed: Dec. 2, 2024

work page 2024

[16] [16]

youtube.com/watch?v=ErnWZxJovaM&list= PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI

Alexander Amini, MIT Introduction to Deep Learning — 6.S191 , YouTube, May 2024, https://www. youtube.com/watch?v=ErnWZxJovaM&list= PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI. Accessed: Dec. 2, 2024

work page 2024

[17] [17]

Meituan. YOLOv6. https://github.com/meituan/ YOLOv6. Accessed: March 7, 2024

work page 2024

[18] [18]

Ssd: Single shot multibox detector

Liu W, Anguelov D, Erhan D et al. Ssd: Single shot multibox detector. In Leibe B, Matas J, Sebe N et al. (eds.) Computer Vision – ECCV 2016. Cham: Springer International Publishing, pp. 21–37

work page 2016

[19] [19]

Very Deep Convolutional Networks for Large-Scale Image Recognition

Simonyan K and Zisserman A. Very deep convolutional networks for large-scale image recognition. CoRR 2014; abs/1409.1556. URL https://api. semanticscholar.org/CorpusID:14124313

work page internal anchor Pith review Pith/arXiv arXiv 2014

[20] [20]

Deep residual learning for image recognition

He K, Zhang X, Ren S et al. Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015; : 770–778URL https://api.semanticscholar.org/ CorpusID:206594692

work page 2016

[21] [21]

A comprehensive review of deep learning-based crack detection approaches

Hamishebahar Y , Guan H, So S et al. A comprehensive review of deep learning-based crack detection approaches. Applied Sciences 2022; 12(3): 1374

work page 2022

[22] [22]

Artiﬁcial intelli- gence assisted infrastructure assessment using mixed reality systems

Karaaslan E, Bagci U and Catbas FN. Artiﬁcial intelli- gence assisted infrastructure assessment using mixed reality systems. Transportation Research Record 2018; 2673: 413 – 424. URL https://api.semanticscholar.org/ CorpusID:55702295

work page 2018

[23] [23]

Event-based classiﬁcation ofdefects incivil infrastructures withartiﬁcial andspiking neural networks

Gamage UKNGW, Zanatta L, Fumagalli M et al. Event-based classiﬁcation ofdefects incivil infrastructures withartiﬁcial andspiking neural networks. In Rojas I, Joya G and Catala A (eds.) Advances in Computational Intelligence . Cham: Springer Nature Switzerland. ISBN 978-3-031-43078-7, pp. 629–640

work page

[24] [24]

A 128× 128 120 db 15 µs latency asynchronous temporal contrast vision sensor.IEEE Journal of Solid-State Circuits 2008; 43(2): 566–576

Lichtsteiner P, Posch C and Delbruck T. A 128× 128 120 db 15 µs latency asynchronous temporal contrast vision sensor.IEEE Journal of Solid-State Circuits 2008; 43(2): 566–576. DOI: 10.1109/JSSC.2007.914337

work page doi:10.1109/jssc.2007.914337 2008

[25] [25]

Measuring diameters and velocities of artiﬁcial raindrops with a neuromorphic dynamic vision sensor disdrometer

Steiner JG, Micev K, Aydin A et al. Measuring diameters and velocities of artiﬁcial raindrops with a neuromorphic dynamic vision sensor disdrometer. URL https://api. semanticscholar.org/CorpusID:253707979

work page

[26] [26]

URL https://inivation.com/

Inivation. URL https://inivation.com/

work page

[27] [27]

A large scale event-based detection dataset for automotive,

de Tournemire P, Nitti DO, Perot E et al. A large scale event-based detection dataset for automotive. ArXiv 2020; abs/2001.08499. URL https://api. semanticscholar.org/CorpusID:210860813

work page arXiv 2020

[28] [28]

Learning to detect objects with a 1 megapixel event camera

Perot E, de Tournemire P, Nitti D et al. Learning to detect objects with a 1 megapixel event camera. In Proceedings of the 34th International Conference on Neural Information Processing Systems . NIPS’20, Red Hook, NY , USA: Curran Associates Inc. ISBN 9781713829546

work page

[29] [29]

Pushing the limits of asynchronous graph-based object detection with event cameras

Gehrig D and Scaramuzza D. Pushing the limits of asynchronous graph-based object detection with event cameras. arXiv 2022

work page 2022

[30] [30]

High-temporal-resolution object detection and tracking using images and events

El Shair Z and Rawashdeh SA. High-temporal-resolution object detection and tracking using images and events. Journal of Imaging 2022; 8(8): 210. DOI:10.3390/ jimaging8080210. URL https://www.mdpi.com/ 2313-433X/8/8/210

work page 2022

[31] [31]

Pedro: an event- based dataset for person detection in robotics

Boretti C, Bich P, Pareschi F et al. Pedro: an event- based dataset for person detection in robotics. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

work page

[32] [32]

MoCoGAN: Decomposing motion and content for video generation

Sironi A, Brambilla M, Bourdis N et al. Hats: His- tograms of averaged time surfaces for robust event-based object classiﬁcation. In 2018 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR) . IEEE Com- puter Society, pp. 1731–1740. DOI:10.1109/CVPR.2018. 00186. URL https://doi.ieeecomputersociety. org/10.1109/CVPR.2018.00186

work page doi:10.1109/cvpr.2018 2018

[33] [33]

Dsec: A stereo event camera dataset for driving scenarios

Gehrig M, Aarents W, Gehrig D et al. Dsec: A stereo event camera dataset for driving scenarios. IEEE Robotics and Automation Letters 2021; DOI:10.1109/LRA.2021.3068942

work page doi:10.1109/lra.2021.3068942 2021

[34] [34]

In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

Hu Y , Liu S and Delbruck T. v2e: From video frames to realistic dvs events. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1312–1321. DOI:10.1109/CVPRW53098.2021. 00144. URL https://doi.ieeecomputersociety. org/10.1109/CVPRW53098.2021.00144

work page doi:10.1109/cvprw53098.2021 2021

[35] [35]

M3ed: Multi-robot, multi- sensor, multi-environment event dataset

Chaney K, Cladera F, Wang Z et al. M3ed: Multi-robot, multi- sensor, multi-environment event dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. pp. 4015–4022

work page

[36] [36]

Converting static image datasets to spiking neuromorphic datasets using saccades

Orchard G, Jayawant A, Cohen G et al. Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in Neuroscience 2015; 9. URL https://api. semanticscholar.org/CorpusID:940928

work page 2015

[37] [37]

Cifar10-dvs: An event-stream dataset for object classiﬁcation

Li H, Liu H, Ji X et al. Cifar10-dvs: An event-stream dataset for object classiﬁcation. Frontiers in Neuroscience 2017; 11. URL https://api.semanticscholar. org/CorpusID:2406565

work page 2017

[38] [38]

Intel RealSense D400 Series Product Family Datasheet,

Intel, “Intel RealSense D400 Series Product Family Datasheet,” Available: https://www.intel.com/ content/www/us/en/content-details/841984/ intel-realsense-d400-series-product-family-datasheet. html. [Accessed: 31-Jan-2025]

work page 2025

[39] [39]

URL https://www.sz3km.cn/index.php/content/71

The application of ir laser illuminator in the drones. URL https://www.sz3km.cn/index.php/content/71. Prepared using sagej.cls 25

work page

[40] [40]

Introduction to the Physics and Techniques of Remote Sensing

Elachi C and van Zyl J. Introduction to the Physics and Techniques of Remote Sensing . John Wiley & Sons,

work page

[41] [41]

URL https://onlinelibrary.wiley.com/ doi/book/10.1002/0471783390

work page doi:10.1002/0471783390

[42] [42]

Intel realsense sdk api how-to: Controlling the laser, Year

Intel Corporation. Intel realsense sdk api how-to: Controlling the laser, Year. URL https://github. com/IntelRealSense/librealsense/wiki/ API-How-To#controlling-the-laser

work page

[43] [43]

Meta-learning convolutional neural architectures for multi-target concrete defect classiﬁcation with the concrete defect bridge image dataset

Mundt M, Majumder S, Murali S et al. Meta-learning convolutional neural architectures for multi-target concrete defect classiﬁcation with the concrete defect bridge image dataset. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . pp. 11188–11197. DOI: 10.1109/CVPR.2019.01145

work page doi:10.1109/cvpr.2019.01145 2019

[44] [44]

Deep concrete inspection using unmanned aerial vehicle towards cssc database

Yang L, Li B, Li W et al. Deep concrete inspection using unmanned aerial vehicle towards cssc database

work page

[45] [45]

Detecting cracks and spalling automatically in extreme events by end-to-end deep learning frameworks

Bai Y , Sezen H and Yilmaz A. Detecting cracks and spalling automatically in extreme events by end-to-end deep learning frameworks. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences 2021; V-2-2021: 161–168. DOI:10.5194/isprs-annals-V-2-2021-161-2021

work page doi:10.5194/isprs-annals-v-2-2021-161-2021 2021

[46] [46]

SensorsINI. jaer. https://github.com/SensorsINI/ jaer, Accessed: Dec 7, 2024

work page 2024

[47] [47]

Labelme: Image annotation and labeling tool

LabelMe Development Team. Labelme: Image annotation and labeling tool. https://github.com/labelmeai/ labelme, ongoing. Accessed: March 7, 2024

work page 2024

[48] [48]

Microsoft COCO: Common Objects in Context

Lin T, Maire M, Belongie SJ et al. Microsoft COCO: common objects in context. CoRR 2014; abs/1405.0312. URL http: //arxiv.org/abs/1405.0312. 1405.0312

work page internal anchor Pith review Pith/arXiv arXiv 2014

[49] [49]

Event-based vision meets deep learning on steering prediction for self- driving cars

Maqueda AMI, Loquercio A, Gallego G et al. Event-based vision meets deep learning on steering prediction for self- driving cars. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018; : 5419–5427URL https:// api.semanticscholar.org/CorpusID:4610262

work page 2018

[50] [50]

cv::CLAHE Class Reference, n.d

OpenCV. cv::CLAHE Class Reference, n.d. URL https://docs.opencv.org/4.x/d6/db6/ classcv_1_1CLAHE.html

work page

[51] [51]

Mobilenetv2: Inverted residuals and linear bottlenecks

Sandler M, Howard AG, Zhu M et al. Mobilenetv2: Inverted residuals and linear bottlenecks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition2018; : 4510–4520URL https://api.semanticscholar. org/CorpusID:4555207

work page 2018

[52] [52]

PyTorch: an imperative style, high-performance deep learning library

Paszke A, Gross S, Massa F et al. PyTorch: an imperative style, high-performance deep learning library. Red Hook, NY , USA: Curran Associates Inc., 2019

work page 2019

[53] [53]

SGDR: Stochastic Gradient Descent with Warm Restarts

Loshchilov I and Hutter F. Sgdr: Stochastic gradient descent with restarts. ArXiv 2016; abs/1608.03983. URL https:// api.semanticscholar.org/CorpusID:15884797

work page internal anchor Pith review Pith/arXiv arXiv 2016

[54] [54]

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Tan M and Le QV . Efﬁcientnet: Rethinking model scaling for convolutional neural networks. ArXiv 2019; abs/1905.11946. URL https://api.semanticscholar.org/ CorpusID:167217261

work page internal anchor Pith review Pith/arXiv arXiv 2019

[55] [55]

Microsoft coco: Common objects in context

Lin TY , Maire M, Belongie S et al. Microsoft coco: Common objects in context. In Fleet D, Pajdla T, Schiele B et al. (eds.) Computer Vision – ECCV 2014 . Cham: Springer International Publishing, pp. 740–755

work page 2014

[56] [56]

pycocotools: Python API for Microsoft COCO.https://github.com/ cocodataset/cocoapi, 2017

Wei W, Mercurial T and Consortium C. pycocotools: Python API for Microsoft COCO.https://github.com/ cocodataset/cocoapi, 2017

work page 2017

[57] [57]

Prophesee evk4 event camera, 2024

Prophesee. Prophesee evk4 event camera, 2024. URL https: //www.prophesee.ai/event-camera-evk4/

work page 2024

[58] [58]

ROS package for DVS data processing and applications

University of Zurich, Robotics and Perception Group. ROS package for DVS data processing and applications. Available at https://github.com/uzh-rpg/rpg_dvs_ros, Accessed December 9, 2024

work page 2024

[59] [59]

Accessed: 2024- 12-09

Silvio Savarese, Lecture 3: Camera Models & Camera Calibration, Computational Vision and Geometry Lab, 2014 https://cvgl.stanford.edu/teaching/ cs231a_winter1415/lecture/lecture3_ camera_calibration_notes.pdf. Accessed: 2024- 12-09

work page 2014

[60] [60]

Biasing Dynamic Sensors,

iniVation AG, “Biasing Dynamic Sensors,” 2024. [Online]. Available: https://docs.inivation.com/ hardware/hardware-advanced-usage/biasing. html. [Accessed: Dec. 9, 2024]

work page 2024

[61] [61]

Light Meter LM-3000 4+,

Apps Studio, “Light Meter LM-3000 4+,” 2024. [Online]. Available: https://apps.apple.com/us/app/ light-meter-lm-3000/id1554264761 . [Accessed: Dec. 9, 2024]

work page 2024

[62] [62]

S120C Photodiode Power Sensor,

Thorlabs, Inc., “S120C Photodiode Power Sensor,” 2024. [Online]. Available: https://www.thorlabs.com/ thorproduct.cfm?partnumber=S120C. [Accessed: Dec. 9, 2024]

work page 2024

[63] [63]

[Accessed: 5-Jan-2025]

OpenCV , ”Multiple Object Tracking in Real-Time,” OpenCV Blog, Available: https://opencv.org/blog/ multiple-object-tracking-in-realtime/ . [Accessed: 5-Jan-2025]

work page 2025

[64] [64]

Event-Driven Sensing for Efﬁcient Perception: Vision and Audition Algorithms,

Shih-Chii Liu, Bodo Rueckauer, Enea Ceolini, Adrian Huber, and Tobi Delbruck, “Event-Driven Sensing for Efﬁcient Perception: Vision and Audition Algorithms,” IEEE Signal Processing Magazine , vol. 36, no. 6, pp. 29-37, 2019, doi: 10.1109/MSP.2019.2928127

work page doi:10.1109/msp.2019.2928127 2019

[65] [65]

Adaptive Time-Slice Block- Matching Optical Flow Algorithm for Dynamic Vision Sensors,

Min Liu and Tobi Delbrck, “Adaptive Time-Slice Block- Matching Optical Flow Algorithm for Dynamic Vision Sensors,” in British Machine Vision Conference , 2018. Available at: https://api.semanticscholar.org/ CorpusID:52283776

work page 2018

[66] [66]

RepVGG: Making VGG- style ConvNets Great Again,

Xiaohan Ding, X. Zhang, Ningning Ma, Jungong Han, Guiguang Ding, and Jian Sun, “RepVGG: Making VGG- style ConvNets Great Again,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , pp. 13728-13737, 2021. Available at: https://api. semanticscholar.org/CorpusID:231572790

work page 2021

[67] [67]

EfﬁcientRep: An Efﬁcient Repvgg-style ConvNets with Hardware-aware Neural Network Design,

Kaiheng Weng, Xiangxiang Chu, Xiaoming Xu, Junshi Huang, and Xiaoming Wei, “EfﬁcientRep: An Efﬁcient Repvgg-style ConvNets with Hardware-aware Neural Network Design,” arXiv preprint arXiv:2302.00386 , 2023, submitted on 1 Feb

work page arXiv 2023

[68] [68]

Available at: https://arxiv.org/abs/2302. 00386

work page

[69] [69]

CSPNet: A New Backbone that can Enhance Learning Capability of CNN,

Chien-Yao Wang, Hong-Yuan Mark Liao, I-Hau Yeh, Yueh- Hua Wu, Ping-Yang Chen, and Jun-Wei Hsieh, “CSPNet: A New Backbone that can Enhance Learning Capability of CNN,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1571-1580,

work page 2020

[70] [70]

org/CorpusID:208310312

Available at: https://api.semanticscholar. org/CorpusID:208310312

work page

[71] [71]

GhostNet: More Features From Cheap Operations,

Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu, “GhostNet: More Features From Cheap Operations,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1577-1586, 2019. Available at: https://api.semanticscholar.org/ CorpusID:208310058

work page 2020

[72] [72]

ShufﬂeNet: An Extremely Efﬁcient Convolutional Neural Network for Mobile Devices,

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun, “ShufﬂeNet: An Extremely Efﬁcient Convolutional Neural Network for Mobile Devices,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , pp. 6848-6856,

work page 2018

[73] [73]

org/CorpusID:24982157

Available at: https://api.semanticscholar. org/CorpusID:24982157

work page

[74] [74]

Feature Pyramid Networks for Object Detection,

Tsung-Yi Lin, Piotr Dollr, Ross B. Girshick, Kaiming He, Bharath Hariharan, and Serge J. Belongie, “Feature Pyramid Networks for Object Detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936-944, 2016. Available at: https://api. semanticscholar.org/CorpusID:10716717

work page 2017

[75] [75]

Path Aggregation Network for Instance Segmentation,

Shu Liu, Lu Qi, Haifang Qin, Jianping Shi, and Jiaya Jia, “Path Aggregation Network for Instance Segmentation,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8759-8768, 2018. Available at: https:// api.semanticscholar.org/CorpusID:3698141

work page 2018

[76] [76]

RETRACTED: Highway forecasting for weather factor and trafﬁc ﬂow interaction scenarios,

Ning Tao, Deng Shengteng, Jia Xiangkun, et al., “RETRACTED: Highway forecasting for weather factor and trafﬁc ﬂow interaction scenarios,” PREPRINT, 11 October 2023, Version 1. Available at: https: //doi.org/10.21203/rs.3.rs-3418469/v1

work page doi:10.21203/rs.3.rs-3418469/v1 2023

[77] [77]

Liang, Y.-C

L. Cordone, B. Miramond, and P. Thierion, ”Object Detection with Spiking Neural Networks on Automotive Event Data,” in *Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN)*, 2022, pp. 1-8. doi: 10.1109/IJCNN55064.2022.9892618

work page doi:10.1109/ijcnn55064.2022.9892618 2022

[78] [78]

Zubic, D

N. Zubic, D. Gehrig, M. Gehrig, and D. Scaramuzza, ”From Chaos Comes Order: Ordering Event Representations for Object Recognition and Detection,” in *Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV)*, 2023, pp. 12800-12810. Available: Semantic Scholar

work page 2023

[79] [79]

Z. Zhou, Z. Wu, R. Boutteau, F. Yang, C. Demonceaux, and D. Ginhac, ”RGB-Event Fusion for Moving Object Detection in Autonomous Driving,” in *Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA)*, 2022, pp. 7808-7815. Available: Semantic Scholar

work page 2023

[80] [80]

C. M. Torrejn, U. G. W. K. N. Gamage, and S. Tolu, ”Concurrent Detection of Known Defects and Out-of- Distribution Instances in Building Inspections: Advancements in Deep Classiﬁcation,” in *Proceedings of the 2023 IEEE International Conference on Imaging Systems and Techniques (IST)*, 2023, pp. 1-6. doi: 10.1109/IST59124.2023.10355664. Prepared using sagej.cls

work page doi:10.1109/ist59124.2023.10355664 2023