NeuroLiDAR: Adaptive Frame Rate Depth Sensing via Neuromorphic Event-LiDAR Fusion

Archan Misra; Darshana Rathnayake; Dulanga Weerakoon; Meera Radhakrishnan

arxiv: 2605.16805 · v1 · pith:ESC4L3TXnew · submitted 2026-05-16 · 💻 cs.CV

NeuroLiDAR: Adaptive Frame Rate Depth Sensing via Neuromorphic Event-LiDAR Fusion

Darshana Rathnayake , Dulanga Weerakoon , Meera Radhakrishnan , Archan Misra This is my paper

Pith reviewed 2026-05-19 21:11 UTC · model grok-4.3

classification 💻 cs.CV

keywords event cameraLiDAR fusionadaptive depth sensingneuromorphic visionkeyframe detectiondepth extrapolation3D reconstructiondynamic frame rate

0 comments

The pith

NeuroLiDAR fuses event camera streams with sparse LiDAR scans to raise effective depth frame rates to around 66 Hz while cutting reconstruction error by 29 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NeuroLiDAR as a fusion framework that pairs low-rate LiDAR depth measurements with high-temporal-resolution data from neuromorphic event cameras. It introduces event-based keyframe detection to decide when a full LiDAR scan is needed and event-guided extrapolation to synthesize depth at intermediate times. This adaptive control produces effective rates between roughly 28 and 47 Hz, up to a peak near 66 Hz, across indoor and outdoor scenes captured in the new ELiDAR dataset. The result is a measured 29 percent drop in root-mean-square depth error compared with fixed low-rate LiDAR operation. A reader would care because many robotics and mapping tasks need both accurate long-range geometry and quick response to motion without forcing the LiDAR hardware to run continuously at high speed.

Core claim

NeuroLiDAR achieves effective frame rates of up to approximately 66 Hz by fusing temporally sparse LiDAR data with temporally dense inputs from neuromorphic event cameras, integrates event-based keyframe detection and event-guided depth extrapolation to dynamically adjust the sensing rate in response to scene dynamics, and demonstrates a reduction of depth reconstruction error by approximately 29 percent in RMSE together with adaptive frame rates between 27.8 and 47.3 Hz on the ELiDAR dataset.

What carries the argument

Event-based keyframe detection plus event-guided depth extrapolation, which together decide when to trigger a LiDAR scan and then fill the gaps between scans using the dense temporal information from the event camera.

If this is right

LiDAR systems can maintain long-range accuracy while responding to motion at much higher effective rates without hardware redesign.
Power or thermal budgets for depth sensing can be reduced by skipping unnecessary full scans when scene change is low.
The same fusion logic extends naturally to other pairs of sparse accurate sensors and dense temporal sensors.
Robotics and autonomous navigation pipelines gain more timely 3D maps for fast decision making.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar event-guided interpolation could be tested on stereo or RGB-D cameras to reduce their frame-rate demands.
If event cameras are unavailable, the same keyframe logic might be driven by optical flow from conventional video at higher compute cost.
The approach suggests a general principle: let a fast but low-accuracy signal schedule a slow but high-accuracy sensor.

Load-bearing premise

Event data can be trusted to pick the right moments for LiDAR scans and to guide accurate depth filling between those scans in many different indoor and outdoor environments.

What would settle it

Record a scene containing sudden fast motion that produces few or ambiguous events; if the extrapolated depths then show RMSE higher than a simple low-rate LiDAR baseline, the adaptive fusion claim fails.

Figures

Figures reproduced from arXiv: 2605.16805 by Archan Misra, Darshana Rathnayake, Dulanga Weerakoon, Meera Radhakrishnan.

**Figure 1.** Figure 1: Examples from the ELiDAR dataset: (a)–(c) simulated outdoor scenarios, (d)–(f) real-world indoor scenarios. mean absolute depth error than frame-based baselines. Other methods leveraged event–RGB fusion under adverse conditions: EVEN [17] enhanced RGB images with GANs before fusing them with events for robust depth estimation at night. More recently, hybrid spiking-transformer architectures [18] combined … view at source ↗

**Figure 2.** Figure 2: Event and depth changes with time for varying ego [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 4.** Figure 4: ) using an Intel RealSense L515 [22] as the lowframe-rate LiDAR sensor and a Prophesee EVK4 [23] as the event camera. The keyframe detection and depth extrapolation models are deployed on an NVIDIA Jetson AGX Orin [24]. To support real-time operation with reduced energy consumption, both models are quantized to 16-bit FP precision and executed using TensorRT. V. ADAPTIVE EVENT-GUIDED DEPTH EXTRAPOLATION W… view at source ↗

**Figure 5.** Figure 5: Neural network architectures of keyframe detection and depth extrapolation network with expanded view of ResConv [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Keyframe detection under different conditions [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: End-to-end latency including keyframe detection, voxel-grid construction, and depth extrapolation [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

read the original abstract

LiDARs are widely used for 3D depth reconstruction, but their performance is often limited by inherent hardware constraints that impose trade-offs between range, spatial resolution, and frame rate. Many LiDAR systems typically operate at low frame rates (e.g., 5-10 Hz), prioritizing long-range sensing over responsiveness to rapid scene changes. We present NeuroLiDAR, an adaptive depth sensing framework that achieves effective frame rates of up to $\approx$66 Hz by fusing temporally sparse LiDAR data with temporally dense inputs from neuromorphic event cameras. NeuroLiDAR integrates two components: event-based keyframe detection and event-guided depth extrapolation, to dynamically adjust the sensing rate in response to scene dynamics. To evaluate our approach, we introduce ELiDAR, a dataset spanning outdoor and indoor scenarios, and show that NeuroLiDAR reduces depth reconstruction error by $\approx$29\% in RMSE while achieving adaptive frame rates between 27.8-47.3 Hz. Our code and dataset are available at https://github.com/darshanakgr/neurolidar.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NeuroLiDAR gives a practical fusion approach to increase effective LiDAR frame rates with event data and releases a useful new dataset, but the extrapolation reliability could use tighter checks.

read the letter

NeuroLiDAR shows how event cameras can help LiDAR systems run at higher effective frame rates by selecting keyframes and extrapolating depth, with a new dataset to back it up. The main advance is the adaptive framework that fuses the dense temporal info from events with sparse LiDAR scans. The paper does a good job laying out the two components: event-based keyframe detection to decide when to capture a full LiDAR frame, and event-guided extrapolation to fill the gaps. They test it on ELiDAR, which covers both indoor and outdoor scenarios, and report solid numbers like up to 66 Hz effective rate and 29% RMSE reduction. Making the code and dataset public is helpful for reproducibility. One area that could use more work is the extrapolation part. The claim rests on it not introducing large errors during rapid changes, but there's no explicit bound or worst-case analysis shown for when event density is low or motion is extreme. That makes the robustness a bit hard to assess from what's presented. This paper is for researchers and engineers working on real-time 3D sensing for autonomous systems or robotics. Someone looking for practical ways to boost frame rates without new hardware would find it relevant. I think it should go to peer review. The idea is clear and the resources are there, so referees can dig into the details and suggest improvements on the validation side.

Referee Report

2 major / 2 minor

Summary. The manuscript presents NeuroLiDAR, an adaptive depth sensing framework that fuses temporally sparse LiDAR scans with dense neuromorphic event camera data. It introduces event-based keyframe detection to trigger LiDAR acquisitions and event-guided depth extrapolation to synthesize intermediate frames, claiming effective frame rates up to approximately 66 Hz, adaptive rates in the 27.8-47.3 Hz range, and an approximately 29% reduction in RMSE on the newly introduced ELiDAR dataset covering indoor and outdoor scenes. Code and dataset are released publicly.

Significance. If the empirical gains are robust, the approach could enable more responsive 3D perception in dynamic environments by trading off expensive high-rate LiDAR scanning for event-camera-driven interpolation. The public release of the ELiDAR dataset and implementation code is a clear strength that supports reproducibility and follow-on work in event-LiDAR fusion.

major comments (2)

[§4 (Experiments)] §4 (Experiments) and abstract: the central claim that event-guided depth extrapolation safely supports effective rates up to ~66 Hz while delivering a 29% RMSE reduction rests on an unverified assumption that accumulated extrapolation error remains bounded across scene dynamics. No Lipschitz analysis, worst-case drift bounds, or failure-mode evaluation on high-speed motion sequences is provided, leaving the adaptive-rate guarantee as an empirical observation without quantified safety margins.
[§3 (Method)] §3 (Method) and §4.2: the event-based keyframe detection module is described at a high level but lacks an explicit statement of the decision threshold, its sensitivity to event density, or an ablation showing how threshold choice affects the reported frame-rate range and error. Without this, it is unclear whether the adaptive behavior generalizes beyond the ELiDAR training distribution.

minor comments (2)

[Abstract] Abstract: the baseline method and exact metric definition (e.g., which depth estimator is used for the reported RMSE) should be stated explicitly so readers can immediately interpret the 29% figure.
[Figures] Figure 3 or equivalent qualitative results: captions should clarify the color scale, the time interval between shown frames, and whether the visualized depth maps are extrapolated or LiDAR-grounded.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where we agree and the specific revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [§4 (Experiments)] §4 (Experiments) and abstract: the central claim that event-guided depth extrapolation safely supports effective rates up to ~66 Hz while delivering a 29% RMSE reduction rests on an unverified assumption that accumulated extrapolation error remains bounded across scene dynamics. No Lipschitz analysis, worst-case drift bounds, or failure-mode evaluation on high-speed motion sequences is provided, leaving the adaptive-rate guarantee as an empirical observation without quantified safety margins.

Authors: We appreciate the referee's observation that our claims regarding bounded extrapolation error and the resulting adaptive rates rely on empirical validation rather than formal analysis. The reported effective rates (up to approximately 66 Hz in peak cases) and the 29% RMSE reduction are derived from experiments across the diverse indoor and outdoor sequences in the ELiDAR dataset, where event data guides the interpolation to limit drift in practice. We agree that the absence of Lipschitz constants, worst-case bounds, or dedicated high-speed failure cases leaves the safety margins unquantified. In the revised manuscript we will add a dedicated discussion subsection on error accumulation, report empirical drift statistics over longer sequences, and include additional evaluations on high-speed motion subsets to provide clearer margins for the adaptive-rate behavior. revision: yes
Referee: [§3 (Method)] §3 (Method) and §4.2: the event-based keyframe detection module is described at a high level but lacks an explicit statement of the decision threshold, its sensitivity to event density, or an ablation showing how threshold choice affects the reported frame-rate range and error. Without this, it is unclear whether the adaptive behavior generalizes beyond the ELiDAR training distribution.

Authors: We acknowledge that Section 3 presents the keyframe detection module at a high level without specifying the exact decision threshold or providing sensitivity/ablation results. To improve clarity and reproducibility, the revised manuscript will explicitly state the threshold value employed, include a sensitivity analysis with respect to event density, and add an ablation study in Section 4.2 that varies the threshold and reports the resulting frame-rate ranges together with the corresponding RMSE values. These additions will help demonstrate the robustness of the adaptive mechanism beyond the specific ELiDAR distribution. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical framework evaluated on new dataset

full rationale

The paper introduces NeuroLiDAR with event-based keyframe detection and event-guided depth extrapolation to adapt LiDAR frame rates, then reports empirical RMSE reductions (~29%) and adaptive rates (27.8-47.3 Hz) on the newly introduced ELiDAR dataset spanning indoor/outdoor scenes. No equations, fitted parameters, or self-citations are shown that would make any reported gain equivalent to its inputs by construction. The derivation chain consists of algorithmic components whose outputs are measured against external ground truth rather than being redefined from the same data or prior self-work. This is self-contained against the introduced benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the approach appears to rest on standard sensor-fusion assumptions and the new dataset.

pith-pipeline@v0.9.0 · 5734 in / 1007 out tokens · 38349 ms · 2026-05-19T21:11:40.154163+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

NeuroLiDAR integrates two components: event-based keyframe detection and event-guided depth extrapolation... U-Net-style lightweight autoencoder... voxel grid-based event representation
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

achieves effective frame rates of up to ≈66 Hz... reduces depth reconstruction error by ≈29% in RMSE

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

[1]

Event-based vision: A survey,

G. Gallego, T. Delbruck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, p. 154–180, 2022

work page 2022
[2]

Event-Based Frame Interpolation with Ad-hoc Deblurring

L. Sun, C. Sakaridis, J. Liang, P. Sun, K. Zhang, J. Cao, Q. Jiang, K. Wang, and L. Van Gool, “Event-Based Frame Interpolation with Ad-hoc Deblurring.” IEEE, June 2023, pp. 18 043–18 052

work page 2023
[3]

Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction,

D. Gehrig, M. R ¨uegg, M. Gehrig, J. Hidalgo-Carri ´o, and D. Scara- muzza, “Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2822–2829, 2021

work page 2021
[4]

Dense depth- map estimation based on fusion of event camera and sparse lidar,

M. Cui, Y . Zhu, Y . Liu, Y . Liu, G. Chen, and K. Huang, “Dense depth- map estimation based on fusion of event camera and sparse lidar,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–11, 2022

work page 2022
[5]

Learning spatial- temporal implicit neural representations for event-guided video super- resolution,

Y . Lu, Z. Wang, M. Liu, H. Wang, and L. Wang, “Learning spatial- temporal implicit neural representations for event-guided video super- resolution,” in2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1557–1567

work page 2023
[6]

Optical flow estimation from event-based cameras and spiking neural networks,

J. Cuadrado, U. Ranc ¸on, B. R. Cottereau, F. Barranco, and T. Masque- lier, “Optical flow estimation from event-based cameras and spiking neural networks,”Frontiers in Neuroscience, vol. 17, p. 1160034, 2023

work page 2023
[7]

Enhancing 3-d lidar point clouds with event-based camera,

B. Li, H. Meng, Y . Zhu, R. Song, M. Cui, G. Chen, and K. Huang, “Enhancing 3-d lidar point clouds with event-based camera,”IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–12, 2021

work page 2021
[8]

Learning to estimate two dense depths from lidar and event data,

V . Brebion, J. Moreau, and F. Davoine, “Learning to estimate two dense depths from lidar and event data,” inScandinavian Conference on Image Analysis. Springer, 2023, pp. 517–533

work page 2023
[9]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inInternational Confer- ence on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241

work page 2015
[10]

Carla: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inConference on robot learning. PMLR, 2017, pp. 1–16

work page 2017
[11]

Dynamic graph cnn for event-camera based gesture recognition,

J. Chen, J. Meng, X. Wang, and J. Yuan, “Dynamic graph cnn for event-camera based gesture recognition,” in2020 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2020, pp. 1–5

work page 2020
[12]

Event-based action recognition using motion information and spiking neural networks

Q. Liu, D. Xing, H. Tang, D. Ma, and G. Pan, “Event-based action recognition using motion information and spiking neural networks.” inIJCAI, 2021, pp. 1743–1749

work page 2021
[13]

Real-time 3d reconstruc- tion and 6-dof tracking with an event camera,

H. Kim, S. Leutenegger, and A. J. Davison, “Real-time 3d reconstruc- tion and 6-dof tracking with an event camera,” inEuropean conference on computer vision. Springer, 2016, pp. 349–364

work page 2016
[14]

Evi-sam: Robust, real-time, tightly-coupled event–visual–inertial state estimation and 3d dense mapping,

W. Guan, P. Chen, H. Zhao, Y . Wang, and P. Lu, “Evi-sam: Robust, real-time, tightly-coupled event–visual–inertial state estimation and 3d dense mapping,”Advanced Intelligent Systems, vol. 6, no. 12, p. 2400243, 2024

work page 2024
[15]

Event-based depth prediction with deep spiking neural network,

X. Wu, W. He, M. Yao, Z. Zhang, Y . Wang, B. Xu, and G. Li, “Event-based depth prediction with deep spiking neural network,” IEEE Transactions on Cognitive and Developmental Systems, 2024

work page 2024
[16]

The multivehicle stereo event camera dataset: An event camera dataset for 3d perception,

A. Z. Zhu, D. Thakur, T. ¨Ozaslan, B. Pfrommer, V . Kumar, and K. Daniilidis, “The multivehicle stereo event camera dataset: An event camera dataset for 3d perception,”IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 2032–2039, 2018

work page 2032
[17]

Even: An event- based framework for monocular depth estimation at adverse night conditions,

P. Shi, J. Peng, J. Qiu, X. Ju, F. P. W. Lo, and B. Lo, “Even: An event- based framework for monocular depth estimation at adverse night conditions,” in2023 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2023, pp. 1–7

work page 2023
[18]

Snn-ann hybrid networks for embedded multimodal monocular depth estimation,

S. A. Tumpa, A. Devulapally, M. Brehove, E. Kyubwa, and V . Narayanan, “Snn-ann hybrid networks for embedded multimodal monocular depth estimation,” in2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 2024, pp. 198–203

work page 2024
[19]

Learning monocular dense depth from events,

D. G. Javier Hidalgo-Carrio and D. Scaramuzza, “Learning monocular dense depth from events,”IEEE International Conference on 3D Vision.(3DV), 2020

work page 2020
[20]

Vlp16: mid range lidar sensor,

Ouster, “Vlp16: mid range lidar sensor,” https://ouster.com/products/ hardware/vlp-16, accessed:2025-09-14

work page 2025
[21]

Event- based vision: A survey,

G. Gallego, T. Delbr ¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis,et al., “Event- based vision: A survey,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, pp. 154–180, 2020

work page 2020
[22]

Intel ® realsense™lidar camera l515,

I. Corporation, “Intel ® realsense™lidar camera l515,” https://www.intel.com/content/www/us/en/products/sku/201775/ intel-realsense-lidar-camera-l515/specifications.html, accessed: 2025-09-14

work page 2025
[23]

Evk4: The ultra high-speed and compact, hd event- based vision evaluation kit built to endure field testing conditions

M. by Prophesee, “Evk4: The ultra high-speed and compact, hd event- based vision evaluation kit built to endure field testing conditions.” https://www.prophesee.ai/event-camera-evk4/, accessed: 2025-09-14

work page 2025
[24]

Nvidia jetson orin,

NVIDIA, “Nvidia jetson orin,” https://www.nvidia.com/en-sg/ autonomous-machines/embedded-systems/jetson-orin/, accessed: 2025-09-14

work page 2025
[25]

Meter: A mobile vision transformer architecture for monocular depth estimation,

L. Papa, P. Russo, and I. Amerini, “Meter: A mobile vision transformer architecture for monocular depth estimation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 10, pp. 5882– 5893, 2023

work page 2023
[26]

Digging into self-supervised monocular depth estimation,

C. Godard, O. Mac Aodha, M. Firman, and G. J. Brostow, “Digging into self-supervised monocular depth estimation,” inProceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3828–3838

work page 2019

[1] [1]

Event-based vision: A survey,

G. Gallego, T. Delbruck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, and D. Scara- muzza, “Event-based vision: A survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 1, p. 154–180, 2022

work page 2022

[2] [2]

Event-Based Frame Interpolation with Ad-hoc Deblurring

L. Sun, C. Sakaridis, J. Liang, P. Sun, K. Zhang, J. Cao, Q. Jiang, K. Wang, and L. Van Gool, “Event-Based Frame Interpolation with Ad-hoc Deblurring.” IEEE, June 2023, pp. 18 043–18 052

work page 2023

[3] [3]

Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction,

D. Gehrig, M. R ¨uegg, M. Gehrig, J. Hidalgo-Carri ´o, and D. Scara- muzza, “Combining events and frames using recurrent asynchronous multimodal networks for monocular depth prediction,”IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2822–2829, 2021

work page 2021

[4] [4]

Dense depth- map estimation based on fusion of event camera and sparse lidar,

M. Cui, Y . Zhu, Y . Liu, Y . Liu, G. Chen, and K. Huang, “Dense depth- map estimation based on fusion of event camera and sparse lidar,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–11, 2022

work page 2022

[5] [5]

Learning spatial- temporal implicit neural representations for event-guided video super- resolution,

Y . Lu, Z. Wang, M. Liu, H. Wang, and L. Wang, “Learning spatial- temporal implicit neural representations for event-guided video super- resolution,” in2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1557–1567

work page 2023

[6] [6]

Optical flow estimation from event-based cameras and spiking neural networks,

J. Cuadrado, U. Ranc ¸on, B. R. Cottereau, F. Barranco, and T. Masque- lier, “Optical flow estimation from event-based cameras and spiking neural networks,”Frontiers in Neuroscience, vol. 17, p. 1160034, 2023

work page 2023

[7] [7]

Enhancing 3-d lidar point clouds with event-based camera,

B. Li, H. Meng, Y . Zhu, R. Song, M. Cui, G. Chen, and K. Huang, “Enhancing 3-d lidar point clouds with event-based camera,”IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–12, 2021

work page 2021

[8] [8]

Learning to estimate two dense depths from lidar and event data,

V . Brebion, J. Moreau, and F. Davoine, “Learning to estimate two dense depths from lidar and event data,” inScandinavian Conference on Image Analysis. Springer, 2023, pp. 517–533

work page 2023

[9] [9]

U-net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inInternational Confer- ence on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241

work page 2015

[10] [10]

Carla: An open urban driving simulator,

A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V . Koltun, “Carla: An open urban driving simulator,” inConference on robot learning. PMLR, 2017, pp. 1–16

work page 2017

[11] [11]

Dynamic graph cnn for event-camera based gesture recognition,

J. Chen, J. Meng, X. Wang, and J. Yuan, “Dynamic graph cnn for event-camera based gesture recognition,” in2020 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2020, pp. 1–5

work page 2020

[12] [12]

Event-based action recognition using motion information and spiking neural networks

Q. Liu, D. Xing, H. Tang, D. Ma, and G. Pan, “Event-based action recognition using motion information and spiking neural networks.” inIJCAI, 2021, pp. 1743–1749

work page 2021

[13] [13]

Real-time 3d reconstruc- tion and 6-dof tracking with an event camera,

H. Kim, S. Leutenegger, and A. J. Davison, “Real-time 3d reconstruc- tion and 6-dof tracking with an event camera,” inEuropean conference on computer vision. Springer, 2016, pp. 349–364

work page 2016

[14] [14]

Evi-sam: Robust, real-time, tightly-coupled event–visual–inertial state estimation and 3d dense mapping,

W. Guan, P. Chen, H. Zhao, Y . Wang, and P. Lu, “Evi-sam: Robust, real-time, tightly-coupled event–visual–inertial state estimation and 3d dense mapping,”Advanced Intelligent Systems, vol. 6, no. 12, p. 2400243, 2024

work page 2024

[15] [15]

Event-based depth prediction with deep spiking neural network,

X. Wu, W. He, M. Yao, Z. Zhang, Y . Wang, B. Xu, and G. Li, “Event-based depth prediction with deep spiking neural network,” IEEE Transactions on Cognitive and Developmental Systems, 2024

work page 2024

[16] [16]

The multivehicle stereo event camera dataset: An event camera dataset for 3d perception,

A. Z. Zhu, D. Thakur, T. ¨Ozaslan, B. Pfrommer, V . Kumar, and K. Daniilidis, “The multivehicle stereo event camera dataset: An event camera dataset for 3d perception,”IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 2032–2039, 2018

work page 2032

[17] [17]

Even: An event- based framework for monocular depth estimation at adverse night conditions,

P. Shi, J. Peng, J. Qiu, X. Ju, F. P. W. Lo, and B. Lo, “Even: An event- based framework for monocular depth estimation at adverse night conditions,” in2023 IEEE International Conference on Robotics and Biomimetics (ROBIO). IEEE, 2023, pp. 1–7

work page 2023

[18] [18]

Snn-ann hybrid networks for embedded multimodal monocular depth estimation,

S. A. Tumpa, A. Devulapally, M. Brehove, E. Kyubwa, and V . Narayanan, “Snn-ann hybrid networks for embedded multimodal monocular depth estimation,” in2024 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 2024, pp. 198–203

work page 2024

[19] [19]

Learning monocular dense depth from events,

D. G. Javier Hidalgo-Carrio and D. Scaramuzza, “Learning monocular dense depth from events,”IEEE International Conference on 3D Vision.(3DV), 2020

work page 2020

[20] [20]

Vlp16: mid range lidar sensor,

Ouster, “Vlp16: mid range lidar sensor,” https://ouster.com/products/ hardware/vlp-16, accessed:2025-09-14

work page 2025

[21] [21]

Event- based vision: A survey,

G. Gallego, T. Delbr ¨uck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis,et al., “Event- based vision: A survey,”IEEE transactions on pattern analysis and machine intelligence, vol. 44, pp. 154–180, 2020

work page 2020

[22] [22]

Intel ® realsense™lidar camera l515,

I. Corporation, “Intel ® realsense™lidar camera l515,” https://www.intel.com/content/www/us/en/products/sku/201775/ intel-realsense-lidar-camera-l515/specifications.html, accessed: 2025-09-14

work page 2025

[23] [23]

Evk4: The ultra high-speed and compact, hd event- based vision evaluation kit built to endure field testing conditions

M. by Prophesee, “Evk4: The ultra high-speed and compact, hd event- based vision evaluation kit built to endure field testing conditions.” https://www.prophesee.ai/event-camera-evk4/, accessed: 2025-09-14

work page 2025

[24] [24]

Nvidia jetson orin,

NVIDIA, “Nvidia jetson orin,” https://www.nvidia.com/en-sg/ autonomous-machines/embedded-systems/jetson-orin/, accessed: 2025-09-14

work page 2025

[25] [25]

Meter: A mobile vision transformer architecture for monocular depth estimation,

L. Papa, P. Russo, and I. Amerini, “Meter: A mobile vision transformer architecture for monocular depth estimation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 10, pp. 5882– 5893, 2023

work page 2023

[26] [26]

Digging into self-supervised monocular depth estimation,

C. Godard, O. Mac Aodha, M. Firman, and G. J. Brostow, “Digging into self-supervised monocular depth estimation,” inProceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 3828–3838

work page 2019