Stereo Event Lifetime and Disparity Estimation for Dynamic Vision Sensors
Pith reviewed 2026-05-24 20:23 UTC · model grok-4.3
The pith
Stereo event cameras jointly estimate lifetimes and disparities in one shot using matching events across the pair.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors present a method for single shot event lifetime and disparity estimation in stereo event-cameras, where events from the same brightness change are associated via stereo matching to jointly solve for lifetime and disparity. This produces sharp gradient images of events that serve as input to disparity estimation methods. The approach is shown to be approximately twice as fast and more accurate than estimating lifetimes separately for each sensor and then performing stereo matching. Validation is performed on real-world data from multiple stereo event-camera experiments.
What carries the argument
single shot event lifetime and disparity estimation with association via stereo matching
If this is right
- Sharp gradient images are generated without fixed time interval accumulation of events.
- Depth becomes available for each event from the stereo pair.
- The method supports applications that exploit the asynchronous nature of event cameras.
- Processing runs approximately twice as fast with higher accuracy than separate estimation then matching.
Where Pith is reading between the lines
- The joint solution may compound accuracy gains when the resulting depth maps are used in downstream tasks such as visual odometry.
- Extending the association step to more than two cameras could allow multi-view lifetime and depth estimation in a single optimization.
- Scenes with very sparse events might expose whether the matching step remains stable when fewer candidate pairs exist.
- The speed advantage could translate directly to higher frame-rate output in real-time robotics pipelines that consume the gradient images.
Load-bearing premise
Events triggered by the same brightness change in the two sensors can be reliably associated through stereo matching without introducing significant matching errors.
What would settle it
A test sequence in which stereo matching frequently pairs unrelated events would make the joint estimates less accurate than separate lifetime estimation followed by matching.
Figures
read the original abstract
Event-based cameras are biologically inspired sensors that output asynchronous pixel-wise brightness changes in the scene called events. They have a high dynamic range and temporal resolution of a microsecond, opposed to standard cameras that output frames at fixed frame rates and suffer from motion blur. Forming stereo pairs of such cameras can open novel application possibilities, since for each event depth can be readily estimated; however, to fully exploit asynchronous nature of the sensor and avoid fixed time interval event accumulation, stereo event lifetime estimation should be employed. In this paper, we propose a novel method for event lifetime estimation of stereo event-cameras, allowing generation of sharp gradient images of events that serve as input to disparity estimation methods. Since a single brightness change triggers events in both event-camera sensors, we propose a method for single shot event lifetime and disparity estimation, with association via stereo matching. The proposed method is approximately twice as fast and more accurate than if lifetimes were estimated separately for each sensor and then stereo matched. Results are validated on real-world data through multiple stereo event-camera experiments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a method for joint single-shot event lifetime and disparity estimation in stereo dynamic vision sensor pairs. Events triggered by the same brightness change are associated via stereo matching to enable simultaneous lifetime estimation and disparity computation, producing sharp gradient images as input to disparity methods. The approach is claimed to be approximately twice as fast and more accurate than the baseline of independent per-sensor lifetime estimation followed by stereo matching, with validation on real-world stereo event-camera experiments.
Significance. If the reported speed and accuracy gains hold under the quantitative comparisons supplied in the full manuscript, the work offers a practical route to exploiting the microsecond temporal resolution of event cameras in stereo depth estimation without fixed-interval accumulation. The explicit treatment of the stereo association step (Section 4) and the provision of real-world timing/accuracy benchmarks against the separate-lifetime baseline constitute concrete strengths that support applicability in robotics and high-speed vision.
minor comments (3)
- [Abstract] Abstract: the phrase 'approximately twice as fast' would be more informative if it referenced the specific timing metric and dataset from the experiments section.
- [Introduction] The manuscript would benefit from a short statement in the introduction clarifying how the stereo-matching association step avoids introducing systematic bias into the lifetime estimates.
- [Figures] Figure captions could explicitly state the event accumulation window or number of events used to generate the displayed gradient images for reproducibility.
Simulated Author's Rebuttal
We thank the referee for their positive review of the manuscript and their recommendation to accept.
Circularity Check
No significant circularity detected
full rationale
The paper presents an empirical method for joint stereo event lifetime and disparity estimation via event association, with performance claims (speed and accuracy gains) validated on real-world data against a separate-lifetime baseline. No derivation chain reduces to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations; the approach relies on standard stereo matching principles applied to event data without internal circular reductions. The manuscript supplies explicit experimental validation and discussion of the matching step, keeping the central claims self-contained and externally falsifiable.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a method for single shot event lifetime and disparity estimation, with association via stereo matching. The proposed method is approximately twice as fast...
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
lifetime τ(p) = 1/n3 √(n1² + n2²) ... plane fitting ... RANSAC
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Activity-driven, event-based vision sensors,
T. Delbrück, B. Linares-Barranco, E. Culurciello, and C. Posch, “Activity-driven, event-based vision sensors,” in Proceedings of 2010 IEEE International Symposium on Circuits and Systems . IEEE, 2010, pp. 2426–2429
work page 2010
-
[2]
A 128 ×128 120 db 15 µs latency asynchronous temporal contrast vision sensor,
P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128 ×128 120 db 15 µs latency asynchronous temporal contrast vision sensor,” IEEE Journal of Solid-State Circuits, vol. 43, no. 2, pp. 566–576, 2008
work page 2008
-
[3]
A 240 × 180 130 db 3 µs latency global shutter spatiotemporal vision sensor,
C. Brandli, R. Berner, M. Yang, S.-C. Liu, and T. Delbruck, “A 240 × 180 130 db 3 µs latency global shutter spatiotemporal vision sensor,” IEEE Journal of Solid-State Circuits , vol. 49, no. 10, pp. 2333–2341, 2014
work page 2014
-
[4]
Event- based vision: A survey,
G. Gallego, T. Delbruck, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. Davison, J. Conradt, K. Daniilidis et al. , “Event- based vision: A survey,” arXiv preprint arXiv:1904.08405 , 2019
-
[5]
Events-to-video: Bringing modern computer vision to event cameras,
H. Rebecq, R. Ranftl, V . Koltun, and D. Scaramuzza, “Events-to-video: Bringing modern computer vision to event cameras,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2019, pp. 3857–3866
work page 2019
-
[6]
L. Wang, Y .-S. Ho, K.-J. Yoon et al., “Event-based high dynamic range image and very high frame rate video generation using conditional gen- erative adversarial networks,” arXiv preprint arXiv:1811.08230 , 2018
-
[7]
Asyn- chronous event-based visual shape tracking for stable haptic feedback in microrobotics,
Z. Ni, A. Bolopion, J. Agnus, R. Benosman, and S. Régnier, “Asyn- chronous event-based visual shape tracking for stable haptic feedback in microrobotics,” IEEE Transactions on Robotics , vol. 28, no. 5, pp. 1081–1089, 2012
work page 2012
-
[8]
Simultaneous optical flow and intensity estimation from an event camera,
P. Bardow, A. J. Davison, and S. Leutenegger, “Simultaneous optical flow and intensity estimation from an event camera,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016, pp. 884–892
work page 2016
-
[9]
A. R. Vidal, H. Rebecq, T. Horstschaefer, and D. Scaramuzza, “Ultimate SLAM? Combining events, images, and IMU for robust visual SLAM in HDR and high speed scenarios,” IEEE Robotics and Automation Letters, vol. 3, no. 2, pp. 994–1001, 2018
work page 2018
-
[10]
Lifetime estimation of events from dynamic vision sensors,
E. Mueggler, C. Forster, N. Baumli, G. Gallego, and D. Scaramuzza, “Lifetime estimation of events from dynamic vision sensors,” in 2015 IEEE International Conference on Robotics and Automation (ICRA) . IEEE, 2015, pp. 4874–4881
work page 2015
-
[11]
R. Benosman, C. Clercq, X. Lagorce, S.-H. Ieng, and C. Bartolozzi, “Event-based visual flow,” IEEE transactions on neural networks and learning systems, vol. 25, no. 2, pp. 407–417, 2014
work page 2014
-
[12]
Dynamic stereo vision system for real-time tracking,
S. Schraml, A. N. Belbachir, N. Milosevic, and P. Schön, “Dynamic stereo vision system for real-time tracking,” in Proceedings of 2010 IEEE International Symposium on Circuits and Systems . IEEE, 2010, pp. 1409–1412
work page 2010
-
[13]
Address-event based stereo vision with bio-inspired silicon retina imagers,
J. Kogler, C. Sulzbachner, M. Humenberger, and F. Eibensteiner, “Address-event based stereo vision with bio-inspired silicon retina imagers,” in Advances in Theory and Applications of Stereo Vision . IntechOpen, 2011
work page 2011
-
[14]
Event-based stereo matching approaches for frameless address event stereo data,
J. Kogler, M. Humenberger, and C. Sulzbachner, “Event-based stereo matching approaches for frameless address event stereo data,” in Inter- national Symposium on Visual Computing. Springer, 2011, pp. 674–685
work page 2011
-
[15]
Context-aware event-driven stereo matching,
D. Zou, P. Guo, Q. Wang, X. Wang, G. Shao, F. Shi, J. Li, and P.- K. Park, “Context-aware event-driven stereo matching,” in 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 2016, pp. 1076–1080
work page 2016
-
[16]
Robust dense depth map estimation from sparse DVS stereos,
D. Zou, F. Shi, W. Liu, J. Li, Q. Wang, P. Park, C. Shi, Y . Roh, and H. Ryu, “Robust dense depth map estimation from sparse DVS stereos,” in British Machine Vis. Conf.(BMVC) , vol. 3, 2017
work page 2017
-
[17]
Distance transforms of sampled functions,
P. F. Felzenszwalb and D. P. Huttenlocher, “Distance transforms of sampled functions,” Theory of computing , vol. 8, no. 1, pp. 415–428, 2012
work page 2012
-
[18]
The multi vehicle stereo event camera dataset: An event camera dataset for 3D perception,
A. Z. Zhu, D. Thakur, T. Özaslan, B. Pfrommer, V . Kumar, and K. Daniilidis, “The multi vehicle stereo event camera dataset: An event camera dataset for 3D perception,” IEEE Robotics and Automation Letters, vol. 3, no. 3, pp. 2032–2039, 2018
work page 2032
-
[19]
Self-supervised calibration for robotic systems,
J. Maye, P. Furgale, and R. Siegwart, “Self-supervised calibration for robotic systems,” in 2013 IEEE Intelligent Vehicles Symposium (IV) . IEEE, 2013, pp. 473–480
work page 2013
-
[20]
ROS: an open-source robot operating system,
M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs, R. Wheeler, and A. Y . Ng, “ROS: an open-source robot operating system,” in ICRA workshop on open source software , vol. 3, no. 3.2. Kobe, Japan, 2009, p. 5
work page 2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.