pith. sign in

arxiv: 1907.05284 · v1 · pith:SN4QTDA6new · submitted 2019-07-02 · 💻 cs.CV · eess.IV

Vision-based Pedestrian Alert Safety System (PASS) for Signalized Intersections

Pith reviewed 2026-05-25 11:35 UTC · model grok-4.3

classification 💻 cs.CV eess.IV
keywords pedestrian detectiondeep learningtraffic camerasconnected vehiclessafety alertssignalized intersectionsV2P communication
0
0 comments X

The pith

A vision-based deep learning system using traffic cameras detects pedestrians and estimates their location and velocity more accurately than DSRC devices to generate real-time safety alerts at signalized intersections.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Pedestrian Alert Safety System that applies a deep learning model to traffic camera footage to detect pedestrians and produce personal safety messages every 100 milliseconds. This addresses the problem that many pedestrians lack the hand-held DSRC or 5G devices needed for vehicle-to-pedestrian communication. The system claims to deliver more accurate pedestrian position and velocity data than existing DSRC devices. Numerical tests indicate that the resulting alerts meet the accuracy and latency standards required for connected-vehicle safety applications such as the Pedestrian in Signalized Crosswalk Warning.

Core claim

The paper claims that traffic cameras combined with a vision-based deep learning model can detect and locate pedestrians in real time at signalized intersections, generate personal safety messages every 100 milliseconds, and supply pedestrian location and velocity estimates that are more accurate than those from DSRC-enabled hand-held devices while satisfying the performance requirements of pedestrian safety applications in a connected vehicle environment.

What carries the argument

Vision-based deep learning model that processes traffic camera images to detect pedestrians and generate personal safety messages in real time.

If this is right

  • Pedestrian safety alerts become possible even when pedestrians carry no communication devices.
  • Connected vehicle applications can use infrastructure camera data as a source for pedestrian position and velocity.
  • Alerts generated every 100 milliseconds can provide timely warnings of imminent vehicle-pedestrian conflicts.
  • The approach meets the accuracy and latency thresholds needed for operational pedestrian safety systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Camera-based detection could lessen dependence on universal adoption of personal communication devices for V2P safety.
  • The same camera feeds might support safety monitoring for other road users such as cyclists.
  • Integration with additional sensors could be tested to handle cases where camera visibility is reduced.
  • Data collected by such systems could be used to evaluate and improve intersection design for pedestrian safety.

Load-bearing premise

The deep learning model can accurately detect and locate pedestrians from traffic cameras in real time under operational conditions at signalized intersections.

What would settle it

Field trials at a signalized intersection in which the vision-based estimates of pedestrian location and velocity show higher errors than those obtained from DSRC hand-held devices.

Figures

Figures reproduced from arXiv: 1907.05284 by Amy Apon, Eshaa Deepak Sood, Gurcan Comert, Mashrur Chowdhury, Mhafuzul Islam, Mizanur Rahman.

Figure 1
Figure 1. Figure 1: Flowchart for the Pedestrian Alert Safety System (PASS) using Personal Safety Messages (PSMs) 3.1 Deep Learning Model A high pedestrian detection accuracy and a low computational time are the key motivations and challenges for implementing a vision-based deep learning model for safety-critical applications [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example of the YOLOv3 model input and output. [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Experimental setup for generating vision-based pedestrian safety alerts [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Location RMSE between our vision-based PASS and DSRC-enabled pedestrian hand-held device compared to the actual pedestrian location [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Average location RMSE between our vision [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Velocity RMSE between our vision-based PASS and DSRC-enabled pedestrian device compared to the actual pedestrian velocity. 5. EVALUATION OF PEDESTRIAN SAFETY In this section, we evaluated our vision-based pedestrian safety with the PSCW application by analyzing pedestrian collision warning or safety alerts. A system level evaluation is necessary for providing a system level justification by measuring the c… view at source ↗
Figure 7
Figure 7. Figure 7: Evaluation scenario for pedestrian safety using pedestrian collision warnings 5.2 Evaluation Results We used Time-to-Collision (TTC) matric to evaluate the PSCW application. We defined TTC as the time required for a vehicle to collide with a pedestrian if both the pedestrian and vehicle continue on their present trajectories (e.g., velocity and direction) without any change in their trajectory (Karamouzas … view at source ↗
Figure 8
Figure 8. Figure 8: Distribution of end-to-end latency for the pedestrian collision warnings [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
read the original abstract

Although Vehicle-to-Pedestrian (V2P) communication can significantly improve pedestrian safety at a signalized intersection, this safety is hindered as pedestrians often do not carry hand-held devices (e.g., Dedicated short-range communication (DSRC) and 5G enabled cell phone) to communicate with connected vehicles nearby. To overcome this limitation, in this study, traffic cameras at a signalized intersection were used to accurately detect and locate pedestrians via a vision-based deep learning technique to generate safety alerts in real-time about possible conflicts between vehicles and pedestrians. The contribution of this paper lies in the development of a system using a vision-based deep learning model that is able to generate personal safety messages (PSMs) in real-time (every 100 milliseconds). We develop a pedestrian alert safety system (PASS) to generate a safety alert of an imminent pedestrian-vehicle crash using generated PSMs to improve pedestrian safety at a signalized intersection. Our approach estimates the location and velocity of a pedestrian more accurately than existing DSRC-enabled pedestrian hand-held devices. A connected vehicle application, the Pedestrian in Signalized Crosswalk Warning (PSCW), was developed to evaluate the vision-based PASS. Numerical analyses show that our vision-based PASS is able to satisfy the accuracy and latency requirements of pedestrian safety applications in a connected vehicle environment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a Vision-based Pedestrian Alert Safety System (PASS) that uses traffic cameras and deep learning to detect and locate pedestrians at signalized intersections, generate Personal Safety Messages (PSMs) every 100 ms, and trigger alerts via a Pedestrian in Signalized Crosswalk Warning (PSCW) application in a connected-vehicle setting. The central claims are that the vision-based estimates of pedestrian location and velocity are more accurate than those from DSRC hand-held devices and that numerical analyses confirm the system meets the accuracy and latency requirements of pedestrian safety applications.

Significance. If the quantitative claims hold, the work would address a practical barrier to V2P safety by eliminating the need for pedestrians to carry DSRC/5G devices, leveraging existing intersection cameras instead. The approach is infrastructure-centric and could be relevant to deployment of connected-vehicle safety applications.

major comments (2)
  1. [Abstract] Abstract: the claim that the vision-based approach 'estimates the location and velocity of a pedestrian more accurately than existing DSRC-enabled pedestrian hand-held devices' is presented without any supporting detection metrics (precision, recall, localization RMSE), timing benchmarks, dataset description, or direct numerical comparison against DSRC error distributions.
  2. [Abstract] Abstract: the statement that 'Numerical analyses show that our vision-based PASS is able to satisfy the accuracy and latency requirements' supplies no model architecture details, no evaluation on intersection imagery, no latency measurements for the 100 ms PSM generation rate, and no explicit thresholds or results from the PSCW application, so the compliance conclusion cannot be verified.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments regarding the abstract. We agree that the abstract would benefit from greater self-containment and will revise it to include key quantitative results from the body of the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the vision-based approach 'estimates the location and velocity of a pedestrian more accurately than existing DSRC-enabled pedestrian hand-held devices' is presented without any supporting detection metrics (precision, recall, localization RMSE), timing benchmarks, dataset description, or direct numerical comparison against DSRC error distributions.

    Authors: The manuscript body reports the supporting metrics, including precision/recall, localization RMSE, timing benchmarks, dataset details, and direct numerical comparisons to DSRC error distributions. We will revise the abstract to summarize these key results (e.g., specific RMSE values and accuracy gains) so the claim is substantiated within the abstract itself. revision: yes

  2. Referee: [Abstract] Abstract: the statement that 'Numerical analyses show that our vision-based PASS is able to satisfy the accuracy and latency requirements' supplies no model architecture details, no evaluation on intersection imagery, no latency measurements for the 100 ms PSM generation rate, and no explicit thresholds or results from the PSCW application, so the compliance conclusion cannot be verified.

    Authors: The full manuscript provides the model architecture, evaluations on intersection imagery, measured latencies meeting the 100 ms PSM rate, explicit accuracy/latency thresholds, and PSCW application results. We will revise the abstract to include concise numerical outcomes and references to these evaluations demonstrating compliance. revision: yes

Circularity Check

0 steps flagged

No derivation chain or equations; claims are empirical

full rationale

The paper presents a system description using standard vision-based deep learning for pedestrian detection and PSM generation at intersections. No equations, mathematical derivations, fitted parameters, or predictions by construction appear in the abstract or described content. Accuracy and latency claims rest on unspecified numerical analyses rather than any self-referential fitting, self-citation chain, or ansatz smuggling. The approach relies on external DL techniques without reducing central claims to inputs by definition, so the work is self-contained with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract introduces no free parameters, axioms, or invented entities; the approach rests on standard deep learning for object detection.

pith-pipeline@v0.9.0 · 5789 in / 1064 out tokens · 26713 ms · 2026-05-25T11:35:36.024593+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages

  1. [1]

    Alahi, A., Bierlaire, M., & Vandergheynst, P. (2014). Robust real-time pedestrians detection in urban environments with low-resolution cameras. Transportation research part C: emerging technologies, 39, 113-128. Alonso, I. P., Llorca, D. F., Sotelo, M. Á., Bergasa, L. M., de Toro, P. R., Nuevo, J., Ocaña, M., and Garrido, M. Á. G. (2007). Combination of f...

  2. [2]

    Available online at: https://local.iteris.co m/arc- it/html/servicepackages/servicepackages-areaspsort.html Bagheri, M., Siekkinen, M., & Nurminen, J

    Architecture Reference for Cooperative and Intelligent Transportation. Available online at: https://local.iteris.co m/arc- it/html/servicepackages/servicepackages-areaspsort.html Bagheri, M., Siekkinen, M., & Nurminen, J. K. (2014, November). Cellular-based vehicle to pedestrian (V2P) adaptive communication for collision avoidance. In Connected Vehicles a...

  3. [3]

    Transportation research record, 1636(1), pp.96-103

    Passive pedestrian detection at unsignalized crossings. Transportation research record, 1636(1), pp.96-103. Benenson, R., Omran, M., Hosang, J., & Schiele, B. (2014). Ten years of pedestrian detection what have we learned?. In European Conference on Computer Vision (pp. 613-627). Springer, Cham. Boudet, L., & Midenet, S. (2009). Pedestrian crossing detect...

  4. [4]

    IEEE (pp

    Proceedings. IEEE (pp. 100-105). IEEE. Cai, Z., Fan, Q., Feris, R. S., & Vasconcelos, N. (2016). A unified multi -scale deep convolutional neural network for fast object detection. In European Conference on Computer Vision (pp. 354-370). Springer, Cham. Cai, Z., Saberian, M., & Vasconcelos, N. (2015). Learning complexity-aware cascades for deep pedestrian...

  5. [5]

    C., Rayamajhi, A., Chowdhury, M., Bhavsar, P., & Martin, J

    Dey, K. C., Rayamajhi, A., Chowdhury, M., Bhavsar, P., & Martin, J. (2016). Vehicle -to-vehicle (V2V) and vehicle -to-infrastructure (V2I) communication in a heterogeneous wireless network–Performance evaluation. Transportation Research Part C: Emerging Technologies, 68, 168-184. Dollar, P., Wojek, C., Schiele, B., & Perona, P. (2012). Pedestrian detectio...

  6. [6]

    IEEE (pp

    ITSC'06. IEEE (pp. 976-981). IEEE. Girshick, R., J. Donahue, T. Darrell, and J. Malik. Rich (2014). Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 580 –587. https://doi.org/10.1109/CVPR.2014.81. Girshick, (2015) R. Fast R-CNN. ...

  7. [7]

    Anguelov, D

    Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg. (2016) SSD: Single Shot Multibox Detector. Lec ture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9905 LNCS, 2016, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2 NHTSA. (2017). Traffic ...

  8. [8]

    Rahman, M., Islam, M., Calhoun, J., & Chowdhury, M. (2019). Real-Time Pedestrian Detection Approach with an Efficient Data Communication Bandwidth Strategy. Transportation Research Record: Journal of the Transportation Research Board. In Press. DOI : 10.1177/0361198119843255. Redmon, J., A. Farhadi, and C. Ap. (2018) YOLOv3 : An Incremental Improvement. R...

  9. [9]

    Non -maximum Suppression for Object Detection by Passing Messages Between Windows. Comput. Vis. - {ACCV} 2014 - 12th Asian Conf. Comput. Vision, Singapore, Singapore, Novemb. 1-5, 2014, Revis. Sel. Pap. Part {I} 290–306. doi:10.1007/978-3-319-16865-4_19 SAE J2735. Dedicated Short Range Communications (DSRC) Message Set Dictionary J2735_201603. Available a...

  10. [10]

    Real-Time Systems, 12(3), pp.295-327

    Specification and implementation of the universal time coordinated synchronization unit (UTCSU). Real-Time Systems, 12(3), pp.295-327. Sewalkar, P., & Seitz, J. (2019). Vehicle -to-pedestrian communication for vulnerable road users: survey, design considerations, and challenges. Sensors, 19(2),

  11. [11]

    N., Fallah, Y

    Tahmasbi-W, A., Mahjoub, H. N., Fallah, Y. P., Moradi-Pari, E., & Abuchaar, O. (2017). Implementation and Evaluation of a Cooperative Vehicle- to-Pedestrian Safety Application. IEEE Intelligent Transportation Systems Magazine, 9(4), 62-75 VRU,

  12. [12]

    Published March 21, 2017 by SAE International in United States

    Vulnera ble Road User Safety Message Minimum Performance Requirements. Published March 21, 2017 by SAE International in United States. Available online at: https://saemobilus.sae.org/content/j2945/9_201703 USDOT, (2016). Pedestrian in Signalized Crosswalk Warning. https://local.iteris.com/cvria/html/applications/app51.html. Accessed Jun. 14,

  13. [13]

    (2012, June)

    Wang, X., & Ouyang, W. (2012, June). A discriminative deep model for pedestrian detection with occlusion handling. In 2012 IE EE Conference on Computer Vision and Pattern Recognition (pp. 3258-3265). IEEE. WHO,

  14. [14]

    Williams, E

    World Health Organization. Williams, E. (2011). Aviation Formulary V1

  15. [15]

    Wu, B., & Nevatia, R

    Aviation. Wu, B., & Nevatia, R. (2007, June). Pedestrian detection in infrared images based on local shape features. In 2007 IEEE Conference on Computer Vision and Pattern Recognition (pp. 1-8). IEEE. Wu, X., Miucic, R., Yang, S., Al-Stouhi, S., Misener, J., Bai, S., & Chan, W. H. (2014, September). Cars talk to phones: A DSRC based vehicle - pedestrian s...

  16. [16]

    Islam, Rahman, Chowdhury, Comert, Sood and Apon 23 Zeng, X., Ouyang, W., & Wang, X

    https://doi.org/10.1155/2017/2750452. Islam, Rahman, Chowdhury, Comert, Sood and Apon 23 Zeng, X., Ouyang, W., & Wang, X. (2013). Multi-stage contextual deep learning for pedestrian detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 121-128). Zeng, X., Ouyang, W., Wang, M., & Wang, X. (2014, September). Deep learning of...