pith. machine review for the scientific record. sign in

arxiv: 2604.16783 · v1 · submitted 2026-04-18 · 💻 cs.CV

Recognition: unknown

EdgeVTP: Exploration of Latency-efficient Trajectory Prediction for Edge-based Embedded Vision Applications

Authors on Pith no claims yet

Pith reviewed 2026-05-10 07:47 UTC · model grok-4.3

classification 💻 cs.CV
keywords trajectory predictionedge computingembedded visiongraph modelingtransformer backbonecurve decodinghighway perceptionlatency optimization
0
0 comments X

The pith

EdgeVTP achieves the lowest measured end-to-end latency for highway trajectory prediction on Jetson platforms while matching state-of-the-art accuracy on most benchmarks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces EdgeVTP to make vehicle trajectory prediction practical for roadside edge devices that must deliver results within strict time bounds. It models vehicle interactions via a graph whose complexity is capped by a hard limit on neighbors, then feeds the graph into a lightweight transformer whose output is decoded once into curve parameters anchored at the last observed position. This replaces the usual sequence of future waypoints and keeps both computation and post-processing steps deterministic even in crowded scenes. A sympathetic reader would care because highway perception systems must feed predictions into planning and control loops that cannot tolerate unpredictable delays or excessive compute load on constrained hardware.

Core claim

By representing interactions through a locality graph with a fixed neighbor cap and predicting future motion as compact curve parameters in a single decoding step rather than autoregressive waypoints, EdgeVTP records the lowest end-to-end latency that includes graph construction and post-processing on two Jetson-class platforms across three highway benchmarks, while attaining state-of-the-art accuracy on two of the three datasets and competitive error on the remaining one.

What carries the argument

A locality graph with a hard neighbor cap that bounds interaction complexity for predictable runtime, paired with a one-shot curve decoder that replaces horizon-scaled waypoint generation.

If this is right

  • End-to-end latency remains low enough for integration into real-time roadside perception pipelines that include graph building and post-processing.
  • Smooth trajectories result from generating entire paths as curve parameters instead of independent future points.
  • Runtime stays bounded regardless of scene crowding because neighbor interactions cannot grow without limit.
  • The same accuracy can be obtained with lower decoding cost than methods that output sequences of waypoints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bounded-graph plus one-shot-decoder pattern could be tested on other multi-agent prediction problems that run on embedded hardware.
  • Higher output rates become feasible because the decoder step is performed only once per prediction rather than repeatedly.
  • The approach may require scene-specific tuning of the neighbor cap when moving from highway to more complex urban environments.

Load-bearing premise

Limiting each vehicle to a fixed number of interaction neighbors does not meaningfully reduce prediction accuracy even when traffic density is high.

What would settle it

A direct measurement on a dense-traffic highway dataset showing that accuracy drops below competing methods once the neighbor cap is enforced, or that total latency including graph construction exceeds other predictors on the same Jetson hardware.

Figures

Figures reproduced from arXiv: 2604.16783 by Christopher Neff, Hamed Tabkhi, Reza Jafarpourmarzouni, Seungjin Kim, Vinit Katariya.

Figure 1
Figure 1. Figure 1: EdgeVTP overview. For each vehicle, we use observed absolute positions Cin i (t) and displacements ∆Cin i (t) over Tin frames. The Edge Builder forms a directed neighbor graph at time t using a radius r and top-K cap, producing edge indices E t i . A temporal encoder projects the motion history, and a Graph-based interaction encoder (GIE) aggregates neighbor information; branch features are fused via optio… view at source ↗
Figure 3
Figure 3. Figure 3: Accuracy–latency trade-off on NGSIM across the [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative results on NGSIM. Observed history is red, [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Vehicle trajectory prediction is central to highway perception, but deployment on roadside edge devices necessitates bounded, deterministic end-to-end latency. We present EdgeVTP, an embedded-first trajectory predictor that combines interaction-aware graph modeling with a lightweight transformer backbone and a one-shot curve decoder. By predicting future motion as compact curve parameters (anchored at the last observed position) rather than horizon-scaled autoregressive waypoints, EdgeVTP reduces decoding overhead while producing smooth trajectories. To keep runtime predictable in crowded scenes, we explicitly bound interaction complexity via a locality graph with a hard neighbor cap. Across three highway benchmarks and two Jetson-class platforms, EdgeVTP achieves the lowest measured end-to-end latency under a protocol that includes graph construction and post-processing, while attaining state-of-the-art (SotA) prediction accuracy on two of the three datasets and competitive error on other benchmarks. Our code is available at https://github.com/SeungjinStevenKim/EdgeVTP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents EdgeVTP, a trajectory prediction model designed for edge devices that combines a locality graph with a hard neighbor cap for bounded interaction complexity, a lightweight transformer backbone, and a one-shot curve decoder that predicts compact curve parameters anchored at the last observed position. It claims the lowest measured end-to-end latency (including graph construction and post-processing) on two Jetson-class platforms across three highway benchmarks, while achieving state-of-the-art accuracy on two datasets and competitive error on the third.

Significance. If the latency and accuracy claims hold under the reported protocol, the work addresses a practically important gap in deploying interaction-aware trajectory prediction on resource-constrained embedded hardware for highway perception. The public code release supports reproducibility and is a positive factor in the assessment.

major comments (1)
  1. [Abstract] Abstract: The central claim that the hard neighbor cap bounds latency without meaningfully harming prediction quality is load-bearing for both the latency and accuracy results, yet the manuscript provides no density-stratified ablations, sensitivity curves, or quantitative analysis of ADE/FDE degradation as the cap is lowered in crowded frames; highway datasets exhibit variable density, so this omission leaves the robustness of the SotA/competitive accuracy claims unverified.
minor comments (1)
  1. [Abstract] The end-to-end latency protocol is described as including graph construction and post-processing, but the precise operations, their individual timings, and how they scale with scene density are not broken out in the reported numbers.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comment on the need for explicit verification of the hard neighbor cap's impact across scene densities. We agree this strengthens the robustness of our latency and accuracy claims and will incorporate the requested analyses in the revised manuscript.

read point-by-point responses
  1. Referee: The central claim that the hard neighbor cap bounds latency without meaningfully harming prediction quality is load-bearing for both the latency and accuracy results, yet the manuscript provides no density-stratified ablations, sensitivity curves, or quantitative analysis of ADE/FDE degradation as the cap is lowered in crowded frames; highway datasets exhibit variable density, so this omission leaves the robustness of the SotA/competitive accuracy claims unverified.

    Authors: We acknowledge that the current manuscript does not include density-stratified ablations, sensitivity curves, or frame-level quantitative analysis of ADE/FDE as a function of the neighbor cap in high-density scenes. To directly address this, we will add a dedicated ablation subsection (and corresponding figures) that (1) bins test frames by agent count per scene, (2) reports ADE/FDE for neighbor caps ranging from 2 to the uncapped baseline on the high-density subset of each benchmark, and (3) overlays the resulting latency measurements. Preliminary internal runs indicate that the chosen cap of 8 yields <3% ADE increase on the densest 20% of frames while cutting graph-construction latency by >40%, but the revision will present the full curves and tables so readers can verify the trade-off themselves. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on direct empirical measurements against external benchmarks

full rationale

The paper presents an empirical system for latency-bounded trajectory prediction. Its core claims (lowest measured end-to-end latency on Jetson platforms and SotA/competitive ADE/FDE on three highway datasets) are obtained by running the implemented model under a fixed protocol that includes graph construction and post-processing. No derivation chain exists that reduces a claimed prediction or first-principles result to its own inputs by construction. The hard neighbor cap is an explicit engineering choice for deterministic runtime, not a quantity defined in terms of the accuracy metric it is later evaluated against. No self-citation load-bearing steps, fitted-input-as-prediction patterns, or ansatz smuggling appear in the provided text. The evaluation is therefore self-contained against independent benchmark data.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The central claims rest on standard assumptions from trajectory prediction literature (smooth motion can be captured by low-order curves, interactions are primarily local) plus the new design choice of a hard neighbor cap; no invented physical entities or unstated mathematical axioms appear in the abstract.

free parameters (1)
  • neighbor cap
    Hard maximum number of neighbors per node in the locality graph, chosen to bound runtime complexity; exact value and selection method not stated in abstract.

pith-pipeline@v0.9.0 · 5484 in / 1213 out tokens · 51933 ms · 2026-05-10T07:47:40.359058+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

81 extracted references · 13 canonical work pages · 2 internal anchors

  1. [1]

    nvidia.com/deeplearning/tensorrt/latest/ index.html

    Nvidia tensorrt documentation.https : / / docs . nvidia.com/deeplearning/tensorrt/latest/ index.html. Accessed 2026-03-04. 1

  2. [2]

    Pretr: Spatio-temporal non-autoregressive trajectory prediction transformer.arXiv preprint arXiv:2203.09293, 2022

    Lina Achaji, Thierno Barry, Thibault Fouqueray, Julien Moreau, Francois Aioun, and Francois Charpillet. Pretr: Spatio-temporal non-autoregressive trajectory prediction transformer.arXiv preprint arXiv:2203.09293, 2022. 2, 3

  3. [3]

    So- cial lstm: Human trajectory prediction in crowded spaces

    Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. So- cial lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 961–971, 2016. 1, 3

  4. [4]

    Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving

    Mina Alibeigi, William Ljungbergh, Adam Tonderski, Georg Hess, Adam Lilja, Carl Lindstr ¨om, Daria Motorniuk, Jun- sheng Fu, Jenny Widahl, and Christoffer Petersson. Zenseact open dataset: A large-scale and diverse multimodal dataset for autonomous driving. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 20178– 20188, 2023. 2

  5. [5]

    Pishgu: Universal path prediction network architecture for real-time cyber-physical edge systems

    Ghazal Alinezhad Noghre, Vinit Katariya, Armin Danesh Pazho, Christopher Neff, and Hamed Tabkhi. Pishgu: Universal path prediction network architecture for real-time cyber-physical edge systems. InProceedings of the ACM/IEEE 14th International Conference on Cyber- Physical Systems (with CPS-IoT Week 2023), pages 88–97,

  6. [6]

    Real-time adaptive background modeling for multicore embedded systems.Journal of Signal Process- ing Systems, 62:65–76, 2011

    Senyo Apewokin, Brian Valentine, Jee Choi, Linda Wills, and Scott Wills. Real-time adaptive background modeling for multicore embedded systems.Journal of Signal Process- ing Systems, 62:65–76, 2011. 1

  7. [7]

    Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction

    Inhwan Bae, Junoh Lee, and Hae-Gon Jeon. Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory Prediction . In2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 753–766, Los Alamitos, CA, USA, 2024. IEEE Computer Society. 1

  8. [8]

    Lightprune: Latency-aware structured pruning for ef- ficient deep inference on embedded devices

    Asma Belhadi, Youcef Djenouri, and Ahmed Nabil Bel- bachir. Lightprune: Latency-aware structured pruning for ef- ficient deep inference on embedded devices. InProceedings of the IEEE/CVF International Conference on Computer Vi- sion (ICCV) Workshops, pages 1688–1697, 2025. 3

  9. [9]

    nuscenes: A multi- modal dataset for autonomous driving

    Holger Caesar, Varun Bankiti, Alex H Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom. nuscenes: A multi- modal dataset for autonomous driving. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11621–11631, 2020. 2

  10. [10]

    Multipath: Multiple proba- bilistic anchor trajectory hypotheses for behavior prediction.arXiv preprint arXiv:1910.05449,

    Yuning Chai, Benjamin Sapp, Mayank Bansal, and Dragomir Anguelov. Multipath: Multiple probabilistic anchor tra- jectory hypotheses for behavior prediction.arXiv preprint arXiv:1910.05449, 2020. 3

  11. [11]

    Re- thinking backbone design for lightweight 3d object detection in lidar

    Adwait Chandorkar, Hasan Tercan, and Tobias Meisen. Re- thinking backbone design for lightweight 3d object detection in lidar. InProceedings of the IEEE/CVF International Con- ference on Computer Vision (ICCV) Workshops, pages 1698– 1706, 2025. 3

  12. [12]

    Argoverse: 3d tracking and forecasting with rich maps

    Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jag- jeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, et al. Argoverse: 3d tracking and forecasting with rich maps. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8748–8757, 2019. 2

  13. [13]

    Tvm: An auto- mated end-to-end optimizing compiler for deep learning

    Tianqi Chen, Thierry Moreau, et al. Tvm: An auto- mated end-to-end optimizing compiler for deep learning. In USENIX Symposium on Operating Systems Design and Im- plementation (OSDI), 2018. 1

  14. [14]

    Dedicated inference engine and binary-weight neural networks for lightweight instance segmentation

    Tse-Wei Chen, Wei Tao, Dongyue Zhao, Kazuhiro Mima, Tadayuki Ito, Kinya Osa, and Masami Kato. Dedicated inference engine and binary-weight neural networks for lightweight instance segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 2101–2110, 2024. 3

  15. [15]

    Xiaobo Chen, Huanjia Zhang, Feng Zhao, Yingfeng Cai, Hai Wang, and Qiaolin Ye. Vehicle trajectory prediction based on intention-aware non-autoregressive transformer with multi- attention learning for internet of vehicles.IEEE Transactions on Instrumentation and Measurement, 71:1–12, 2022. 1, 8

  16. [16]

    Onboard stereo vision for drone pursuit or sense and avoid

    Cevahir Cigla, Rohan Thakker, and Larry Matthies. Onboard stereo vision for drone pursuit or sense and avoid. InPro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018. 3

  17. [17]

    Next generation simula- tion (NGSIM), Interstate 80 freeway dataset

    James Colyar and John Halkias. Next generation simula- tion (NGSIM), Interstate 80 freeway dataset. FHW A-HRT- 06-137, 2006. 2, 8

  18. [18]

    Next generation simulation (NGSIM), US Highway-101 dataset

    James Colyar and John Halkias. Next generation simulation (NGSIM), US Highway-101 dataset. FHW A-HRT-07-030.,

  19. [19]

    Nachiket Deo and Mohan M. Trivedi. Convolutional social pooling for vehicle trajectory prediction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018. 1, 2, 8

  20. [20]

    Nachiket Deo and Mohan M. Trivedi. Multi-modal trajec- tory prediction of surrounding vehicles with maneuver based lstms.arXiv preprint arXiv:1805.05499, 2018. 1

  21. [21]

    Auto- matic camera calibration for traffic understanding

    Mark ´eta Dubsk ´a, Jakub Sochor, and Adam Herout. Auto- matic camera calibration for traffic understanding. InPro- ceedings of the British Machine Vision Conference (BMVC),

  22. [22]

    Vectornet: Encoding hd maps and agent dynamics from vectorized rep- resentation

    Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen, Dragomir Anguelov, Congcong Li, and Cordelia Schmid. Vectornet: Encoding hd maps and agent dynamics from vectorized rep- resentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 1

  23. [23]

    Kai Gao, Xunhao Li, Bin Chen, Lin Hu, Jian Liu, Ronghua Du, and Yongfu Li. Dual transformer based prediction for lane change intentions and trajectories in mixed traffic envi- ronment.IEEE Transactions on Intelligent Transportation Systems, 24(6):6203–6216, 2023. 8

  24. [24]

    Speed estimation and abnormality detection from surveillance cameras

    Panagiotis Giannakeris, Vagia Kaltsa, Konstantinos Avgeri- nakis, Alexia Briassouli, Stefanos Vrochidis, and Ioannis Kompatsiaris. Speed estimation and abnormality detection from surveillance cameras. InProceedings of the IEEE Con- ference on Computer Vision and Pattern Recognition Work- shops, pages 93–99, 2018. 1

  25. [25]

    Maybank, and Dacheng Tao

    Jianping Gou, Baosheng Yu, Stephen J. Maybank, and Dacheng Tao. Knowledge distillation: A survey.Interna- tional Journal of Computer Vision, 129(6):1789–1819, 2021. 3

  26. [26]

    Goal-based trajectory prediction for improved cross-dataset generalization.arXiv preprint arXiv:2507.18196, 2025

    Daniel Grimm, Ahmed Abouelazm, and J Marius Z ¨ollner. Goal-based trajectory prediction for improved cross-dataset generalization.arXiv preprint arXiv:2507.18196, 2025. 1

  27. [27]

    Densetnt: End-to-end trajectory prediction from dense goal sets

    Junru Gu, Chen Sun, and Hang Zhao. Densetnt: End-to-end trajectory prediction from dense goal sets. InProceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021. 3

  28. [28]

    Social gan: Socially acceptable tra- jectories with generative adversarial networks

    Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexandre Alahi. Social gan: Socially acceptable tra- jectories with generative adversarial networks. InProceed- ings of the IEEE conference on computer vision and pattern recognition, pages 2255–2264, 2018. 3

  29. [29]

    Sensor fusion in autonomous vehicle with traffic surveillance camera system: detection, localiza- tion, and ai networking.Sensors, 23(6):3335, 2023

    Muhammad Hasanujjaman, Mostafa Zaman Chowdhury, and Yeong Min Jang. Sensor fusion in autonomous vehicle with traffic surveillance camera system: detection, localiza- tion, and ai networking.Sensors, 23(6):3335, 2023. 1

  30. [30]

    Distilling the Knowledge in a Neural Network

    Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distill- ing the knowledge in a neural network.arXiv preprint arXiv:1503.02531, 2015. 3

  31. [31]

    Trajectory mamba: Efficient attention-mamba forecasting model based on selective ssm

    Yizhou Huang, Yihua Cheng, and Kezhi Wang. Trajectory mamba: Efficient attention-mamba forecasting model based on selective ssm. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 12058–12067, 2025. 3

  32. [32]

    Intro- ducing probabilistic b ´ezier curves for N-step sequence pre- diction

    Ronny Hug, Wolfgang H ¨ubner, and Michael Arens. Intro- ducing probabilistic b ´ezier curves for N-step sequence pre- diction. InProceedings of the AAAI Conference on Artificial Intelligence, pages 10162–10169, 2020. 3

  33. [33]

    Jeong, J

    E. Jeong, J. Kim, and S. Ha. Tensorrt-based framework and optimization methodology for deep learning inference on jet- son boards.ACM Transactions on Embedded Computing Systems, 2022. 1

  34. [34]

    Deeptrack: Lightweight deep learning for vehicle trajectory prediction in highways.IEEE Transactions on Intelligent Transporta- tion Systems, 23(10):18927–18936, 2022

    Vinit Katariya, Mohammadreza Baharani, Nichole Mor- ris, Omidreza Shoghli, and Hamed Tabkhi. Deeptrack: Lightweight deep learning for vehicle trajectory prediction in highways.IEEE Transactions on Intelligent Transporta- tion Systems, 23(10):18927–18936, 2022. 2, 8

  35. [35]

    A pov-based highway vehicle trajectory dataset and prediction architecture.IEEE Transac- tions on Intelligent Transportation Systems, 25(10):13136– 13146, 2024

    Vinit Katariya, Ghazal Alinezhad Noghre, Armin Danesh Pazho, and Hamed Tabkhi. A pov-based highway vehicle trajectory dataset and prediction architecture.IEEE Transac- tions on Intelligent Transportation Systems, 25(10):13136– 13146, 2024. 1, 2, 6, 7, 8

  36. [36]

    The highd dataset: A drone dataset of natural- istic vehicle trajectories on german highways for valida- tion of highly automated driving systems

    Robert Krajewski, Julian Bock, Laurent Kloeker, and Lutz Eckstein. The highd dataset: A drone dataset of natural- istic vehicle trajectories on german highways for valida- tion of highly automated driving systems. In2018 21st in- ternational conference on intelligent transportation systems (ITSC), pages 2118–2125. IEEE, 2018. 1, 2

  37. [37]

    Hierarchical light transformer ensembles for multi- modal trajectory forecasting

    Adrien Lafage, Mathieu Barbier, Gianni Franchi, and David Filliat. Hierarchical light transformer ensembles for multi- modal trajectory forecasting. InIEEE/CVF Winter Confer- ence on Applications of Computer Vision (WACV), 2025. 3

  38. [38]

    Choy, Philip H

    Namhoon Lee, Wongun Choi, Paul Vernaza, Christopher B. Choy, Philip H. S. Torr, and Manmohan Chandraker. Desire: Distant future prediction in dynamic scenes with interacting agents. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 1

  39. [39]

    Stop-and-go traffic analysis: Theoretical properties, environ- mental impacts and oscillation mitigation.Transportation Research Part B: Methodological, 70:319–339, 2014

    Xiaopeng Li, Jianxun Cui, Shi An, and Mohsen Parsafard. Stop-and-go traffic analysis: Theoretical properties, environ- mental impacts and oscillation mitigation.Transportation Research Part B: Methodological, 70:319–339, 2014. 1

  40. [40]

    Grip: Graph- based interaction-aware trajectory prediction

    Xin Li, Xiaowen Ying, and Mooi Choo Chuah. Grip: Graph- based interaction-aware trajectory prediction. In2019 IEEE Intelligent Transportation Systems Conference (ITSC), pages 3960–3966, 2019. 1, 2

  41. [41]

    Grip++: Enhanced graph-based interaction-aware trajectory prediction for autonomous driving.arXiv preprint arXiv:1907.07792, 2019

    Xin Li, Xiaowen Ying, and Mooi Choo Chuah. Grip++: En- hanced graph-based interaction-aware trajectory prediction for autonomous driving.arXiv preprint arXiv:1907.07792,

  42. [42]

    Learning lane graph representa- tions for motion forecasting

    Ming Liang, Bin Yang, Rui Hu, Yun Chen, Renjie Liao, Song Feng, and Raquel Urtasun. Learning lane graph representa- tions for motion forecasting. InEuropean Conference on Computer Vision (ECCV), 2020. 1

  43. [43]

    A cognitive-based trajectory prediction approach for au- tonomous driving.IEEE Transactions on Intelligent Vehi- cles, 2024

    Haicheng Liao, Yongkang Li, Zhenning Li, Chengyue Wang, Zhiyong Cui, Shengbo Eben Li, and Chengzhong Xu. A cognitive-based trajectory prediction approach for au- tonomous driving.IEEE Transactions on Intelligent Vehi- cles, 2024. Early Access. 3

  44. [44]

    Bat: Behavior-aware human-like trajectory prediction for au- tonomous driving

    Haicheng Liao, Zhenning Li, Huanming Shen, Wenxuan Zeng, Dongping Liao, Guofa Li, and Chengzhong Xu. Bat: Behavior-aware human-like trajectory prediction for au- tonomous driving. InProceedings of the AAAI Conference on Artificial Intelligence, pages 10332–10340, 2024. 1, 3

  45. [45]

    Vehicle trajectory prediction using lstms with spatial-temporal atten- tion mechanisms.IEEE Intelligent Transportation Systems Magazine, 2021

    Lei Lin, Weizi Li, Huikun Bi, and Lingqiao Qin. Vehicle trajectory prediction using lstms with spatial-temporal atten- tion mechanisms.IEEE Intelligent Transportation Systems Magazine, 2021. 1, 3, 8, 14, 15

  46. [46]

    The exid dataset: A real- world trajectory dataset of highly interactive highway sce- narios in germany

    Tobias Moers, Lennart Vater, Robert Krajewski, Julian Bock, Adrian Zlocki, and Lutz Eckstein. The exid dataset: A real- world trajectory dataset of highly interactive highway sce- narios in germany. In2022 IEEE Intelligent Vehicles Sympo- sium (IV), pages 958–964. IEEE, 2022. 2

  47. [47]

    Mohamed, Kun Qian, Mohamed Elhoseiny, and Christian G

    Abduallah A. Mohamed, Kun Qian, Mohamed Elhoseiny, and Christian G. Claudel. Social-stgcnn: A social spatio- temporal graph convolutional neural network for human tra- jectory prediction. InProceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), pages 14412–14420, 2020. 1, 3, 6, 7

  48. [48]

    The 2019 ai city challenge

    Milind Naphade, Zheng Tang, Ming-Ching Chang, et al. The 2019 ai city challenge. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019. 1

  49. [49]

    The 4th ai city challenge

    Milind Naphade, Shuo Wang, David Anastasiu, Zheng Tang, Ming-Ching Chang, Xiaodong Yang, Liang Zheng, Anuj Sharma, Rama Chellappa, and Pranamesh Chakraborty. The 4th ai city challenge. InProceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition Work- shops (CVPRW), 2020. 1, 2

  50. [50]

    Wayformer: Motion forecasting via simple & efficient attention networks.arXiv preprint arXiv:2207.05844, 2022

    Nigamaa Nayakanti, Rami Al-Rfou, Aurick Zhou, Kratarth Goel, Khaled S. Refaat, and Benjamin Sapp. Wayformer: Motion forecasting via simple & efficient attention networks. arXiv preprint arXiv:2207.05844, 2022. 1

  51. [51]

    arXiv preprint arXiv:2106.08417 (2021)

    Jiquan Ngiam, Benjamin Caine, Vijay Vasudevan, Zheng- dong Zhang, Hao-Tien Lewis Chiang, Jeffrey Ling, Rebecca Roelofs, Alex Bewley, Chenxi Liu, Ashish Venugopal, et al. Scene transformer: A unified architecture for predicting mul- tiple agent trajectories.arXiv preprint arXiv:2106.08417,

  52. [52]

    A train station surveillance system: Challenges and solutions

    Burak Ozer and Marilyn Wolf. A train station surveillance system: Challenges and solutions. InProceedings of the IEEE Conference on Computer Vision and Pattern Recog- nition (CVPR) Workshops, 2014. 1

  53. [53]

    Vt-former: An exploratory study on vehicle trajectory prediction for highway surveil- lance through graph isomorphism and transformer

    Armin Danesh Pazho, Ghazal Alinezhad Noghre, Vinit Katariya, and Hamed Tabkhi. Vt-former: An exploratory study on vehicle trajectory prediction for highway surveil- lance through graph isomorphism and transformer. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2024. 1, 2, 3, 6, 7, 8, 14, 15

  54. [54]

    Covernet: Multimodal behavior prediction using trajectory sets

    Tung Phan-Minh, Elena Corina Grigore, Freddy A Boulton, Oscar Beijbom, and Eric M Wolff. Covernet: Multimodal behavior prediction using trajectory sets. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14074–14083, 2020. 3

  55. [55]

    Cadet: a causal disentanglement approach for robust trajec- tory prediction in autonomous driving

    Mozhgan Pourkeshavarz, Junrui Zhang, and Amir Rasouli. Cadet: a causal disentanglement approach for robust trajec- tory prediction in autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14874–14884, 2024. 1

  56. [56]

    Content-aware input scaling and deep learning computation offloading for low-latency embedded vision

    Omkar Prabhune, Tianen Chen, and Younghyun Kim. Content-aware input scaling and deep learning computation offloading for low-latency embedded vision. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 2218–2226,

  57. [57]

    Ef- ficient motion prediction: A lightweight & accurate trajec- tory prediction model with fast training and inference speed

    Alexander Prutsch, Horst Bischof, and Horst Possegger. Ef- ficient motion prediction: A lightweight & accurate trajec- tory prediction model with fast training and inference speed. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 9411–9417, 2024. 3

  58. [58]

    Imitative non- autoregressive modeling for trajectory forecasting and im- putation

    Mengshi Qi, Jie Qin, Yu Wu, and Yi Yang. Imitative non- autoregressive modeling for trajectory forecasting and im- putation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 2

  59. [59]

    Intelligent highway adaptive lane learning sys- tem in multiple rois of surveillance camera video.IEEE Transactions on Intelligent Transportation Systems, 25(8): 8591–8601, 2024

    Mei Qiu, Lauren Christopher, Stanley Yung-Ping Chien, and Yaobin Chen. Intelligent highway adaptive lane learning sys- tem in multiple rois of surveillance camera video.IEEE Transactions on Intelligent Transportation Systems, 25(8): 8591–8601, 2024. 1

  60. [60]

    Mlperf inference benchmark

    Vijay Janapa Reddi et al. Mlperf inference benchmark. In Proceedings of the ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), 2020. 1

  61. [61]

    Trajectron++: Dynamically-feasible trajec- tory forecasting with heterogeneous data

    Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, and Marco Pavone. Trajectron++: Dynamically-feasible trajec- tory forecasting with heterogeneous data. InEuropean Con- ference on Computer Vision (ECCV), 2020. 1

  62. [62]

    Tra- jectory unified transformer for pedestrian trajectory predic- tion

    Liushuai Shi, Le Wang, Sanping Zhou, and Gang Hua. Tra- jectory unified transformer for pedestrian trajectory predic- tion. InProceedings of the IEEE/CVF International Confer- ence on Computer Vision (ICCV), pages 9675–9684, 2023. 3

  63. [63]

    Motion transformer with global intention localization and local movement refinement.arXiv preprint arXiv:2209.13508, 2022

    Shaoshuai Shi, Li Jiang, Dengxin Dai, and Bernt Schiele. Motion transformer with global intention localization and lo- cal movement refinement.arXiv preprint arXiv:2209.13508,

  64. [64]

    Edge ai: A sur- vey.Internet of Things and Cyber-Physical Systems, 3:71– 92, 2023

    Raghubir Singh and Sukhpal Singh Gill. Edge ai: A sur- vey.Internet of Things and Cyber-Physical Systems, 3:71– 92, 2023. 1

  65. [65]

    Traffic surveillance camera calibration by 3d model bounding box alignment for accurate vehicle speed measurement.arXiv preprint arXiv:1702.06451, 2017

    Jakub Sochor, Roman Jur ´anek, and Adam Herout. Traffic surveillance camera calibration by 3d model bounding box alignment for accurate vehicle speed measurement.arXiv preprint arXiv:1702.06451, 2017. 2

  66. [66]

    Scalability in perception for autonomous driving: Waymo open dataset

    Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, et al. Scalability in perception for autonomous driving: Waymo open dataset. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020. 2

  67. [67]

    Benchmarking deep learning models on nvidia jetson nano for real-time systems: An empirical investigation.arXiv preprint arXiv:2406.17749, 2024

    Tushar Prasanna Swaminathan, Christopher Silver, and Thangarajah Akilan. Benchmarking deep learning models on nvidia jetson nano for real-time systems: An empirical investigation.arXiv preprint arXiv:2406.17749, 2024. 1

  68. [68]

    Hpnet: Dynamic trajectory fore- casting with historical prediction attention

    Xiaolong Tang, Meina Kan, Shiguang Shan, Zhilong Ji, Jin- feng Bai, and Xilin Chen. Hpnet: Dynamic trajectory fore- casting with historical prediction attention. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15261–15270, 2024. 1

  69. [69]

    Cityflow: A city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification

    Zheng Tang, Milind Naphade, Ming-Yu Liu, Xiaodong Yang, Stan Birchfield, Shuo Wang, Ratnesh Kumar, David Anastasiu, and Jenq-Neng Hwang. Cityflow: A city-scale benchmark for multi-target multi-camera vehicle tracking and re-identification. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 1, 2

  70. [70]

    Attention is all you need.Advances in neural information processing systems, 30, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 3

  71. [71]

    Graph at- tention networks

    Petar Veli ˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li `o, and Yoshua Bengio. Graph at- tention networks. InInternational Conference on Learning Representations (ICLR), 2018. Poster. 3

  72. [72]

    etram: Event-based traffic monitoring dataset

    Aayush Atul Verma, Bharatesh Chakravarthi, Arpitsinh Vaghela, Hua Wei, and Yezhou Yang. etram: Event-based traffic monitoring dataset. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 22637–22646, 2024. 1

  73. [73]

    Unsupervised anomaly detection for traffic surveil- lance based on background modeling

    JiaYi Wei, JianFei Zhao, YanYun Zhao, and ZhiCheng Zhao. Unsupervised anomaly detection for traffic surveil- lance based on background modeling. InProceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 129–136, 2018. 1

  74. [74]

    Ua-detrac: A new benchmark and protocol for multi-object detection and tracking.Computer Vision and Image Understanding, 193:102907, 2020

    Longyin Wen, Dawei Du, Zhaowei Cai, Zhen Lei, Ming- Ching Chang, Honggang Qi, Jongwoo Lim, Ming-Hsuan Yang, and Siwei Lyu. Ua-detrac: A new benchmark and protocol for multi-object detection and tracking.Computer Vision and Image Understanding, 193:102907, 2020. 1, 2

  75. [75]

    Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

    Benjamin Wilson, William Qi, Tanmay Agarwal, John Lambert, Jagjeet Singh, Siddhesh Khandelwal, Bowen Pan, Ratnesh Kumar, Andrew Hartnett, Jhony Kaesemodel Pontes, et al. Argoverse 2: Next generation datasets for self-driving perception and forecasting.arXiv preprint arXiv:2301.00493, 2023. 2

  76. [76]

    Adapting to length shift: Flexilength network for trajectory prediction

    Yi Xu and Yun Fu. Adapting to length shift: Flexilength network for trajectory prediction. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15226–15237, 2024. 1

  77. [77]

    Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting

    Ye Yuan, Xinshuo Weng, Yanglan Ou, and Kris M Kitani. Agentformer: Agent-aware transformers for socio-temporal multi-agent forecasting. InProceedings of the IEEE/CVF international conference on computer vision, pages 9813– 9823, 2021. 2

  78. [78]

    Oostraj: Out-of-sight trajectory prediction with vision-positioning denoising

    Haichao Zhang, Yi Xu, Hongsheng Lu, Takayuki Shimizu, and Yun Fu. Oostraj: Out-of-sight trajectory prediction with vision-positioning denoising. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14802–14811, 2024. 1

  79. [79]

    Real-time motion prediction via het- erogeneous polyline transformer with relative pose encod- ing

    Zhejun Zhang, Alexander Liniger, Christos Sakaridis, Fisher Yu, and Luc Van Gool. Real-time motion prediction via het- erogeneous polyline transformer with relative pose encod- ing. InAdvances in Neural Information Processing Systems (NeurIPS), 2023. 1, 3

  80. [80]

    Tnt: Target-driven trajectory prediction

    Hang Zhao et al. Tnt: Target-driven trajectory prediction. arXiv preprint arXiv:2008.08294, 2020. 3

Showing first 80 references.