pith. sign in

arxiv: 2604.12425 · v1 · submitted 2026-04-14 · 💻 cs.LG

Forecasting the Past: Gradient-Based Distribution Shift Detection in Trajectory Prediction

Pith reviewed 2026-05-10 15:35 UTC · model grok-4.3

classification 💻 cs.LG
keywords predictionshiftstrajectorydistributionforecastingdecoderdetectiondistributional
0
0 comments X

The pith

A gradient-based score from a post-hoc decoder trained to forecast the second half of trajectories detects distribution shifts without changing the original prediction model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes training an auxiliary decoder on the self-supervised task of predicting the latter part of observed trajectories from the earlier part. The L2 norm of the gradient of this forecasting loss relative to the decoder's final layer serves as an indicator for when the input distribution has shifted. This method leaves the main trajectory prediction model untouched, preserving its performance, and shows better detection of shifts on the Shifts and Argoverse datasets compared to prior approaches. It also applies to spotting potential collisions early in a motion planner. Readers should care because trajectory predictors in driving systems can fail dangerously under new conditions, and this offers a lightweight way to flag those cases.

Core claim

The central claim is that the L2 norm of the gradient of an auxiliary self-supervised forecasting loss with respect to the decoder's final layer provides an effective score for detecting distribution shifts in trajectory prediction tasks, achieving substantial improvements on benchmark datasets while ensuring no interference with the original model's performance.

What carries the argument

The L2 norm of the gradient of the auxiliary forecasting loss with respect to the decoder's final layer, which acts as a distribution shift score.

Load-bearing premise

The L2 norm of the gradient of the auxiliary forecasting loss reliably indicates distribution shifts that matter for the downstream trajectory prediction task.

What would settle it

If experiments on the Shifts or Argoverse datasets show that the proposed gradient norm score does not outperform existing distribution shift detection methods in terms of detection accuracy or AUROC, the claim would be falsified.

Figures

Figures reproduced from arXiv: 2604.12425 by Julian Wiederer, Michele De Vita, Vasileios Belagiannis.

Figure 1
Figure 1. Figure 1: Instead of (a) reconstructing the entire past trajectory [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our post-hoc gradient-based distribution [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative example of our gradient-based distribution shift detection method. The figure shows trajectory samples from in [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Highway simulator scenarios depicting a merge crash, a roundabout crash, and normal roundabout navigation. We mark the start [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Gradient Distribution for In-Distribution vs. Out-Of [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Kernel density estimation (KDE) [8] of the last-layer gradients in the Highway environment [21] for the intersection driving task. We observed very distinct gradients between ID on OOD samples, leading to almost perfect collision detection. landscape. On Argoverse, OOD samples often show lower gradient norms than ID samples. We hypothesize from our experi￾ments that this failure is due to training failure … view at source ↗
Figure 9
Figure 9. Figure 9: Gradient vs loss-based OOD detection on the Shifts [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗
read the original abstract

Trajectory prediction models often fail in real-world automated driving due to distributional shifts between training and test conditions. Such distributional shifts, whether behavioural or environmental, pose a critical risk by causing the model to make incorrect forecasts in unfamiliar situations. We propose a self-supervised method that trains a decoder in a post-hoc fashion on the self-supervised task of forecasting the second half of observed trajectories from the first half. The L2 norm of the gradient of this forecasting loss with respect to the decoder's final layer defines a score to identify distribution shifts. Our approach, first, does not affect the trajectory prediction model, ensuring no interference with original prediction performance and second, demonstrates substantial improvements on distribution shift detection for trajectory prediction on the Shifts and Argoverse datasets. Moreover, we show that this method can also be used to early detect collisions of a deep Q-Network motion planner in the Highway simulator. Source code is available at https://github.com/Michedev/forecasting-the-past.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces a post-hoc self-supervised approach for distribution shift detection in trajectory prediction models. An auxiliary decoder is trained to forecast the second half of a trajectory from the first half; the L2 norm of the gradient of this auxiliary loss with respect to the decoder's final layer is used as a shift-detection score. The method is asserted to leave the original predictor untouched and to yield substantial improvements on the Shifts and Argoverse datasets, with an additional demonstration on early collision detection for a DQN planner in the Highway simulator.

Significance. If the auxiliary gradient norm reliably flags shifts that degrade the main predictor, the approach would supply a lightweight, non-intrusive monitoring tool for deployed trajectory models in autonomous driving. The post-hoc, self-supervised construction avoids any interference with original performance and re-uses existing trajectory data, which are practical advantages. Source-code release further supports reproducibility.

major comments (3)
  1. [Experiments] The central claim that the L2 gradient norm on the auxiliary forecasting loss detects shifts relevant to the original predictor rests on an untested alignment between auxiliary sensitivity and main-task failure modes. No analysis is provided showing correlation between high detection scores and elevated ADE/FDE on the untouched trajectory model under the reported shifts (Experiments section).
  2. [Section 4] Quantitative results for the claimed 'substantial improvements' on Shifts and Argoverse are not summarized with baselines, error bars, or ablation details on the auxiliary task and chosen gradient layer, making it impossible to assess whether the gains are robust or merely reflect the auxiliary decoder's own sensitivity (Section 4).
  3. [Application to DQN planner] The extension to early collision detection in the DQN Highway planner requires clarification on how the auxiliary decoder and gradient score are adapted to the planner's state representation and what quantitative metric defines 'early' detection (final application paragraph).
minor comments (2)
  1. [Abstract] Abstract: the phrasing 'first, does not affect... and second, demonstrates' is grammatically awkward and should be restructured for clarity.
  2. [Method] Notation for the auxiliary loss and the exact parameters θ (decoder final layer) should be defined explicitly in the method section to avoid ambiguity when readers reproduce the gradient computation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline revisions to strengthen the presentation of our results and claims.

read point-by-point responses
  1. Referee: [Experiments] The central claim that the L2 gradient norm on the auxiliary forecasting loss detects shifts relevant to the original predictor rests on an untested alignment between auxiliary sensitivity and main-task failure modes. No analysis is provided showing correlation between high detection scores and elevated ADE/FDE on the untouched trajectory model under the reported shifts (Experiments section).

    Authors: We agree that an explicit analysis correlating the auxiliary gradient-norm scores with degradation in the original predictor's ADE/FDE would more directly substantiate the relevance of the detected shifts. The current validation relies on improved AUROC for shift detection on the Shifts and Argoverse benchmarks, where the shifts are known to impact trajectory prediction performance. To address this point, we will add a new analysis (including scatter plots or correlation coefficients) in the Experiments section of the revised manuscript that quantifies the relationship between high detection scores and elevated prediction errors on the frozen main model. revision: yes

  2. Referee: [Section 4] Quantitative results for the claimed 'substantial improvements' on Shifts and Argoverse are not summarized with baselines, error bars, or ablation details on the auxiliary task and chosen gradient layer, making it impossible to assess whether the gains are robust or merely reflect the auxiliary decoder's own sensitivity (Section 4).

    Authors: The results in Section 4 compare our method against multiple distribution-shift detection baselines and report AUROC improvements. However, we acknowledge that the presentation would benefit from error bars across random seeds and expanded ablations on auxiliary-task hyperparameters and gradient-layer selection. In the revision we will include these elements (error bars from at least five runs and additional ablation tables) to demonstrate robustness and rule out sensitivity artifacts. revision: yes

  3. Referee: [Application to DQN planner] The extension to early collision detection in the DQN Highway planner requires clarification on how the auxiliary decoder and gradient score are adapted to the planner's state representation and what quantitative metric defines 'early' detection (final application paragraph).

    Authors: The auxiliary decoder is trained on the same state trajectories used by the DQN planner (position, velocity, and heading sequences extracted from the simulator). The self-supervised forecasting task and gradient-norm computation are applied identically to the trajectory-prediction case. 'Early' detection is quantified by the number of timesteps before a collision at which the score exceeds a threshold, together with the resulting reduction in collision rate when the score triggers a safety intervention. We will expand the final paragraph with these details and add a short table of lead-time and collision-rate metrics in the revised manuscript. revision: partial

Circularity Check

0 steps flagged

No circularity: auxiliary gradient norm defined independently of main-task performance

full rationale

The paper defines its distribution-shift score directly as the L2 norm of the gradient of a post-hoc auxiliary self-supervised forecasting loss (second-half trajectory from first half) with respect to the decoder's final layer. This construction is independent of the original trajectory predictor's ADE/FDE or any fitted parameters from the main task. No equations reduce the score to a self-referential quantity, a fitted input renamed as prediction, or a self-citation chain. The method is explicitly post-hoc and non-interfering. Evaluation on Shifts and Argoverse is empirical comparison, not a derivation that collapses to the inputs by construction. This is the normal non-circular case.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard machine-learning assumptions about the feasibility of self-supervised training on trajectory splits and the informativeness of gradients for shift detection; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Observed trajectories can be split into first and second halves that support meaningful self-supervised forecasting.
    Invoked in the definition of the auxiliary task.

pith-pipeline@v0.9.0 · 5470 in / 1213 out tokens · 47203 ms · 2026-05-10T15:35:03.286900+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 2 internal anchors

  1. [1]

    Curb Your Attention: Causal Attention Gating for Robust Trajectory Prediction in Autonomous Driving,

    Ehsan Ahmadi, Ray Mercurius, Soheil Alizadeh, Kasra Rezaee, and Amir Rasouli. Curb your attention: Causal at- tention gating for robust trajectory prediction in autonomous driving.arXiv preprint arXiv:2410.07191, 2024. 4

  2. [2]

    Adapt: Efficient multi-agent trajectory prediction with adaptation

    G ¨orkay Aydemir, Adil Kaan Akan, and Fatma G¨uney. Adapt: Efficient multi-agent trajectory prediction with adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 8295–8305, 2023. 2

  3. [3]

    Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom

    Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh V ora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Gi- ancarlo Baldan, and Oscar Beijbom. nuscenes: A multi- modal dataset for autonomous driving. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 2

  4. [4]

    Livingston McPherson, and Kather- ine Driggs-Campbell

    Neeloy Chakraborty, Aamir Hasan, Shuijing Liu, Tianchen Ji, Weihang Liang, D. Livingston McPherson, and Kather- ine Driggs-Campbell. Structural attention-based recurrent variational autoencoder for highway vehicle anomaly de- tection. InProceedings of the 2023 International Confer- ence on Autonomous Agents and Multiagent Systems, page 1125–1134, Richland...

  5. [5]

    Argo- verse: 3d tracking and forecasting with rich maps

    Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jag- jeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, and James Hays. Argo- verse: 3d tracking and forecasting with rich maps. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. 2

  6. [6]

    Argov- erse: 3d tracking and forecasting with rich maps

    Ming-Fang Chang, John W Lambert, Patsorn Sangkloy, Jag- jeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, and James Hays. Argov- erse: 3d tracking and forecasting with rich maps. InConfer- ence on Computer Vision and Pattern Recognition (CVPR),

  7. [7]

    S2tnet: Spatio-temporal transformer networks for trajectory predic- tion in autonomous driving

    Weihuang Chen, Fangfang Wang, and Hongbin Sun. S2tnet: Spatio-temporal transformer networks for trajectory predic- tion in autonomous driving. InProceedings of The 13th Asian Conference on Machine Learning, pages 454–469. PMLR, 2021. 2

  8. [8]

    A tutorial on kernel density estimation and recent advances.Biostatistics & Epidemiology, 1(1):161– 187, 2017

    Yen-Chi Chen. A tutorial on kernel density estimation and recent advances.Biostatistics & Epidemiology, 1(1):161– 187, 2017. 7

  9. [9]

    Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks

    Zhao Chen, Vijay Badrinarayanan, Chen-Yu Lee, and An- drew Rabinovich. Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks. InIn- ternational conference on machine learning, pages 794–803. PMLR, 2018. 3

  10. [10]

    End-to-end driving via conditional imitation learning

    Felipe Codevilla, Matthias M ¨uller, Antonio L ´opez, Vladlen Koltun, and Alexey Dosovitskiy. End-to-end driving via conditional imitation learning. In2018 IEEE international conference on robotics and automation (ICRA), pages 4693–

  11. [11]

    Grood: Gradient-aware out-of- distribution detection.Transactions on Machine Learning Research, 2024

    Mostafa ElAraby, Sabyasachi Sahoo, Yann Pequignot, Paul Novello, and Liam Paull. Grood: Gradient-aware out-of- distribution detection.Transactions on Machine Learning Research, 2024. 4

  12. [12]

    Qi, Yin Zhou, Zoey Yang, Aur ´elien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander Mc- Cauley, Jonathon Shlens, and Dragomir Anguelov

    Scott Ettinger, Shuyang Cheng, Benjamin Caine, Chenxi Liu, Hang Zhao, Sabeek Pradhan, Yuning Chai, Ben Sapp, Charles R. Qi, Yin Zhou, Zoey Yang, Aur ´elien Chouard, Pei Sun, Jiquan Ngiam, Vijay Vasudevan, Alexander Mc- Cauley, Jonathon Shlens, and Dragomir Anguelov. Large scale interactive motion forecasting for autonomous driv- ing: The waymo open motion...

  13. [13]

    Unitraj: A unified framework for scalable vehi- cle trajectory prediction

    Lan Feng, Mohammadhossein Bahari, Kaouther Mes- saoud Ben Amor, ´Eloi Zablocki, Matthieu Cord, and Alexan- dre Alahi. Unitraj: A unified framework for scalable vehi- cle trajectory prediction. InComputer Vision – ECCV 2024, pages 106–123, Cham, 2025. Springer Nature Switzerland. 2

  14. [14]

    Can au- tonomous vehicles identify, recover from, and adapt to dis- tribution shifts? InInternational Conference on Machine Learning (ICML), 2020

    Angelos Filos, Panagiotis Tigas, Rowan McAllister, Nicholas Rhinehart, Sergey Levine, and Yarin Gal. Can au- tonomous vehicles identify, recover from, and adapt to dis- tribution shifts? InInternational Conference on Machine Learning (ICML), 2020. 2

  15. [15]

    Dropout as a bayesian approximation: Representing model uncertainty in deep learning

    Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. Ininternational conference on machine learning, pages 1050–1059. PMLR, 2016. 2

  16. [16]

    Multi- transmotion: Pre-trained model for human motion predic- tion

    Yang Gao, Po-Chien Luan, and Alexandre Alahi. Multi- transmotion: Pre-trained model for human motion predic- tion. In8th Annual Conference on Robot Learning, 2024. 1

  17. [17]

    Uncertainty-aware likelihood ratio estimation for pixel- wise out-of-distribution detection

    Marc H ¨olle, Walter Kellermann, and Vasileios Belagian- nis. Uncertainty-aware likelihood ratio estimation for pixel- wise out-of-distribution detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 772–782, 2025. 2

  18. [18]

    Heatmap- based out-of-distribution detection

    Julia Hornauer and Vasileios Belagiannis. Heatmap- based out-of-distribution detection. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2603–2612, 2023. 1

  19. [19]

    On the impor- tance of gradients for detecting distributional shifts in the wild.Advances in Neural Information Processing Systems, 34:677–689, 2021

    Rui Huang, Andrew Geng, and Yixuan Li. On the impor- tance of gradients for detecting distributional shifts in the wild.Advances in Neural Information Processing Systems, 34:677–689, 2021. 2, 3

  20. [20]

    Simple and scalable predictive uncertainty estima- tion using deep ensembles.Advances in neural information processing systems, 30, 2017

    Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estima- tion using deep ensembles.Advances in neural information processing systems, 30, 2017. 2

  21. [21]

    An environment for autonomous driving decision-making.https://github.com/eleurent/ highway-env, 2018

    Edouard Leurent. An environment for autonomous driving decision-making.https://github.com/eleurent/ highway-env, 2018. 7

  22. [22]

    Difftad: Denoising diffusion prob- abilistic models for vehicle trajectory anomaly detection

    Chaoneng Li, Guanwen Feng, Yunan Li, Ruyi Liu, Qiguang Miao, and Liang Chang. Difftad: Denoising diffusion prob- abilistic models for vehicle trajectory anomaly detection. Knowledge-Based Systems, 286:111387, 2024. 2

  23. [23]

    Shiyu Liang, Yixuan Li, and R. Srikant. Enhancing the re- liability of out-of-distribution image detection in neural net- works. InInternational Conference on Learning Represen- tations, 2018. 2

  24. [24]

    Dyttp: Trajectory predic- tion with normalization-free transformers, 2025

    Yunxiang Liu and Hongkuo Niu. Dyttp: Trajectory predic- tion with normalization-free transformers, 2025. 2

  25. [25]

    Shifts: A dataset of real distributional shift across multiple large-scale tasks.arXiv preprint arXiv:2107.07455, 2021

    Andrey Malinin, Neil Band, German Chesnokov, Yarin Gal, Mark JF Gales, Alexey Noskov, Andrey Ploskonosov, Li- udmila Prokhorenkova, Ivan Provilkov, Vatsal Raina, et al. Shifts: A dataset of real distributional shift across multiple large-scale tasks.arXiv preprint arXiv:2107.07455, 2021. 2, 4, 5, 6, 7, 8

  26. [26]

    Evidential uncertainty estima- tion for multi-modal trajectory prediction.arXiv preprint arXiv:2503.05274, 2025

    Sajad Marvi, Christoph Rist, Julian Schmidt, Julian Jor- dan, and Abhinav Valada. Evidential uncertainty estima- tion for multi-modal trajectory prediction.arXiv preprint arXiv:2503.05274, 2025. 2

  27. [27]

    Vt-former: An exploratory study on vehicle trajectory prediction for highway surveil- lance through graph isomorphism and transformer

    Armin Danesh Pazho, Ghazal Alinezhad Noghre, Vinit Katariya, and Hamed Tabkhi. Vt-former: An exploratory study on vehicle trajectory prediction for highway surveil- lance through graph isomorphism and transformer. InPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pages 5651– 5662, 2024. 2

  28. [28]

    Stable- baselines3: Reliable reinforcement learning implementa- tions.Journal of Machine Learning Research, 22(268):1–8,

    Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kan- ervisto, Maximilian Ernestus, and Noah Dormann. Stable- baselines3: Reliable reinforcement learning implementa- tions.Journal of Machine Learning Research, 22(268):1–8,

  29. [29]

    Deep imitative models for flexible inference, planning, and control

    Nicholas Rhinehart, Rowan McAllister, and Sergey Levine. Deep imitative models for flexible inference, planning, and control. InInternational Conference on Learning Represen- tations, 2020. 2, 4, 5

  30. [30]

    Learning representations by back-propagating er- rors.nature, 323(6088):533–536, 1986

    David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learning representations by back-propagating er- rors.nature, 323(6088):533–536, 1986. 4

  31. [31]

    Meat: Maneuver extraction from agent trajectories

    Julian Schmidt, Julian Jordan, David Raba, Tobias Welz, and Klaus Dietmayer. Meat: Maneuver extraction from agent trajectories. In2022 IEEE Intelligent Vehicles Symposium (IV), pages 1810–1816. IEEE, 2022. 4

  32. [32]

    Proximal Policy Optimization Algorithms

    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Rad- ford, and Oleg Klimov. Proximal policy optimization algo- rithms.arXiv preprint arXiv:1707.06347, 2017. 7, 8

  33. [33]

    Safeshift: Safety- informed distribution shifts for robust trajectory prediction in autonomous driving

    Benjamin Stoler, Ingrid Navarro, Meghdeep Jana, Soonmin Hwang, Jonathan Francis, and Jean Oh. Safeshift: Safety- informed distribution shifts for robust trajectory prediction in autonomous driving. In2024 IEEE Intelligent Vehicles Symposium (IV), pages 1179–1186, 2024. 5

  34. [34]

    Attention is all you need.Advances in neural information processing systems, 30, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 2, 5

  35. [35]

    Jointmotion: Joint self-supervision for joint mo- tion prediction

    Royden Wagner, Omer Sahin Tas, Marvin Klemp, and Carlos Fernandez. Jointmotion: Joint self-supervision for joint mo- tion prediction. In8th Annual Conference on Robot Learn- ing, 2024. 2

  36. [36]

    Anomaly detection in multi- agent trajectories for automated driving

    Julian Wiederer, Arij Bouazizi, Marco Troina, Ulrich Kres- sel, and Vasileios Belagiannis. Anomaly detection in multi- agent trajectories for automated driving. InConference on Robot Learning, 2021. 1

  37. [37]

    Anomaly detection in multi- agent trajectories for automated driving

    Julian Wiederer, Arij Bouazizi, Marco Troina, Ulrich Kres- sel, and Vasileios Belagiannis. Anomaly detection in multi- agent trajectories for automated driving. InProceedings of the 5th Conference on Robot Learning, pages 1223–1233. PMLR, 2022. 2, 4

  38. [38]

    Joint out-of-distribution detection and uncertainty estimation for trajectory predic- tion

    Julian Wiederer, Julian Schmidt, Ulrich Kressel, Klaus Diet- mayer, and Vasileios Belagiannis. Joint out-of-distribution detection and uncertainty estimation for trajectory predic- tion. In2023 IEEE/RSJ International Conference on Intelli- gent Robots and Sytems (IROS), 2023. 2, 4, 5

  39. [39]

    Argoverse 2: Next generation datasets for self-driving perception and fore- casting, 2023

    Benjamin Wilson, William Qi, Tanmay Agarwal, John Lam- bert, Jagjeet Singh, Siddhesh Khandelwal, Bowen Pan, Rat- nesh Kumar, Andrew Hartnett, Jhony Kaesemodel Pontes, Deva Ramanan, Peter Carr, and James Hays. Argoverse 2: Next generation datasets for self-driving perception and fore- casting, 2023. 2

  40. [40]

    Improving out-of-distribution generalization of trajectory prediction for autonomous driv- ing via polynomial representations

    Yue Yao, Shengchao Yan, Daniel Goehring, Wolfram Bur- gard, and Joerg Reichardt. Improving out-of-distribution generalization of trajectory prediction for autonomous driv- ing via polynomial representations. In2024 IEEE/RSJ In- ternational Conference on Intelligent Robots and Systems (IROS), pages 488–495, 2024. 2, 4

  41. [41]

    Agents-llm: Augmenta- tive generation of challenging traffic scenarios with an agen- tic llm framework

    Yu Yao, Salil Bhatnagar, Markus Mazzola, Vasileios Belagiannis, Igor Gilitschenski, Luigi Palmieri, Simon Razniewski, and Marcel Hallgarten. Agents-llm: Augmenta- tive generation of challenging traffic scenarios with an agen- tic llm framework. In2025 IEEE/RSJ International Con- ference on Intelligent Robots and Systems (IROS), pages 18400–18407, 2025. 2

  42. [42]

    INTERACTION dataset: An international, adversarial and cooperative motion dataset in interactive driving scenarios with semantic maps,

    Wei Zhan, Liting Sun, Di Wang, Haojie Shi, Aubrey Clausse, Maximilian Naumann, Julius K ¨ummerle, Hendrik K¨onigshof, Christoph Stiller, Arnaud de La Fortelle, and Masayoshi Tomizuka. INTERACTION Dataset: An IN- TERnational, Adversarial and Cooperative moTION Dataset in Interactive Driving Scenarios with Semantic Maps. arXiv:1910.03088 [cs, eess], 2019. 2

  43. [43]

    Gradient rectification for robust calibration under distribu- tion shift.arXiv preprint arXiv:2508.19830, 2025

    Yilin Zhang, Cai Xu, You Wu, Ziyu Guan, and Wei Zhao. Gradient rectification for robust calibration under distribu- tion shift.arXiv preprint arXiv:2508.19830, 2025. 2

  44. [44]

    Hivt: Hierarchical vector transformer for multi-agent motion prediction

    Zikang Zhou, Luyao Ye, Jianping Wang, Kui Wu, and Ke- jie Lu. Hivt: Hierarchical vector transformer for multi-agent motion prediction. In2022 IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 8813– 8823, 2022. 2, 4, 7