pith. sign in

arxiv: 2606.30777 · v1 · pith:AETP2NB3new · submitted 2026-06-29 · 💻 cs.CV

Unveiling Transferability in Trajectory Prediction via Latent Scene Embeddings

Pith reviewed 2026-07-01 06:23 UTC · model grok-4.3

classification 💻 cs.CV
keywords trajectory predictiontransferabilitylatent embeddingsdataset similaritymotion predictioncross-dataset performancescene representationsdistributional metrics
0
0 comments X

The pith

Latent embeddings of entire trajectory datasets produce similarity scores that correlate with how well models transfer across different scenes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a framework that encodes whole datasets into latent representations capturing scene layouts, behaviors, and sensing differences. Distributional metrics then turn these representations into transferability scores between any pair of datasets. When tested on 24 major motion-prediction benchmarks, the scores line up closely with actual performance drops observed when models are trained on one dataset and evaluated on another. The approach supplies a practical substitute for exhaustive cross-training experiments when choosing data for pretraining or building foundation models.

Core claim

A framework that learns latent representations of datasets and quantifies their similarity using distributional metrics produces transferability scores that strongly correlate with cross-dataset model performance in trajectory prediction.

What carries the argument

A framework that learns latent representations of entire datasets and quantifies similarity using distributional metrics.

If this is right

  • Transferability scores can guide which datasets to combine for training without running full cross-experiments.
  • Pretraining data can be selected by ranking datasets according to their embedding similarity to the target domain.
  • Large-scale foundation models for motion prediction can use the scores to prioritize compatible source data.
  • Predictive systems become more robust by avoiding training on datasets whose embeddings indicate large domain gaps.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same embedding approach could be applied to select data for other sequence prediction tasks such as video forecasting.
  • If embeddings also reflect sensor-specific artifacts, the scores might help diagnose failures caused by hardware differences rather than scene content.
  • Dataset curators could use the method to flag redundant collections that add little new information beyond existing benchmarks.

Load-bearing premise

Latent embeddings of entire datasets capture the key differences in scene layouts, agent behaviors, and sensing conditions that drive transferability failures.

What would settle it

Running the embedding procedure on a fresh collection of datasets and models and finding that the resulting similarity scores show no correlation with measured cross-dataset performance would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.30777 by Bj\"orn Olofsson, David Axelsson, Erik Frisk, Theodor Westny.

Figure 1
Figure 1. Figure 1: t-SNE projection of learned scenario embeddings (contours indicate density). Dataset proximity (e.g., ETH/inD, INTERACTION/WOMD) indicates potential for cross-dataset pretraining or knowledge transfer. 2 Related work 2.1 Trajectory prediction Trajectory prediction remains an active area of research as a result of its central role in domains such as autonomous driving and robotics [24,47,92]. Advances in de… view at source ↗
Figure 2
Figure 2. Figure 2: Latent embedding model architecture. An encoder maps input features to node￾level latent variables, which are aggregated into scene- and dataset-level embeddings. Two decoder heads handle feature reconstruction (predicting X) and future state forecasting (predicting Y). Training optimizes a combined reconstruction and forecasting loss. tensor X ∈ R N×H×D, and the output features in Y ∈ R N×F ×D, where D = … view at source ↗
Figure 3
Figure 3. Figure 3: KL divergence DKL(row∥col) between dataset embedding distributions. Entries indicate how well column datasets approximate row datasets. Lower values denote closer alignment, typically correlating with stronger zero-shot transfer from column training datasets to row evaluation datasets. Matrix asymmetry reflects directional differences in dataset coverage and variability. broader clusters. This indicates th… view at source ↗
Figure 4
Figure 4. Figure 4: Two-dimensional t-SNE visualization of the learned scenario embeddings for (a) a subset of the urban datasets (acquired with instrumented vehicles) and (b) highway datasets, with contour lines indicating density. specific data [92], indicating untapped potential in leveraging datasets that are comparatively underused in the motion prediction literature. 5.4 Transferability evaluation To evaluate the interp… view at source ↗
Figure 5
Figure 5. Figure 5: KL divergence versus zero-shot minADE6 transfer performance across all 552 dataset pairs. Larger divergence correlates with worse transferability. 20000 30000 40000 50000 60000 Training steps 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 minADE6 DKL = 588 DKL = 38.5k DKL = 13.0k WOMD pretraining uniD pretraining ApolloScape pretraining No pretraining [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Argoverse fine-tuning from differ￾ent sources. Sources with lower divergence scores yield better adaptation. 100 1000 10000 DKL(DikDj ) (log scale) 0.1 1 ∆ M(Di, Dj ) (log scale) [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: (a) Evolution of coefficient of determination R 2 , Spearman’s ρ, and the number of selected correction terms Nc against the regularization parameter λ. (b) Coefficient paths for all terms in (11). compared with ρ = 0.811 for the KL divergence, with a 95% confidence interval of (0.782, 0.840). This further indicates that the latent representation captures discrepancies that are not adequately described by … view at source ↗
Figure 9
Figure 9. Figure 9: KL divergence DKL(row∥col) between latent embedding distributions of all dataset pairs when WOMD, exiD, and uniD are excluded from training. Each entry measures how well the column dataset can approximate the row dataset in latent space, where lower values indicate better approximation. Datasets excluded from training are highlighted. and most stable Spearman’s ρ, while the L = 16 model performs slightly w… view at source ↗
Figure 10
Figure 10. Figure 10: Evolution of Spearman’s rank correlation coefficient ρ over training epochs for models with latent dimensionalities L ∈ {16, 32, 64, 128}, illustrating how capacity influences the stability and growth of learned correlations. with prior empirical findings [100], which similarly report that highway scenarios are generally easier to predict. One row that stands out is WOMD, which appears to be one of the mo… view at source ↗
Figure 11
Figure 11. Figure 11: Cross-dataset zero-shot minADE6 results for a 3 s prediction horizon. Columns indicate training datasets, and rows indicate evaluation datasets. Lower values reflect better transferability. source. Finally, the relative gap normalizes this difference by the oracle minADE6. The reported values are averaged across the target datasets. As shown in Tab. 6, the KL-based rule selects one of the three best-perfo… view at source ↗
Figure 12
Figure 12. Figure 12: Relationship between Spearman’s rank correlation coefficient ρ and median KL divergence across the evaluated covariance sources, estimators, ranks, and regularization strengths. The displayed frontiers are nondominated under the reporting convention of maximizing ρ and minimizing KL. Stars denote maximum-correlation configurations, while diamonds denote the selected configurations [PITH_FULL_IMAGE:figure… view at source ↗
read the original abstract

The growing availability of trajectory datasets has fueled major advances in data-driven motion prediction. Yet, models trained on one dataset often fail to generalize beyond their training domain as a result of differences in scene layouts, agent behaviors, and sensing conditions. A framework that learns latent representations of datasets and quantifies their similarity using distributional metrics is presented. This large-scale study covers 24 major datasets, including the most widely used motion-prediction benchmarks, and shows that the resulting transferability scores strongly correlate with cross-dataset model performance. The results provide practical guidance for dataset selection, pretraining, and large-scale foundation models for motion prediction, paving the way toward more generalizable and robust predictive systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript presents a framework that learns latent representations of entire trajectory datasets and quantifies pairwise similarity via distributional metrics on these embeddings. A large-scale empirical study across 24 motion-prediction datasets reports that the resulting transferability scores exhibit strong correlation with observed cross-dataset model performance, and the authors argue this supplies practical guidance for dataset selection, pretraining, and foundation-model development.

Significance. If the correlation is robust and the embeddings demonstrably encode the scene-layout, behavioral, and sensing factors that drive transfer failures (rather than superficial statistics), the result would be useful for data curation in trajectory prediction. The scale of the study (24 datasets) is a positive feature. The provided text supplies no methods, equations, ablation studies, or qualitative inspections that would allow verification of these conditions.

major comments (2)
  1. The central claim—that distributional metrics on dataset-level latent embeddings produce transferability scores whose correlation with cross-dataset performance is driven by differences in scene layout, agent behavior, and sensing conditions—rests on an untested assumption. No section supplies a probing experiment, controlled ablation, or qualitative inspection showing that the learned embeddings isolate these factors rather than trajectory-length distributions, point-cloud density, or other superficial statistics.
  2. The abstract states that the scores 'strongly correlate' with cross-dataset performance, yet the manuscript provides neither the correlation coefficient, the number of model–dataset pairs evaluated, nor any statistical controls for confounding variables such as dataset size or annotation quality. Without these details the claim cannot be evaluated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the review and the opportunity to clarify our work. We address the two major comments point by point below.

read point-by-point responses
  1. Referee: The central claim—that distributional metrics on dataset-level latent embeddings produce transferability scores whose correlation with cross-dataset performance is driven by differences in scene layout, agent behavior, and sensing conditions—rests on an untested assumption. No section supplies a probing experiment, controlled ablation, or qualitative inspection showing that the learned embeddings isolate these factors rather than trajectory-length distributions, point-cloud density, or other superficial statistics.

    Authors: The framework is trained directly on raw trajectory data from the 24 datasets, and the resulting embeddings are evaluated solely through their ability to predict observed transfer gaps; the strong empirical correlation therefore provides indirect support that the embeddings reflect the factors driving those gaps. We nevertheless agree that direct evidence would be stronger and will add controlled ablations (e.g., length-matched subsets, density-normalized inputs) together with qualitative embedding visualizations in the revised manuscript. revision: yes

  2. Referee: The abstract states that the scores 'strongly correlate' with cross-dataset performance, yet the manuscript provides neither the correlation coefficient, the number of model–dataset pairs evaluated, nor any statistical controls for confounding variables such as dataset size or annotation quality. Without these details the claim cannot be evaluated.

    Authors: We will revise both the abstract and the results section to report the precise Pearson/Spearman coefficient, the exact count of model–dataset transfer pairs used in the correlation analysis, and additional regression controls that account for dataset size and annotation quality. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical correlation between learned embeddings and transfer performance stands as independent observation

full rationale

The paper introduces a framework that learns latent dataset embeddings and applies distributional metrics to produce transferability scores, then reports an empirical correlation of those scores with observed cross-dataset model performance across 24 datasets. No equations, fitting procedures, or self-citations are described that would make the transferability scores or the correlation tautological by construction. The central result is an observed statistical relationship rather than a derived identity, and the abstract supplies no load-bearing self-referential step that reduces the claimed prediction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities can be identified from the provided text.

pith-pipeline@v0.9.1-grok · 5645 in / 990 out tokens · 29684 ms · 2026-07-01T06:23:16.376964+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

119 extracted references · 15 canonical work pages · 7 internal anchors

  1. [1]

    In: IEEE/CVF Int

    Achille, A., Lam, M., Tewari, R., Ravichandran, A., Maji, S., Fowlkes, C.C., Soatto, S., Perona, P.: Task2Vec: Task embedding for meta-learning. In: IEEE/CVF Int. Conf. Comput. Vision. pp. 6430–6439 (2019)

  2. [2]

    In: IEEE Conf

    Alahi, A., Goel, K., Ramanathan, V., Robicquet, A., Fei-Fei, L., Savarese, S.: Social LSTM: Human trajectory prediction in crowded spaces. In: IEEE Conf. Comput. Vision Pattern Recog. pp. 961–971 (2016)

  3. [3]

    Amirian, J., Hayet, J.B., Pettré, J.: Social ways: Learning multi-modal distributions ofpedestriantrajectorieswithGANs.In:IEEE/CVFConf.Comput.VisionPattern Recog. (2019)

  4. [4]

    In: Asian Conf

    Amirian, J., Zhang, B., Castro, F.V., Baldelomar, J.J., Hayet, J.B., Pettré, J.: OpenTraj: Assessing prediction complexity in human trajectories datasets. In: Asian Conf. Comput. Vision (2020)

  5. [5]

    Layer Normalization

    Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)

  6. [6]

    In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

    Bae, I., Park, Y.J., Jeon, H.G.: SingularTrajectory: Universal trajectory predictor using diffusion model. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2024)

  7. [7]

    Communic

    Berghaus, M., Lamberty, S., Ehlers, J., Kalló, E., Oeser, M.: Vehicle trajectory dataset from drone videos including off-ramp and congested traffic–analysis of data quality, traffic flow, and accident risk. Communic. in Transport. Research4, 100133 (2024)

  8. [8]

    Springer (2024)

    Bishop, C.M., Bishop, H.: Deep learning. Springer (2024)

  9. [9]

    In: IEEE Intell

    Bock, J., Krajewski, R., Moers, T., Runde, S., Vater, L., Eckstein, L.: The inD dataset: A drone dataset of naturalistic road user trajectories at German intersections. In: IEEE Intell. Veh. Symp. pp. 1929–1934 (2020)

  10. [10]

    IEEE Robot

    Boekema, H.J.H., Martens, B.K., Kooij, J.F., Gavrila, D.M.: Multi-class trajectory prediction in urban traffic using the view-of-delft prediction dataset. IEEE Robot. Automat. Lett.9(5), 4806–4813 (2024)

  11. [11]

    On the Opportunities and Risks of Foundation Models

    Bommasani, R., Hudson, D.A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M.S., Bohg, J., Bosselut, A., Brunskill, E., et al.: On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021)

  12. [12]

    In: IEEE Intell

    Breuer, A., Termöhlen, J.A., Homoceanu, S., Fingscheidt, T.: openDD: A large- scale roundabout drone dataset. In: IEEE Intell. Transport. Syst. Conf. pp. 1–6. IEEE (2020)

  13. [13]

    In: IEEE/CVF Conf

    Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., Krishnan, A., Pan, Y., Baldan, G., Beijbom, O.: nuScenes: A multimodal dataset for autonomous driving. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 11621–11631 (2020)

  14. [14]

    In: IEEE/CVF Conf

    Chang, M.F., Lambert, J., Sangkloy, P., Singh, J., Bak, S., Hartnett, A., Wang, D., Carr, P., Lucey, S., Ramanan, D., et al.: Argoverse: 3D tracking and forecasting with rich maps. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 8748– 8757 (2019)

  15. [15]

    In: SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation

    Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder–decoder approaches. In: SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation. pp. 103–111 (2014)

  16. [16]

    In: IEEE Intell

    Choi, Y., Mercurius, R.C., Shabestary, S.M.A., Rasouli, A.: DICE: Diverse diffusion model with scoring for trajectory prediction. In: IEEE Intell. Veh. Symp. pp. 3023– 3029 (2024) Unveiling Transferability in Trajectory Prediction 17

  17. [17]

    Clark, R.: Predicting Transfer Learning Performance Using Dataset Similarity for Time Series Classification of Human Activity Recognition. Ph.D. thesis, McMaster University (2022)

  18. [18]

    In: IEEE Int

    Cui, H., Radosavljevic, V., Chou, F.C., Lin, T.H., Nguyen, T., Huang, T.K., Schneider, J., Djuric, N.: Multimodal trajectory predictions for autonomous driving using deep convolutional networks. In: IEEE Int. Conf. Robot. Automat. pp. 2090– 2096 (2019)

  19. [19]

    In: IEEE Conf

    Deo, N., Trivedi, M.M.: Convolutional social pooling for vehicle trajectory predic- tion. In: IEEE Conf. Comput. Vision Pattern Recog. pp. 1468–1476 (2018)

  20. [20]

    In: Conf

    Deo, N., Wolff, E., Beijbom, O.: Multimodal trajectory prediction conditioned on lane-graph traversals. In: Conf. on Robot Learn. pp. 203–212. PMLR (2022)

  21. [21]

    In: IEEE Intell

    Diehl, F., Brunner, T., Le, M.T., Knoll, A.: Graph neural networks for modelling traffic participant interaction. In: IEEE Intell. Veh. Symp. pp. 695–701 (2019)

  22. [22]

    arXiv preprint arXiv:2404.06198 (2024)

    Ehrig, C., Sonnleitner, B., Neumann, U., Cleophas, C., Forestier, G.: The impact of data set similarity and diversity on transfer learning success in time series forecasting. arXiv preprint arXiv:2404.06198 (2024)

  23. [23]

    In: IEEE/CVF Int

    Ettinger, S., Cheng, S., Caine, B., Liu, C., Zhao, H., Pradhan, S., Chai, Y., Sapp, B., Qi, C.R., Zhou, Y., et al.: Large scale interactive motion forecasting for autonomous driving: The Waymo open motion dataset. In: IEEE/CVF Int. Conf. Comput. Vision. pp. 9710–9719 (2021)

  24. [24]

    IEEE Transactions on Intelligent Transportation Systems25(8), 8334–8355 (2024)

    Fang, J., Wang, F., Xue, J., Chua, T.S.: Behavioral intention prediction in driving scenes: A survey. IEEE Transactions on Intelligent Transportation Systems25(8), 8334–8355 (2024)

  25. [25]

    In: IEEE/CVF Conf

    Feng, K., Li, C., Ren, D., Yuan, Y., Wang, G.: On the road to portability: Compressing end-to-end motion planner for autonomous driving. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 15099–15108 (2024)

  26. [26]

    Feng, L., Bahari, M., Amor, K.M.B., Zablocki, É., Cord, M., Alahi, A.: UniTraj: A unified framework for scalable vehicle trajectory prediction. In: Eur. Conf. Comput. Vision. pp. 106–123. Springer (2024)

  27. [27]

    In: IEEE Int

    Feng, L., Li, Q., Peng, Z., Tan, S., Zhou, B.: TrafficGen: Learning to generate diverse and realistic traffic scenarios. In: IEEE Int. Conf. Robot. Automat. pp. 3567–3575. IEEE (2023)

  28. [28]

    Feng, L., Tung, F., Ahmed, M.O., Bengio, Y., Hajimirsadegh, H.: Were RNNs All We Needed? arXiv preprint arXiv:2410.01201 (2024)

  29. [29]

    In: ICLR Workshop on Representation Learn

    Fey, M., Lenssen, J.E.: Fast graph representation learning with PyTorch Geometric. In: ICLR Workshop on Representation Learn. on Graphs and Manifolds (2019)

  30. [30]

    arXiv preprint arXiv:2402.01105 (2024)

    Gao, H., Wang, Z., Li, Y., Long, K., Yang, M., Shen, Y.: A survey for foundation models in autonomous driving. arXiv preprint arXiv:2402.01105 (2024)

  31. [31]

    IEEE Robot

    Gao, X., Jia, X., Li, Y., Xiong, H.: Dynamic scenario representation learning for motion forecasting with heterogeneous graph convolutional recurrent networks. IEEE Robot. Automat. Lett.8(5), 2946–2953 (2023)

  32. [32]

    IEEE Trans

    Geng, M., Li, J., Li, C., Xie, N., Chen, X., Lee, D.H.: Adaptive and simultane- ous trajectory prediction for heterogeneous agents via transferable hierarchical transformer network. IEEE Trans. Intell. Transport. Syst.24(10), 11479–11492 (2023)

  33. [33]

    In: IEEE Int

    Gilles, T., Sabatini, S., Tsishkou, D., Stanciulescu, B., Moutarde, F.: GOHOME: Graph-oriented heatmap output for future motion estimation. In: IEEE Int. Conf. Robot. Automat. pp. 9107–9114. IEEE (2022)

  34. [34]

    In: Workshop on 18 T

    Gilles, T., Sabatini, S., Tsishkou, D., Stanciulescu, B., Moutarde, F.: Uncertainty estimation for cross-dataset performance in trajectory prediction. In: Workshop on 18 T. Westny et al. Fresh Perspectives on the Future of Autonomous Driving, IEEE Int. Conf. Robot. Automat. (2022)

  35. [35]

    In: 25th international conference on pattern recognition (ICPR)

    Giuliari, F., Hasan, I., Cristani, M., Galasso, F.: Transformer networks for trajec- tory forecasting. In: 25th international conference on pattern recognition (ICPR). pp. 10335–10342. IEEE (2021)

  36. [36]

    An Empirical Investigation of Catastrophic Forgetting in Gradient-Based Neural Networks

    Goodfellow, I.J., Mirza, M., Xiao, D., Courville, A., Bengio, Y.: An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211 (2013)

  37. [37]

    Inter- national journal of computer vision129(6), 1789–1819 (2021)

    Gou, J., Yu, B., Maybank, S.J., Tao, D.: Knowledge distillation: A survey. Inter- national journal of computer vision129(6), 1789–1819 (2021)

  38. [38]

    The Journal of Machine Learning Research13(1), 723–773 (2012)

    Gretton, A., Borgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. The Journal of Machine Learning Research13(1), 723–773 (2012)

  39. [39]

    In: IEEE/CVF Conf

    Gu, T., Chen, G., Li, J., Lin, C., Rao, Y., Zhou, J., Lu, J.: Stochastic trajectory prediction via motion indeterminacy diffusion. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 17113–17122. IEEE/CVF (2022)

  40. [40]

    In: IEEE Conf

    Gupta, A., Johnson, J., Fei-Fei, L., Savarese, S., Alahi, A.: Social GAN: Socially ac- ceptable trajectories with generative adversarial networks. In: IEEE Conf. Comput. Vision Pattern Recog. (2018)

  41. [41]

    In: IEEE/CVF Conf

    Hegde, D., Yasarla, R., Cai, H., Han, S., Bhattacharyya, A., Mahajan, S., Liu, L., Garrepalli, R., Patel, V.M., Porikli, F.: Distilling multi-modal large language models for autonomous driving. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 27575–27585 (2025)

  42. [42]

    Gaussian Error Linear Units (GELUs)

    Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv preprint arXiv:1606.08415 (2016)

  43. [43]

    Distilling the Knowledge in a Neural Network

    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)

  44. [44]

    In: Conference on Robot Learning (CoRL)

    Houston, J., Zuidhof, G., Bergamini, L., Ye, Y., Chen, L., Jain, A., Omari, S., Iglovikov, V., Ondruska, P.: One thousand and one hours: Self-driving motion prediction dataset. In: Conference on Robot Learning (CoRL). pp. 409–418. PMLR (2021)

  45. [45]

    In: IEEE Intell

    Hu, Y., Zhan, W., Tomizuka, M.: Probabilistic prediction of vehicle semantic intention and motion. In: IEEE Intell. Veh. Symp. pp. 307–313 (2018)

  46. [46]

    IEEE Trans

    Hu, Y., Zhan, W., Tomizuka, M.: Scenario-transferable semantic graph reasoning for interaction-aware probabilistic prediction. IEEE Trans. Intell. Transport. Syst. 23(12), 23212–23230 (2022)

  47. [47]

    IEEE Trans

    Huang, Y., Du, J., Yang, Z., Zhou, Z., Zhang, L., Chen, H.: A survey on trajectory- prediction methods for autonomous driving. IEEE Trans. Intell. Veh.7(3), 652–674 (2022)

  48. [48]

    In: IEEE Int

    Huang, Z., Mo, X., Lv, C.: Multi-modal motion prediction with transformer-based neural network for autonomous driving. In: IEEE Int. Conf. Robot. Automat. pp. 2605–2611 (2022)

  49. [49]

    In: IEEE Int

    Ivanovic, B., Harrison, J., Pavone, M.: Expanding the deployment envelope of be- havior prediction via adaptive meta-learning. In: IEEE Int. Conf. Robot. Automat. pp. 7786–7793. IEEE (2023)

  50. [50]

    In: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023)

    Ivanovic, B., Song, G., Gilitschenski, I., Pavone, M.: trajdata: A unified interface to multiple human trajectory datasets. In: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023)

  51. [51]

    In: IEEE Intell

    Jaipuria, N., Habibi, G., How, J.P.: Learning in the curbside coordinate frame for a transferable pedestrian trajectory prediction model. In: IEEE Intell. Transport. Syst. Conf. pp. 3125–3131 (2018) Unveiling Transferability in Trajectory Prediction 19

  52. [52]

    In: IEEE/CVF Conf

    Jiang, C., Cornman, A., Park, C., Sapp, B., Zhou, Y., Anguelov, D., et al.: MotionDiffuser: Controllable multi-agent motion prediction using diffusion. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 9644–9653 (2023)

  53. [53]

    In: IEEE Int

    Kong, H., Xu, J., Gong, S., Yang, J., Zhang, S.: Adaptive pedestrian trajectory prediction via target-directed angle augmentation. In: IEEE Int. Conf. Acoust., Speech, Signal Processing. pp. 4065–4069. IEEE (2024)

  54. [54]

    In: IEEE Intell

    Krajewski, R., Bock, J., Kloeker, L., Eckstein, L.: The highD dataset: A drone dataset of naturalistic vehicle trajectories on German highways for validation of highly automated driving systems. In: IEEE Intell. Transport. Syst. Conf. pp. 2118–2125 (2018)

  55. [55]

    In: IEEE Intell

    Krajewski, R., Moers, T., Bock, J., Vater, L., Eckstein, L.: The rounD dataset: A drone dataset of road user trajectories at roundabouts in Germany. In: IEEE Intell. Transport. Syst. Conf. pp. 1–6 (2020)

  56. [56]

    Deduplicating Training Data Makes Language Models Better

    Lee, K., Ippolito, D., Nystrom, A., Zhang, C., Eck, D., Callison-Burch, C., Carlini, N.: Deduplicating training data makes language models better. arXiv preprint arXiv:2107.06499 (2021)

  57. [57]

    In: Computer graphics forum

    Lerner, A., Chrysanthou, Y., Lischinski, D.: Crowds by example. In: Computer graphics forum. vol. 26, pp. 655–664. Wiley Online Library (2007)

  58. [58]

    com/unid-dataset/(2024), accessed: September 8, 2024

    leveLXData: The uniD Dataset: A university drone dataset.https://levelxdata. com/unid-dataset/(2024), accessed: September 8, 2024

  59. [59]

    IEEE Trans

    Li, J., Ma, H., Zhang, Z., Li, J., Tomizuka, M.: Spatio-temporal graph dual- attention network for multi-agent prediction and tracking. IEEE Trans. Intell. Transport. Syst.23(8), 10556–10569 (2022)

  60. [60]

    Advances in Neural Information Processing Systems (NeurIPS)36, 3894–3920 (2023)

    Li, Q., Peng, Z.M., Feng, L., Liu, Z., Duan, C., Mo, W., Zhou, B.: ScenarioNet: Open-source platform for large-scale traffic scenario simulation and modeling. Advances in Neural Information Processing Systems (NeurIPS)36, 3894–3920 (2023)

  61. [61]

    arXiv preprint arXiv:1907.07792 (2019)

    Li, X., Ying, X., Chuah, M.C.: GRIP++: Enhanced graph-based interaction-aware trajectory prediction for autonomous driving. arXiv preprint arXiv:1907.07792 (2019)

  62. [62]

    Liang, M., Yang, B., Hu, R., Chen, Y., Liao, R., Feng, S., Urtasun, R.: Learning lane graph representations for motion forecasting. In: Eur. Conf. Comput. Vision. pp. 541–556 (2020)

  63. [63]

    Liu, M., Cheng, H., Yang, M.Y.: Tracing the influence of predecessors on trajectory prediction. In: Int. Conf. on Comput. Vision Worksh. (ICCVW). pp. 3245–3255. IEEE/CVF (2023)

  64. [64]

    In: IEEE/CVF Conf

    Liu, Y., Zhang, J., Fang, L., Jiang, Q., Zhou, B.: Multimodal motion prediction with stacked transformers. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 7577–7586 (2021)

  65. [65]

    In: International Conference on Learning Representations (ICLR) (2017)

    Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR) (2017)

  66. [66]

    In: International Conference on Learning Representations (ICLR) (2019)

    Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (ICLR) (2019)

  67. [67]

    In: Proceedings of the AAAI conference on artificial intelligence

    Ma, Y., Zhu, X., Zhang, S., Yang, R., Wang, W., Manocha, D.: Trafficpredict: Trajectory prediction for heterogeneous traffic-agents. In: Proceedings of the AAAI conference on artificial intelligence. vol. 33, pp. 6120–6127 (2019)

  68. [68]

    Journal of Machine Learning Research9, 2579–2605 (2008)

    Maaten, L.v.d., Hinton, G.: Visualizing data using t-SNE. Journal of Machine Learning Research9, 2579–2605 (2008)

  69. [69]

    In: IEEE/CVF Conf

    Mao, W., Xu, C., Zhu, Q., Chen, S., Wang, Y.: Leapfrog diffusion model for stochastic trajectory prediction. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 5517–5526 (2023) 20 T. Westny et al

  70. [70]

    In: IEEE/CVF Conf

    Marchetti, F., Becattini, F., Seidenari, L., Bimbo, A.D.: MANTRA: Memory augmented networks for multiple trajectory prediction. In: IEEE/CVF Conf. Comput. Vision Pattern Recog. pp. 7143–7152 (2020)

  71. [71]

    In: IEEE Intell

    Messaoud, K., Deo, N., Trivedi, M.M., Nashashibi, F.: Trajectory prediction for autonomous driving based on multi-head attention with joint agent-map representation. In: IEEE Intell. Veh. Symp. pp. 165–170 (2021)

  72. [72]

    IEEE Trans

    Messaoud, K., Yahiaoui, I., Verroust-Blondet, A., Nashashibi, F.: Attention based vehicle trajectory prediction. IEEE Trans. Intell. Veh.6(1), 175–185 (2021)

  73. [73]

    In: IEEE Intell

    Moers, T., Vater, L., Krajewski, R., Bock, J., Zlocki, A., Eckstein, L.: The exiD dataset: A real-world trajectory dataset of highly interactive highway scenarios in Germany. In: IEEE Intell. Veh. Symp. pp. 958–964 (2022)

  74. [74]

    Mohamed, A., Zhu, D., Vu, W., Elhoseiny, M., Claudel, C.: Social-Implicit: Re- thinking trajectory prediction evaluation and the effectiveness of implicit maximum likelihood estimation. In: Eur. Conf. Comput. Vision. pp. 463–479 (2022)

  75. [75]

    In: AAAI Conference on Artificial Intelligence

    Morris, C., Ritzert, M., Fey, M., Hamilton, W.L., Lenssen, J.E., Rattan, G., Grohe, M.: Weisfeiler and leman go neural: Higher-order graph neural networks. In: AAAI Conference on Artificial Intelligence. pp. 4602–4609 (2019)

  76. [76]

    In: Proceedings of the AAAI Conference on Artificial Intelligence

    Park, D., Jeong, J., Yoon, K.J.: Improving transferability for cross-domain trajec- tory prediction via neural stochastic differential equation. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 38, pp. 10145–10154 (2024)

  77. [77]

    Park, S.H., Lee, G., Seo, J., Bhat, M., Kang, M., Francis, J., Jadhav, A., Liang, P.P., Morency, L.P.: Diverse and admissible trajectory forecasting through multimodal context understanding. In: Eur. Conf. Comput. Vision. pp. 282–298 (2020)

  78. [78]

    Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: PyTorch: An imperative style, high-performance deep learning library. In: Int. Conf. Adv. in Neural Inf. Process. Syst. (2019)

  79. [79]

    In: International Conference on Computer Vision (ICCV)

    Pellegrini, S., Ess, A., Schindler, K., Van Gool, L.: You’ll never walk alone: Modeling social behavior for multi-target tracking. In: International Conference on Computer Vision (ICCV). pp. 261–268. IEEE (2009)

  80. [80]

    Advances in Neural Information Processing Systems37, 30811–30849 (2024)

    Penedo, G., Kydlíček, H., Lozhkov, A., Mitchell, M., Raffel, C.A., Von Werra, L., Wolf, T., et al.: The fineweb datasets: Decanting the web for the finest text data at scale. Advances in Neural Information Processing Systems37, 30811–30849 (2024)

Showing first 80 references.