pith. sign in

arxiv: 2605.26524 · v1 · pith:GFN5L3FAnew · submitted 2026-05-26 · 💻 cs.CV · cs.AI

CmIVTP: Cross-modal Interaction-based Vessel Trajectory Prediction for Maritime Intelligence

Pith reviewed 2026-06-29 18:54 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords vessel trajectory predictioncross-modal interactionAIS dataCCTV videomultimodal fusiontransformermaritime datasettrajectory clustering
0
0 comments X

The pith

A cross-modal transformer fuses AIS motion data with CCTV scene features to generate more accurate and feasible vessel trajectories.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Vessel trajectory prediction for maritime safety is limited when relying on either sparse AIS signals alone or CCTV video alone. The paper introduces the CmIVTP framework that adds a target-aware scene encoder to capture environmental context and a cross-modal interaction transformer to link motion patterns with scene semantics through attention. A vessel group trajectory bank is built by clustering historical AIS paths to supply candidate motions. The authors also release a synchronized multimodal dataset. If the fusion works, predictions respect both vessel dynamics and surroundings, supporting safer navigation in busy waterways.

Core claim

The CmIVTP framework models intricate interactions between vessel dynamics and environmental constraints by extracting scene semantic features with a target-aware scene encoder and integrating AIS-derived motion features, CCTV-based environmental features, and scene representations inside a cross-modal interaction transformer that applies cross-modal attention to capture intra-modal and inter-modal relations, while a vessel group trajectory bank supplies representative motion patterns from clustered historical data, yielding improved performance on multimodal benchmarks.

What carries the argument

The cross-modal interaction transformer, which integrates AIS motion features, CCTV environmental features, and scene representations using cross-modal attention mechanisms to capture both intra-modal semantics and inter-modal interactions.

If this is right

  • Trajectory predictions become both dynamically consistent and aligned with environmental features extracted from CCTV.
  • Candidate trajectories can be generated efficiently at scale using the pre-clustered vessel group trajectory bank.
  • Research on multimodal maritime prediction gains support from the released synchronized AIS-CCTV dataset.
  • Overall accuracy improves over single-source methods on standard multimodal vessel trajectory benchmarks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same fusion pattern could be tested on other sensor pairs, such as radar plus camera, for surface vehicle tracking.
  • The trajectory bank approach suggests that pre-computed motion clusters may reduce inference cost in real-time maritime systems.
  • If the attention fusion proves stable, similar cross-modal designs might apply to prediction tasks in aviation or rail transport.

Load-bearing premise

The cross-modal attention mechanisms will produce dynamically consistent and environmentally feasible predictions when fusing sparse AIS data with CCTV features.

What would settle it

A held-out test set in which the model's generated trajectories repeatedly exceed realistic vessel speed limits, turning radii, or navigation constraints would show the claim does not hold.

Figures

Figures reproduced from arXiv: 2605.26524 by Congcong Zhao, Dong Yang, Mengwei Bao, Xiaoyu Li, Yuxu Lu.

Figure 1
Figure 1. Figure 1: The MITS integrates advanced infrastructure and artificial intelligence [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The flowchart of the proposed cross-modal interaction-based vessel trajectory prediction (named CmIVTP) framework. It consists of four main modules: [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Vessel trajectory prediction has evolved from traditional model-based methods to unimodal, and subsequently to advanced multimodal learning [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Problem setup for cross-modal interaction-based vessel trajectory pre [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The pipeline of the VSTaE. It extracts features from image sequences [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The pipeline of the cross-modal interaction Transformer (CMIT). It [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The pipeline of the uncertainty-aware variational decoder (UaVD). It [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The proposed Maritime-MmD+ focuses on four critical areas, includ￾ing the bridge area and the curved waterway sections, which are characterized by their elevated safety risks and critical importance to navigational safety. TABLE I DETAILS OF THE MARITIME-MMD+. THE ”VD” REPRESENTS THE VESSEL DENSITY, WITH ϕl , ϕm, AND ϕh INDICATING LOW, MEDIUM, AND HIGH DENSITIES, RESPECTIVELY. THE “NOVA” AND “NOVC” ARE THE… view at source ↗
Figure 10
Figure 10. Figure 10: Qualitative comparison of trajectory prediction methods on the 36-step task using no-missing AIS data. From left to right, the images include (a) [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Qualitative comparison of trajectory prediction methods on the 36-step task using missing AIS data. We manually remove AIS trajectories and use [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Qualitative comparison of CmIVTP at 12, 24, and 36 prediction [PITH_FULL_IMAGE:figures/full_fig_p012_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: PCA visualization of the learned 2D latent space across four [PITH_FULL_IMAGE:figures/full_fig_p013_13.png] view at source ↗
Figure 15
Figure 15. Figure 15: Ablation study on the number of clusters ( [PITH_FULL_IMAGE:figures/full_fig_p014_15.png] view at source ↗
read the original abstract

Maritime intelligent transportation systems (MITS) are essential for ensuring navigation safety and efficiency in busy waterways. However, accurate vessel trajectory prediction remains challenging due to the limitations of single-source data. Automatic identification system (AIS) data is often sparse or unavailable for small vessels, while closed-circuit television (CCTV) data alone cannot fully capture dynamic vessel behavior. To mitigate these challenges, we propose a cross-modal interaction-based vessel trajectory prediction (named CmIVTP) framework to model the intricate interactions between vessel dynamics and environmental constraints. Specifically, we introduce a target-aware scene encoder to extract scene semantic features, effectively capturing vessel-environment interactions and enhancing trajectory prediction accuracy. In addition, we propose a cross-modal interaction transformer, which integrates AIS-derived motion features, CCTV-based environmental features, and scene representations. It leverages cross-modal attention mechanisms to simultaneously capture intra-modal semantics and inter-modal interactions, ensuring dynamically consistent and environmentally feasible predictions. Furthermore, we construct a vessel group trajectory bank by clustering historical AIS trajectories into representative motion patterns, providing an efficient and scalable approach for candidate trajectory generation. Additionally, we introduce the maritime multimodal dataset plus (named Maritime-MmD$^+$), a large-scale dataset that synchronizes AIS data and CCTV video data, providing robust support for multimodal trajectory prediction research. Extensive experiments demonstrate that CmIVTP achieves better performance on multimodal-driven vessel trajectory prediction benchmarks. The code resources for this work can be available at https://github.com/LouisYxLu/CmIVTP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes CmIVTP, a multimodal framework for vessel trajectory prediction that fuses sparse AIS motion data with CCTV scene features. It introduces a target-aware scene encoder, a cross-modal interaction transformer using attention to capture intra- and inter-modal interactions, a vessel group trajectory bank derived from clustered historical AIS trajectories, and the new Maritime-MmD+ synchronized dataset. The central claims are that the architecture produces dynamically consistent and environmentally feasible predictions and achieves superior benchmark performance.

Significance. If the quantitative gains and feasibility claims hold under rigorous validation, the work would address a practical gap in maritime intelligence systems by demonstrating effective use of complementary sparse and visual data sources for trajectory forecasting.

major comments (2)
  1. [cross-modal interaction transformer] The cross-modal interaction transformer description asserts that its attention mechanisms 'ensure dynamically consistent and environmentally feasible predictions,' yet no kinematic constraints, collision penalties, waterway masks, or other explicit regularizers are specified; feasibility therefore reduces to an emergent statistical property of the training distribution rather than an architectural guarantee. This is load-bearing for the central claim.
  2. [experiments] The experimental claims of superior performance on multimodal-driven benchmarks are stated without any reported quantitative metrics, baseline comparisons, ablation results, error distributions, or protocol details (e.g., train/test splits, missing-data handling). This prevents evaluation of the performance assertions.
minor comments (2)
  1. [abstract] Abstract: 'The code resources for this work can be available at' is grammatically awkward; rephrase to 'Code is available at'.
  2. [dataset introduction] Notation for the new dataset is introduced as 'Maritime-MmD$^+$' but the superscript is not consistently rendered or explained in the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough and constructive review. We address each major comment below and commit to revisions where needed to strengthen the manuscript.

read point-by-point responses
  1. Referee: [cross-modal interaction transformer] The cross-modal interaction transformer description asserts that its attention mechanisms 'ensure dynamically consistent and environmentally feasible predictions,' yet no kinematic constraints, collision penalties, waterway masks, or other explicit regularizers are specified; feasibility therefore reduces to an emergent statistical property of the training distribution rather than an architectural guarantee. This is load-bearing for the central claim.

    Authors: We agree that the manuscript wording overstates the role of the attention mechanisms. The cross-modal interaction transformer captures intra- and inter-modal dependencies from the synchronized AIS-CCTV data, allowing the model to learn dynamically consistent and feasible behaviors as an emergent property of the training distribution. No explicit kinematic or collision constraints are imposed. We will revise the relevant sections to remove any implication of an architectural guarantee and instead describe the outcome as data-driven. revision: yes

  2. Referee: [experiments] The experimental claims of superior performance on multimodal-driven benchmarks are stated without any reported quantitative metrics, baseline comparisons, ablation results, error distributions, or protocol details (e.g., train/test splits, missing-data handling). This prevents evaluation of the performance assertions.

    Authors: The current manuscript version presents only high-level claims in the abstract and introduction. We will add a complete experimental section in the revision that reports all quantitative metrics, baseline comparisons, ablation studies, error distributions, and full protocol details including train/test splits and missing-data handling procedures. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper is a method description of a cross-modal transformer architecture for multimodal vessel trajectory prediction. No equations, derivations, or first-principles results are presented that could reduce to their inputs by construction. The central claim is empirical performance on benchmarks, evaluated externally via experiments rather than derived from self-referential definitions, fitted parameters renamed as predictions, or self-citation chains. The architecture uses standard attention mechanisms and clustering of historical data for candidate generation, with no load-bearing self-citations or uniqueness theorems invoked. This is a self-contained empirical ML contribution with no circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical details, parameters, or assumptions are visible in the abstract, so the ledger is empty.

pith-pipeline@v0.9.1-grok · 5809 in / 987 out tokens · 40451 ms · 2026-06-29T18:54:54.255713+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    Maritime intelligent transport systems,

    Z. Pietrzykowski, “Maritime intelligent transport systems,” inProc. TST. Springer, 2010, pp. 455–462

  2. [2]

    Real-time graph-based traffic complexity evaluation for intelligent maritime supervision using multi- source data fusion,

    C. Zhao, Y . Lu, D. Yang, and T. L. Yip, “Real-time graph-based traffic complexity evaluation for intelligent maritime supervision using multi- source data fusion,”Reliab. Eng. Syst. Saf., p. 112380, 2026

  3. [3]

    Real-time collision risk based safety management for vessel traffic in busy ports and waterways,

    M. Li, J. Mou, P. Chen, L. Chen, and P. van Gelder, “Real-time collision risk based safety management for vessel traffic in busy ports and waterways,”Ocean Coastal Manage., vol. 234, p. 106471, 2023

  4. [4]

    Stmgcn: Mobile edge computing-empowered vessel trajectory predic- tion using spatio-temporal multigraph convolutional network,

    R. W. Liu, M. Liang, J. Nie, Y . Yuan, Z. Xiong, H. Yu, and N. Guizani, “Stmgcn: Mobile edge computing-empowered vessel trajectory predic- tion using spatio-temporal multigraph convolutional network,”IEEE Trans. Ind. Inf., vol. 18, no. 11, pp. 7977–7987, 2022

  5. [5]

    Real- time multi-scene visibility enhancement for promoting navigational safety of vessels under complex weather conditions,

    R. W. Liu, Y . Lu, Y . Gao, Y . Guo, W. Ren, F. Zhu, and F.-Y . Wang, “Real- time multi-scene visibility enhancement for promoting navigational safety of vessels under complex weather conditions,”IEEE Trans. Intell. Transp. Syst., vol. 25, no. 12, pp. 19 979–19 994, 2024

  6. [6]

    Risk assessment of the operations of maritime autonomous surface ships,

    C.-H. Chang, C. Kontovas, Q. Yu, and Z. Yang, “Risk assessment of the operations of maritime autonomous surface ships,”Reliab. Eng. Syst. Saf., vol. 207, p. 107324, 2021

  7. [7]

    Cloud shore ship collaborative computing for intelligent navigation of inland river ships: Architecture design, operation model and application,

    H. Chen, Y . Wen, J. Yang, C. Xiao, and Z. Sui, “Cloud shore ship collaborative computing for intelligent navigation of inland river ships: Architecture design, operation model and application,”IEEE Internet Things J., vol. 12, no. 18, pp. 38 943–38 964, 2025

  8. [8]

    Next-generation vessel traffic services systems—from “passive

    Z. Xiao, X. Fu, L. Zhao, L. Zhang, T. K. Teo, N. Li, W. Zhang, and Z. Qin, “Next-generation vessel traffic services systems—from “passive” to “proactive”,”IEEE Intell. Transp. Syst. Mag., vol. 15, no. 1, pp. 363– 377, 2022

  9. [9]

    Graph learning-driven multi-vessel association: Fusing multimodal data for maritime intelligence,

    Y . Lu, K. Yang, D. Yang, H. Ding, J. Weng, and R. W. Liu, “Graph learning-driven multi-vessel association: Fusing multimodal data for maritime intelligence,”IEEE Trans. Intell. Transp. Syst., vol. 27, no. 5, pp. 5739–5754, 2026

  10. [10]

    Ex- ploiting ais data for intelligent maritime navigation: A comprehensive survey from data to methodology,

    E. Tu, G. Zhang, L. Rachmawati, E. Rajabally, and G.-B. Huang, “Ex- ploiting ais data for intelligent maritime navigation: A comprehensive survey from data to methodology,”IEEE Trans. Intell. Transp. Syst., vol. 19, no. 5, pp. 1559–1582, 2017

  11. [11]

    Ais-based maritime anomaly traffic detection: A review,

    C. V . Ribeiro, A. Paes, and D. de Oliveira, “Ais-based maritime anomaly traffic detection: A review,”Expert Syst. Appl., vol. 231, p. 120561, 2023

  12. [12]

    Asynchronous trajectory matching-based multimodal maritime data fusion for vessel traffic surveillance in inland waterways,

    Y . Guo, R. W. Liu, J. Qu, Y . Lu, F. Zhu, and Y . Lv, “Asynchronous trajectory matching-based multimodal maritime data fusion for vessel traffic surveillance in inland waterways,”IEEE Trans. Intell. Transp. Syst., vol. 24, no. 11, pp. 12 779–12 792, 2023. 15

  13. [13]

    Vessel trajectory prediction in maritime transportation: Current approaches and beyond,

    X. Zhang, X. Fu, Z. Xiao, H. Xu, and Z. Qin, “Vessel trajectory prediction in maritime transportation: Current approaches and beyond,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 11, pp. 19 980–19 998, 2022

  14. [14]

    Ais data-driven ship trajectory prediction modelling and analysis based on machine learning and deep learning methods,

    H. Li, H. Jiao, and Z. Yang, “Ais data-driven ship trajectory prediction modelling and analysis based on machine learning and deep learning methods,”Transp. Res. Part E Logist. Transp. Rev., vol. 175, p. 103152, 2023

  15. [15]

    Uncertainty- aware vessel trajectory prediction for heterogeneous data fusion in internet of things-driven smart waterways,

    Y . Lu, K. Yang, D. Yang, H. Ding, J. Weng, and M. Zhang, “Uncertainty- aware vessel trajectory prediction for heterogeneous data fusion in internet of things-driven smart waterways,”Eng. Appl. Artif. Intell., vol. 177, p. 114930, 2026

  16. [16]

    Resilient inland vessel trajectory prediction via visual-kinematic data imputation and interaction-aware graph learning,

    M. Bao, Y . Lu, X. Li, H. Ding, D. Yang, and M. Zhang, “Resilient inland vessel trajectory prediction via visual-kinematic data imputation and interaction-aware graph learning,”Ocean Eng., vol. 359, p. 125842, 2026

  17. [17]

    Forecasting human trajectory from scene history,

    M. Meng, Z. Wu, T. Chen, X. Cai, X. Zhou, F. Yang, and D. Shen, “Forecasting human trajectory from scene history,”NeurIPS, vol. 35, pp. 24 920–24 933, 2022

  18. [18]

    Long-short term spatio-temporal aggregation for trajectory prediction,

    C. Yang and Z. Pei, “Long-short term spatio-temporal aggregation for trajectory prediction,”IEEE Trans. Intell. Transp. Syst., vol. 24, no. 4, pp. 4114–4126, 2023

  19. [19]

    Intention- aware vehicle trajectory prediction based on spatial-temporal dynamic attention network for internet of vehicles,

    X. Chen, H. Zhang, F. Zhao, Y . Hu, C. Tan, and J. Yang, “Intention- aware vehicle trajectory prediction based on spatial-temporal dynamic attention network for internet of vehicles,”IEEE Trans. Intell. Transp. Syst., vol. 23, no. 10, pp. 19 471–19 483, 2022

  20. [20]

    Emsin: Enhanced multistream interaction network for vehicle trajectory prediction,

    Y . Ren, Z. Lan, L. Liu, and H. Yu, “Emsin: Enhanced multistream interaction network for vehicle trajectory prediction,”IEEE Trans. Fuzzy Syst., vol. 33, no. 1, pp. 54–68, 2024

  21. [21]

    The steering and course keeping qualities of high-speed craft and the inception of dynamic instabilities in the following sea,

    M. Bonci, P. De Jong, F. Van Walree, M. Renilson, and R. Huijsmans, “The steering and course keeping qualities of high-speed craft and the inception of dynamic instabilities in the following sea,”Ocean Eng., vol. 194, p. 106636, 2019

  22. [22]

    Review of maritime traffic models from vessel behavior modeling perspective,

    Y . Zhou, W. Daamen, T. Vellinga, and S. Hoogendoorn, “Review of maritime traffic models from vessel behavior modeling perspective,” Transp. Res. Part C Emerging Technol., vol. 105, pp. 323–345, 2019

  23. [23]

    Ocean vessel trajectory estimation and prediction based on extended kalman filter,

    L. P. Perera, C. G. Soareset al., “Ocean vessel trajectory estimation and prediction based on extended kalman filter,” inProc. ADAPTIVE. Citeseer, 2010, pp. 14–20

  24. [24]

    Variable-and fixed-structure aug- mented interacting multiple-model algorithms for manoeuvring ship tracking based on new ship models,

    E. Semerdjiev and L. Mihaylova, “Variable-and fixed-structure aug- mented interacting multiple-model algorithms for manoeuvring ship tracking based on new ship models,”Int. J. Appl. Math. Comput. Sci., vol. 10, no. 3, pp. 591–604, 2000

  25. [25]

    Wavelet analysis based hidden markov model for large ship trajectory prediction,

    X. Zhang, G. Liu, C. Hu, and X. Ma, “Wavelet analysis based hidden markov model for large ship trajectory prediction,” inProc. CCC. IEEE, 2019, pp. 2913–2918

  26. [26]

    Trajectory prediction for ocean vessels base on k-order multivariate markov chain,

    S. Guo, C. Liu, Z. Guo, Y . Feng, F. Hong, and H. Huang, “Trajectory prediction for ocean vessels base on k-order multivariate markov chain,” inProc. WASA. Springer, 2018, pp. 140–150

  27. [27]

    Ship collision avoidance methods: State-of-the-art,

    Y . Huang, L. Chen, P. Chen, R. R. Negenborn, and P. Van Gelder, “Ship collision avoidance methods: State-of-the-art,”Saf. Sci., vol. 121, pp. 451–473, 2020

  28. [28]

    Deep learning methods for vessel trajectory prediction based on recurrent neural networks,

    S. Capobianco, L. M. Millefiori, N. Forti, P. Braca, and P. Willett, “Deep learning methods for vessel trajectory prediction based on recurrent neural networks,”IEEE Trans. Aerosp. Electron. Syst., vol. 57, no. 6, pp. 4329–4346, 2021

  29. [29]

    Application of coordinate systems for vessel trajectory prediction improvement using a recurrent neural networks,

    R. Jurkus, J. Venskus, and P. Treigys, “Application of coordinate systems for vessel trajectory prediction improvement using a recurrent neural networks,”Eng. Appl. Artif. Intell., vol. 123, p. 106448, 2023

  30. [30]

    Deep learning-powered vessel trajectory prediction for improving smart traffic services in maritime internet of things,

    R. W. Liu, M. Liang, J. Nie, W. Y . B. Lim, Y . Zhang, and M. Guizani, “Deep learning-powered vessel trajectory prediction for improving smart traffic services in maritime internet of things,”IEEE Trans. Network Sci. Eng., vol. 9, no. 5, pp. 3080–3094, 2022

  31. [31]

    An efficient lstm neural network-based framework for vessel location fore- casting,

    E. Chondrodima, N. Pelekis, A. Pikrakis, and Y . Theodoridis, “An efficient lstm neural network-based framework for vessel location fore- casting,”IEEE Trans. Intell. Transp. Syst., vol. 24, no. 5, pp. 4872–4888, 2023

  32. [32]

    Fb-bigru: A deep learning model for ais-based vessel trajectory curve fitting and analysis,

    J. Chen, H. Chen, Y . Zhao, and X. Li, “Fb-bigru: A deep learning model for ais-based vessel trajectory curve fitting and analysis,”Ocean Eng., vol. 266, p. 112898, 2022

  33. [33]

    An ais-based deep learning framework for regional ship behavior prediction,

    B. Murray and L. P. Perera, “An ais-based deep learning framework for regional ship behavior prediction,”Reliab. Eng. Syst. Saf., vol. 215, p. 107819, 2021

  34. [34]

    Vessel trajectory prediction based on spatio-temporal graph convolutional network for complex and crowded sea areas,

    S. Wang, Y . Li, H. Xing, and Z. Zhang, “Vessel trajectory prediction based on spatio-temporal graph convolutional network for complex and crowded sea areas,”Ocean Eng., vol. 298, p. 117232, 2024

  35. [35]

    Enhancing risk perception by integrat- ing ship interactions in multi-ship encounters: A graph-based learning method,

    K. Yang, D. Yang, and Y . Lu, “Enhancing risk perception by integrat- ing ship interactions in multi-ship encounters: A graph-based learning method,”Reliab. Eng. Syst. Saf., vol. 261, p. 111150, 2025

  36. [36]

    Multiple variational kalman-gru for ship trajectory prediction with uncertainty,

    C. Jia, J. Ma, and W. M. Kouw, “Multiple variational kalman-gru for ship trajectory prediction with uncertainty,”IEEE Trans. Aerosp. Electron. Syst., vol. 61, no. 2, pp. 3654–3667, 2024

  37. [37]

    Ship trajectory uncertainty prediction based on a gaussian process model,

    H. Rong, A. Teixeira, and C. G. Soares, “Ship trajectory uncertainty prediction based on a gaussian process model,”Ocean Eng., vol. 182, pp. 499–511, 2019

  38. [38]

    Probabilistic mar- itime trajectory prediction in complex scenarios using deep learning,

    K. A. Sørensen, P. Heiselberg, and H. Heiselberg, “Probabilistic mar- itime trajectory prediction in complex scenarios using deep learning,” Sensors, vol. 22, no. 5, p. 2058, 2022

  39. [39]

    Toward multimodal vessel trajectory prediction by modeling the distribution of modes,

    S. Guo, H. Zhang, and Y . Guo, “Toward multimodal vessel trajectory prediction by modeling the distribution of modes,”Ocean Eng., vol. 282, p. 115020, 2023

  40. [40]

    Regional ship behavior and trajectory prediction for maritime traffic management: A social generative adversarial network approach,

    P. Chen, F. Yang, J. Mou, L. Chen, and M. Li, “Regional ship behavior and trajectory prediction for maritime traffic management: A social generative adversarial network approach,”Ocean Eng., vol. 299, p. 117186, 2024

  41. [41]

    Interaction-aware short-term marine vessel trajectory prediction with deep generative models,

    P. Han, M. Zhu, and H. Zhang, “Interaction-aware short-term marine vessel trajectory prediction with deep generative models,”IEEE Trans. Ind. Inf., vol. 20, no. 3, pp. 3188–3196, 2023

  42. [42]

    Probabilistic and interaction- aware trajectory prediction using score-based diffusion models,

    P. Han, M. Zhu, W. Tian, and H. Zhang, “Probabilistic and interaction- aware trajectory prediction using score-based diffusion models,”IEEE Trans. Ind. Inf., vol. 22, no. 1, pp. 510–519, 2025

  43. [43]

    Tripleconvtransformer: A deep learning vessel trajectory prediction method fusing discretized meteorological data,

    P. Huang, Q. Chen, D. Wang, M. Wang, X. Wu, and X. Huang, “Tripleconvtransformer: A deep learning vessel trajectory prediction method fusing discretized meteorological data,”Front. Environ. Sci., vol. 10, p. 1012547, 2022

  44. [44]

    An adaptive multimodal data vessel trajectory prediction model based on a satellite automatic identification system and environmental data,

    Y . Xiao, Y . Hu, J. Liu, Y . Xiao, and Q. Liu, “An adaptive multimodal data vessel trajectory prediction model based on a satellite automatic identification system and environmental data,”J. Mar. Sci. Eng., vol. 12, no. 3, p. 513, 2024

  45. [45]

    Multimodal deep learning framework for vessel trajectory prediction,

    J. Luo, Y . Xiao, Y . Li, Y . Xiao, and W. Yao, “Multimodal deep learning framework for vessel trajectory prediction,”Ocean Eng., vol. 336, p. 121766, 2025

  46. [46]

    St-feitnet: A trajectory prediction model for complex sea states via multiscale spa- tiotemporal feature extraction and cross-modal frequency enhancement,

    X. Zhou, F. Zheng, H. Yang, N. Guo, A. Yu, and J. Wang, “St-feitnet: A trajectory prediction model for complex sea states via multiscale spa- tiotemporal feature extraction and cross-modal frequency enhancement,” Ocean Eng., vol. 342, p. 122922, 2025

  47. [47]

    Semint: An llm-empowered long-term vessel trajectory prediction framework,

    N. Chen, A. Yang, H. Wu, L. Chen, W. Xiong, and N. Jing, “Semint: An llm-empowered long-term vessel trajectory prediction framework,” Int. J. Geogr. Inf. Sci., pp. 1–35, 2025

  48. [48]

    Llm4stp: A large language model-driven multi-feature fusion method for ship trajectory prediction,

    H. Jiao, J. Gong, H. Li, J. S. L. Lam, Y . Shu, J. Wang, and Z. Yang, “Llm4stp: A large language model-driven multi-feature fusion method for ship trajectory prediction,”Transp. Res. Part E Logist. Transp. Rev., vol. 207, p. 104599, 2026

  49. [49]

    Ais-llm: A unified framework for maritime trajectory prediction, anomaly detection, and collision risk assessment with explainable forecasting,

    H. Park, J. Jung, M. Seo, H. Choi, D. Cho, S. Park, and D.-G. Choi, “Ais- llm: A unified framework for maritime trajectory prediction, anomaly detection, and collision risk assessment with explainable forecasting,” arXiv preprint arXiv:2508.07668, 2025

  50. [50]

    Heterogeneous graph social pooling for interaction-aware vehicle trajectory prediction,

    X. Mo, Y . Xing, and C. Lv, “Heterogeneous graph social pooling for interaction-aware vehicle trajectory prediction,”Transp. Res. Part E Logist. Transp. Rev., vol. 191, p. 103748, 2024

  51. [51]

    Learning ship activity patterns in maritime data streams: enhancing cep rule learning by temporal and spatial relations and domain-specific functions,

    R. Bruns, J. Dunkel, and S. Seremet, “Learning ship activity patterns in maritime data streams: enhancing cep rule learning by temporal and spatial relations and domain-specific functions,”IEEE Trans. Intell. Transp. Syst., vol. 24, no. 10, pp. 11 384–11 395, 2023

  52. [52]

    Deep residual learning for image recognition,

    K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProc. IEEE CVPR, 2016, pp. 770–778

  53. [53]

    Convolutional lstm network: A machine learning approach for precipitation nowcasting,

    X. Shi, Z. Chen, H. Wang, D.-Y . Yeung, W.-K. Wong, and W.-c. Woo, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,”NeurIPS, vol. 28, pp. 802–810, 2015

  54. [54]

    Learning phrase representations using rnn encoder-decoder for statistical machine translation,

    K. Cho, B. Van Merri ¨enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y . Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” inProc. EMNLP, 2014

  55. [55]

    Long short-term memory,

    S. Hochreiter and J. Schmidhuber, “Long short-term memory,”Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997

  56. [56]

    Bidirectional recurrent neural net- works,

    M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural net- works,”IEEE Trans. Signal Process., vol. 45, no. 11, pp. 2673–2681, 1997

  57. [57]

    An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,

    S. Bai, J. Z. Kolter, and V . Koltun, “An empirical evaluation of generic convolutional and recurrent networks for sequence modeling,” inProc. ICLR, 2018. 16

  58. [58]

    Attention is all you need,

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”NeurIPS, vol. 30, 2017

  59. [59]

    The graph neural network model,

    F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,”IEEE Trans. Neural Networks, vol. 20, no. 1, pp. 61–80, 2008

  60. [60]

    Auto-Encoding Variational Bayes

    D. P. Kingma and M. Welling, “Auto-encoding variational bayes,”arXiv preprint arXiv:1312.6114, 2013

  61. [61]

    Generative adversarial networks,

    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, pp. 139–144, 2020

  62. [62]

    Deep learning framework for vessel trajectory prediction using auxiliary tasks and convolutional networks,

    Y . Shin, N. Kim, H. Lee, S. Y . In, M. Hansen, and Y . Yoon, “Deep learning framework for vessel trajectory prediction using auxiliary tasks and convolutional networks,”Eng. Appl. Artif. Intell., vol. 132, p. 107936, 2024

  63. [63]

    Stia-djanet: spatial–temporal intention-aware vessel trajectory prediction based on dual-joint attention network for e-navigation,

    J. Jiang and Y . Zuo, “Stia-djanet: spatial–temporal intention-aware vessel trajectory prediction based on dual-joint attention network for e-navigation,”Expert Syst. Appl., vol. 262, p. 125550, 2025

  64. [64]

    A transformer network with sparse aug- mented data representation and cross entropy loss for ais-based vessel trajectory prediction,

    D. Nguyen and R. Fablet, “A transformer network with sparse aug- mented data representation and cross entropy loss for ais-based vessel trajectory prediction,”IEEE Access, vol. 12, pp. 21 596–21 609, 2024

  65. [65]

    Uncertainty-aware ship trajectory prediction via spatio-temporal graph transformer,

    J. Gong, H. Li, H. Jiao, and Z. Yang, “Uncertainty-aware ship trajectory prediction via spatio-temporal graph transformer,”Transp. Res. Part E Logist. Transp. Rev., vol. 203, p. 104315, 2025

  66. [66]

    Generative adversarial nets,

    I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” NeurIPS, vol. 27, 2014