ART: Adaptive Relational Transformer for Pedestrian Trajectory Prediction with Temporal-Aware Relations

arxiv: 2604.03649 · v1 · submitted 2026-04-04 · 💻 cs.CV · cs.AI

ART: Adaptive Relational Transformer for Pedestrian Trajectory Prediction with Temporal-Aware Relations

Ruochen Li , Ziyi Chang , Junyan Hu , Jiannan Li , Amir Atapour-Abarghouei , Hubert P. H. Shum This is my paper

Pith reviewed 2026-05-13 17:40 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords pedestrian trajectory predictionrelational transformertemporal-aware relation graphadaptive interaction pruninghuman interaction modelingETH/UCY benchmarkNBA trajectory datasettransformer efficiency

0 comments p. Extension

The pith

The Adaptive Relational Transformer improves pedestrian trajectory prediction by explicitly modeling how pairwise interactions change over time while pruning unnecessary computations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Predicting where pedestrians will walk next is essential for robots navigating crowds safely. Existing graph and transformer methods either add too much computation or fail to capture how human interactions shift from moment to moment. This paper introduces the Adaptive Relational Transformer, which builds a temporal-aware relation graph to track evolving interactions and uses adaptive pruning to skip redundant calculations. On standard benchmarks like ETH/UCY and NBA, the model reaches higher accuracy than prior work while running more efficiently. A reader should care because better trajectory forecasts could enable smoother robot-human interactions in real environments.

Core claim

The Adaptive Relational Transformer (ART) introduces a Temporal-Aware Relation Graph (TARG) to explicitly capture the evolution of pairwise interactions among pedestrians and an Adaptive Interaction Pruning (AIP) mechanism to reduce redundant computations. This combination allows the model to represent diverse and time-varying human interactions more effectively than previous graph-based or transformer-based approaches, leading to state-of-the-art accuracy and computational efficiency on the ETH/UCY and NBA benchmarks.

What carries the argument

Temporal-Aware Relation Graph (TARG) combined with Adaptive Interaction Pruning (AIP), which together model changing pairwise relations and eliminate unnecessary interaction computations.

Load-bearing premise

The assumption that the temporal-aware graph and pruning mechanism can consistently identify and preserve all critical time-varying interactions without introducing bias or discarding essential data on the tested scenarios.

What would settle it

A new dataset featuring highly complex, rapidly changing group interactions where the pruned model produces significantly higher prediction errors than a full-interaction baseline.

Figures

Figures reproduced from arXiv: 2604.03649 by Amir Atapour-Abarghouei, Hubert P. H. Shum, Jiannan Li, Junyan Hu, Ruochen Li, Ziyi Chang.

**Figure 2.** Figure 2: Overview of ART. Left: Temporal-Aware Relation Graph (TARG) leverages pairwise attention to model agent interactions across time steps, assigning higher weights to informative moments. Right: Adaptive Interaction Pruning (AIP) uses top-p filtering to adaptively retain informative neighbors based on cumulative interaction strength, producing a sparsified graph for trajectory prediction. The proposed ART ach… view at source ↗

**Figure 3.** Figure 3: Ablation study of Top-p threshold on the ETH/UCY dataset. TABLE III ABLATION STUDY OF RELATION WEIGHTING STRATEGIES ON THE ETH/UCY DATASET. Weighting Strategies ETH/UCY Dataset min ADE20 min FDE20 Cosine Similarity 0.22 0.36 Random Weighting 0.23 0.37 Uniform Weighting 0.23 0.36 Ours 0.20 0.32 relations leads to more effective interaction representations. 2) Ablation Study on Top-p Threshold [PITH_FULL_IM… view at source ↗

**Figure 4.** Figure 4: Qualitative comparisons with MART [17] on the ETH/UCY [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparisons with MART [17] on the NBA dataset. Past [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

read the original abstract

Accurate prediction of real-world pedestrian trajectories is crucial for a wide range of robot-related applications. Recent approaches typically adopt graph-based or transformer-based frameworks to model interactions. Despite their effectiveness, these methods either introduce unnecessary computational overhead or struggle to represent the diverse and time-varying characteristics of human interactions. In this work, we present an Adaptive Relational Transformer (ART), which introduces a Temporal-Aware Relation Graph (TARG) to explicitly capture the evolution of pairwise interactions and an Adaptive Interaction Pruning (AIP) mechanism to reduce redundant computations efficiently. Extensive evaluations on ETH/UCY and NBA benchmarks show that ART delivers state-of-the-art accuracy with high computational efficiency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ART pairs temporal-aware relations with adaptive pruning in a transformer to achieve better accuracy and efficiency on pedestrian trajectory benchmarks.

read the letter

The punchline on this paper is that it delivers a practical improvement to transformer models for pedestrian trajectory prediction by adding explicit temporal modeling of relations and an adaptive pruning mechanism. The results on ETH/UCY and NBA show state-of-the-art accuracy paired with better efficiency than the baselines. What is new is the specific way they combine the Temporal-Aware Relation Graph to follow how interactions evolve and the Adaptive Interaction Pruning to reduce redundant work without hurting performance. The paper does well in laying out the architecture clearly and running ablations that isolate each component's impact. Efficiency metrics are reported in a way that lets you compare directly to prior work, and the gains appear consistent across the two benchmarks. The soft spots are minor. The evaluation relies on standard datasets, which is fine but means we don't see how it handles more unusual interaction patterns. The pruning step could in theory drop something important in edge cases, but the ablations and overall numbers suggest it doesn't happen on these tests. No issues with the underlying math or citations stand out. This paper is for researchers and engineers working on trajectory prediction in robotics and autonomous systems. Anyone looking for efficiency improvements in transformer-based models for this task will find usable ideas here. It deserves a serious referee. The claims are backed by concrete experiments and breakdowns, so peer review would help refine the presentation and check for any overlooked scenarios.

Referee Report

0 major / 3 minor

Summary. The paper proposes an Adaptive Relational Transformer (ART) for pedestrian trajectory prediction. It introduces a Temporal-Aware Relation Graph (TARG) to explicitly model the evolution of time-varying pairwise interactions and an Adaptive Interaction Pruning (AIP) mechanism to reduce redundant computations. The approach is evaluated on the ETH/UCY and NBA benchmarks, where it reports state-of-the-art accuracy alongside high computational efficiency.

Significance. If the results hold, the work is significant for trajectory prediction in robotics applications. It improves upon prior graph- and transformer-based methods by explicitly handling diverse, time-varying interactions while maintaining efficiency. Strengths include internally consistent architecture and loss formulation, ablations that isolate the contribution of TARG and AIP, and direct efficiency comparisons against comparable baselines on standard benchmarks.

minor comments (3)

[Abstract] Abstract: The claim of SOTA accuracy would benefit from a brief quantitative summary (e.g., ADE/FDE deltas on ETH/UCY) to allow readers to assess the magnitude of improvement without reading the full results section.
[§4] §4 (Experiments): While ablations are reported, include a short error analysis or failure-case discussion (e.g., crowded scenes or long-horizon predictions) to strengthen the claim that AIP does not discard critical interactions.
[§3] Notation: Ensure consistent use of symbols for temporal relations in TARG across equations and figures; a small table of key symbols would improve readability.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and recommendation of minor revision. We are pleased that the significance for robotics applications, internal consistency of the architecture and loss, ablations isolating TARG and AIP, and efficiency comparisons are recognized. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces an Adaptive Relational Transformer (ART) architecture consisting of a Temporal-Aware Relation Graph (TARG) and Adaptive Interaction Pruning (AIP) mechanism for modeling pedestrian interactions. Its claims rest entirely on empirical evaluation against standard external benchmarks (ETH/UCY and NBA), with reported SOTA accuracy and efficiency metrics. No equations, derivations, or parameter-fitting steps are described that would reduce any prediction or result to the inputs by construction. Ablations isolate component contributions independently of the final numbers, and the work is self-contained against those benchmarks without load-bearing self-citation chains or self-definitional reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities can be extracted from the provided text.

pith-pipeline@v0.9.0 · 5428 in / 961 out tokens · 32617 ms · 2026-05-13T17:40:28.685335+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages

[1]

Pedestrian trajectory prediction based on transfer learning for human-following mobile robots.IEEE Access, 9:126172–126185, 2021

Rina Akabane and Yuka Kato. Pedestrian trajectory prediction based on transfer learning for human-following mobile robots.IEEE Access, 9:126172–126185, 2021

work page 2021
[2]

Social lstm: Human trajectory prediction in crowded spaces

Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. Social lstm: Human trajectory prediction in crowded spaces. InCVPR, pages 961–971, 2016

work page 2016
[3]

Eigentrajectory: Low-rank descriptors for multi-modal trajectory forecasting

Inhwan Bae, Jean Oh, and Hae-Gon Jeon. Eigentrajectory: Low-rank descriptors for multi-modal trajectory forecasting. InICCV, pages 10017–10029, 2023

work page 2023
[4]

Non-probability sampling network for stochastic human trajectory prediction

Inhwan Bae, Jin-Hwi Park, and Hae-Gon Jeon. Non-probability sampling network for stochastic human trajectory prediction. InCVPR, pages 6477–6487, 2022

work page 2022
[5]

Singulartrajectory: Universal trajectory predictor using diffusion model

Inhwan Bae, Young-Jae Park, and Hae-Gon Jeon. Singulartrajectory: Universal trajectory predictor using diffusion model. InCVPR, pages 17890–17901, 2024

work page 2024
[6]

Intention-aware online pomdp planning for autonomous driving in a crowd

Haoyu Bai, Shaojun Cai, Nan Ye, David Hsu, and Wee Sun Lee. Intention-aware online pomdp planning for autonomous driving in a crowd. InICRA, pages 454–460, 2015

work page 2015
[7]

Trajectory prediction for robot navigation using flow-guided markov neural operator

Rashmi Bhaskara, Hrishikesh Viswanath, and Aniket Bera. Trajectory prediction for robot navigation using flow-guided markov neural operator. InICRA, pages 15209–15216. IEEE, 2024

work page 2024
[8]

On the design fundamentals of diffusion models: A survey

Ziyi Chang, George A Koulieris, Hyung Jin Chang, and Hubert PH Shum. On the design fundamentals of diffusion models: A survey. Pattern Recognition, 169:111934, 2026

work page 2026
[9]

Large-scale multi-character interaction synthesis

Ziyi Chang, He Wang, George Koulieris, and Hubert PH Shum. Large-scale multi-character interaction synthesis. InProceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers, pages 1–10, 2025

work page 2025
[10]

Robot navigation based on human trajectory prediction and multiple travel modes.Applied Sciences, 8(11):2205, 2018

Zhixian Chen, Chao Song, Yuanyuan Yang, Baoliang Zhao, Ying Hu, Shoubin Liu, and Jianwei Zhang. Robot navigation based on human trajectory prediction and multiple travel modes.Applied Sciences, 8(11):2205, 2018

work page 2018
[11]

Relational attention: Generalizing transformers for graph-structured tasks

Cameron Diao and Ricky Loynd. Relational attention: Generalizing transformers for graph-structured tasks. InICLR, 2023

work page 2023
[12]

Stochastic trajectory prediction via motion indeterminacy diffusion

Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie Zhou, and Jiwen Lu. Stochastic trajectory prediction via motion indeterminacy diffusion. InCVPR, pages 17113–17122, 2022

work page 2022
[13]

End-to-end trajectory distribution prediction based on occupancy grid maps

Ke Guo, Wenxi Liu, and Jia Pan. End-to-end trajectory distribution prediction based on occupancy grid maps. InCVPR, pages 2242–2251, 2022

work page 2022
[14]

Social gan: Socially acceptable trajectories with generative adversarial networks

Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexan- dre Alahi. Social gan: Socially acceptable trajectories with generative adversarial networks. InCVPR, pages 2255–2264, 2018

work page 2018
[15]

Learning heterogeneous inter- action strengths by trajectory prediction with graph neural network

Seungwoong Ha and Hawoong Jeong. Learning heterogeneous inter- action strengths by trajectory prediction with graph neural network. arXiv, 2022

work page 2022
[16]

Learning sparse interaction graphs of partially detected pedestrians for trajectory prediction.IEEE RAL, 7(2):1198–1205, 2022

Zhe Huang, Ruohua Li, Kazuki Shin, and Katherine Driggs-Campbell. Learning sparse interaction graphs of partially detected pedestrians for trajectory prediction.IEEE RAL, 7(2):1198–1205, 2022

work page 2022
[17]

Mart: Multiscale relational transformer networks for multi-agent trajectory prediction

Seongju Lee, Junseok Lee, Yeonguk Yu, Taeri Kim, and Kyoobin Lee. Mart: Multiscale relational transformer networks for multi-agent trajectory prediction. InECCV, pages 89–107, 2024

work page 2024
[18]

Crowds by example

Alon Lerner, Yiorgos Chrysanthou, and Dani Lischinski. Crowds by example. InComputer graphics forum, volume 26, pages 655–664, 2007

work page 2007
[19]

Bp-sgcn: Behavioral pseudo-label informed sparse graph convolution network for pedestrian and heterogeneous trajectory pre- diction.TNNLS, 2025

Ruochen Li, Stamos Katsigiannis, Tae-Kyun Kim, and Hubert PH Shum. Bp-sgcn: Behavioral pseudo-label informed sparse graph convolution network for pedestrian and heterogeneous trajectory pre- diction.TNNLS, 2025

work page 2025
[20]

Multiclass- sgcn: Sparse graph-based trajectory prediction with agent class em- bedding

Ruochen Li, Stamos Katsigiannis, and Hubert PH Shum. Multiclass- sgcn: Sparse graph-based trajectory prediction with agent class em- bedding. InICIP, pages 2346–2350. IEEE, 2022

work page 2022
[21]

Unified spatial-temporal edge-enhanced graph networks for pedestrian trajectory prediction.TCSVT, 2025

Ruochen Li, Tanqiu Qiao, Stamos Katsigiannis, Zhanxing Zhu, and Hubert PH Shum. Unified spatial-temporal edge-enhanced graph networks for pedestrian trajectory prediction.TCSVT, 2025

work page 2025
[22]

Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction.arXiv, 2025

Ruochen Li, Zhanxing Zhu, Tanqiu Qiao, and Hubert PH Shum. Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction.arXiv, 2025

work page 2025
[23]

Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction

Ruochen Li, Zhanxing Zhu, Tanqiu Qiao, and Hubert PH Shum. Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction. InAAAI, volume 40, pages 17535–17543, 2026

work page 2026
[24]

Twilight: Adaptive attention sparsity with hierarchical top-$p$ pruning

Chaofan Lin, Jiaming Tang, Shuo Yang, Hanshuo Wang, Tian Tang, Boyu Tian, Ion Stoica, Song Han, and Mingyu Gao. Twilight: Adaptive attention sparsity with hierarchical top-$p$ pruning. InNeurips, 2025

work page 2025
[25]

Porca: Modeling and planning for autonomous driving among many pedestrians.IEEE RAL, 3(4):3418–3425, 2018

Yuanfu Luo, Panpan Cai, Aniket Bera, David Hsu, Wee Sun Lee, and Dinesh Manocha. Porca: Modeling and planning for autonomous driving among many pedestrians.IEEE RAL, 3(4):3418–3425, 2018

work page 2018
[26]

Leapfrog diffusion model for stochastic trajectory prediction

Weibo Mao, Chenxin Xu, Qi Zhu, Siheng Chen, and Yanfeng Wang. Leapfrog diffusion model for stochastic trajectory prediction. In CVPR, 2023

work page 2023
[27]

You’ll never walk alone: Modeling social behavior for multi-target tracking

Stefano Pellegrini, Andreas Ess, Konrad Schindler, and Luc Van Gool. You’ll never walk alone: Modeling social behavior for multi-target tracking. InICCV, pages 261–268, 2009

work page 2009
[28]

Geometric visual fusion graph neural networks for multi-person human-object interaction recognition in videos.arXiv, 2025

Tanqiu Qiao, Ruochen Li, Frederick WB Li, Yoshiki Kubotani, Shigeo Morishima, and Hubert PH Shum. Geometric visual fusion graph neural networks for multi-person human-object interaction recognition in videos.arXiv, 2025

work page 2025
[29]

From category to scenery: An end-to-end framework for multi-person human-object interaction recognition in videos

Tanqiu Qiao, Ruochen Li, Frederick WB Li, and Hubert PH Shum. From category to scenery: An end-to-end framework for multi-person human-object interaction recognition in videos. InICPR, pages 262– 277, 2024

work page 2024
[30]

Trajectory unified transformer for pedestrian trajectory prediction

Liushuai Shi, Le Wang, Sanping Zhou, and Gang Hua. Trajectory unified transformer for pedestrian trajectory prediction. InICCV, pages 9675–9684, 2023

work page 2023
[31]

Attention is all you need.Neurips, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Neurips, 30, 2017

work page 2017
[32]

Graph attention networks

Petar Veli ˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. InICLR, 2018

work page 2018
[33]

Groupnet: Multiscale hypergraph neural networks for trajectory pre- diction with relational reasoning.CVPR, pages 6488–6497, 2022

Chenxin Xu, Maosen Li, Zhenyang Ni, Ya Zhang, and Siheng Chen. Groupnet: Multiscale hypergraph neural networks for trajectory pre- diction with relational reasoning.CVPR, pages 6488–6497, 2022

work page 2022
[34]

Remem- ber intentions: Retrospective-memory-based trajectory prediction

Chenxin Xu, Weibo Mao, Wenjun Zhang, and Siheng Chen. Remem- ber intentions: Retrospective-memory-based trajectory prediction. In CVPR, pages 6488–6497, 2022

work page 2022
[35]

Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, and Yanfeng Wang

Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, and Yanfeng Wang. EqMotion: Equivariant multi-agent motion prediction with invariant interaction reasoning. In CVPR, 2023

work page 2023
[36]

Dynamic-group-aware networks for multi- agent trajectory prediction with relational reasoning.Neural Networks, 170:564–577, 2024

Chenxin Xu, Yuxi Wei, Bohan Tang, Sheng Yin, Ya Zhang, Siheng Chen, and Yanfeng Wang. Dynamic-group-aware networks for multi- agent trajectory prediction with relational reasoning.Neural Networks, 170:564–577, 2024

work page 2024
[37]

Ss-lstm: A hierarchical lstm model for pedestrian trajectory prediction

Hao Xue, Du Q Huynh, and Mark Reynolds. Ss-lstm: A hierarchical lstm model for pedestrian trajectory prediction. InWACV, pages 1186– 1194, 2018

work page 2018
[38]

Ia-lstm: Interaction-aware lstm for pedestrian trajectory prediction.IEEE transactions on cybernetics, 54(7):3904–3917, 2024

Jing Yang, Yuehai Chen, Shaoyi Du, Badong Chen, and Jose C Principe. Ia-lstm: Interaction-aware lstm for pedestrian trajectory prediction.IEEE transactions on cybernetics, 54(7):3904–3917, 2024

work page 2024
[39]

Spatio- temporal graph transformer networks for pedestrian trajectory predic- tion

Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, and Shuai Yi. Spatio- temporal graph transformer networks for pedestrian trajectory predic- tion. InECCV, pages 507–523. Springer, 2020

work page 2020
[40]

Agent- former: Agent-aware transformers for socio-temporal multi-agent fore- casting

Ye Yuan, Xinshuo Weng, Yanglan Ou, and Kris M Kitani. Agent- former: Agent-aware transformers for socio-temporal multi-agent fore- casting. InICCV, pages 9813–9823, 2021

work page 2021

[1] [1]

Pedestrian trajectory prediction based on transfer learning for human-following mobile robots.IEEE Access, 9:126172–126185, 2021

Rina Akabane and Yuka Kato. Pedestrian trajectory prediction based on transfer learning for human-following mobile robots.IEEE Access, 9:126172–126185, 2021

work page 2021

[2] [2]

Social lstm: Human trajectory prediction in crowded spaces

Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. Social lstm: Human trajectory prediction in crowded spaces. InCVPR, pages 961–971, 2016

work page 2016

[3] [3]

Eigentrajectory: Low-rank descriptors for multi-modal trajectory forecasting

Inhwan Bae, Jean Oh, and Hae-Gon Jeon. Eigentrajectory: Low-rank descriptors for multi-modal trajectory forecasting. InICCV, pages 10017–10029, 2023

work page 2023

[4] [4]

Non-probability sampling network for stochastic human trajectory prediction

Inhwan Bae, Jin-Hwi Park, and Hae-Gon Jeon. Non-probability sampling network for stochastic human trajectory prediction. InCVPR, pages 6477–6487, 2022

work page 2022

[5] [5]

Singulartrajectory: Universal trajectory predictor using diffusion model

Inhwan Bae, Young-Jae Park, and Hae-Gon Jeon. Singulartrajectory: Universal trajectory predictor using diffusion model. InCVPR, pages 17890–17901, 2024

work page 2024

[6] [6]

Intention-aware online pomdp planning for autonomous driving in a crowd

Haoyu Bai, Shaojun Cai, Nan Ye, David Hsu, and Wee Sun Lee. Intention-aware online pomdp planning for autonomous driving in a crowd. InICRA, pages 454–460, 2015

work page 2015

[7] [7]

Trajectory prediction for robot navigation using flow-guided markov neural operator

Rashmi Bhaskara, Hrishikesh Viswanath, and Aniket Bera. Trajectory prediction for robot navigation using flow-guided markov neural operator. InICRA, pages 15209–15216. IEEE, 2024

work page 2024

[8] [8]

On the design fundamentals of diffusion models: A survey

Ziyi Chang, George A Koulieris, Hyung Jin Chang, and Hubert PH Shum. On the design fundamentals of diffusion models: A survey. Pattern Recognition, 169:111934, 2026

work page 2026

[9] [9]

Large-scale multi-character interaction synthesis

Ziyi Chang, He Wang, George Koulieris, and Hubert PH Shum. Large-scale multi-character interaction synthesis. InProceedings of the Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Papers, pages 1–10, 2025

work page 2025

[10] [10]

Robot navigation based on human trajectory prediction and multiple travel modes.Applied Sciences, 8(11):2205, 2018

Zhixian Chen, Chao Song, Yuanyuan Yang, Baoliang Zhao, Ying Hu, Shoubin Liu, and Jianwei Zhang. Robot navigation based on human trajectory prediction and multiple travel modes.Applied Sciences, 8(11):2205, 2018

work page 2018

[11] [11]

Relational attention: Generalizing transformers for graph-structured tasks

Cameron Diao and Ricky Loynd. Relational attention: Generalizing transformers for graph-structured tasks. InICLR, 2023

work page 2023

[12] [12]

Stochastic trajectory prediction via motion indeterminacy diffusion

Tianpei Gu, Guangyi Chen, Junlong Li, Chunze Lin, Yongming Rao, Jie Zhou, and Jiwen Lu. Stochastic trajectory prediction via motion indeterminacy diffusion. InCVPR, pages 17113–17122, 2022

work page 2022

[13] [13]

End-to-end trajectory distribution prediction based on occupancy grid maps

Ke Guo, Wenxi Liu, and Jia Pan. End-to-end trajectory distribution prediction based on occupancy grid maps. InCVPR, pages 2242–2251, 2022

work page 2022

[14] [14]

Social gan: Socially acceptable trajectories with generative adversarial networks

Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, and Alexan- dre Alahi. Social gan: Socially acceptable trajectories with generative adversarial networks. InCVPR, pages 2255–2264, 2018

work page 2018

[15] [15]

Learning heterogeneous inter- action strengths by trajectory prediction with graph neural network

Seungwoong Ha and Hawoong Jeong. Learning heterogeneous inter- action strengths by trajectory prediction with graph neural network. arXiv, 2022

work page 2022

[16] [16]

Learning sparse interaction graphs of partially detected pedestrians for trajectory prediction.IEEE RAL, 7(2):1198–1205, 2022

Zhe Huang, Ruohua Li, Kazuki Shin, and Katherine Driggs-Campbell. Learning sparse interaction graphs of partially detected pedestrians for trajectory prediction.IEEE RAL, 7(2):1198–1205, 2022

work page 2022

[17] [17]

Mart: Multiscale relational transformer networks for multi-agent trajectory prediction

Seongju Lee, Junseok Lee, Yeonguk Yu, Taeri Kim, and Kyoobin Lee. Mart: Multiscale relational transformer networks for multi-agent trajectory prediction. InECCV, pages 89–107, 2024

work page 2024

[18] [18]

Crowds by example

Alon Lerner, Yiorgos Chrysanthou, and Dani Lischinski. Crowds by example. InComputer graphics forum, volume 26, pages 655–664, 2007

work page 2007

[19] [19]

Bp-sgcn: Behavioral pseudo-label informed sparse graph convolution network for pedestrian and heterogeneous trajectory pre- diction.TNNLS, 2025

Ruochen Li, Stamos Katsigiannis, Tae-Kyun Kim, and Hubert PH Shum. Bp-sgcn: Behavioral pseudo-label informed sparse graph convolution network for pedestrian and heterogeneous trajectory pre- diction.TNNLS, 2025

work page 2025

[20] [20]

Multiclass- sgcn: Sparse graph-based trajectory prediction with agent class em- bedding

Ruochen Li, Stamos Katsigiannis, and Hubert PH Shum. Multiclass- sgcn: Sparse graph-based trajectory prediction with agent class em- bedding. InICIP, pages 2346–2350. IEEE, 2022

work page 2022

[21] [21]

Unified spatial-temporal edge-enhanced graph networks for pedestrian trajectory prediction.TCSVT, 2025

Ruochen Li, Tanqiu Qiao, Stamos Katsigiannis, Zhanxing Zhu, and Hubert PH Shum. Unified spatial-temporal edge-enhanced graph networks for pedestrian trajectory prediction.TCSVT, 2025

work page 2025

[22] [22]

Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction.arXiv, 2025

Ruochen Li, Zhanxing Zhu, Tanqiu Qiao, and Hubert PH Shum. Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction.arXiv, 2025

work page 2025

[23] [23]

Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction

Ruochen Li, Zhanxing Zhu, Tanqiu Qiao, and Hubert PH Shum. Vite: Virtual graph trajectory expert router for pedestrian trajectory prediction. InAAAI, volume 40, pages 17535–17543, 2026

work page 2026

[24] [24]

Twilight: Adaptive attention sparsity with hierarchical top-$p$ pruning

Chaofan Lin, Jiaming Tang, Shuo Yang, Hanshuo Wang, Tian Tang, Boyu Tian, Ion Stoica, Song Han, and Mingyu Gao. Twilight: Adaptive attention sparsity with hierarchical top-$p$ pruning. InNeurips, 2025

work page 2025

[25] [25]

Porca: Modeling and planning for autonomous driving among many pedestrians.IEEE RAL, 3(4):3418–3425, 2018

Yuanfu Luo, Panpan Cai, Aniket Bera, David Hsu, Wee Sun Lee, and Dinesh Manocha. Porca: Modeling and planning for autonomous driving among many pedestrians.IEEE RAL, 3(4):3418–3425, 2018

work page 2018

[26] [26]

Leapfrog diffusion model for stochastic trajectory prediction

Weibo Mao, Chenxin Xu, Qi Zhu, Siheng Chen, and Yanfeng Wang. Leapfrog diffusion model for stochastic trajectory prediction. In CVPR, 2023

work page 2023

[27] [27]

You’ll never walk alone: Modeling social behavior for multi-target tracking

Stefano Pellegrini, Andreas Ess, Konrad Schindler, and Luc Van Gool. You’ll never walk alone: Modeling social behavior for multi-target tracking. InICCV, pages 261–268, 2009

work page 2009

[28] [28]

Geometric visual fusion graph neural networks for multi-person human-object interaction recognition in videos.arXiv, 2025

Tanqiu Qiao, Ruochen Li, Frederick WB Li, Yoshiki Kubotani, Shigeo Morishima, and Hubert PH Shum. Geometric visual fusion graph neural networks for multi-person human-object interaction recognition in videos.arXiv, 2025

work page 2025

[29] [29]

From category to scenery: An end-to-end framework for multi-person human-object interaction recognition in videos

Tanqiu Qiao, Ruochen Li, Frederick WB Li, and Hubert PH Shum. From category to scenery: An end-to-end framework for multi-person human-object interaction recognition in videos. InICPR, pages 262– 277, 2024

work page 2024

[30] [30]

Trajectory unified transformer for pedestrian trajectory prediction

Liushuai Shi, Le Wang, Sanping Zhou, and Gang Hua. Trajectory unified transformer for pedestrian trajectory prediction. InICCV, pages 9675–9684, 2023

work page 2023

[31] [31]

Attention is all you need.Neurips, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Neurips, 30, 2017

work page 2017

[32] [32]

Graph attention networks

Petar Veli ˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks. InICLR, 2018

work page 2018

[33] [33]

Groupnet: Multiscale hypergraph neural networks for trajectory pre- diction with relational reasoning.CVPR, pages 6488–6497, 2022

Chenxin Xu, Maosen Li, Zhenyang Ni, Ya Zhang, and Siheng Chen. Groupnet: Multiscale hypergraph neural networks for trajectory pre- diction with relational reasoning.CVPR, pages 6488–6497, 2022

work page 2022

[34] [34]

Remem- ber intentions: Retrospective-memory-based trajectory prediction

Chenxin Xu, Weibo Mao, Wenjun Zhang, and Siheng Chen. Remem- ber intentions: Retrospective-memory-based trajectory prediction. In CVPR, pages 6488–6497, 2022

work page 2022

[35] [35]

Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, and Yanfeng Wang

Chenxin Xu, Robby T. Tan, Yuhong Tan, Siheng Chen, Yu Guang Wang, Xinchao Wang, and Yanfeng Wang. EqMotion: Equivariant multi-agent motion prediction with invariant interaction reasoning. In CVPR, 2023

work page 2023

[36] [36]

Dynamic-group-aware networks for multi- agent trajectory prediction with relational reasoning.Neural Networks, 170:564–577, 2024

Chenxin Xu, Yuxi Wei, Bohan Tang, Sheng Yin, Ya Zhang, Siheng Chen, and Yanfeng Wang. Dynamic-group-aware networks for multi- agent trajectory prediction with relational reasoning.Neural Networks, 170:564–577, 2024

work page 2024

[37] [37]

Ss-lstm: A hierarchical lstm model for pedestrian trajectory prediction

Hao Xue, Du Q Huynh, and Mark Reynolds. Ss-lstm: A hierarchical lstm model for pedestrian trajectory prediction. InWACV, pages 1186– 1194, 2018

work page 2018

[38] [38]

Ia-lstm: Interaction-aware lstm for pedestrian trajectory prediction.IEEE transactions on cybernetics, 54(7):3904–3917, 2024

Jing Yang, Yuehai Chen, Shaoyi Du, Badong Chen, and Jose C Principe. Ia-lstm: Interaction-aware lstm for pedestrian trajectory prediction.IEEE transactions on cybernetics, 54(7):3904–3917, 2024

work page 2024

[39] [39]

Spatio- temporal graph transformer networks for pedestrian trajectory predic- tion

Cunjun Yu, Xiao Ma, Jiawei Ren, Haiyu Zhao, and Shuai Yi. Spatio- temporal graph transformer networks for pedestrian trajectory predic- tion. InECCV, pages 507–523. Springer, 2020

work page 2020

[40] [40]

Agent- former: Agent-aware transformers for socio-temporal multi-agent fore- casting

Ye Yuan, Xinshuo Weng, Yanglan Ou, and Kris M Kitani. Agent- former: Agent-aware transformers for socio-temporal multi-agent fore- casting. InICCV, pages 9813–9823, 2021

work page 2021