TrajTok: Adaptive Spatial Tokenization for Trajectory Representation Learning

Cyrus Shahabi; Shang-Ling Hsu; Zhen Xiong

arxiv: 2605.20134 · v1 · pith:P3CEOT4Anew · submitted 2026-05-19 · 💻 cs.LG

TrajTok: Adaptive Spatial Tokenization for Trajectory Representation Learning

Zhen Xiong , Shang-Ling Hsu , Cyrus Shahabi This is my paper

Pith reviewed 2026-05-20 06:33 UTC · model grok-4.3

classification 💻 cs.LG

keywords trajectory representation learningspatial tokenizationhexagonal partitioningmasked token modelingfactorized transformertransferable embeddingsGPS trajectorypretraining

0 comments

The pith

TrajTok turns noisy GPS traces into multi-resolution hexagonal tokens that let one frozen encoder support similarity search, classification, and travel-time tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents TrajTok as a method to learn generalizable trajectory embeddings from continuous, noisy, and irregularly sampled GPS data. It first builds an adaptive multi-resolution hexagonal cell partition directly from the spatial distribution of points, turning raw trajectories into discrete token sequences. A factorized transformer then processes geometry and kinematics separately before fusing them, using spatiotemporal rotary embeddings and masked-token pretraining to recover both structure and movement patterns. When the encoder stays frozen and only lightweight adapters are trained, the same representations perform competitively or better than specialized models across four different tasks on the Porto dataset, indicating they capture transferable trajectory properties rather than narrow shortcuts.

Core claim

TrajTok shows that a learned multi-resolution hexagonal partition of space, paired with a factorized transformer using per-modality self-attention, cross-attention fusion, and ST-RoPE, plus masked-token pretraining, yields embeddings that remain effective when the encoder is frozen and reused across geometry-heavy and kinematics-heavy downstream tasks.

What carries the argument

The adaptive multi-resolution hexagonal cell partition learned from GPS point distribution, which discretizes trajectories into tokens for the factorized transformer with early modality-specific attention, cross-attention fusion, and ST-RoPE position embeddings.

If this is right

A single pretrained trajectory encoder can be reused for both similarity-based retrieval and regression-style time prediction.
Lightweight task adapters suffice once the encoder has learned general structure through masked pretraining.
Hexagonal multi-resolution tokenization handles sparse cells and heterogeneous patterns better than fixed fine or coarse grids.
Masked recovery of both geometry and kinematics during pretraining supports transfer without task-specific shortcuts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tokenization approach could be tested on datasets with different sampling rates or urban densities to check geographic robustness.
Adding explicit speed or heading features at the token level might further strengthen kinematics capture beyond what position and time already provide.
Direct comparison against vector-quantized or learned grid alternatives would isolate whether the hexagonal multi-resolution choice is the main driver of transfer performance.
Deployment in online settings such as live route planning would reveal whether the frozen encoder supports low-latency inference without retraining.

Load-bearing premise

Learning a multi-resolution hexagonal partition from the spatial distribution of GPS points will produce tokens that keep the geometric and movement information needed for multiple tasks without unacceptable loss from noise or irregular sampling.

What would settle it

A large performance gap appearing on one or more Porto tasks when the encoder is frozen, or clear failure to distinguish distinct movement patterns on a new city dataset, would show the tokenization does not preserve the required information.

Figures

Figures reproduced from arXiv: 2605.20134 by Cyrus Shahabi, Shang-Ling Hsu, Zhen Xiong.

**Figure 1.** Figure 1: Overview of TrajTok. (Left) Adaptive spatial tokenization: raw, irregular GPS trajectories are partitioned using a density-adaptive multi-resolution hexagonal grid, producing a compact vocabulary of spatial cells and a discrete token sequence. (Middle) Factorized transformer encoder: tokens are represented through two streams—a geometric channel (cell IDs) and a kinematic channel (motion features such as s… view at source ↗

**Figure 2.** Figure 2: Fixed vs. adaptive spatial tokenization on Porto. Top: raw data (left) and adaptive partition [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Mismatch between pretraining optimization and downstream utility. Although both training [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

read the original abstract

Learning generalizable trajectory representations from raw GPS traces remains difficult because the data is continuous, noisy, and irregularly sampled. Spatial tokenization is also challenging: fine grids yield sparse cells with weak embeddings, while coarse grids merge heterogeneous movement patterns into the same token. We present TrajTok, a trajectory encoder with a simple pretraining recipe for transferable trajectory embeddings. TrajTok first learns a multi-resolution hexagonal cell partition from the spatial distribution of GPS points, converting noisy GPS sequences into discrete cell tokens. To capture both geometry and kinematics, it uses a factorized transformer encoder with early per-modality self-attention blocks, cross-attention fusion layers, and spatiotemporal rotary position embeddings, ST-RoPE, to encode where and when each token occurs. TrajTok is pretrained with masked-token modeling that recovers both geometric structure and kinematic patterns from partial trajectory observations. On the Porto dataset, a frozen TrajTok encoder with lightweight task adapters achieves strong performance across trajectory similarity search, classification, estimated time of arrival, and full travel-time regression, outperforming multiple task-specific methods. The same frozen encoder supports both geometry-dominated and kinematics-dominated tasks, suggesting that TrajTok learns transferable trajectory structure rather than task-specific shortcuts. These results indicate that learned multi-resolution spatial tokenization combined with masked-token pretraining is a promising direction for general-purpose trajectory foundation models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TrajTok gives a concrete recipe for data-driven hexagonal tokenization plus factorized attention and masked pretraining that lets one frozen encoder handle several trajectory tasks on Porto, but the abstract leaves the actual gains and kinematic preservation hard to assess.

read the letter

TrajTok learns a multi-resolution hexagonal partition directly from GPS point distributions, turns traces into cell sequences, and runs them through a factorized transformer that does separate self-attention per modality before cross-attention fusion, all with ST-RoPE for space-time positions. It pretrains with masked token prediction and then freezes the encoder for downstream work with small adapters. On Porto this setup reportedly beats task-specific baselines on similarity search, classification, ETA, and travel-time regression while supporting both geometry-heavy and kinematics-heavy jobs from the same representation.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces TrajTok, a trajectory encoder that learns a multi-resolution hexagonal cell partition directly from the spatial distribution of GPS points to produce discrete tokens. These tokens are processed by a factorized transformer with early per-modality self-attention, cross-attention fusion, and spatiotemporal rotary position embeddings (ST-RoPE). The model is pretrained via masked-token modeling to recover both geometric structure and kinematic patterns. The central empirical claim is that a frozen TrajTok encoder, combined with lightweight task adapters, outperforms multiple task-specific methods on the Porto dataset across trajectory similarity search, classification, estimated time of arrival, and full travel-time regression, while supporting both geometry-dominated and kinematics-dominated tasks.

Significance. If the reported gains hold under rigorous controls, the work would represent a meaningful step toward general-purpose trajectory foundation models. Demonstrating that a single pretrained encoder can be reused across diverse downstream tasks without retraining the backbone would reduce the need for task-specific architectures in mobility and transportation applications. The combination of adaptive spatial tokenization with masked pretraining is a plausible direction, though its advantage over existing grid- or point-based methods requires clear quantification.

major comments (2)

[§4] §4 (Experiments on Porto dataset): The headline claim that the frozen encoder supports kinematics-dominated tasks (e.g., travel-time regression) rests on the unverified assumption that the spatially-derived hexagonal partition preserves sufficient velocity and acceleration information. No quantitative diagnostic—such as mutual information between token sequences and velocity profiles, or per-cell kinematic variance—is reported to show that aggregation under irregular GPS sampling does not erase critical movement details.
[§3.1] §3.1 (Tokenization): The multi-resolution hexagonal partition is learned solely from point density; the manuscript does not demonstrate that this choice avoids merging trajectories with dissimilar headings or speeds into the same token at coarser levels, which would undermine the transferability argument for kinematics-heavy tasks.

minor comments (2)

[Abstract] The abstract asserts 'strong performance' and 'outperforming multiple task-specific methods' without numerical values, baseline names, or error bars; these details should appear in the main results table for immediate assessment.
[§3.2] Notation for ST-RoPE and the factorized attention blocks should be defined with explicit equations in §3.2 to allow reproduction of the position encoding.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and have incorporated revisions to provide the requested quantitative support for our claims about kinematic preservation.

read point-by-point responses

Referee: [§4] §4 (Experiments on Porto dataset): The headline claim that the frozen encoder supports kinematics-dominated tasks (e.g., travel-time regression) rests on the unverified assumption that the spatially-derived hexagonal partition preserves sufficient velocity and acceleration information. No quantitative diagnostic—such as mutual information between token sequences and velocity profiles, or per-cell kinematic variance—is reported to show that aggregation under irregular GPS sampling does not erase critical movement details.

Authors: We agree that an explicit diagnostic would make the claim more robust. While the strong empirical results on travel-time regression and ETA tasks (which are kinematics-dominated) already indicate that velocity and acceleration signals are retained through the token sequences and ST-RoPE, we will add in the revised manuscript a quantitative analysis including per-cell kinematic variance and mutual information between token IDs and velocity/acceleration statistics derived from the underlying GPS points. This will directly verify that the adaptive hexagonal aggregation preserves sufficient movement detail under irregular sampling. revision: yes
Referee: [§3.1] §3.1 (Tokenization): The multi-resolution hexagonal partition is learned solely from point density; the manuscript does not demonstrate that this choice avoids merging trajectories with dissimilar headings or speeds into the same token at coarser levels, which would undermine the transferability argument for kinematics-heavy tasks.

Authors: The density-driven multi-resolution design primarily addresses sparsity at fine scales and over-merging at coarse scales. The masked pretraining objective further encourages the model to recover kinematic patterns from context, supporting transfer to kinematics-heavy tasks. To strengthen this, we will include in the revision additional visualizations and metrics (e.g., intra-token variance of headings and speeds across resolutions) demonstrating that coarser tokens maintain acceptable homogeneity in movement characteristics rather than indiscriminately merging dissimilar trajectories. revision: yes

Circularity Check

0 steps flagged

TrajTok's tokenization and pretraining pipeline shows no circular reductions

full rationale

The derivation begins with learning a multi-resolution hexagonal partition from the empirical spatial distribution of GPS points, converts sequences to discrete tokens, applies a factorized transformer with ST-RoPE, and pretrains via standard masked-token modeling. Downstream results on Porto use frozen encoder plus separate lightweight adapters for similarity search, classification, ETA, and travel-time regression; these are independent held-out tasks rather than re-derivations of the pretraining loss or partition objective. No equations or claims reduce a prediction to a fitted input by construction, and the abstract contains no self-citations or uniqueness theorems invoked to force the architecture. The chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the approach relies on a learned spatial partition whose resolution choices and fitting procedure are not detailed; standard transformer assumptions are also invoked.

free parameters (1)

multi-resolution levels
The number and boundaries of resolution levels in the hexagonal partition are learned or chosen from data distribution and function as hyperparameters.

axioms (1)

domain assumption Hexagonal cells provide a suitable discrete representation for continuous GPS trajectories.
The choice of hexagonal partitioning is presented without justification or comparison in the abstract.

pith-pipeline@v0.9.0 · 5772 in / 1398 out tokens · 64256 ms · 2026-05-20T06:33:01.596712+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We build the token vocabulary with the hierarchical H3 hexagonal grid... Cells whose counts exceed a capacity threshold are split into their children... allocating finer cells to dense regions while preserving a compact vocabulary.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

factorized transformer encoder with early per-modality self-attention blocks, cross-attention fusion layers, and spatiotemporal rotary position embeddings, ST-RoPE

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 1 internal anchor

[1]

Mobility network models of covid-19 explain inequities and inform reopening

Serina Chang, Emma Pierson, Pang Wei Koh, Jaline Gerardin, Beth Redbird, David Grusky, and Jure Leskovec. Mobility network models of covid-19 explain inequities and inform reopening. Nature, 589(7840):82–87, 2021

work page 2021
[2]

Contrastive trajectory similarity learning with dual-feature attention

Yanchuan Chang, Jianzhong Qi, Yuxuan Liang, and Egemen Tanin. Contrastive trajectory similarity learning with dual-feature attention. InProceedings of the 39th IEEE International Conference on Data Engineering (ICDE). IEEE, 2023. doi: 10.1109/ICDE55515.2023.00119

work page doi:10.1109/icde55515.2023.00119 2023
[3]

Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima

Huanran Chen, Huaqing Zhang, Xiao Li, Yinpeng Dong, Ke Shen, and Jun Zhu. Nexus: Same pretraining loss, better downstream generalization via common minima, 2026. URL https://arxiv.org/abs/2604.09258. Direct support: the paper reports significantly better downstream performance despite achieving the same pretraining loss, challenging reliance on pretrainin...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

KGTS: Contrastive trajectory similarity learning over prompt knowledge graph embedding

Zhen Chen, Dalin Zhang, Shanshan Feng, Kaixuan Chen, Lisi Chen, Peng Han, and Shuo Shang. KGTS: Contrastive trajectory similarity learning over prompt knowledge graph embedding. In Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, 2024. doi: 10.1609/aaai.v38i8.28672

work page doi:10.1609/aaai.v38i8.28672 2024
[5]

Efficient trajectory similarity computation with contrastive learning

Liwei Deng, Yan Zhao, Zidan Wang, Hao Fan, and Kai Zheng. Efficient trajectory similarity computation with contrastive learning. InProceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM), pages 229–239. ACM, 2022. doi: 10.1145/ 3511808.3557308

work page arXiv 2022
[6]

and Lagerkvist, V

Yanyan Fan, Ye Yuan, Guoliang Lv, Hao Chen, Jian Li, and Feifei Liu. MetaER-TTE: An adaptive meta-learning model for en route travel time estimation. InProceedings of the Thirty- First International Joint Conference on Artificial Intelligence, pages 1806–1812, 2022. doi: 10.24963/ijcai.2022/251

work page doi:10.24963/ijcai.2022/251 2022
[7]

ConSTGAT: Contextual spatial-temporal graph attention network for travel time estimation at baidu maps

Xiaomin Fang, Jizhou Huang, Fan Wang, Lingke Zeng, Haijin Liang, and Haifeng Wang. ConSTGAT: Contextual spatial-temporal graph attention network for travel time estimation at baidu maps. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2697–2705, 2020. doi: 10.1145/3394486.3403316

work page doi:10.1145/3394486.3403316 2020
[8]

SSML: Self-supervised meta-learner for en route travel time estimation at baidu maps

Xiaomin Fang, Jizhou Huang, Fan Wang, Lihang Liu, Yibo Sun, and Haifeng Wang. SSML: Self-supervised meta-learner for en route travel time estimation at baidu maps. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2840–2848, 2021. doi: 10.1145/3447548.3467109

work page doi:10.1145/3447548.3467109 2021
[9]

Model-agnostic meta-learning for fast adap- tation of deep networks

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adap- tation of deep networks. InProceedings of the 34th International Conference on Machine Learning, pages 1126–1135, 2017

work page 2017
[10]

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Kun Fu, Fanlin Meng, Jieping Ye, and Zheng Wang. CompactETA: A fast inference system for travel time prediction. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 3337–3345, 2020. doi: 10.1145/3394486. 3403387

work page doi:10.1145/3394486 2020
[11]

Self- supervised trajectory representation learning with temporal regularities and travel semantics

Jiawei Jiang, Dayan Pan, Houxing Ren, Xiaohan Jiang, Chao Li, and Jingyuan Wang. Self- supervised trajectory representation learning with temporal regularities and travel semantics. In 2023 IEEE 39th international conference on data engineering (ICDE), pages 843–855. IEEE, 2023

work page 2023
[12]

Satclip: Global, general-purpose location embeddings with satellite imagery

Konstantin Klemmer, Esther Rolf, Caleb Robinson, Lester Mackey, and Marc Rußwurm. Satclip: Global, general-purpose location embeddings with satellite imagery. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 4347–4355, 2025. 10

work page 2025
[13]

Jensen, and Wei Wei

Xiucheng Li, Kaiqi Zhao, Gao Cong, Christian S. Jensen, and Wei Wei. Deep representation learning for trajectory similarity computation. InProceedings of the 34th IEEE International Conference on Data Engineering (ICDE), pages 617–628. IEEE, 2018

work page 2018
[14]

Jensen, and Youfang Lin

Yan Lin, Huaiyu Wan, Shengnan Guo, Jilin Hu, Christian S. Jensen, and Youfang Lin. Pre- training general trajectory embeddings with maximum multi-view entropy coding.IEEE Transactions on Knowledge and Data Engineering, 2023. doi: 10.48550/arXiv.2207.14539

work page doi:10.48550/arxiv.2207.14539 2023
[15]

Region-point joint representation for effective trajectory similarity learning.Proceedings of the AAAI Conference on Artificial Intelligence, 40(18):15439–15447, Mar

Hao Long, Silin Zhou, Lisi Chen, and Shuo Shang. Region-point joint representation for effective trajectory similarity learning.Proceedings of the AAAI Conference on Artificial Intelligence, 40(18):15439–15447, Mar. 2026. doi: 10.1609/aaai.v40i18.38571. URL https: //ojs.aaai.org/index.php/AAAI/article/view/38571

work page doi:10.1609/aaai.v40i18.38571 2026
[16]

More than routing: Joint gps and route modeling for refine trajectory representation learning

Zhipeng Ma, Zheyan Tu, Xinhai Chen, Yan Zhang, Deguo Xia, Guyue Zhou, Yilun Chen, Yu Zheng, and Jiangtao Gong. More than routing: Joint gps and route modeling for refine trajectory representation learning. InProceedings of the ACM Web Conference 2024, pages 3064–3075, 2024

work page 2024
[17]

Multi- scale representation learning for spatial feature distributions using grid cells.arXiv preprint arXiv:2003.00824, 2020

Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. Multi- scale representation learning for spatial feature distributions using grid cells.arXiv preprint arXiv:2003.00824, 2020

work page arXiv 2003
[18]

Multi-scale representation learning for spatial feature distributions using grid cells

Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. Multi-scale representation learning for spatial feature distributions using grid cells. In8th International Conference on Learning Representations, ICLR 2020, 2020

work page 2020
[19]

Towards a foundation model for trajectory intelligence

Alameen Najjar. Towards a foundation model for trajectory intelligence. In2023 IEEE International Conference on Data Mining Workshops (ICDMW), pages 832–835. IEEE, 2023

work page 2023
[20]

Trajectory representation learning on road networks and grids with spatio-temporal dynamics.arXiv preprint arXiv:2411.14014, 2024

Stefan Schestakov and Simon Gottschalk. Trajectory representation learning on road networks and grids with spatio-temporal dynamics.arXiv preprint arXiv:2411.14014, 2024

work page arXiv 2024
[21]

Roformer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

work page 2024
[22]

H3: A Hexagonal Hierarchical Geospatial Indexing System (version 4)

Uber Technologies. H3: A Hexagonal Hierarchical Geospatial Indexing System (version 4). https://h3geo.org/, 2022. Accessed: April 22, 2026

work page 2022
[23]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

work page 2017
[24]

When will you arrive? estimating travel time based on deep neural networks

Dong Wang, Junbo Zhang, Wei Cao, Jian Li, and Yu Zheng. When will you arrive? estimating travel time based on deep neural networks. InProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pages 2500–2507, 2018

work page 2018
[25]

Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach

Zhe Xu, Zhixin Li, Qingwen Guan, Dingshui Zhang, Qiang Li, Junxiao Nan, Chunyang Liu, Wei Bian, and Jieping Ye. Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 905–913, 2018

work page 2018
[26]

Huang, Shawn Yue, and Ge Zhang

Chen Yang, Junzhuo Li, Xinyao Niu, Xinrun Du, Songyang Gao, Haoran Zhang, Zhaoliang Chen, Xingwei Qu, Ruibin Yuan, Yizhi Li, Jiaheng Liu, Stephen W. Huang, Shawn Yue, and Ge Zhang. The fine line: Navigating large language model pretraining with down-streaming capability analysis, 2024. URL https://arxiv.org/abs/2404.01204. Direct support: the abstract exp...

work page arXiv 2024
[27]

SIMformer: Single- layer vanilla transformer can learn free-space trajectory similarity.Proceedings of the VLDB Endowment, 2025

Chuang Yang, Renhe Jiang, Xiaohang Xu, Chuan Xiao, and Kaoru Sezaki. SIMformer: Single- layer vanilla transformer can learn free-space trajectory similarity.Proceedings of the VLDB Endowment, 2025. doi: 10.14778/3705829.3705853. 11

work page doi:10.14778/3705829.3705853 2025
[28]

Unsupervised path rep- resentation learning with curriculum negative sampling.arXiv preprint arXiv:2106.09373, 2021

Sean Bin Yang, Chenjuan Guo, Jilin Hu, Jian Tang, and Bin Yang. Unsupervised path rep- resentation learning with curriculum negative sampling.arXiv preprint arXiv:2106.09373, 2021

work page arXiv 2021
[29]

Learning deep representation for trajectory clustering.Expert Systems, 35(2):e12252, 2018

Di Yao, Chao Zhang, Zhihua Zhu, Qin Hu, Zheng Wang, Jianhui Huang, and Jingping Bi. Learning deep representation for trajectory clustering.Expert Systems, 35(2):e12252, 2018. doi: https://doi.org/10.1111/exsy.12252. URL https://onlinelibrary.wiley.com/doi/abs/ 10.1111/exsy.12252. e12252 10.1111/exsy.12252

work page doi:10.1111/exsy.12252 2018
[30]

Computing trajectory similarity in linear time: A generic seed-guided neural metric learning approach

Di Yao, Chao Zhang, Zhihua Zhu, Jianhui Hu, and Jingping Bi. Computing trajectory similarity in linear time: A generic seed-guided neural metric learning approach. InProceedings of the 35th IEEE International Conference on Data Engineering (ICDE), pages 1358–1369. IEEE,

work page
[31]

doi: 10.1109/ICDE.2019.00123

work page doi:10.1109/icde.2019.00123 2019
[32]

TrajGAT: A graph-based long-term dependency modeling approach for trajectory similarity computation

Di Yao, Haonan Hu, Lun Du, Gao Cong, Shi Han, and Jingping Bi. TrajGAT: A graph-based long-term dependency modeling approach for trajectory similarity computation. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 2275–2285. ACM, 2022. doi: 10.1145/3534678.3539358

work page doi:10.1145/3534678.3539358 2022
[33]

Urban computing: concepts, methodolo- gies, and applications.ACM Transactions on Intelligent Systems and Technology (TIST), 5(3): 1–55, 2014

Yu Zheng, Licia Capra, Ouri Wolfson, and Hai Yang. Urban computing: concepts, methodolo- gies, and applications.ACM Transactions on Intelligent Systems and Technology (TIST), 5(3): 1–55, 2014

work page 2014
[34]

Unitraj: Learning a universal trajectory foundation model from billion-scale worldwide traces

Yuanshao Zhu, James Jianqiao Yu, Xiangyu Zhao, Xun Zhou, Liang Han, Xuetao Wei, and Yuxuan Liang. Unitraj: Learning a universal trajectory foundation model from billion-scale worldwide traces. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. A Algorithm: Density-Adaptive Vocabulary Construction Algorithm 1Density-Adapti...

work page 2025

[1] [1]

Mobility network models of covid-19 explain inequities and inform reopening

Serina Chang, Emma Pierson, Pang Wei Koh, Jaline Gerardin, Beth Redbird, David Grusky, and Jure Leskovec. Mobility network models of covid-19 explain inequities and inform reopening. Nature, 589(7840):82–87, 2021

work page 2021

[2] [2]

Contrastive trajectory similarity learning with dual-feature attention

Yanchuan Chang, Jianzhong Qi, Yuxuan Liang, and Egemen Tanin. Contrastive trajectory similarity learning with dual-feature attention. InProceedings of the 39th IEEE International Conference on Data Engineering (ICDE). IEEE, 2023. doi: 10.1109/ICDE55515.2023.00119

work page doi:10.1109/icde55515.2023.00119 2023

[3] [3]

Nexus: Same Pretraining Loss, Better Downstream Generalization via Common Minima

Huanran Chen, Huaqing Zhang, Xiao Li, Yinpeng Dong, Ke Shen, and Jun Zhu. Nexus: Same pretraining loss, better downstream generalization via common minima, 2026. URL https://arxiv.org/abs/2604.09258. Direct support: the paper reports significantly better downstream performance despite achieving the same pretraining loss, challenging reliance on pretrainin...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[4] [4]

KGTS: Contrastive trajectory similarity learning over prompt knowledge graph embedding

Zhen Chen, Dalin Zhang, Shanshan Feng, Kaixuan Chen, Lisi Chen, Peng Han, and Shuo Shang. KGTS: Contrastive trajectory similarity learning over prompt knowledge graph embedding. In Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, 2024. doi: 10.1609/aaai.v38i8.28672

work page doi:10.1609/aaai.v38i8.28672 2024

[5] [5]

Efficient trajectory similarity computation with contrastive learning

Liwei Deng, Yan Zhao, Zidan Wang, Hao Fan, and Kai Zheng. Efficient trajectory similarity computation with contrastive learning. InProceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM), pages 229–239. ACM, 2022. doi: 10.1145/ 3511808.3557308

work page arXiv 2022

[6] [6]

and Lagerkvist, V

Yanyan Fan, Ye Yuan, Guoliang Lv, Hao Chen, Jian Li, and Feifei Liu. MetaER-TTE: An adaptive meta-learning model for en route travel time estimation. InProceedings of the Thirty- First International Joint Conference on Artificial Intelligence, pages 1806–1812, 2022. doi: 10.24963/ijcai.2022/251

work page doi:10.24963/ijcai.2022/251 2022

[7] [7]

ConSTGAT: Contextual spatial-temporal graph attention network for travel time estimation at baidu maps

Xiaomin Fang, Jizhou Huang, Fan Wang, Lingke Zeng, Haijin Liang, and Haifeng Wang. ConSTGAT: Contextual spatial-temporal graph attention network for travel time estimation at baidu maps. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2697–2705, 2020. doi: 10.1145/3394486.3403316

work page doi:10.1145/3394486.3403316 2020

[8] [8]

SSML: Self-supervised meta-learner for en route travel time estimation at baidu maps

Xiaomin Fang, Jizhou Huang, Fan Wang, Lihang Liu, Yibo Sun, and Haifeng Wang. SSML: Self-supervised meta-learner for en route travel time estimation at baidu maps. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2840–2848, 2021. doi: 10.1145/3447548.3467109

work page doi:10.1145/3447548.3467109 2021

[9] [9]

Model-agnostic meta-learning for fast adap- tation of deep networks

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adap- tation of deep networks. InProceedings of the 34th International Conference on Machine Learning, pages 1126–1135, 2017

work page 2017

[10] [10]

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Kun Fu, Fanlin Meng, Jieping Ye, and Zheng Wang. CompactETA: A fast inference system for travel time prediction. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 3337–3345, 2020. doi: 10.1145/3394486. 3403387

work page doi:10.1145/3394486 2020

[11] [11]

Self- supervised trajectory representation learning with temporal regularities and travel semantics

Jiawei Jiang, Dayan Pan, Houxing Ren, Xiaohan Jiang, Chao Li, and Jingyuan Wang. Self- supervised trajectory representation learning with temporal regularities and travel semantics. In 2023 IEEE 39th international conference on data engineering (ICDE), pages 843–855. IEEE, 2023

work page 2023

[12] [12]

Satclip: Global, general-purpose location embeddings with satellite imagery

Konstantin Klemmer, Esther Rolf, Caleb Robinson, Lester Mackey, and Marc Rußwurm. Satclip: Global, general-purpose location embeddings with satellite imagery. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 4347–4355, 2025. 10

work page 2025

[13] [13]

Jensen, and Wei Wei

Xiucheng Li, Kaiqi Zhao, Gao Cong, Christian S. Jensen, and Wei Wei. Deep representation learning for trajectory similarity computation. InProceedings of the 34th IEEE International Conference on Data Engineering (ICDE), pages 617–628. IEEE, 2018

work page 2018

[14] [14]

Jensen, and Youfang Lin

Yan Lin, Huaiyu Wan, Shengnan Guo, Jilin Hu, Christian S. Jensen, and Youfang Lin. Pre- training general trajectory embeddings with maximum multi-view entropy coding.IEEE Transactions on Knowledge and Data Engineering, 2023. doi: 10.48550/arXiv.2207.14539

work page doi:10.48550/arxiv.2207.14539 2023

[15] [15]

Region-point joint representation for effective trajectory similarity learning.Proceedings of the AAAI Conference on Artificial Intelligence, 40(18):15439–15447, Mar

Hao Long, Silin Zhou, Lisi Chen, and Shuo Shang. Region-point joint representation for effective trajectory similarity learning.Proceedings of the AAAI Conference on Artificial Intelligence, 40(18):15439–15447, Mar. 2026. doi: 10.1609/aaai.v40i18.38571. URL https: //ojs.aaai.org/index.php/AAAI/article/view/38571

work page doi:10.1609/aaai.v40i18.38571 2026

[16] [16]

More than routing: Joint gps and route modeling for refine trajectory representation learning

Zhipeng Ma, Zheyan Tu, Xinhai Chen, Yan Zhang, Deguo Xia, Guyue Zhou, Yilun Chen, Yu Zheng, and Jiangtao Gong. More than routing: Joint gps and route modeling for refine trajectory representation learning. InProceedings of the ACM Web Conference 2024, pages 3064–3075, 2024

work page 2024

[17] [17]

Multi- scale representation learning for spatial feature distributions using grid cells.arXiv preprint arXiv:2003.00824, 2020

Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. Multi- scale representation learning for spatial feature distributions using grid cells.arXiv preprint arXiv:2003.00824, 2020

work page arXiv 2003

[18] [18]

Multi-scale representation learning for spatial feature distributions using grid cells

Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. Multi-scale representation learning for spatial feature distributions using grid cells. In8th International Conference on Learning Representations, ICLR 2020, 2020

work page 2020

[19] [19]

Towards a foundation model for trajectory intelligence

Alameen Najjar. Towards a foundation model for trajectory intelligence. In2023 IEEE International Conference on Data Mining Workshops (ICDMW), pages 832–835. IEEE, 2023

work page 2023

[20] [20]

Trajectory representation learning on road networks and grids with spatio-temporal dynamics.arXiv preprint arXiv:2411.14014, 2024

Stefan Schestakov and Simon Gottschalk. Trajectory representation learning on road networks and grids with spatio-temporal dynamics.arXiv preprint arXiv:2411.14014, 2024

work page arXiv 2024

[21] [21]

Roformer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. Roformer: Enhanced transformer with rotary position embedding.Neurocomputing, 568:127063, 2024

work page 2024

[22] [22]

H3: A Hexagonal Hierarchical Geospatial Indexing System (version 4)

Uber Technologies. H3: A Hexagonal Hierarchical Geospatial Indexing System (version 4). https://h3geo.org/, 2022. Accessed: April 22, 2026

work page 2022

[23] [23]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

work page 2017

[24] [24]

When will you arrive? estimating travel time based on deep neural networks

Dong Wang, Junbo Zhang, Wei Cao, Jian Li, and Yu Zheng. When will you arrive? estimating travel time based on deep neural networks. InProceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, pages 2500–2507, 2018

work page 2018

[25] [25]

Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach

Zhe Xu, Zhixin Li, Qingwen Guan, Dingshui Zhang, Qiang Li, Junxiao Nan, Chunyang Liu, Wei Bian, and Jieping Ye. Large-scale order dispatch in on-demand ride-hailing platforms: A learning and planning approach. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 905–913, 2018

work page 2018

[26] [26]

Huang, Shawn Yue, and Ge Zhang

Chen Yang, Junzhuo Li, Xinyao Niu, Xinrun Du, Songyang Gao, Haoran Zhang, Zhaoliang Chen, Xingwei Qu, Ruibin Yuan, Yizhi Li, Jiaheng Liu, Stephen W. Huang, Shawn Yue, and Ge Zhang. The fine line: Navigating large language model pretraining with down-streaming capability analysis, 2024. URL https://arxiv.org/abs/2404.01204. Direct support: the abstract exp...

work page arXiv 2024

[27] [27]

SIMformer: Single- layer vanilla transformer can learn free-space trajectory similarity.Proceedings of the VLDB Endowment, 2025

Chuang Yang, Renhe Jiang, Xiaohang Xu, Chuan Xiao, and Kaoru Sezaki. SIMformer: Single- layer vanilla transformer can learn free-space trajectory similarity.Proceedings of the VLDB Endowment, 2025. doi: 10.14778/3705829.3705853. 11

work page doi:10.14778/3705829.3705853 2025

[28] [28]

Unsupervised path rep- resentation learning with curriculum negative sampling.arXiv preprint arXiv:2106.09373, 2021

Sean Bin Yang, Chenjuan Guo, Jilin Hu, Jian Tang, and Bin Yang. Unsupervised path rep- resentation learning with curriculum negative sampling.arXiv preprint arXiv:2106.09373, 2021

work page arXiv 2021

[29] [29]

Learning deep representation for trajectory clustering.Expert Systems, 35(2):e12252, 2018

Di Yao, Chao Zhang, Zhihua Zhu, Qin Hu, Zheng Wang, Jianhui Huang, and Jingping Bi. Learning deep representation for trajectory clustering.Expert Systems, 35(2):e12252, 2018. doi: https://doi.org/10.1111/exsy.12252. URL https://onlinelibrary.wiley.com/doi/abs/ 10.1111/exsy.12252. e12252 10.1111/exsy.12252

work page doi:10.1111/exsy.12252 2018

[30] [30]

Computing trajectory similarity in linear time: A generic seed-guided neural metric learning approach

Di Yao, Chao Zhang, Zhihua Zhu, Jianhui Hu, and Jingping Bi. Computing trajectory similarity in linear time: A generic seed-guided neural metric learning approach. InProceedings of the 35th IEEE International Conference on Data Engineering (ICDE), pages 1358–1369. IEEE,

work page

[31] [31]

doi: 10.1109/ICDE.2019.00123

work page doi:10.1109/icde.2019.00123 2019

[32] [32]

TrajGAT: A graph-based long-term dependency modeling approach for trajectory similarity computation

Di Yao, Haonan Hu, Lun Du, Gao Cong, Shi Han, and Jingping Bi. TrajGAT: A graph-based long-term dependency modeling approach for trajectory similarity computation. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pages 2275–2285. ACM, 2022. doi: 10.1145/3534678.3539358

work page doi:10.1145/3534678.3539358 2022

[33] [33]

Urban computing: concepts, methodolo- gies, and applications.ACM Transactions on Intelligent Systems and Technology (TIST), 5(3): 1–55, 2014

Yu Zheng, Licia Capra, Ouri Wolfson, and Hai Yang. Urban computing: concepts, methodolo- gies, and applications.ACM Transactions on Intelligent Systems and Technology (TIST), 5(3): 1–55, 2014

work page 2014

[34] [34]

Unitraj: Learning a universal trajectory foundation model from billion-scale worldwide traces

Yuanshao Zhu, James Jianqiao Yu, Xiangyu Zhao, Xun Zhou, Liang Han, Xuetao Wei, and Yuxuan Liang. Unitraj: Learning a universal trajectory foundation model from billion-scale worldwide traces. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. A Algorithm: Density-Adaptive Vocabulary Construction Algorithm 1Density-Adapti...

work page 2025