From GPS Points to Travel Patterns: Flexible and Semantic Trajectory Generation with LLMs

Chenhao Wang; Lisi Chen; Panos Kalnis; Shuo Shang; Silin Zhou; Yuntao Wen

arxiv: 2605.30014 · v1 · pith:MQM3Y6HWnew · submitted 2026-05-28 · 💻 cs.AI

From GPS Points to Travel Patterns: Flexible and Semantic Trajectory Generation with LLMs

Silin Zhou , Chenhao Wang , Yuntao Wen , Shuo Shang , Lisi Chen , Panos Kalnis This is my paper

Pith reviewed 2026-06-29 07:15 UTC · model grok-4.3

classification 💻 cs.AI

keywords trajectory generationlarge language modelsurban trajectoriesRQ-VAEtravel patternsGPS synthesisprivacy preservationsemantic generation

0 comments

The pith

HTP generates flexible urban trajectories by first creating travel pattern tokens with RQ-VAE then using LLMs for GPS points under varied conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the limits of prior trajectory generators that produce only fixed-length paths without explicit travel patterns. It introduces HTP to synthesize realistic GPS data while protecting privacy by working at the level of semantic patterns first. A specialized RQ-VAE quantizes raw trajectories into compact travel-pattern tokens that retain spatial details such as density variations. These tokens extend an LLM's vocabulary, after which supervised fine-tuning teaches the model to output valid pattern sequences conditioned on different scenarios. The resulting points are then decoded from the patterns, yielding higher-quality outputs than direct GPS generation.

Core claim

HTP first applies a trajectory-specific residual quantization variational autoencoder to turn micro-level GPS trajectories into macro-level travel pattern tokens that encode segment irregularities, then extends the LLM vocabulary with these tokens and applies supervised fine-tuning so the model can generate variable-length travel pattern sequences under multiple conditions before decoding back to GPS points.

What carries the argument

Trajectory-specific residual quantization variational autoencoder (RQ-VAE) that converts GPS trajectories into compact travel pattern tokens in a coarse-to-fine manner.

If this is right

Trajectories can be generated at variable lengths and under multiple user-specified conditions rather than a single fixed setting.
Travel pattern tokens capture point-density variations caused by traffic conditions that direct GPS generators miss.
Extending the LLM vocabulary with the tokens aligns trajectory data with the model's existing language capabilities.
Supervised fine-tuning on the tokens produces higher-quality outputs than direct point generation baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same tokenization-plus-LLM pipeline could be tested on other sequential spatial data such as animal movement tracks or delivery routes.
If the tokens prove robust, the approach might reduce reliance on raw location traces for training downstream urban models.
Conditioning the LLM on external signals like weather or events could be added to simulate scenario-specific mobility without retraining the quantizer.

Load-bearing premise

The RQ-VAE tokens preserve all necessary spatial irregularities and the supervised fine-tuning will make the LLM generate valid patterns without mode collapse or fidelity loss.

What would settle it

On a held-out real-world dataset, if the generated trajectories show no improvement over baselines in metrics for length variability, spatial irregularity, or semantic match, or if they exhibit mode collapse, the performance claim would be falsified.

Figures

Figures reproduced from arXiv: 2605.30014 by Chenhao Wang, Lisi Chen, Panos Kalnis, Shuo Shang, Silin Zhou, Yuntao Wen.

**Figure 2.** Figure 2: The training overview of HTP. The details of the encoder and decoder are shown in Figure 13 of the Appendix. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the process for obtaining training [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 5.** Figure 5: Visualization comparisons on Chengdu dataset. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization comparisons on Porto dataset. [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 7.** Figure 7: Comparisons of trajectory length density distribu [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Comparisons of generation speed for one trajectory. [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: Proportion of token usage in each codebook layer of RQ-VAE during training on the Chengdu dataset. [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 11.** Figure 11: Visualization of case study. 5.6 Case Study (RQ5) Codebook. To further investigate what the codebook has learned, we visualize the trajectories sharing the same codes at the first layer in [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 12.** Figure 12: The length token transformation of odd-even vari [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

**Figure 13.** Figure 13: The details of the encoder and decoder of trajectory-specific RQ-VAE. [PITH_FULL_IMAGE:figures/full_fig_p012_13.png] view at source ↗

read the original abstract

Urban trajectories play a crucial role in modeling urban dynamics and supporting various smart city applications. However, privacy concerns restrict access to large-scale and high-quality trajectory datasets. Trajectory generation provides a promising alternative by synthesizing realistic data to mitigate privacy risks. However, existing methods fail to explicitly capture travel patterns and can only generate fixed-length trajectories under a single condition. To address these limitations, we propose \textbf{HTP}, which \textbf{H}ierarchically generates \textbf{T}ravel patterns first and then generates GPS \textbf{P}oints by using large language models (LLMs), rather than directly generating GPS points. We first design a trajectory-specific residual quantization variational autoencoder (RQ-VAE) that quantizes micro-level GPS trajectories into compact, macro-level travel pattern tokens in a coarse-to-fine manner. These tokens capture rich segment spatial irregularities, such as point density variations caused by traffic conditions. Then, we extend the LLM vocabulary with travel pattern tokens to align trajectory representations with the LLM input, and apply supervised fine-tuning (SFT) to align the LLM with the trajectory generation task, enabling generation of travel pattern sequences under various conditions. Extensive experiments on two real-world datasets show that HTP outperforms the strongest baseline by an average of 29.78\% in terms of generation quality. Our code is available at https://github.com/slzhou-xy/HTP.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HTP's RQ-VAE tokenization plus LLM vocabulary extension and SFT produces conditioned variable-length trajectories and reports a 30% average gain, but the abstract leaves the metrics and baselines underspecified.

read the letter

The main point is that this work generates trajectories by first learning compact travel pattern tokens via a residual quantization VAE that works coarse-to-fine on GPS data, then extending an LLM's vocabulary with those tokens and fine-tuning it to produce sequences under different conditions before decoding back to points. The result is flexible length and multi-condition output instead of the fixed-length single-condition limits in earlier methods.

The new piece is the specific pipeline: the trajectory-tuned RQ-VAE that keeps segment-level irregularities like density changes from traffic, followed by the vocab extension and SFT step to make the LLM handle the token sequences. That combination is not in the prior direct-generation or fixed-length papers they cite. Releasing the code helps.

It does a clean job addressing the privacy problem by synthesizing data that can support urban modeling tasks. The reported 29.78% average improvement over the strongest baseline on two real datasets is the central empirical claim.

The soft spot is that the abstract states the improvement number without naming the exact metrics, listing all baselines, or noting statistical tests or split details. That makes it difficult to assess how solid the gain is until the experiments section is checked. The assumption that the quantized tokens retain everything needed for valid patterns also needs explicit checks for fidelity loss or generation artifacts.

This is for researchers in mobility synthesis and LLM use on structured sequences. Readers working on smart-city data or privacy-preserving simulation will find the method and results useful. The work is coherent enough on its own terms to deserve a serious referee.

Referee Report

2 major / 2 minor

Summary. The paper proposes HTP, a hierarchical trajectory generation method that first uses a residual quantization VAE (RQ-VAE) to tokenize micro-level GPS trajectories into macro-level travel pattern tokens capturing spatial irregularities, then extends an LLM's vocabulary with these tokens and applies supervised fine-tuning (SFT) to generate variable-length travel pattern sequences under different conditions before producing GPS points. It reports that HTP outperforms the strongest baseline by an average of 29.78% in generation quality on two real-world datasets.

Significance. If the empirical results hold under rigorous evaluation, the work would be significant for privacy-preserving urban trajectory synthesis, as the hierarchical LLM-based approach enables flexible, condition-aware generation of semantic travel patterns rather than fixed-length GPS sequences. Strengths include the explicit handling of travel patterns via RQ-VAE tokenization, code release, and focus on real-world applicability in smart city modeling.

major comments (2)

[Experiments] Experiments section (and abstract): the central 29.78% average improvement claim requires explicit reporting of the underlying metrics (e.g., trajectory similarity measures), full list of baselines, data splits, statistical significance tests, and ablation results on the RQ-VAE quantization levels; without these, it is impossible to assess whether the gain is robust or affected by evaluation choices.
[Method] Method section on RQ-VAE: the claim that the coarse-to-fine quantization preserves all necessary spatial irregularities (e.g., point density variations) is load-bearing for the hierarchical advantage; an ablation quantifying information loss or downstream generation fidelity at different codebook sizes would be needed to support this.

minor comments (2)

[Abstract] Abstract: the performance number is presented without any metric or baseline detail, which should be added for clarity even in the abstract.
[Notation] Notation: ensure consistent use of symbols for travel pattern tokens versus raw GPS points throughout the text and figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which highlights important areas for improving the clarity and rigor of our experimental reporting and methodological validation. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [Experiments] Experiments section (and abstract): the central 29.78% average improvement claim requires explicit reporting of the underlying metrics (e.g., trajectory similarity measures), full list of baselines, data splits, statistical significance tests, and ablation results on the RQ-VAE quantization levels; without these, it is impossible to assess whether the gain is robust or affected by evaluation choices.

Authors: We agree that these details are necessary for full assessment of the results. The 29.78% figure represents the average relative improvement across the primary generation quality metrics on the two datasets. In the revised manuscript, we will expand both the abstract and Experiments section to explicitly report the underlying metrics (including trajectory similarity measures such as DTW and Fréchet distance), the complete list of baselines, data split details, statistical significance tests (e.g., paired t-tests with p-values), and additional ablation results on RQ-VAE quantization levels. This will enable readers to evaluate robustness directly. revision: yes
Referee: [Method] Method section on RQ-VAE: the claim that the coarse-to-fine quantization preserves all necessary spatial irregularities (e.g., point density variations) is load-bearing for the hierarchical advantage; an ablation quantifying information loss or downstream generation fidelity at different codebook sizes would be needed to support this.

Authors: We recognize that an explicit ablation is required to substantiate the preservation of spatial irregularities. We will add a dedicated ablation study in the revised Method and Experiments sections that quantifies reconstruction error (information loss) and downstream trajectory generation fidelity (e.g., similarity metrics) across multiple codebook sizes and quantization levels. This will provide direct empirical support for the coarse-to-fine RQ-VAE design. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents an empirical ML pipeline (RQ-VAE tokenization of GPS trajectories into travel pattern tokens, LLM vocabulary extension, and SFT) whose central claim is an observed 29.78% quality improvement on two real-world datasets. No equations, first-principles derivations, or predictions appear in the provided text that reduce by construction to fitted inputs, self-definitions, or self-citation chains. The result is framed as an experimental outcome rather than a mathematical necessity, satisfying the criteria for a self-contained empirical contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, axioms, or invented entities are described; the method relies on standard VAE and LLM fine-tuning techniques.

pith-pipeline@v0.9.1-grok · 5822 in / 1125 out tokens · 44843 ms · 2026-06-29T07:15:35.121257+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

67 extracted references · 12 canonical work pages · 10 internal anchors

[1]

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. 2023. Qwen technical report.arXiv preprint arXiv:2309.16609(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Hannah Bast, Daniel Delling, Andrew Goldberg, Matthias Müller-Hannemann, Thomas Pajor, Peter Sanders, Dorothea Wagner, and Renato F Werneck. 2016. Route planning in transportation networks. InAlgorithm engineering: Selected results and surveys. 19–80

2016
[3]

Geoff Boeing. 2017. OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks.Computers, environment and urban systems65 (2017), 126–139

2017
[4]

Wei Chen, Yuxuan Liang, Yuanshao Zhu, Yanchuan Chang, Kang Luo, Haomin Wen, Lei Li, Yanwei Yu, Qingsong Wen, Chao Chen, Kai Zheng, Yunjun Gao, Xiao- fang Zhou, and Yu Zheng. 2024. Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond. InarXiv. 2403.14151

work page arXiv 2024
[5]

Xin Chen, Chengrui Huang, Chenhao Wang, and Lisi Chen. 2025. Trajectory generation: a survey on methods and techniques.GeoInformatica29, 3 (2025), 351–376

2025
[6]

Xinyu Chen, Jiajie Xu, Rui Zhou, Wei Chen, Junhua Fang, and Chengfei Liu
[7]

Neurocomputing428 (2021), 332–339

TrajVAE: A Variational AutoEncoder model for trajectory generation. Neurocomputing428 (2021), 332–339

2021
[8]

DeepSeek-AI. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. InarXiv. 2501.12948

work page internal anchor Pith review Pith/arXiv arXiv 2025
[9]

Bangchao Deng, Xin Jing, Tianyue Yang, Bingqing Qu, Dingqi Yang, and Philippe Cudré-Mauroux. 2025. Revisiting Synthetic Human Trajectories: Imitative Gen- eration and Benchmarks Beyond Datasaurus. InKDD. 201–212

2025
[10]

Xiaomin Fang, Jizhou Huang, Fan Wang, Lihang Liu, Yibo Sun, and Haifeng Wang. 2021. SSML: Self-Supervised Meta-Learner for En Route Travel Time Estimation at Baidu Maps. InKDD. 2840–2848

2021
[11]

Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron C

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2020. Generative adversarial networks.Commun. ACM63, 11 (2020), 139–144

2020
[12]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. InKDD. 855–864

2016
[13]

Baoshen Guo, Zhiqing Hong, Junyi Li, Shenhao Wang, and Jinhua Zhao. 2026. Leveraging the Spatial Hierarchy: Coarse-to-fine Trajectory Generation via Cas- caded Hybrid Diffusion. InKDD. 359–370

2026
[14]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. InNeurIPS

2020
[15]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput.9, 8 (1997), 1735–1780

1997
[16]

Danlei Hu, Lu Chen, Hanxi Fang, Ziquan Fang, Tianyi Li, and Yunjun Gao. 2024. Spatio-Temporal Trajectory Similarity Measures: A Comprehensive Survey and Quantitative Study.TKDE36, 5 (2024), 2191–2212

2024
[17]

Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. InICLR

2022
[18]

Yujia Hu, Yuntao Du, Zhikun Zhang, Ziquan Fang, Lu Chen, Kai Zheng, and Yunjun Gao. 2024. Real-Time Trajectory Synthesis with Local Differential Privacy. InICDE. 1685–1698

2024
[19]

Kingma and Max Welling

Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR

2014
[20]

Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. 2021. DiffWave: A Versatile Diffusion Model for Audio Synthesis. InICLR

2021
[21]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classi- fication with Deep Convolutional Neural Networks. InNIPS. 1106–1114

2012
[22]

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient Memory Management for Large Language Model Serving with PagedAttention. InSOSP. 611–626

2023
[23]

Siqi Lai, Zhao Xu, Weijia Zhang, Hao Liu, and Hui Xiong. 2025. LLMLight: Large Language Models as Traffic Signal Control Agents. InKDD. 2335–2346

2025
[24]

Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, and Wook-Shin Han. 2022. Autoregressive Image Generation using Residual Quantization. InCVPR. 11513– 11522

2022
[25]

Jianhua Lin. 2002. Divergence measures based on the Shannon entropy.IEEE Transactions on Information theory37, 1 (2002), 145–151

2002
[26]

Xia Liu, Hanzhou Chen, and Clio Andris. 2018. trajGANs: Using generative adversarial networks for geo-privacy protection of trajectory data (Vision paper). InLocation privacy and security workshop. 1–7

2018
[27]

Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. InICLR

2019
[28]

Yin Lou, Chengyang Zhang, Yu Zheng, Xing Xie, Wei Wang, and Yan Huang
[29]

InSIGSPATIAL

Map-matching for low-sampling-rate GPS trajectories. InSIGSPATIAL. 352–361
[30]

Xuebin Ma, Zinan Ding, and Xiaoyan Zhang. 2024. ST-TrajGAN: A synthetic trajectory generation algorithm for privacy preservation.Future Generation Computer Systems161 (2024), 226–238

2024
[31]

Xiaowei Mao, Yan Lin, Shengnan Guo, Yubin Chen, Xingyu Xian, Haomin Wen, Qisen Xu, Youfang Lin, and Huaiyu Wan. 2025. DutyTTE: Deciphering Uncer- tainty in Origin-Destination Travel Time Estimation. InAAAI. 12390–12398

2025
[32]

Mehmet Ercan Nergiz, Maurizio Atzori, Yücel Saygin, and Baris Güç. 2009. To- wards Trajectory Anonymization: a Generalization-Based Approach.Transactions on Data Privacy2, 1 (2009), 47–75

2009
[33]

OpenAI. 2023. GPT-4 Technical Report. InarXiv. 2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023
[34]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.Journal of machine learning research21 (2020), 140:1–140:67

2020
[35]

Jinmeng Rao, Song Gao, Yuhao Kang, and Qunying Huang. 2020. LSTM-TrajGAN: A deep learning approach to trajectory privacy protection. InarXiv. 2006.10521

work page arXiv 2020
[36]

Xuan Rao, Shuo Shang, Renhe Jiang, Peng Han, and Lisi Chen. 2025. Seed: Bridging Sequence and Diffusion Models for Road Trajectory Generation. In WWW. 2007–2017

2025
[37]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. InMICCAI, Vol. 9351. 234–241

2015
[38]

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. InarXiv. 2402.03300

work page internal anchor Pith review Pith/arXiv arXiv 2024
[39]

Filippo Simini, Gianni Barlacchi, Massimilano Luca, and Luca Pappalardo. 2021. A deep gravity model for mobility flows generation.Nature communications12, 1 (2021), 6576

2021
[40]

Song Tao and Jia Wang. 2020. Alleviation of Gradient Exploding in GANs: Fake Can Be Real. InCVPR. 1188–1197

2020
[41]

Llama Team. 2024. The Llama 3 Herd of Models. InarXiv. 2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024
[42]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[43]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yas- mine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhos- ale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models.arXiv preprint arXiv:2307.09288(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[44]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. InNIPS. 5998–6008

2017
[45]

Chenhao Wang, Lisi Chen, Shuo Shang, Christian S Jensen, and Panos Kalnis
[46]

Multi-scale detection of anomalous spatio-temporal trajectories in evolving trajectory datasets. InKDD. 2980–2990
[47]

Huandong Wang, Changzheng Gao, Yuchen Wu, Depeng Jin, Lina Yao, and Yong Li. 2023. PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation Learning. InAAAI. 14539–14547

2023
[48]

Jingyuan Wang, Yujing Lin, and Yudong Li. 2025. GTG: Generalizable Trajectory Generation Model for Urban Mobility. InAAAI. AAAI Press, 834–842

2025
[49]

Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, and Yu Su. 2024. TravelPlanner: A Benchmark for Real-World Planning with Language Agents. InICML

2024
[50]

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. 2025. Qwen3 technical report.arXiv preprint arXiv:2505.09388(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[51]

An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, et al . 2024. Qwen2.5 Technical Report.arXiv preprint arXiv:2412.15115(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[52]

Can Yang and Gyözö Gidófalvi. 2018. Fast map matching, an algorithm integrating hidden Markov model with precomputation.International Journal of Geographical Information Science32, 3 (2018), 547–570

2018
[53]

Chuang Yang, Renhe Jiang, Xiaohang Xu, Chuan Xiao, and Kaoru Sezaki. 2024. SIMformer: Single-Layer Vanilla Transformer Can Learn Free-Space Trajectory Similarity.Proc. VLDB Endow.18, 2 (2024), 390–398

2024
[54]

Whalen, and Gengchen Mai

Hao Yang, Xiaobai Angela Yao, Christopher C. Whalen, and Gengchen Mai. 2025. BERT4Traj: Transformer-Based Trajectory Reconstruction for Sparse Mobility Data. InGIScience, Vol. 346. 8:1–8:9

2025
[55]

Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, and Yonghui Wu. 2022. Vector-quantized Image Modeling with Improved VQGAN. InICLR

2022
[56]

Jing Zhang, Qihan Huang, Yirui Huang, Qian Ding, and Pei-Wei Tsai. 2023. DP- TrajGAN: A privacy-aware trajectory generation model with differential privacy. Future Generation Computer Systems142 (2023), 25–40

2023
[57]

Qianru Zhang, Peng Yang, Junliang Yu, Haixin Wang, Xingwei He, Siu-Ming Yiu, and Hongzhi Yin. 2025. A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security.TKDE37, 6 (2025), 3153–3172

2025
[58]

Ruixing Zhang, Yunqi Liu, Liangzhe Han, Leilei Sun, Chuanren Liu, Jibin Wang, and Weifeng Lv. 2025. Large-scale Human Mobility Data Regeneration for Open Urban Research. InKDD. 2827–2836. From GPS Points to Travel Patterns: Flexible and Semantic Trajectory Generation with LLMs KDD 2026, August 9–13, 2026, Jeju Island, Republic of Korea

2025
[59]

Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Hao Liu, and Hui Xiong. 2024. Urban Foundation Models: A Survey. InKDD. 6633–6643

2024
[60]

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, and Ji-Rong Wen. 2023. A Survey of Large Language Models. InarXiv. 2303.18223

work page internal anchor Pith review Pith/arXiv arXiv 2023
[61]

Gonzalez, Clark W

Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Chuyue Sun, Jeff Huang, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark W. Barrett, and Ying Sheng. 2024. SGLang: Efficient Execution of Struc- tured Language Model Programs. InNeurIPS

2024
[62]

Silin Zhou, Yao Chen, Shuo Shang, Lisi Chen, Bingsheng He, and Ryosuke Shibasaki. 2025. Blurred Encoding for Trajectory Representation Learning. In KDD. 4132–4143

2025
[63]

Yuanshao Zhu, Yongchao Ye, Shiyao Zhang, Xiangyu Zhao, and James Yu. 2023. DiffTraj: Generating GPS Trajectory with Diffusion Probabilistic Model. In NeurIPS

2023
[64]

Yuanshao Zhu, James Jian Qiao Yu, Xiangyu Zhao, Qidong Liu, Yongchao Ye, Wei Chen, Zijian Zhang, Xuetao Wei, and Yuxuan Liang. 2024. ControlTraj: Controllable Trajectory Generation with Topology-Constrained Diffusion Model. InKDD. 4676–4687. A Appendix A.1 Details of Conditions The input information of the LLM includes road segment tokens describing the r...

2024
[65]

• Travel Distance (T-Dist):Measures the JSD on distributions in total travel distances between real trajectories and generated trajectories

Point-level.Point-level metrics evaluate the statistical properties of trajectories at the raw GPS point, focusing on micro-level and fine-grained geometric consistency between generated and real trajectories. • Travel Distance (T-Dist):Measures the JSD on distributions in total travel distances between real trajectories and generated trajectories. The tr...

2026
[66]

In this paper, each city is partitioned into grids of size100 m× 100m to transform GPS trajectories into grid trajectories

Grid-level.Grid-level metrics evaluate the spatial aggregation and regional movement patterns of trajectories, capturing how generated trajectories distribute over urban space and whether they preserve high-frequency activity regions. In this paper, each city is partitioned into grids of size100 m× 100m to transform GPS trajectories into grid trajectories...
[67]

Aligned road segments are obtained via map-matching [27, 49]

Road-level.Road-level metrics evaluate trajectory realism from a road network perspective, measuring whether generated trajec- tories follow realistic road usage patterns and align with the un- derlying road topology. Aligned road segments are obtained via map-matching [27, 49]. • Density (R-Den):Measures the cosine similarity between the density distribu...

[1] [1]

Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, et al. 2023. Qwen technical report.arXiv preprint arXiv:2309.16609(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

Hannah Bast, Daniel Delling, Andrew Goldberg, Matthias Müller-Hannemann, Thomas Pajor, Peter Sanders, Dorothea Wagner, and Renato F Werneck. 2016. Route planning in transportation networks. InAlgorithm engineering: Selected results and surveys. 19–80

2016

[3] [3]

Geoff Boeing. 2017. OSMnx: New methods for acquiring, constructing, analyzing, and visualizing complex street networks.Computers, environment and urban systems65 (2017), 126–139

2017

[4] [4]

Wei Chen, Yuxuan Liang, Yuanshao Zhu, Yanchuan Chang, Kang Luo, Haomin Wen, Lei Li, Yanwei Yu, Qingsong Wen, Chao Chen, Kai Zheng, Yunjun Gao, Xiao- fang Zhou, and Yu Zheng. 2024. Deep Learning for Trajectory Data Management and Mining: A Survey and Beyond. InarXiv. 2403.14151

work page arXiv 2024

[5] [5]

Xin Chen, Chengrui Huang, Chenhao Wang, and Lisi Chen. 2025. Trajectory generation: a survey on methods and techniques.GeoInformatica29, 3 (2025), 351–376

2025

[6] [6]

Xinyu Chen, Jiajie Xu, Rui Zhou, Wei Chen, Junhua Fang, and Chengfei Liu

[7] [7]

Neurocomputing428 (2021), 332–339

TrajVAE: A Variational AutoEncoder model for trajectory generation. Neurocomputing428 (2021), 332–339

2021

[8] [8]

DeepSeek-AI. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. InarXiv. 2501.12948

work page internal anchor Pith review Pith/arXiv arXiv 2025

[9] [9]

Bangchao Deng, Xin Jing, Tianyue Yang, Bingqing Qu, Dingqi Yang, and Philippe Cudré-Mauroux. 2025. Revisiting Synthetic Human Trajectories: Imitative Gen- eration and Benchmarks Beyond Datasaurus. InKDD. 201–212

2025

[10] [10]

Xiaomin Fang, Jizhou Huang, Fan Wang, Lihang Liu, Yibo Sun, and Haifeng Wang. 2021. SSML: Self-Supervised Meta-Learner for En Route Travel Time Estimation at Baidu Maps. InKDD. 2840–2848

2021

[11] [11]

Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron C

Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2020. Generative adversarial networks.Commun. ACM63, 11 (2020), 139–144

2020

[12] [12]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable Feature Learning for Networks. InKDD. 855–864

2016

[13] [13]

Baoshen Guo, Zhiqing Hong, Junyi Li, Shenhao Wang, and Jinhua Zhao. 2026. Leveraging the Spatial Hierarchy: Coarse-to-fine Trajectory Generation via Cas- caded Hybrid Diffusion. InKDD. 359–370

2026

[14] [14]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. InNeurIPS

2020

[15] [15]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput.9, 8 (1997), 1735–1780

1997

[16] [16]

Danlei Hu, Lu Chen, Hanxi Fang, Ziquan Fang, Tianyi Li, and Yunjun Gao. 2024. Spatio-Temporal Trajectory Similarity Measures: A Comprehensive Survey and Quantitative Study.TKDE36, 5 (2024), 2191–2212

2024

[17] [17]

Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2022. LoRA: Low-Rank Adaptation of Large Language Models. InICLR

2022

[18] [18]

Yujia Hu, Yuntao Du, Zhikun Zhang, Ziquan Fang, Lu Chen, Kai Zheng, and Yunjun Gao. 2024. Real-Time Trajectory Synthesis with Local Differential Privacy. InICDE. 1685–1698

2024

[19] [19]

Kingma and Max Welling

Diederik P. Kingma and Max Welling. 2014. Auto-Encoding Variational Bayes. In ICLR

2014

[20] [20]

Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, and Bryan Catanzaro. 2021. DiffWave: A Versatile Diffusion Model for Audio Synthesis. InICLR

2021

[21] [21]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classi- fication with Deep Convolutional Neural Networks. InNIPS. 1106–1114

2012

[22] [22]

Woosuk Kwon, Zhuohan Li, Siyuan Zhuang, Ying Sheng, Lianmin Zheng, Cody Hao Yu, Joseph Gonzalez, Hao Zhang, and Ion Stoica. 2023. Efficient Memory Management for Large Language Model Serving with PagedAttention. InSOSP. 611–626

2023

[23] [23]

Siqi Lai, Zhao Xu, Weijia Zhang, Hao Liu, and Hui Xiong. 2025. LLMLight: Large Language Models as Traffic Signal Control Agents. InKDD. 2335–2346

2025

[24] [24]

Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, and Wook-Shin Han. 2022. Autoregressive Image Generation using Residual Quantization. InCVPR. 11513– 11522

2022

[25] [25]

Jianhua Lin. 2002. Divergence measures based on the Shannon entropy.IEEE Transactions on Information theory37, 1 (2002), 145–151

2002

[26] [26]

Xia Liu, Hanzhou Chen, and Clio Andris. 2018. trajGANs: Using generative adversarial networks for geo-privacy protection of trajectory data (Vision paper). InLocation privacy and security workshop. 1–7

2018

[27] [27]

Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. InICLR

2019

[28] [28]

Yin Lou, Chengyang Zhang, Yu Zheng, Xing Xie, Wei Wang, and Yan Huang

[29] [29]

InSIGSPATIAL

Map-matching for low-sampling-rate GPS trajectories. InSIGSPATIAL. 352–361

[30] [30]

Xuebin Ma, Zinan Ding, and Xiaoyan Zhang. 2024. ST-TrajGAN: A synthetic trajectory generation algorithm for privacy preservation.Future Generation Computer Systems161 (2024), 226–238

2024

[31] [31]

Xiaowei Mao, Yan Lin, Shengnan Guo, Yubin Chen, Xingyu Xian, Haomin Wen, Qisen Xu, Youfang Lin, and Huaiyu Wan. 2025. DutyTTE: Deciphering Uncer- tainty in Origin-Destination Travel Time Estimation. InAAAI. 12390–12398

2025

[32] [32]

Mehmet Ercan Nergiz, Maurizio Atzori, Yücel Saygin, and Baris Güç. 2009. To- wards Trajectory Anonymization: a Generalization-Based Approach.Transactions on Data Privacy2, 1 (2009), 47–75

2009

[33] [33]

OpenAI. 2023. GPT-4 Technical Report. InarXiv. 2303.08774

work page internal anchor Pith review Pith/arXiv arXiv 2023

[34] [34]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.Journal of machine learning research21 (2020), 140:1–140:67

2020

[35] [35]

Jinmeng Rao, Song Gao, Yuhao Kang, and Qunying Huang. 2020. LSTM-TrajGAN: A deep learning approach to trajectory privacy protection. InarXiv. 2006.10521

work page arXiv 2020

[36] [36]

Xuan Rao, Shuo Shang, Renhe Jiang, Peng Han, and Lisi Chen. 2025. Seed: Bridging Sequence and Diffusion Models for Road Trajectory Generation. In WWW. 2007–2017

2025

[37] [37]

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. InMICCAI, Vol. 9351. 234–241

2015

[38] [38]

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Mingchuan Zhang, Y. K. Li, Y. Wu, and Daya Guo. 2024. DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models. InarXiv. 2402.03300

work page internal anchor Pith review Pith/arXiv arXiv 2024

[39] [39]

Filippo Simini, Gianni Barlacchi, Massimilano Luca, and Luca Pappalardo. 2021. A deep gravity model for mobility flows generation.Nature communications12, 1 (2021), 6576

2021

[40] [40]

Song Tao and Jia Wang. 2020. Alleviation of Gradient Exploding in GANs: Fake Can Be Real. InCVPR. 1188–1197

2020

[41] [41]

Llama Team. 2024. The Llama 3 Herd of Models. InarXiv. 2407.21783

work page internal anchor Pith review Pith/arXiv arXiv 2024

[42] [42]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models.arXiv preprint arXiv:2302.13971(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[43] [43]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yas- mine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhos- ale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models.arXiv preprint arXiv:2307.09288(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[44] [44]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. InNIPS. 5998–6008

2017

[45] [45]

Chenhao Wang, Lisi Chen, Shuo Shang, Christian S Jensen, and Panos Kalnis

[46] [46]

Multi-scale detection of anomalous spatio-temporal trajectories in evolving trajectory datasets. InKDD. 2980–2990

[47] [47]

Huandong Wang, Changzheng Gao, Yuchen Wu, Depeng Jin, Lina Yao, and Yong Li. 2023. PateGail: A Privacy-Preserving Mobility Trajectory Generator with Imitation Learning. InAAAI. 14539–14547

2023

[48] [48]

Jingyuan Wang, Yujing Lin, and Yudong Li. 2025. GTG: Generalizable Trajectory Generation Model for Urban Mobility. InAAAI. AAAI Press, 834–842

2025

[49] [49]

Jian Xie, Kai Zhang, Jiangjie Chen, Tinghui Zhu, Renze Lou, Yuandong Tian, Yanghua Xiao, and Yu Su. 2024. TravelPlanner: A Benchmark for Real-World Planning with Language Agents. InICML

2024

[50] [50]

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. 2025. Qwen3 technical report.arXiv preprint arXiv:2505.09388(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[51] [51]

An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, et al . 2024. Qwen2.5 Technical Report.arXiv preprint arXiv:2412.15115(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[52] [52]

Can Yang and Gyözö Gidófalvi. 2018. Fast map matching, an algorithm integrating hidden Markov model with precomputation.International Journal of Geographical Information Science32, 3 (2018), 547–570

2018

[53] [53]

Chuang Yang, Renhe Jiang, Xiaohang Xu, Chuan Xiao, and Kaoru Sezaki. 2024. SIMformer: Single-Layer Vanilla Transformer Can Learn Free-Space Trajectory Similarity.Proc. VLDB Endow.18, 2 (2024), 390–398

2024

[54] [54]

Whalen, and Gengchen Mai

Hao Yang, Xiaobai Angela Yao, Christopher C. Whalen, and Gengchen Mai. 2025. BERT4Traj: Transformer-Based Trajectory Reconstruction for Sparse Mobility Data. InGIScience, Vol. 346. 8:1–8:9

2025

[55] [55]

Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, and Yonghui Wu. 2022. Vector-quantized Image Modeling with Improved VQGAN. InICLR

2022

[56] [56]

Jing Zhang, Qihan Huang, Yirui Huang, Qian Ding, and Pei-Wei Tsai. 2023. DP- TrajGAN: A privacy-aware trajectory generation model with differential privacy. Future Generation Computer Systems142 (2023), 25–40

2023

[57] [57]

Qianru Zhang, Peng Yang, Junliang Yu, Haixin Wang, Xingwei He, Siu-Ming Yiu, and Hongzhi Yin. 2025. A Survey on Point-of-Interest Recommendation: Models, Architectures, and Security.TKDE37, 6 (2025), 3153–3172

2025

[58] [58]

Ruixing Zhang, Yunqi Liu, Liangzhe Han, Leilei Sun, Chuanren Liu, Jibin Wang, and Weifeng Lv. 2025. Large-scale Human Mobility Data Regeneration for Open Urban Research. InKDD. 2827–2836. From GPS Points to Travel Patterns: Flexible and Semantic Trajectory Generation with LLMs KDD 2026, August 9–13, 2026, Jeju Island, Republic of Korea

2025

[59] [59]

Weijia Zhang, Jindong Han, Zhao Xu, Hang Ni, Hao Liu, and Hui Xiong. 2024. Urban Foundation Models: A Survey. InKDD. 6633–6643

2024

[60] [60]

Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, Zican Dong, Yifan Du, Chen Yang, Yushuo Chen, Zhipeng Chen, Jinhao Jiang, Ruiyang Ren, Yifan Li, Xinyu Tang, Zikang Liu, Peiyu Liu, Jian-Yun Nie, and Ji-Rong Wen. 2023. A Survey of Large Language Models. InarXiv. 2303.18223

work page internal anchor Pith review Pith/arXiv arXiv 2023

[61] [61]

Gonzalez, Clark W

Lianmin Zheng, Liangsheng Yin, Zhiqiang Xie, Chuyue Sun, Jeff Huang, Cody Hao Yu, Shiyi Cao, Christos Kozyrakis, Ion Stoica, Joseph E. Gonzalez, Clark W. Barrett, and Ying Sheng. 2024. SGLang: Efficient Execution of Struc- tured Language Model Programs. InNeurIPS

2024

[62] [62]

Silin Zhou, Yao Chen, Shuo Shang, Lisi Chen, Bingsheng He, and Ryosuke Shibasaki. 2025. Blurred Encoding for Trajectory Representation Learning. In KDD. 4132–4143

2025

[63] [63]

Yuanshao Zhu, Yongchao Ye, Shiyao Zhang, Xiangyu Zhao, and James Yu. 2023. DiffTraj: Generating GPS Trajectory with Diffusion Probabilistic Model. In NeurIPS

2023

[64] [64]

Yuanshao Zhu, James Jian Qiao Yu, Xiangyu Zhao, Qidong Liu, Yongchao Ye, Wei Chen, Zijian Zhang, Xuetao Wei, and Yuxuan Liang. 2024. ControlTraj: Controllable Trajectory Generation with Topology-Constrained Diffusion Model. InKDD. 4676–4687. A Appendix A.1 Details of Conditions The input information of the LLM includes road segment tokens describing the r...

2024

[65] [65]

• Travel Distance (T-Dist):Measures the JSD on distributions in total travel distances between real trajectories and generated trajectories

Point-level.Point-level metrics evaluate the statistical properties of trajectories at the raw GPS point, focusing on micro-level and fine-grained geometric consistency between generated and real trajectories. • Travel Distance (T-Dist):Measures the JSD on distributions in total travel distances between real trajectories and generated trajectories. The tr...

2026

[66] [66]

In this paper, each city is partitioned into grids of size100 m× 100m to transform GPS trajectories into grid trajectories

Grid-level.Grid-level metrics evaluate the spatial aggregation and regional movement patterns of trajectories, capturing how generated trajectories distribute over urban space and whether they preserve high-frequency activity regions. In this paper, each city is partitioned into grids of size100 m× 100m to transform GPS trajectories into grid trajectories...

[67] [67]

Aligned road segments are obtained via map-matching [27, 49]

Road-level.Road-level metrics evaluate trajectory realism from a road network perspective, measuring whether generated trajec- tories follow realistic road usage patterns and align with the un- derlying road topology. Aligned road segments are obtained via map-matching [27, 49]. • Density (R-Den):Measures the cosine similarity between the density distribu...