pith. sign in

arxiv: 1907.08489 · v1 · pith:F6RGVKQQnew · submitted 2019-07-19 · 💻 cs.AI · cs.CY· cs.LG

Empowering A* Search Algorithms with Neural Networks for Personalized Route Recommendation

Pith reviewed 2026-05-24 19:23 UTC · model grok-4.3

classification 💻 cs.AI cs.CYcs.LG
keywords personalized route recommendationA* algorithmneural networksattention RNNgraph attention networkscost function learningpathfindingheuristic estimation
0
0 comments X

The pith

Neural networks learn the cost functions for A* search to generate personalized routes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to replace manually designed heuristics in the A* algorithm with costs that neural networks learn directly from historical route data. An attention-based RNN tracks the cost and context from the origin to each candidate point while producing a time-varying vector for the user's state; a separate value network built on graph attention layers estimates the remaining cost to the destination. These two estimates are combined inside the search to score candidate locations. A reader would care because the approach removes the need for experts to tune cost functions and lets context information shape the search automatically.

Core claim

The authors propose using neural networks to automatically learn the cost functions of the A* algorithm for the personalized route recommendation task. The model consists of an attention-based RNN that models the cost from source to candidate location by incorporating context and learns a time-varying vectorized representation of the user's moving state, together with a value network on improved graph attention networks that estimates the cost from candidate to destination while capturing structural characteristics and context. The two components are integrated to produce more accurate costs for candidate locations during search.

What carries the argument

Two-component neural cost estimator: attention RNN for source-to-candidate costs with vectorized user state and graph-attention value network for candidate-to-destination costs.

If this is right

  • Context information such as user state is incorporated directly into the cost estimates during search.
  • Manual setting of heuristic cost functions is no longer required.
  • The search uses time-varying vector representations rather than single scalar costs.
  • The integrated estimator produces more accurate total costs for candidate locations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same learned-cost structure could be tried on other graph search tasks where defining good heuristics by hand is difficult.
  • Because the state is represented as a vector, the model might support richer forms of personalization than scalar costs alone allow.
  • Periodic retraining on fresh trip data could let the system track changes in traffic patterns without rewriting the heuristic rules.

Load-bearing premise

Neural network estimates of path costs trained on historical routes will be accurate enough to guide A* search effectively on new queries without systematic bias.

What would settle it

A controlled test on held-out route queries in which the neural A* returns paths whose measured travel times or user acceptance rates are no better than those produced by standard A* with hand-tuned heuristics.

Figures

Figures reproduced from arXiv: 1907.08489 by Fanzhang Peng, Jingyuan Wang, Ning Wu, Wayne Xin Zhao, Xin Lin.

Figure 1
Figure 1. Figure 1: The overall architecture of the NASR model. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Detailed analysis of our model on the dataset of [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of the search procedure with the esti [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
read the original abstract

Personalized Route Recommendation (PRR) aims to generate user-specific route suggestions in response to users' route queries. Early studies cast the PRR task as a pathfinding problem on graphs, and adopt adapted search algorithms by integrating heuristic strategies. Although these methods are effective to some extent, they require setting the cost functions with heuristics. In addition, it is difficult to utilize useful context information in the search procedure. To address these issues, we propose using neural networks to automatically learn the cost functions of a classic heuristic algorithm, namely A* algorithm, for the PRR task. Our model consists of two components. First, we employ attention-based Recurrent Neural Networks (RNN) to model the cost from the source to the candidate location by incorporating useful context information. Instead of learning a single cost value, the RNN component is able to learn a time-varying vectorized representation for the moving state of a user. Second, we propose to use a value network for estimating the cost from a candidate location to the destination. For capturing structural characteristics, the value network is built on top of improved graph attention networks by incorporating the moving state of a user and other context information. The two components are integrated in a principled way for deriving a more accurate cost of a candidate location. Extensive experiment results on three real-world datasets have shown the effectiveness and robustness of the proposed model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a hybrid model that augments the A* algorithm for personalized route recommendation (PRR) on graphs. It replaces hand-crafted costs with two learned components—an attention-based RNN that produces a time-varying state vector for source-to-candidate cost, and a graph-attention value network that estimates candidate-to-destination cost—then integrates them to produce the node evaluation function used inside A*. Experiments on three real-world datasets are reported to demonstrate improved recommendation accuracy and robustness over baselines.

Significance. If the empirical gains hold and the learned costs remain sufficiently accurate for search, the work would illustrate a practical way to inject context-aware neural representations into classic heuristic search without abandoning the A* framework. The explicit separation of forward and backward cost estimators, together with the use of moving-state vectors, is a concrete architectural contribution that could be reused in other path-finding settings.

major comments (2)
  1. [§3 (model description) and integration paragraph] The manuscript repeatedly describes the procedure as an instance of the A* algorithm (abstract; §1; §3), yet the learned heuristic produced by the value network is trained on historical trajectories with no stated constraint that it be admissible or consistent. Because admissibility is required for A* optimality guarantees, the claim that the method remains a valid A* procedure (and therefore inherits its completeness and optimality properties) is not supported by the given architecture or loss; this directly affects the central methodological claim.
  2. [§3.3 (integration)] The integration step that combines the RNN state vector with the value-network output to obtain the final node cost is described only at a high level (“integrated in a principled way”). Without an explicit equation showing how g(n) and h(n) are formed and whether the resulting f(n) preserves the monotonicity or consistency properties required by A*, it is impossible to verify that the search procedure is still A* rather than a generic best-first search.
minor comments (2)
  1. [abstract] The abstract states that “extensive experiment results … have shown the effectiveness,” but supplies no numerical values, baselines, or ablation settings; readers must reach the experimental section to evaluate the strength of the claims.
  2. [§3] Notation for the moving-state vector and the precise form of the attention mechanisms is introduced without a consolidated table of symbols, making it easy to lose track of which quantities are time-varying versus static.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate where revisions will be made to improve clarity and precision.

read point-by-point responses
  1. Referee: [§3 (model description) and integration paragraph] The manuscript repeatedly describes the procedure as an instance of the A* algorithm (abstract; §1; §3), yet the learned heuristic produced by the value network is trained on historical trajectories with no stated constraint that it be admissible or consistent. Because admissibility is required for A* optimality guarantees, the claim that the method remains a valid A* procedure (and therefore inherits its completeness and optimality properties) is not supported by the given architecture or loss; this directly affects the central methodological claim.

    Authors: We agree that the learned heuristic is not constrained to be admissible or consistent, since it is trained directly on historical trajectories without such regularization. Consequently, while the search follows the A* node-expansion procedure, we do not claim that the method inherits A* optimality or completeness guarantees. In the revision we will explicitly qualify all references to A* by stating that the algorithm structure is used but optimality is not guaranteed due to the data-driven heuristic. This removes the unsupported claim while preserving the description of the search procedure. revision: yes

  2. Referee: [§3.3 (integration)] The integration step that combines the RNN state vector with the value-network output to obtain the final node cost is described only at a high level (“integrated in a principled way”). Without an explicit equation showing how g(n) and h(n) are formed and whether the resulting f(n) preserves the monotonicity or consistency properties required by A*, it is impossible to verify that the search procedure is still A* rather than a generic best-first search.

    Authors: We accept that the current description of the integration is insufficiently precise. The revised manuscript will add an explicit equation (or set of equations) in §3.3 that defines how the time-varying state vector from the attention-based RNN is combined with the value-network estimate to produce the forward cost g(n) and the heuristic estimate h(n), and therefore the evaluation function f(n). We will also note that, because admissibility is not enforced, monotonicity/consistency is not guaranteed and the procedure is best characterized as A*-style best-first search with learned costs. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture proposal with independent empirical validation

full rationale

The paper's derivation chain consists of proposing an attention RNN for source-to-candidate costs and a graph-attention value network for candidate-to-destination costs, then integrating them inside A* search. No equation defines a quantity in terms of itself, no fitted parameter is relabeled as a prediction, and no uniqueness theorem or self-citation is invoked to force the architecture. The central claim is an empirical modeling contribution whose validity rests on experimental results on three datasets rather than any reduction to its own inputs by construction. This is the normal case of a self-contained modeling paper.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the capacity of neural networks with many trainable weights to learn suitable cost functions from route data; the A* component inherits the standard mathematical requirement that learned costs remain admissible for optimality guarantees.

free parameters (2)
  • attention RNN weights
    Trainable parameters of the attention-based RNN that are fitted to historical route data to produce moving-state vectors.
  • graph attention value network weights
    Trainable parameters of the improved GAT value network fitted to data to estimate remaining costs.
axioms (1)
  • standard math A* search yields optimal paths when the heuristic cost estimate is admissible
    The paper builds directly on the classic A* algorithm and its optimality properties.

pith-pipeline@v0.9.0 · 5785 in / 1338 out tokens · 31643 ms · 2026-05-24T19:23:36.376566+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 3 internal anchors

  1. [1]

    Abdulrahman Al-Molegi, Mohammed Jabreel, and Baraq Ghaleb. 2016. STF-RNN: Space Time Features-based Recurrent Neural Network for predicting people next location. In SSCI. 1–7

  2. [2]

    Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. (2014). arXiv:1409.0473

  3. [3]

    Dawei Chen, Cheng Soon Ong, and Lexing Xie. 2016. Learning points and routes to recommend trajectories. In CIKM. ACM, 2227–2232

  4. [4]

    Zaiben Chen, Heng Tao Shen, and Xiaofang Zhou. 2011. Discovering popular routes from trajectories. In ICDE. 900–911

  5. [5]

    Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, and Yoshua Bengio. 2015. Gated feedback recurrent neural networks. In ICML. 2067–2075

  6. [6]

    Ge Cui, Jun Luo, and Xin Wang. 2018. Personalized travel route recommendation using collaborative filtering based on GPS trajectories.IJED 11, 3 (2018), 284–307

  7. [7]

    J. Dai, B. Yang, C. Guo, and Z. Ding. 2015. Personalized route recommendation using big trajectory data. In ICDE. 543–554

  8. [8]

    Marco Ernandes and Marco Gori. 2004. Likely-admissible and sub-symbolic heuristics. In ECAI. Citeseer, 613–617

  9. [9]

    Jie Feng, Yong Li, Chao Zhang, Funing Sun, Fanchao Meng, Ang Guo, and Depeng Jin. 2018. DeepMove: Predicting Human Mobility with Attentional Recurrent Networks. In WWW. 1459–1468

  10. [10]

    Peter E Hart, Nils J Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE T-SSC 4, 2 (1968), 100–107

  11. [11]

    Evangelos Kanoulas, Yang Du, Tian Xia, and Donghui Zhang. 2006. Finding fastest paths on a road network with speed patterns. In ICDE. IEEE, 10–10

  12. [12]

    Takeshi Kurashima, Tomoharu Iwata, Go Irie, and Ko Fujimura. 2010. Travel route recommendation using geotags in photo sharing sites. In CIKM. 579–588

  13. [13]

    Levi Lelis, Roni Stern, and Shahab Jabbari Arfaee. 2011. Predicting solution cost with conditional probabilities. In ASSC

  14. [14]

    Kwan Hui Lim, Jeffrey Chan, Christopher Leckie, and Shanika Karunasekera

  15. [15]

    In IJCAI, Vol

    Personalized Tour Recommendation Based on User Interests and Points of Interest Visit Durations.. In IJCAI, Vol. 15. 1778–1784

  16. [16]

    Hechen Liu, Ling Yin Wei, Yu Zheng, and M Schneider. 2011. Route Discovery from Mining Uncertain Trajectories. (2011), 1239–1242

  17. [17]

    Qiang Liu, Shu Wu, Liang Wang, and Tieniu Tan. 2016. Predicting the next location: a recurrent model with spatial and temporal contexts. In AAAI. 194– 200

  18. [18]

    Wuman Luo, Haoyu Tan, Lei Chen, and Lionel M. Ni. 2013. Finding time period- based most frequent path in big trajectory data. In SIGMOD. 713–724

  19. [19]

    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529

  20. [20]

    Karl Nachtigall. 1995. Time depending shortest-path problems with applications to railway networks. EJOR 83, 1 (1995), 154–166

  21. [21]

    Stefano Pallottino and Maria Grazia Scutella. 1998. Shortest path algorithms in transportation models: classical and innovative aspects. In Equilibrium and advanced transportation modelling. Springer, 245–281

  22. [22]

    Mehdi Samadi, Ariel Felner, and Jonathan Schaeffer. 2008. Learning from Multiple Heuristics.. In AAAI. 357–362

  23. [23]

    Samia Shafique and Mohammed Eunus Ali. 2016. Recommending most pop- ular travel path within a region of interest from historical trajectory data. In SIGSPATIAL. ACM, 2–11

  24. [24]

    David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershel- vam, Marc Lanctot, et al . 2016. Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016), 484

  25. [25]

    Richard S Sutton. 1988. Learning to predict by the methods of temporal differences. Machine learning 3, 1 (1988), 9–44

  26. [26]

    Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An intro- duction. MIT press

  27. [27]

    Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 1, 2 (2017)

  28. [28]

    Jingyuan Wang, Chao Chen, Junjie Wu, and Zhang Xiong. 2017. No longer sleep- ing with a bomb: a duet system for protecting urban safety from dangerous goods. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1673–1681

  29. [29]

    Jingyuan Wang, Xu He, Ze Wang, Junjie Wu, Nicholas Jing Yuan, Xing Xie, and Zhang Xiong. 2018. CD-CNN: a partially supervised cross-domain deep learning model for urban resident recognition. In Thirty-Second AAAI Conference on Artificial Intelligence

  30. [30]

    Jingyuan Wang, Junjie Wu, Fei Gao, and Zhang Xiong. 2019. Understanding Urban Dynamics via Context-aware Tensor Factorization with Neighboring Regularization. arXiv preprint arXiv:1905.00702 (2019)

  31. [31]

    Ling Yin Wei, Yu Zheng, and Wen Chih Peng. 2012. Constructing popular routes from uncertain trajectories. In SIGKDD. 195–203

  32. [32]

    Hao Wu, Ziyang Chen, Weiwei Sun, Baihua Zheng, and Wei Wang. 2017. Model- ing Trajectories with Recurrent Neural Networks. In ICJAI. 3083–3090

  33. [33]

    Hao Wu, Jiangyun Mao, Weiwei Sun, Baihua Zheng, Hanyuan Zhang, Ziyang Chen, and Wei Wang. 2016. Probabilistic Robust Route Recovery with Spatio- Temporal Dynamics. In SIGKDD. 1915–1924

  34. [34]

    Can Yang and Gyozo Gidofalvi. 2018. Fast map matching, an algorithm integrating hidden Markov model with precomputation. IJGIS 32, 3 (2018), 547–570

  35. [35]

    Cheng Yang, Maosong Sun, Wayne Xin Zhao, Zhiyuan Liu, and Edward Y. Chang

  36. [36]

    ACM T-IS 35, 4 (2017), 36:1–36:28

    A Neural Network Approach to Jointly Modeling Social Networks and Mobile Trajectories. ACM T-IS 35, 4 (2017), 36:1–36:28

  37. [37]

    Jing Yuan, Yu Zheng, Chengyang Zhang, Wenlei Xie, Xing Xie, Guangzhong Sun, and Yan Huang. 2010. T-drive: driving directions based on taxi trajectories. In SIGSPATIAL. ACM, 99–108

  38. [38]

    Stephan Zheng, Yisong Yue, and Patrick Lucey. 2017. Generating Long-term Trajectories Using Deep Hierarchical Networks. In NIPS. 1543–1551