NaviFormer: A Deep Reinforcement Learning Transformer-like Model to Holistically Solve the Navigation Problem

Andrea Cavallaro; Carlos R. del-Blanco; Daniel Fuertes; Fernando Jaureguizar; Narciso Garc\'ia

arxiv: 2604.16967 · v1 · submitted 2026-04-18 · 💻 cs.RO · cs.AI

NaviFormer: A Deep Reinforcement Learning Transformer-like Model to Holistically Solve the Navigation Problem

Daniel Fuertes , Andrea Cavallaro , Carlos R. del-Blanco , Fernando Jaureguizar , Narciso Garc\'ia This is my paper

Pith reviewed 2026-05-10 06:33 UTC · model grok-4.3

classification 💻 cs.RO cs.AI

keywords navigationpath planningroute planningreinforcement learningtransformertrajectory predictionautonomous systems

0 comments

The pith

NaviFormer uses one Transformer-based reinforcement learning model to solve both high-level route planning and low-level trajectory generation together.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces NaviFormer to address navigation as a single problem rather than splitting it into separate route sequencing and local path prediction tasks. Most existing methods solve these subproblems independently, which can lead to inefficiencies when the two must work together in real environments. NaviFormer applies a Transformer architecture inside a deep reinforcement learning setup so the same model learns to output both waypoint sequences and collision-free trajectories. Tests show the model reaches competitive accuracy while running faster than compared methods, supporting its use in time-sensitive operations. A reader would care because a unified model could remove the need to maintain and switch between multiple planners for autonomous movement.

Core claim

NaviFormer is a deep reinforcement learning model based on a Transformer architecture that solves the global navigation problem by predicting both high-level routes and low-level trajectories. It addresses the limitations of solving route planning and path planning separately by using a holistic approach that understands the constraints of each subproblem and improves performance accordingly.

What carries the argument

The NaviFormer model, a Transformer architecture integrated with deep reinforcement learning that jointly predicts high-level waypoint routes and low-level collision-avoiding trajectories.

If this is right

The model can be deployed in real-time missions because its computation speed exceeds that of separate planning components.
Performance improves when the model simultaneously accounts for the difficulties of both route sequencing and local avoidance.
The approach reduces the complexity of building navigation systems by replacing multiple specialized modules with one learned component.
Competitive accuracy holds across the tested navigation scenarios when compared to existing algorithms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same unified architecture could extend to settings with moving obstacles if retrained on dynamic data.
Direct integration of raw sensor inputs into the Transformer might further reduce reliance on pre-processed maps.
Similar holistic Transformer models could be tested in related sequential decision tasks such as multi-agent coordination.

Load-bearing premise

A single model can learn and optimize the distinct constraints of global route sequencing and local trajectory generation without hidden performance trade-offs.

What would settle it

Compare NaviFormer against separate specialized route and path planners on navigation tasks where optimal waypoint choices create tight local constraints, measuring whether the unified model achieves equal or higher success rates and lower computation time.

Figures

Figures reproduced from arXiv: 2604.16967 by Andrea Cavallaro, Carlos R. del-Blanco, Daniel Fuertes, Fernando Jaureguizar, Narciso Garc\'ia.

**Figure 2.** Figure 2: NaviFormer’s architecture: a modified Transformer encoder combines simple linear representations of nodes ( [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Novel NaviFormer modules: (a) the combined multi-head attention operation to merge node and obstacle information, [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: A scenario with (a) cultivation and biocultivation [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Solutions provided by NaviFormer for some random [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

read the original abstract

Path planning is usually solved by addressing either the (high-level) route planning problem (waypoint sequencing to achieve the final goal) or the (low-level) path planning problem (trajectory prediction between two waypoints avoiding collisions). However, real-world problems usually require simultaneous solutions to the route and path planning subproblems with a holistic and efficient approach. In this paper, we introduce NaviFormer, a deep reinforcement learning model based on a Transformer architecture that solves the global navigation problem by predicting both high-level routes and low-level trajectories. To evaluate NaviFormer, several experiments have been conducted, including comparisons with other algorithms. Results show competitive accuracy from NaviFormer since it can understand the constraints and difficulties of each subproblem and act consequently to improve performance. Moreover, its superior computation speed proves its suitability for real-time missions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces NaviFormer, a Transformer-based deep reinforcement learning model designed to holistically solve the global navigation problem by jointly predicting high-level routes (waypoint sequencing) and low-level trajectories (collision-free paths between waypoints). The approach uses an encoder-decoder Transformer with custom state embeddings for global maps and local sensor data, trained via a multi-objective reward that balances route progress and trajectory safety. Experiments compare NaviFormer against separate A* + DWA baselines and other DRL agents, reporting metrics that indicate no degradation in either sub-task relative to specialized models, along with claims of competitive accuracy and superior computation speed for real-time suitability.

Significance. If the reported empirical results hold, this work demonstrates that a single Transformer DRL agent can integrate high-level route sequencing and low-level trajectory generation without apparent performance trade-offs, offering a more efficient alternative to modular pipelines for real-time robotic navigation tasks. The internal consistency of the architecture and training procedure, combined with direct baseline comparisons, strengthens the case for holistic models in this domain.

minor comments (3)

Abstract: The claims of 'competitive accuracy' and 'superior computation speed' are stated without any quantitative values, specific metrics, or baseline names. Including at least one key result (e.g., success rate or inference time) would make the abstract self-contained and better aligned with the experimental section.
Section on experiments: While comparisons to A* + DWA and other DRL agents are mentioned, the manuscript would benefit from explicit reporting of error bars, number of trials, and dataset/environment details to allow reproducibility assessment.
Notation and figures: The description of the custom state embedding for global map and local sensor data could be clarified with a diagram or equation reference; current presentation leaves the exact input representation somewhat underspecified for readers.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of NaviFormer, the constructive summary of our contributions, and the recommendation for minor revision. We will incorporate minor improvements to presentation and clarity in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents NaviFormer as an empirical deep reinforcement learning Transformer model trained to jointly address high-level route planning and low-level trajectory generation. No mathematical derivations, first-principles results, or equations appear in the abstract or described content that could reduce to inputs by construction. The central claims rest on architecture design, multi-objective reward training, and experimental comparisons to baselines such as A* + DWA, without any self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations that collapse the holistic performance assertion. The derivation chain is therefore self-contained as an empirical engineering contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no equations, training details, or architectural specifications, so no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.0 · 5458 in / 1030 out tokens · 34688 ms · 2026-05-10T06:33:45.101354+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

[1]

Vehicle routing problems over time: a survey,

A. Mor and M. G. Speranza, “Vehicle routing problems over time: a survey,”Annals of Oper . Res., vol. 314, no. 1, pp. 255–275, 2022

work page 2022
[2]

A survey of path planning algorithms for mobile robots,

K. Karur, N. Sharma, C. Dharmatti, and J. E. Siegel, “A survey of path planning algorithms for mobile robots,”V ehicles, vol. 3, no. 3, pp. 448–468, 2021

work page 2021
[3]

The orienteering problem,

B. L. Golden, L. Levy, and R. V ohra, “The orienteering problem,” Naval Res. Logistics, vol. 34, no. 3, pp. 307–318, 1987

work page 1987
[4]

Conflict- based search with d* lite algorithm for robot path planning in unknown dynamic environments,

J. Jin, Y . Zhang, Z. Zhou, M. Jin, X. Yang, and F. Hu, “Conflict- based search with d* lite algorithm for robot path planning in unknown dynamic environments,”Comput. and Electr . Eng., vol. 105, p. 108473, 2023

work page 2023
[5]

Path planning using neural a* search,

R. Yonetani, T. Taniai, M. Barekatain, M. Nishimura, and A. Kanezaki, “Path planning using neural a* search,” inProc. of the 38th Int. Conf. on Mach. Learn., M. Meila and T. Zhang, Eds., vol. 139, 2021, pp. 12 029–12 039

work page 2021
[6]

Control transformer: Robot navigation in unknown environments through prm-guided return-conditioned se- quence modeling,

D. Lawson and A. H. Qureshi, “Control transformer: Robot navigation in unknown environments through prm-guided return-conditioned se- quence modeling,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2023, pp. 9324–9331

work page 2023
[7]

Attention, learn to solve routing problems!

W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” inInt. Conf. on Learn. Represent., 2019

work page 2019
[8]

Solving routing problems for multiple cooperative un- manned aerial vehicles using transformer networks,

D. Fuertes, C. R. del Blanco, F. Jaureguizar, J. J. Navarro, and N. Garc ´ıa, “Solving routing problems for multiple cooperative un- manned aerial vehicles using transformer networks,”Eng. App. of Artif. Intell., vol. 122, p. 106085, 2023

work page 2023
[9]

OR-Tools,

L. Perron and V . Furnon, “OR-Tools,” https://developers.google.com/ optimization/, Google, 2024

work page 2024
[10]

A multi-waypoint motion planning framework for quadrotor drones in cluttered environments,

D. Shi, J. Shen, M. Gao, and X. Yang, “A multi-waypoint motion planning framework for quadrotor drones in cluttered environments,” Drones, vol. 8, no. 8, 2024

work page 2024
[11]

Leveraging single-goal predictions to improve the efficiency of multi-goal motion planning with dynamics,

Y . Lu and E. Plaku, “Leveraging single-goal predictions to improve the efficiency of multi-goal motion planning with dynamics,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2023, pp. 850–857

work page 2023
[12]

A branch-and-price algorithm for a team orienteering problem with fixed-wing drones,

K. Sundar, S. Sanjeevi, and C. Montez, “A branch-and-price algorithm for a team orienteering problem with fixed-wing drones,”EURO J. on Transp. and Logistics, vol. 11, p. 100070, 2022

work page 2022
[13]

Gurobi Optimizer Reference Manual,

Gurobi Optimization, LLC, “Gurobi Optimizer Reference Manual,” https://www.gurobi.com, 2024

work page 2024
[14]

Study and analysis of various heuristic algorithms for solving trav- elling salesman problem—a survey,

R. Purkayastha, T. Chakraborty, A. Saha, and D. Mukhopadhyay, “Study and analysis of various heuristic algorithms for solving trav- elling salesman problem—a survey,” inProc. of the Global AI Congr . 2019, Singapore, 2020, pp. 61–70

work page 2019
[15]

Nature-inspired metaheuristic techniques for combinatorial optimization problems: Overview and recent advances,

M. A. Rahman, R. Sokkalingam, M. Othman, K. Biswas, L. Abdullah, and E. Abdul Kadir, “Nature-inspired metaheuristic techniques for combinatorial optimization problems: Overview and recent advances,” Mathematics, vol. 9, no. 20, 2021

work page 2021
[16]

A general VNS for the multi-depot open vehicle routing problem with time windows,

S. N. Bezerra, S. R. de Souza, and M. J. F. Souza, “A general VNS for the multi-depot open vehicle routing problem with time windows,” Transp. Optim. Letters, 2023

work page 2023
[17]

A grasp with penalty objective function for the green vehicle routing problem with private capacitated stations,

M. Bruglieri, D. Ferone, P. Festa, and O. Pisacane, “A grasp with penalty objective function for the green vehicle routing problem with private capacitated stations,”Comput. & Oper . Res., vol. 143, p. 105770, 2022

work page 2022
[18]

A bench- mark for multi-uav task assignment of an extended team orienteering problem,

K. Xiao, J. Lu, Y . Nie, L. Ma, X. Wang, and G. Wang, “A bench- mark for multi-uav task assignment of an extended team orienteering problem,” inChina Automation Congr ., 2022, pp. 6966–6970

work page 2022
[19]

A lightweight cnn-transformer model for learning traveling salesman problems,

M. Jung, J. Lee, and J. Kim, “A lightweight cnn-transformer model for learning traveling salesman problems,”Applied Intell., vol. 54, no. 17, pp. 7982–7993, 2024

work page 2024
[20]

imtsp: Solving min-max multiple traveling salesman problem with imperative learning,

Y . Guo, Z. Ren, and C. Wang, “imtsp: Solving min-max multiple traveling salesman problem with imperative learning,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2024, pp. 10 245–10 252

work page 2024
[21]

Extended attention mechanism for tsp problem,

H. Yang, “Extended attention mechanism for tsp problem,” inInt. Joint Conf. on Neural Netw., 2021, pp. 1–8

work page 2021
[22]

A reinforcement learning approach to the orienteering problem with time windows,

R. Gama and H. L. Fernandes, “A reinforcement learning approach to the orienteering problem with time windows,”Comput. & Oper . Res., vol. 133, p. 105357, 2021

work page 2021
[23]

Branch-and-cut- and-price for multi-agent path finding,

E. Lam, P. Le Bodic, D. Harabor, and P. J. Stuckey, “Branch-and-cut- and-price for multi-agent path finding,”Comput. & Oper . Res., vol. 144, p. 105809, 2022

work page 2022
[24]

Unmanned aerial vehicle path planning based on a* algorithm and its variants in 3d environment,

D. Mandloi, R. Arya, and A. K. Verma, “Unmanned aerial vehicle path planning based on a* algorithm and its variants in 3d environment,” Int. J. of Syst. Assurance Eng. and Management, vol. 12, no. 5, pp. 990–1000, 2021

work page 2021
[25]

Symbiotic navigation in multi-robot systems with remote obstacle knowledge sharing,

A. Ravankar, A. A. Ravankar, Y . Kobayashi, and T. Emaru, “Symbiotic navigation in multi-robot systems with remote obstacle knowledge sharing,”Sensors, vol. 17, no. 7, 2017

work page 2017
[26]

An improved artificial potential field method for path planning and formation control of the multi-uav systems,

Z. Pan, C. Zhang, Y . Xia, H. Xiong, and X. Shao, “An improved artificial potential field method for path planning and formation control of the multi-uav systems,”IEEE Tran. on Circuits and Syst. II: Express Briefs, vol. 69, no. 3, pp. 1129–1133, 2022

work page 2022
[27]

Path planning of mobile robots based on genetic algorithm,

Y . Zhang, B. Ou, Y . Xu, and C. Dai, “Path planning of mobile robots based on genetic algorithm,” in2023 8th Int. Conf. on Cloud Computing and Big Data Analytics, 2023, pp. 501–505

work page 2023
[28]

Modified adaptive ant colony optimization algorithm and its application for solving path planning of mobile robot,

L. Wu, X. Huang, J. Cui, C. Liu, and W. Xiao, “Modified adaptive ant colony optimization algorithm and its application for solving path planning of mobile robot,”Expert Syst. with App., vol. 215, p. 119410, 2023

work page 2023
[29]

A novel hybrid particle swarm optimization algorithm for path planning of uavs,

Z. Yu, Z. Si, X. Li, D. Wang, and H. Song, “A novel hybrid particle swarm optimization algorithm for path planning of uavs,” IEEE Internet of Things J., vol. 9, no. 22, pp. 22 547–22 558, 2022

work page 2022
[30]

Mapper: Multi-agent path planning with evolutionary reinforcement learning in mixed dynamic environments,

Z. Liu, B. Chen, H. Zhou, G. Koushik, M. Hebert, and D. Zhao, “Mapper: Multi-agent path planning with evolutionary reinforcement learning in mixed dynamic environments,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2020, pp. 11 748–11 754

work page 2020
[31]

Autonomous emergency landing for multicopters using deep reinforcement learn- ing,

L. Bartolomei, Y . Kompis, L. Teixeira, and M. Chli, “Autonomous emergency landing for multicopters using deep reinforcement learn- ing,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2022, pp. 3392–3399

work page 2022
[32]

Transformer-based imitative reinforcement learning for multi-robot path planning,

L. Chen, Y . Wang, Z. Miao, Y . Mo, M. Feng, Z. Zhou, and H. Wang, “Transformer-based imitative reinforcement learning for multi-robot path planning,”IEEE Tran. on Industrial Informatics, pp. 1–11, 2023

work page 2023
[33]

The orien- teering problem: A survey,

P. Vansteenwegen, W. Souffriau, and D. V . Oudheusden, “The orien- teering problem: A survey,”European J. of Oper . Res., vol. 209, no. 1, pp. 1–10, 2011

work page 2011
[34]

Panoptic segmentation of satellite image time series with convolutional temporal attention networks,

V . F. Garnot and L. Landrieu, “Panoptic segmentation of satellite image time series with convolutional temporal attention networks,” inIEEE/CVF Int. Conf. on Comput. Vision, 2021, pp. 4852–4861

work page 2021
[35]

Transpath: Learning heuristics for grid-based pathfinding via transformers,

D. Kirilenko, A. Andreychuk, A. Panov, and K. Yakovlev, “Transpath: Learning heuristics for grid-based pathfinding via transformers,”Pro- ceedings of the AAAI Conf. on Artif. Intell., vol. 37, no. 10, pp. 12 436– 12 443, 2023

work page 2023
[36]

Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,

Q. Ma, S. Ge, D. He, D. Thaker, and I. Drori, “Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,” inAAAI Workshop on Deep Learn. on Graphs: Methodolo- gies and App., 2020

work page 2020

[1] [1]

Vehicle routing problems over time: a survey,

A. Mor and M. G. Speranza, “Vehicle routing problems over time: a survey,”Annals of Oper . Res., vol. 314, no. 1, pp. 255–275, 2022

work page 2022

[2] [2]

A survey of path planning algorithms for mobile robots,

K. Karur, N. Sharma, C. Dharmatti, and J. E. Siegel, “A survey of path planning algorithms for mobile robots,”V ehicles, vol. 3, no. 3, pp. 448–468, 2021

work page 2021

[3] [3]

The orienteering problem,

B. L. Golden, L. Levy, and R. V ohra, “The orienteering problem,” Naval Res. Logistics, vol. 34, no. 3, pp. 307–318, 1987

work page 1987

[4] [4]

Conflict- based search with d* lite algorithm for robot path planning in unknown dynamic environments,

J. Jin, Y . Zhang, Z. Zhou, M. Jin, X. Yang, and F. Hu, “Conflict- based search with d* lite algorithm for robot path planning in unknown dynamic environments,”Comput. and Electr . Eng., vol. 105, p. 108473, 2023

work page 2023

[5] [5]

Path planning using neural a* search,

R. Yonetani, T. Taniai, M. Barekatain, M. Nishimura, and A. Kanezaki, “Path planning using neural a* search,” inProc. of the 38th Int. Conf. on Mach. Learn., M. Meila and T. Zhang, Eds., vol. 139, 2021, pp. 12 029–12 039

work page 2021

[6] [6]

Control transformer: Robot navigation in unknown environments through prm-guided return-conditioned se- quence modeling,

D. Lawson and A. H. Qureshi, “Control transformer: Robot navigation in unknown environments through prm-guided return-conditioned se- quence modeling,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2023, pp. 9324–9331

work page 2023

[7] [7]

Attention, learn to solve routing problems!

W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” inInt. Conf. on Learn. Represent., 2019

work page 2019

[8] [8]

Solving routing problems for multiple cooperative un- manned aerial vehicles using transformer networks,

D. Fuertes, C. R. del Blanco, F. Jaureguizar, J. J. Navarro, and N. Garc ´ıa, “Solving routing problems for multiple cooperative un- manned aerial vehicles using transformer networks,”Eng. App. of Artif. Intell., vol. 122, p. 106085, 2023

work page 2023

[9] [9]

OR-Tools,

L. Perron and V . Furnon, “OR-Tools,” https://developers.google.com/ optimization/, Google, 2024

work page 2024

[10] [10]

A multi-waypoint motion planning framework for quadrotor drones in cluttered environments,

D. Shi, J. Shen, M. Gao, and X. Yang, “A multi-waypoint motion planning framework for quadrotor drones in cluttered environments,” Drones, vol. 8, no. 8, 2024

work page 2024

[11] [11]

Leveraging single-goal predictions to improve the efficiency of multi-goal motion planning with dynamics,

Y . Lu and E. Plaku, “Leveraging single-goal predictions to improve the efficiency of multi-goal motion planning with dynamics,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2023, pp. 850–857

work page 2023

[12] [12]

A branch-and-price algorithm for a team orienteering problem with fixed-wing drones,

K. Sundar, S. Sanjeevi, and C. Montez, “A branch-and-price algorithm for a team orienteering problem with fixed-wing drones,”EURO J. on Transp. and Logistics, vol. 11, p. 100070, 2022

work page 2022

[13] [13]

Gurobi Optimizer Reference Manual,

Gurobi Optimization, LLC, “Gurobi Optimizer Reference Manual,” https://www.gurobi.com, 2024

work page 2024

[14] [14]

Study and analysis of various heuristic algorithms for solving trav- elling salesman problem—a survey,

R. Purkayastha, T. Chakraborty, A. Saha, and D. Mukhopadhyay, “Study and analysis of various heuristic algorithms for solving trav- elling salesman problem—a survey,” inProc. of the Global AI Congr . 2019, Singapore, 2020, pp. 61–70

work page 2019

[15] [15]

Nature-inspired metaheuristic techniques for combinatorial optimization problems: Overview and recent advances,

M. A. Rahman, R. Sokkalingam, M. Othman, K. Biswas, L. Abdullah, and E. Abdul Kadir, “Nature-inspired metaheuristic techniques for combinatorial optimization problems: Overview and recent advances,” Mathematics, vol. 9, no. 20, 2021

work page 2021

[16] [16]

A general VNS for the multi-depot open vehicle routing problem with time windows,

S. N. Bezerra, S. R. de Souza, and M. J. F. Souza, “A general VNS for the multi-depot open vehicle routing problem with time windows,” Transp. Optim. Letters, 2023

work page 2023

[17] [17]

A grasp with penalty objective function for the green vehicle routing problem with private capacitated stations,

M. Bruglieri, D. Ferone, P. Festa, and O. Pisacane, “A grasp with penalty objective function for the green vehicle routing problem with private capacitated stations,”Comput. & Oper . Res., vol. 143, p. 105770, 2022

work page 2022

[18] [18]

A bench- mark for multi-uav task assignment of an extended team orienteering problem,

K. Xiao, J. Lu, Y . Nie, L. Ma, X. Wang, and G. Wang, “A bench- mark for multi-uav task assignment of an extended team orienteering problem,” inChina Automation Congr ., 2022, pp. 6966–6970

work page 2022

[19] [19]

A lightweight cnn-transformer model for learning traveling salesman problems,

M. Jung, J. Lee, and J. Kim, “A lightweight cnn-transformer model for learning traveling salesman problems,”Applied Intell., vol. 54, no. 17, pp. 7982–7993, 2024

work page 2024

[20] [20]

imtsp: Solving min-max multiple traveling salesman problem with imperative learning,

Y . Guo, Z. Ren, and C. Wang, “imtsp: Solving min-max multiple traveling salesman problem with imperative learning,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2024, pp. 10 245–10 252

work page 2024

[21] [21]

Extended attention mechanism for tsp problem,

H. Yang, “Extended attention mechanism for tsp problem,” inInt. Joint Conf. on Neural Netw., 2021, pp. 1–8

work page 2021

[22] [22]

A reinforcement learning approach to the orienteering problem with time windows,

R. Gama and H. L. Fernandes, “A reinforcement learning approach to the orienteering problem with time windows,”Comput. & Oper . Res., vol. 133, p. 105357, 2021

work page 2021

[23] [23]

Branch-and-cut- and-price for multi-agent path finding,

E. Lam, P. Le Bodic, D. Harabor, and P. J. Stuckey, “Branch-and-cut- and-price for multi-agent path finding,”Comput. & Oper . Res., vol. 144, p. 105809, 2022

work page 2022

[24] [24]

Unmanned aerial vehicle path planning based on a* algorithm and its variants in 3d environment,

D. Mandloi, R. Arya, and A. K. Verma, “Unmanned aerial vehicle path planning based on a* algorithm and its variants in 3d environment,” Int. J. of Syst. Assurance Eng. and Management, vol. 12, no. 5, pp. 990–1000, 2021

work page 2021

[25] [25]

Symbiotic navigation in multi-robot systems with remote obstacle knowledge sharing,

A. Ravankar, A. A. Ravankar, Y . Kobayashi, and T. Emaru, “Symbiotic navigation in multi-robot systems with remote obstacle knowledge sharing,”Sensors, vol. 17, no. 7, 2017

work page 2017

[26] [26]

An improved artificial potential field method for path planning and formation control of the multi-uav systems,

Z. Pan, C. Zhang, Y . Xia, H. Xiong, and X. Shao, “An improved artificial potential field method for path planning and formation control of the multi-uav systems,”IEEE Tran. on Circuits and Syst. II: Express Briefs, vol. 69, no. 3, pp. 1129–1133, 2022

work page 2022

[27] [27]

Path planning of mobile robots based on genetic algorithm,

Y . Zhang, B. Ou, Y . Xu, and C. Dai, “Path planning of mobile robots based on genetic algorithm,” in2023 8th Int. Conf. on Cloud Computing and Big Data Analytics, 2023, pp. 501–505

work page 2023

[28] [28]

Modified adaptive ant colony optimization algorithm and its application for solving path planning of mobile robot,

L. Wu, X. Huang, J. Cui, C. Liu, and W. Xiao, “Modified adaptive ant colony optimization algorithm and its application for solving path planning of mobile robot,”Expert Syst. with App., vol. 215, p. 119410, 2023

work page 2023

[29] [29]

A novel hybrid particle swarm optimization algorithm for path planning of uavs,

Z. Yu, Z. Si, X. Li, D. Wang, and H. Song, “A novel hybrid particle swarm optimization algorithm for path planning of uavs,” IEEE Internet of Things J., vol. 9, no. 22, pp. 22 547–22 558, 2022

work page 2022

[30] [30]

Mapper: Multi-agent path planning with evolutionary reinforcement learning in mixed dynamic environments,

Z. Liu, B. Chen, H. Zhou, G. Koushik, M. Hebert, and D. Zhao, “Mapper: Multi-agent path planning with evolutionary reinforcement learning in mixed dynamic environments,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2020, pp. 11 748–11 754

work page 2020

[31] [31]

Autonomous emergency landing for multicopters using deep reinforcement learn- ing,

L. Bartolomei, Y . Kompis, L. Teixeira, and M. Chli, “Autonomous emergency landing for multicopters using deep reinforcement learn- ing,” inIEEE/RSJ Int. Conf. on Intell. Robots and Syst., 2022, pp. 3392–3399

work page 2022

[32] [32]

Transformer-based imitative reinforcement learning for multi-robot path planning,

L. Chen, Y . Wang, Z. Miao, Y . Mo, M. Feng, Z. Zhou, and H. Wang, “Transformer-based imitative reinforcement learning for multi-robot path planning,”IEEE Tran. on Industrial Informatics, pp. 1–11, 2023

work page 2023

[33] [33]

The orien- teering problem: A survey,

P. Vansteenwegen, W. Souffriau, and D. V . Oudheusden, “The orien- teering problem: A survey,”European J. of Oper . Res., vol. 209, no. 1, pp. 1–10, 2011

work page 2011

[34] [34]

Panoptic segmentation of satellite image time series with convolutional temporal attention networks,

V . F. Garnot and L. Landrieu, “Panoptic segmentation of satellite image time series with convolutional temporal attention networks,” inIEEE/CVF Int. Conf. on Comput. Vision, 2021, pp. 4852–4861

work page 2021

[35] [35]

Transpath: Learning heuristics for grid-based pathfinding via transformers,

D. Kirilenko, A. Andreychuk, A. Panov, and K. Yakovlev, “Transpath: Learning heuristics for grid-based pathfinding via transformers,”Pro- ceedings of the AAAI Conf. on Artif. Intell., vol. 37, no. 10, pp. 12 436– 12 443, 2023

work page 2023

[36] [36]

Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,

Q. Ma, S. Ge, D. He, D. Thaker, and I. Drori, “Combinatorial optimization by graph pointer networks and hierarchical reinforcement learning,” inAAAI Workshop on Deep Learn. on Graphs: Methodolo- gies and App., 2020

work page 2020