Beyond Static Priors: Dynamic Neural Guidance for Large-Scale Ant Colony Optimization

Dat Thanh Tran; Van Khu Vu; Yining Ma

arxiv: 2606.04039 · v1 · pith:H5OEXRAGnew · submitted 2026-06-02 · 💻 cs.NE · cs.AI· cs.LG

Beyond Static Priors: Dynamic Neural Guidance for Large-Scale Ant Colony Optimization

Dat Thanh Tran , Van Khu Vu , Yining Ma This is my paper

Pith reviewed 2026-06-28 07:48 UTC · model grok-4.3

classification 💻 cs.NE cs.AIcs.LG

keywords Ant Colony OptimizationNeural-guided searchDynamic priorsTraveling Salesman ProblemCapacitated Vehicle RoutingLarge-scale combinatorial optimization

0 comments

The pith

DyNACO replaces static neural priors with periodic observations of pheromone and incumbent solutions to guide ACO on large TSP and CVRP instances.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses a training-inference mismatch in neural-guided ACO where policies trained on static outputs are used in iterative search. It introduces DyNACO to generate dynamic guidance by observing the current pheromone distribution and best solution at intervals. To keep this tractable at scale, the approach pairs the policy with a perturbation-based ACO solver and a scope-restricted refinement step. On TSP this reaches 100,000-node problems, beats other neural methods, and often cuts total runtime versus the unguided solver. The same framework extends to CVRP with under 1 percent neural overhead while still improving the baseline.

Core claim

DyNACO achieves dynamic neural guidance for ACO by periodically feeding the pheromone matrix and incumbent tour into a policy network, then uses a perturbation-based backend and scope-restricted refinement to maintain efficacy and stable credit assignment; the resulting solver scales to 100,000-node TSP instances while outperforming neural baselines and frequently running faster than unguided ACO, and extends to capacity-aware CVRP with negligible overhead.

What carries the argument

Dynamic policy that takes pheromone distribution and incumbent solution as input at regular intervals, paired with perturbation-based ACO backend and scope-restricted refinement for credit assignment.

If this is right

ACO can be guided by live search state rather than fixed heatmaps without losing scalability.
The same dynamic-guidance pattern transfers from TSP to CVRP via a capacity-aware backend.
Neural overhead stays below 1 percent while still improving the unguided baseline on both problems.
Generalization holds across instance sizes because the policy sees actual pheromone and solution states during search.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same observation-and-refine loop could be tested on other iterative solvers such as local search or evolutionary algorithms.
If credit assignment remains stable, the approach might reduce the need for separate training phases that match the exact search horizon.
Periodic rather than continuous guidance suggests a tunable trade-off between neural cost and solution quality that could be measured on other combinatorial domains.

Load-bearing premise

The perturbation-based ACO backend and scope-restricted refinement together keep the dynamic policy effective and credit assignment stable at large scale.

What would settle it

Run the same policy on 100,000-node TSP instances after removing either the periodic observation step or the perturbation backend and measure whether solution quality or runtime advantage disappears.

Figures

Figures reproduced from arXiv: 2606.04039 by Dat Thanh Tran, Van Khu Vu, Yining Ma.

**Figure 1.** Figure 1: Comparison of training paradigms and our DyNACO. (a) The original DeepACO paradigm [1, 15, 39]. (b) Our DyNACO framework for dynamic guidance. (c) Trajectory-aware training that aligns with iterative search dynamics. (d) State-aware representation conditioning the policy on the evolving search state. • Action 𝑎ℎ = 𝑧𝜃 (Sℎ) ∈ R 𝑁 ×𝐾 : A residual logit matrix emitted by the policy that shapes the pheromone m… view at source ↗

**Figure 2.** Figure 2: Credit-assignment trade-off under different local [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of pheromone concentration and neu [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Visualization of the guidance-pheromone dynamics [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

read the original abstract

Neural-guided Ant Colony Optimization (ACO) suffers from a fundamental training-inference misalignment: policies are typically trained to generate static priors (e.g., heatmaps), yet deployed to guide iterative, long-horizon search processes. In this paper, we present DyNACO, a novel framework that achieves dynamic neural guidance by periodically observing the pheromone distribution and the incumbent solution. To make DyNACO tractable at scale, we pair the policy with a perturbation-based ACO backend and a scope-restricted refinement mechanism that jointly ensure efficacy and stable credit assignment. On TSP, DyNACO scales to 100,000-node instances and outperforms neural baselines while often reducing total runtime compared to the unguided solver. We extend DyNACO to CVRP via a capacity-aware backend, consistently improving the unguided baseline with less than 1% neural overhead. We further provide in-depth analysis validating the model's generalization capabilities and elucidating why dynamic guidance outperforms static priors. Our work underscores the necessity of aligning neural training with iterative search dynamics in learning-guided optimization. The code is available at https://github.com/shoraaa/DyNACO.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DyNACO's dynamic observation of pheromone and incumbent is a sensible fix for static-prior misalignment, but the scaling claims rest on an unshown assertion about stable credit assignment.

read the letter

The paper's core move is to replace static neural priors with a policy that periodically reads the current pheromone map and best solution during ACO search. That framing is new relative to earlier neural-guided ACO work and directly targets the train-inference gap the authors describe.

They pair the policy with a perturbation-based backend and scope-restricted refinement, claim this combination keeps the approach tractable at 100k-node TSP scale, and report outperformance over neural baselines plus occasional runtime wins versus plain ACO. The CVRP extension with capacity-aware backend and sub-1% overhead is a straightforward addition. Code release is useful.

The soft spot is the central mechanism. The abstract states that the backend and refinement "jointly ensure efficacy and stable credit assignment," yet supplies no metric, ablation, or variance check on policy updates, especially at the largest sizes. If the signals to the dynamic policy are noisy, the claimed gains could collapse to whatever the perturbation backend already provides. No error bars, dataset details, or sensitivity results appear in the given text, so the 100k-node outperformance and runtime claims stay unverified.

This is for researchers already working on hybrid neural-combinatorial solvers who want to try closing the static-prior gap. A reader hunting for concrete implementation ideas on dynamic feedback would find the high-level recipe worth testing, but anyone needing reproducible evidence for the scaling story will have to wait for the full experiments.

I would not send this to peer review in its current form; the credit-assignment claim needs explicit checks before the scaling results can be taken at face value.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes DyNACO, a framework for dynamic neural guidance in ACO that periodically observes pheromone distributions and incumbent solutions to correct the training-inference misalignment of static-prior policies. It pairs the learned policy with a perturbation-based ACO backend and a scope-restricted refinement mechanism claimed to ensure efficacy and stable credit assignment, enabling scaling to 100k-node TSP instances with outperformance over neural baselines and frequent runtime reductions versus the unguided solver; the approach is extended to CVRP via a capacity-aware backend with <1% neural overhead, accompanied by generalization analysis.

Significance. If the dynamic guidance and stable credit assignment hold at the claimed scales, the work would meaningfully advance learning-guided combinatorial optimization by aligning policy training with iterative search dynamics rather than static priors. The public code release supports reproducibility and is a clear strength.

major comments (1)

[Abstract] Abstract: the central claim that the perturbation-based ACO backend and scope-restricted refinement 'jointly ensure efficacy and stable credit assignment' is load-bearing for the 100k-node scaling, outperformance, and runtime-reduction results, yet the manuscript provides no explicit metric, ablation, or analysis (e.g., gradient variance, policy-update consistency, or sensitivity to pheromone noise) quantifying assignment stability on the largest instances.

minor comments (1)

The abstract states that 'in-depth analysis' validates generalization and explains dynamic vs. static superiority, but the summary does not indicate the corresponding section or figures where these results appear.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the need for explicit quantification of credit assignment stability, which is central to our scaling claims. We address the comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the perturbation-based ACO backend and scope-restricted refinement 'jointly ensure efficacy and stable credit assignment' is load-bearing for the 100k-node scaling, outperformance, and runtime-reduction results, yet the manuscript provides no explicit metric, ablation, or analysis (e.g., gradient variance, policy-update consistency, or sensitivity to pheromone noise) quantifying assignment stability on the largest instances.

Authors: We agree that an explicit metric would strengthen the manuscript. The empirical scaling to 100k nodes and runtime reductions provide indirect evidence of stability, but we will add a new subsection with ablations measuring gradient variance and policy-update consistency on instances up to 100k nodes, including sensitivity analysis to pheromone noise. This will directly quantify the contribution of the perturbation-based backend and scope-restricted refinement to stable credit assignment. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on external benchmarks

full rationale

The paper describes a framework pairing a dynamic policy with a perturbation-based ACO backend and scope-restricted refinement, asserting that these jointly ensure stable credit assignment. This is presented as an engineering choice enabling scalability, not as a derivation or prediction that reduces to its own inputs by construction. No equations, fitted parameters renamed as predictions, self-citation load-bearing uniqueness theorems, or ansatz smuggling appear in the provided text. Central claims (scaling to 100k-node TSP, outperformance vs. baselines, runtime reduction) are supported by empirical comparisons to unguided solvers and neural baselines, with code released for reproduction. The design assertion about credit assignment is falsifiable via the stated ablations and metrics, not self-referential.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; full paper required for complete audit.

pith-pipeline@v0.9.1-grok · 5732 in / 968 out tokens · 23165 ms · 2026-06-28T07:48:43.796838+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 15 canonical work pages

[1]

Sohel Rahman, and Md Adnan Arefeen

Abrar Rahman Abir, Muhammad Ali Nayeem, M. Sohel Rahman, and Md Adnan Arefeen. 2025. GTG-ACO: Graph Transformer Guided Ant Colony Optimization for learning heuristics and pheromone dynamics for combinatorial optimization. Swarm and Evolutionary Computation99 (12 2025). doi:10.1016/j.swevo.2025. 102147

work page doi:10.1016/j.swevo.2025 2025
[2]

A Amuthan and K Deepa Thilak. 2017. Improved Ant Colony Algorithms for Eliminating Stagnation and Local Optimum Problem — A Survey. In2017 Interna- tional Conference on Technical Advancements in Computers and Communications (ICTACC). 97–101. doi:10.1109/ICTACC.2017.33 Dat Thanh Tran, Van Khu Vu, and Yining Ma

work page doi:10.1109/ictacc.2017.33 2017
[3]

Federico Berto, Chuanbo Hua, Junyoung Park, Laurin Luttmann, Yining Ma, Fanchen Bu, Jiarui Wang, Haoran Ye, Minsu Kim, Sanghyeok Choi, Nayeli Gast Zepeda, André Hottung, Jianan Zhou, Jieyi Bi, Yu Hu, Fei Liu, Hyeonah Kim, Jiwoo Son, Haeyeon Kim, Davide Angioni, Wouter Kool, Zhiguang Cao, Qingfu Zhang, Joungho Kim, Jie Zhang, Kijung Shin, Cathy Wu, Sungsoo...

work page doi:10.1145/3711896.3737433 2025
[4]

Darren Chitty. 2018. Applying ACO to Large Scale TSP Instances. InAdvances in Intelligent Systems and Computing, Vol. 650. 104–118. doi:10.1007/978-3-319- 66939-7_9

work page doi:10.1007/978-3-319- 2018
[5]

Paulo da Costa, Jason Rhuggenaath, Yingqian Zhang, Alp Akcay, and Uzay Kay- mak. 2021. Learning 2-Opt Heuristics for Routing Problems via Deep Reinforce- ment Learning.SN Computer Science2 (2021), 388. Issue 5. doi:10.1007/s42979- 021-00779-2

work page doi:10.1007/s42979- 2021
[6]

2019.Ant colony optimization: overview and recent advances

Marco Dorigo and Thomas Stützle. 2019.Ant colony optimization: overview and recent advances. Springer

2019
[7]

Darko Drakulic, Sofia Michel, Florian Mai, Arnaud Sors, and Jean-Marc Andreoli
[8]

InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA)

BQ-NCO: bisimulation quotienting for efficient neural combinatorial opti- mization. InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc
[9]

Han Fang, Zhihao Song, Paul Weng, and Yutong Ban. 2024. INViT: a generalizable routing problem solver with invariant nested view transformer. InProceedings of the 41st International Conference on Machine Learning. JMLR.org

2024
[10]

Zhang-Hua Fu, Kai-Bin Qiu, and Hongyuan Zha. 2021. Generalize a small pre-trained model to arbitrarily large tsp instances. InProceedings of the AAAI conference on artificial intelligence, Vol. 35. 7474–7482

2021
[11]

2019.Exact combinatorial optimization with graph convolutional neural networks

Maxime Gasse, Didier Chételat, Nicola Ferroni, Laurent Charlin, and Andrea Lodi. 2019.Exact combinatorial optimization with graph convolutional neural networks. Curran Associates Inc

2019
[12]

Michael Guntsch and Martin Middendorf. 2002. A population based approach for ACO. InApplications of Evolutionary Computing: EvoWorkshops 2002: EvoCOP, EvoIASP, EvoSTIM/EvoPLAN Kinsale, Ireland, April 3–4, 2002 Proceedings. Springer, 72–81

2002
[13]

Keld Helsgaun. 2000. An effective implementation of the Lin–Kernighan traveling salesman heuristic.European Journal of Operational Research126 (10 2000), 106–

2000
[14]

doi:10.1016/S0377-2217(99)00284-2

Issue 1. doi:10.1016/S0377-2217(99)00284-2

work page doi:10.1016/s0377-2217(99)00284-2
[15]

Keld Helsgaun. 2017. An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. doi:10.13140/ RG.2.2.25569.40807

work page arXiv 2017
[16]

Taoan Huang, Aaron M Ferber, Arman Zharmagambetov, Yuandong Tian, and Bistra Dilkina. 2024. Contrastive Predict-and-Search for Mixed Integer Linear Programs. https://openreview.net/forum?id=J2kRjUAOLh

2024
[17]

Minsu Kim, Sanghyeok Choi, Hyeonah Kim, Jiwoo Son, Jinkyoo Park, and Yoshua Bengio. 2025. Ant Colony Sampling with GFlowNets for Combinatorial Optimiza- tion.Proceedings of Machine Learning Research258 (2025), 469–477. Publisher Copyright: Copyright 2025 by the author(s).; 28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025

2025
[18]

Wouter Kool, Herke van Hoof, and Max Welling. 2019. Attention, Learn to Solve Routing Problems!. InInternational Conference on Learning Representations. https://openreview.net/forum?id=ByxBFsRqYm

2019
[19]

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim, Iljoo Yoon, Youngjune Gwon, and Seungjai Min. 2020. POMO: policy optimization with multiple optima for reinforcement learning. InProceedings of the 34th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2020
[20]

Sirui Li, Zhongxia Yan, and Cathy Wu. 2021. Learning to delegate for large-scale vehicle routing. InProceedings of the 35th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2021
[21]

Bo-Cheng Lin, Yi Mei, and Mengjie Zhang. 2026. HEATACO: Heatmap-Guided Ant Colony Decoding for Large-Scale Travelling Salesman Problems. https: //arxiv.org/abs/2601.19041

work page arXiv 2026
[22]

Fu Luo, Xi Lin, Fei Liu, Qingfu Zhang, and Zhenkun Wang. 2023. Neural combi- natorial optimization with heavy decoder: toward large scale generalization. In Proceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023
[23]

Fu Luo, Xi Lin, Yaoxin Wu, Zhenkun Wang, Tong Xialiang, Mingxuan Yuan, and Qingfu Zhang. 2025. Boosting Neural Combinatorial Optimization for Large- Scale Vehicle Routing Problems. InThe Thirteenth International Conference on Learning Representations. https://openreview.net/forum?id=TbTJJNjumY

2025
[24]

Fu Luo, Xi Lin, Mengyuan Zhong, Fei Liu, Zhenkun Wang, Jianyong Sun, and Qingfu Zhang. 2025. Learning to Insert for Constructive Neural Vehicle Routing Solver. InNeural Information Processing Systems

2025
[25]

Yining Ma, Zhiguang Cao, and Yeow Meng Chee. 2023. Learning to search feasible and infeasible regions of routing problems with flexible neural k-opt. In Proceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023
[26]

Martínez and José M

Pablo A. Martínez and José M. García. 2021. ACOTSP-MF: A memory-friendly and highly scalable ACOTSP approach.Engineering Applications of Artificial Intelligence99 (2021), 104131

2021
[27]

2010.Traveling sales- man problem: an overview of applications, formulations, and solution approaches

Rajesh Matai, Surya Prakash Singh, and Murari Lal Mittal. 2010.Traveling sales- man problem: an overview of applications, formulations, and solution approaches. Vol. 1. InTech Rijeka, 1–25

2010
[28]

Nina Mazyavkina, Sergey Sviridov, Sergei Ivanov, and Evgeny Burnaev. 2021. Reinforcement learning for combinatorial optimization: A survey.Computers & Operations Research134 (10 2021), 105400. doi:10.1016/J.COR.2021.105400

work page doi:10.1016/j.cor.2021.105400 2021
[29]

Yimeng Min, Yiwei Bai, and Carla P. Gomes. 2023. Unsupervised learning for solving the travelling salesman problem. InProceedings of the 37th International Conference on Neural Information Processing Systems(New Orleans, LA, USA) (NIPS ’23). Curran Associates Inc., Red Hook, NY, USA, Article 2047, 15 pages

2023
[30]

Wenbin Ouyang, Sirui Li, Yining Ma, and Cathy Wu. 2026. Learning to Seg- ment for Vehicle Routing Problems. InInternational Conference on Learning Representations

2026
[31]

Xuanhao Pan, Yan Jin, Yuandong Ding, Mingxiao Feng, Li Zhao, Lei Song, and Jiang Bian. 2023. H-TSP: hierarchically solving the large-scale traveling sales- man problem. InProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on...

work page doi:10.1609/aaai.v37i8.26120 2023
[32]

Jonathan Pirnay and Dominik G Grimm. 2024. Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement. Transactions on Machine Learning Research(2024). https://openreview.net/ forum?id=agT8ojoH0X Featured Certification

2024
[33]

Ruizhong Qiu, Zhiqing Sun, and Yiming Yang. 2022. DIMES: a differentiable meta solver for combinatorial optimization problems. InProceedings of the 36th International Conference on Neural Information Processing Systems(New Orleans, LA, USA)(NIPS ’22). Curran Associates Inc., Red Hook, NY, USA, Article 1851, 16 pages

2022
[34]

Gerhard Reinelt. 1991. TSPLIB - A Traveling Salesman Problem Library.INFORMS J. Comput.3 (1991), 376–384

1991
[35]

Rafał Skinderowicz. 2022. Improving Ant Colony Optimization efficiency for solving large TSP instances.Applied Soft Computing120 (5 2022), 108653. doi:10. 1016/J.ASOC.2022.108653

work page arXiv 2022
[36]

Thomas Stützle and Holger H. Hoos. 2000. MAX–MIN Ant System.Future Generation Computer Systems16 (6 2000), 889–914. Issue 8. doi:10.1016/S0167- 739X(00)00043-1

work page doi:10.1016/s0167- 2000
[37]

Zhiqing Sun and Yiming Yang. 2023. DIFUSCO: graph-based diffusion solvers for combinatorial optimization. InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023
[38]

Eduardo Uchoa, Diego Pecin, Artur Pessoa, Marcus Poggi, Thibaut Vidal, and Anand Subramanian. 2017. New benchmark instances for the Capacitated Vehicle Routing Problem.European Journal of Operational Research257 (3 2017), 845–858. Issue 3. doi:10.1016/J.EJOR.2016.08.012

work page doi:10.1016/j.ejor.2016.08.012 2017
[39]

Thibaut Vidal. 2022. Hybrid genetic search for the CVRP: Open-source imple- mentation and SWAP* neighborhood.Computers & Operations Research140 (4 2022), 105643. doi:10.1016/J.COR.2021.105643

work page doi:10.1016/j.cor.2021.105643 2022
[40]

Liang Xin, Wen Song, Zhiguang Cao, and Jie Zhang. 2021. NeuroLKH: combin- ing deep learning model with lin-kernighan-helsgaun heuristic for solving the traveling salesman problem. InProceedings of the 35th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2021
[41]

Haoran Ye, Jiarui Wang, Zhiguang Cao, Helan Liang, and Yong Li. 2023. DeepACO: neural-enhanced ant systems for combinatorial optimization. InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023
[42]

Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, and Fanzhang Li
[43]

BFS Repair

GLOP: learning global partition and local construction for solving large- scale routing problems in real-time. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence. AAAI Pres...

work page doi:10.1609/aaai.v38i18.30009

[1] [1]

Sohel Rahman, and Md Adnan Arefeen

Abrar Rahman Abir, Muhammad Ali Nayeem, M. Sohel Rahman, and Md Adnan Arefeen. 2025. GTG-ACO: Graph Transformer Guided Ant Colony Optimization for learning heuristics and pheromone dynamics for combinatorial optimization. Swarm and Evolutionary Computation99 (12 2025). doi:10.1016/j.swevo.2025. 102147

work page doi:10.1016/j.swevo.2025 2025

[2] [2]

A Amuthan and K Deepa Thilak. 2017. Improved Ant Colony Algorithms for Eliminating Stagnation and Local Optimum Problem — A Survey. In2017 Interna- tional Conference on Technical Advancements in Computers and Communications (ICTACC). 97–101. doi:10.1109/ICTACC.2017.33 Dat Thanh Tran, Van Khu Vu, and Yining Ma

work page doi:10.1109/ictacc.2017.33 2017

[3] [3]

Federico Berto, Chuanbo Hua, Junyoung Park, Laurin Luttmann, Yining Ma, Fanchen Bu, Jiarui Wang, Haoran Ye, Minsu Kim, Sanghyeok Choi, Nayeli Gast Zepeda, André Hottung, Jianan Zhou, Jieyi Bi, Yu Hu, Fei Liu, Hyeonah Kim, Jiwoo Son, Haeyeon Kim, Davide Angioni, Wouter Kool, Zhiguang Cao, Qingfu Zhang, Joungho Kim, Jie Zhang, Kijung Shin, Cathy Wu, Sungsoo...

work page doi:10.1145/3711896.3737433 2025

[4] [4]

Darren Chitty. 2018. Applying ACO to Large Scale TSP Instances. InAdvances in Intelligent Systems and Computing, Vol. 650. 104–118. doi:10.1007/978-3-319- 66939-7_9

work page doi:10.1007/978-3-319- 2018

[5] [5]

Paulo da Costa, Jason Rhuggenaath, Yingqian Zhang, Alp Akcay, and Uzay Kay- mak. 2021. Learning 2-Opt Heuristics for Routing Problems via Deep Reinforce- ment Learning.SN Computer Science2 (2021), 388. Issue 5. doi:10.1007/s42979- 021-00779-2

work page doi:10.1007/s42979- 2021

[6] [6]

2019.Ant colony optimization: overview and recent advances

Marco Dorigo and Thomas Stützle. 2019.Ant colony optimization: overview and recent advances. Springer

2019

[7] [7]

Darko Drakulic, Sofia Michel, Florian Mai, Arnaud Sors, and Jean-Marc Andreoli

[8] [8]

InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA)

BQ-NCO: bisimulation quotienting for efficient neural combinatorial opti- mization. InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

[9] [9]

Han Fang, Zhihao Song, Paul Weng, and Yutong Ban. 2024. INViT: a generalizable routing problem solver with invariant nested view transformer. InProceedings of the 41st International Conference on Machine Learning. JMLR.org

2024

[10] [10]

Zhang-Hua Fu, Kai-Bin Qiu, and Hongyuan Zha. 2021. Generalize a small pre-trained model to arbitrarily large tsp instances. InProceedings of the AAAI conference on artificial intelligence, Vol. 35. 7474–7482

2021

[11] [11]

2019.Exact combinatorial optimization with graph convolutional neural networks

Maxime Gasse, Didier Chételat, Nicola Ferroni, Laurent Charlin, and Andrea Lodi. 2019.Exact combinatorial optimization with graph convolutional neural networks. Curran Associates Inc

2019

[12] [12]

Michael Guntsch and Martin Middendorf. 2002. A population based approach for ACO. InApplications of Evolutionary Computing: EvoWorkshops 2002: EvoCOP, EvoIASP, EvoSTIM/EvoPLAN Kinsale, Ireland, April 3–4, 2002 Proceedings. Springer, 72–81

2002

[13] [13]

Keld Helsgaun. 2000. An effective implementation of the Lin–Kernighan traveling salesman heuristic.European Journal of Operational Research126 (10 2000), 106–

2000

[14] [14]

doi:10.1016/S0377-2217(99)00284-2

Issue 1. doi:10.1016/S0377-2217(99)00284-2

work page doi:10.1016/s0377-2217(99)00284-2

[15] [15]

Keld Helsgaun. 2017. An Extension of the Lin-Kernighan-Helsgaun TSP Solver for Constrained Traveling Salesman and Vehicle Routing Problems. doi:10.13140/ RG.2.2.25569.40807

work page arXiv 2017

[16] [16]

Taoan Huang, Aaron M Ferber, Arman Zharmagambetov, Yuandong Tian, and Bistra Dilkina. 2024. Contrastive Predict-and-Search for Mixed Integer Linear Programs. https://openreview.net/forum?id=J2kRjUAOLh

2024

[17] [17]

Minsu Kim, Sanghyeok Choi, Hyeonah Kim, Jiwoo Son, Jinkyoo Park, and Yoshua Bengio. 2025. Ant Colony Sampling with GFlowNets for Combinatorial Optimiza- tion.Proceedings of Machine Learning Research258 (2025), 469–477. Publisher Copyright: Copyright 2025 by the author(s).; 28th International Conference on Artificial Intelligence and Statistics, AISTATS 2025

2025

[18] [18]

Wouter Kool, Herke van Hoof, and Max Welling. 2019. Attention, Learn to Solve Routing Problems!. InInternational Conference on Learning Representations. https://openreview.net/forum?id=ByxBFsRqYm

2019

[19] [19]

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim, Iljoo Yoon, Youngjune Gwon, and Seungjai Min. 2020. POMO: policy optimization with multiple optima for reinforcement learning. InProceedings of the 34th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2020

[20] [20]

Sirui Li, Zhongxia Yan, and Cathy Wu. 2021. Learning to delegate for large-scale vehicle routing. InProceedings of the 35th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2021

[21] [21]

Bo-Cheng Lin, Yi Mei, and Mengjie Zhang. 2026. HEATACO: Heatmap-Guided Ant Colony Decoding for Large-Scale Travelling Salesman Problems. https: //arxiv.org/abs/2601.19041

work page arXiv 2026

[22] [22]

Fu Luo, Xi Lin, Fei Liu, Qingfu Zhang, and Zhenkun Wang. 2023. Neural combi- natorial optimization with heavy decoder: toward large scale generalization. In Proceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023

[23] [23]

Fu Luo, Xi Lin, Yaoxin Wu, Zhenkun Wang, Tong Xialiang, Mingxuan Yuan, and Qingfu Zhang. 2025. Boosting Neural Combinatorial Optimization for Large- Scale Vehicle Routing Problems. InThe Thirteenth International Conference on Learning Representations. https://openreview.net/forum?id=TbTJJNjumY

2025

[24] [24]

Fu Luo, Xi Lin, Mengyuan Zhong, Fei Liu, Zhenkun Wang, Jianyong Sun, and Qingfu Zhang. 2025. Learning to Insert for Constructive Neural Vehicle Routing Solver. InNeural Information Processing Systems

2025

[25] [25]

Yining Ma, Zhiguang Cao, and Yeow Meng Chee. 2023. Learning to search feasible and infeasible regions of routing problems with flexible neural k-opt. In Proceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023

[26] [26]

Martínez and José M

Pablo A. Martínez and José M. García. 2021. ACOTSP-MF: A memory-friendly and highly scalable ACOTSP approach.Engineering Applications of Artificial Intelligence99 (2021), 104131

2021

[27] [27]

2010.Traveling sales- man problem: an overview of applications, formulations, and solution approaches

Rajesh Matai, Surya Prakash Singh, and Murari Lal Mittal. 2010.Traveling sales- man problem: an overview of applications, formulations, and solution approaches. Vol. 1. InTech Rijeka, 1–25

2010

[28] [28]

Nina Mazyavkina, Sergey Sviridov, Sergei Ivanov, and Evgeny Burnaev. 2021. Reinforcement learning for combinatorial optimization: A survey.Computers & Operations Research134 (10 2021), 105400. doi:10.1016/J.COR.2021.105400

work page doi:10.1016/j.cor.2021.105400 2021

[29] [29]

Yimeng Min, Yiwei Bai, and Carla P. Gomes. 2023. Unsupervised learning for solving the travelling salesman problem. InProceedings of the 37th International Conference on Neural Information Processing Systems(New Orleans, LA, USA) (NIPS ’23). Curran Associates Inc., Red Hook, NY, USA, Article 2047, 15 pages

2023

[30] [30]

Wenbin Ouyang, Sirui Li, Yining Ma, and Cathy Wu. 2026. Learning to Seg- ment for Vehicle Routing Problems. InInternational Conference on Learning Representations

2026

[31] [31]

Xuanhao Pan, Yan Jin, Yuandong Ding, Mingxiao Feng, Li Zhao, Lei Song, and Jiang Bian. 2023. H-TSP: hierarchically solving the large-scale traveling sales- man problem. InProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on...

work page doi:10.1609/aaai.v37i8.26120 2023

[32] [32]

Jonathan Pirnay and Dominik G Grimm. 2024. Self-Improvement for Neural Combinatorial Optimization: Sample Without Replacement, but Improvement. Transactions on Machine Learning Research(2024). https://openreview.net/ forum?id=agT8ojoH0X Featured Certification

2024

[33] [33]

Ruizhong Qiu, Zhiqing Sun, and Yiming Yang. 2022. DIMES: a differentiable meta solver for combinatorial optimization problems. InProceedings of the 36th International Conference on Neural Information Processing Systems(New Orleans, LA, USA)(NIPS ’22). Curran Associates Inc., Red Hook, NY, USA, Article 1851, 16 pages

2022

[34] [34]

Gerhard Reinelt. 1991. TSPLIB - A Traveling Salesman Problem Library.INFORMS J. Comput.3 (1991), 376–384

1991

[35] [35]

Rafał Skinderowicz. 2022. Improving Ant Colony Optimization efficiency for solving large TSP instances.Applied Soft Computing120 (5 2022), 108653. doi:10. 1016/J.ASOC.2022.108653

work page arXiv 2022

[36] [36]

Thomas Stützle and Holger H. Hoos. 2000. MAX–MIN Ant System.Future Generation Computer Systems16 (6 2000), 889–914. Issue 8. doi:10.1016/S0167- 739X(00)00043-1

work page doi:10.1016/s0167- 2000

[37] [37]

Zhiqing Sun and Yiming Yang. 2023. DIFUSCO: graph-based diffusion solvers for combinatorial optimization. InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023

[38] [38]

Eduardo Uchoa, Diego Pecin, Artur Pessoa, Marcus Poggi, Thibaut Vidal, and Anand Subramanian. 2017. New benchmark instances for the Capacitated Vehicle Routing Problem.European Journal of Operational Research257 (3 2017), 845–858. Issue 3. doi:10.1016/J.EJOR.2016.08.012

work page doi:10.1016/j.ejor.2016.08.012 2017

[39] [39]

Thibaut Vidal. 2022. Hybrid genetic search for the CVRP: Open-source imple- mentation and SWAP* neighborhood.Computers & Operations Research140 (4 2022), 105643. doi:10.1016/J.COR.2021.105643

work page doi:10.1016/j.cor.2021.105643 2022

[40] [40]

Liang Xin, Wen Song, Zhiguang Cao, and Jie Zhang. 2021. NeuroLKH: combin- ing deep learning model with lin-kernighan-helsgaun heuristic for solving the traveling salesman problem. InProceedings of the 35th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2021

[41] [41]

Haoran Ye, Jiarui Wang, Zhiguang Cao, Helan Liang, and Yong Li. 2023. DeepACO: neural-enhanced ant systems for combinatorial optimization. InProceedings of the 37th International Conference on Neural Information Processing Systems(Red Hook, NY, USA). Curran Associates Inc

2023

[42] [42]

Haoran Ye, Jiarui Wang, Helan Liang, Zhiguang Cao, Yong Li, and Fanzhang Li

[43] [43]

BFS Repair

GLOP: learning global partition and local construction for solving large- scale routing problems in real-time. InProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence. AAAI Pres...

work page doi:10.1609/aaai.v38i18.30009