pith. machine review for the scientific record. sign in

arxiv: 2605.14416 · v1 · submitted 2026-05-14 · 💻 cs.AI

Recognition: no theorem link

A Unified Knowledge Embedded Reinforcement Learning-based Framework for Generalized Capacitated Vehicle Routing Problems

Authors on Pith no claims yet

Pith reviewed 2026-05-15 02:06 UTC · model grok-4.3

classification 💻 cs.AI
keywords Capacitated Vehicle RoutingReinforcement LearningKnowledge EmbeddingHeuristic DecompositionDynamic ProgrammingGeneralizationCombinatorial OptimizationVehicle Routing Problem
0
0 comments X

The pith

A framework embedding classical routing knowledge into RL achieves better solutions for diverse CVRP variants.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to create a unified RL framework for generalized Capacitated Vehicle Routing Problems by incorporating knowledge from Route-First Cluster-Second heuristics. It decomposes the problem so that dynamic programming solves the clustering part and guides the RL agent on route construction. A history-enhanced context module helps with the partial observability from this split. If correct, this hybrid approach delivers higher quality solutions than pure learning methods while generalizing well to variants with extra constraints like time windows. This would matter for practical logistics where problems vary and end-to-end RL often falls short of classical methods.

Core claim

By decomposing generalized CVRPs into a route-first subproblem solved by RL and a cluster-second subproblem solved by dynamic programming whose results guide the RL solver, and adding a unified history-enhanced context processing module to handle partial observability, the framework produces superior solution quality compared to state-of-the-art learning-based methods and shows strong generalization across CVRP variants.

What carries the argument

The Route-First Cluster-Second decomposition combined with dynamic programming guidance for clustering to inform the RL-based route constructor, along with the history-enhanced context processing module.

If this is right

  • Superior solution quality over state-of-the-art learning-based methods for CVRPs.
  • Smaller performance gap to classical heuristics.
  • Strong generalization to diverse CVRP variants including those with time windows or backhauls.
  • Unified framework applicable to multiple problem types without custom redesign.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar knowledge embedding could improve RL for other combinatorial problems like scheduling or packing.
  • Integration with more advanced DP or exact methods might further close the gap to optimal solutions.
  • Testing on real-world logistics data with noisy constraints could reveal practical robustness.

Load-bearing premise

That decomposing the problem via Route-First Cluster-Second and using dynamic programming to guide RL will consistently mitigate partial observability and yield generalizable gains without adding new biases or causing overfitting to specific variants.

What would settle it

Running the framework on an unseen CVRP variant, such as one involving pickup and delivery with time windows not in the original experiments, and verifying if it still outperforms learning baselines by a similar margin.

Figures

Figures reproduced from arXiv: 2605.14416 by Hao Hu, Liang Wang, Wen Wang, Xiangchen Wu, Xianping Tao.

Figure 1
Figure 1. Figure 1: The Overall Framework of the Proposed Knowledge-embedded RFCS method. [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The Average Optimality Gap. 5.4 Results for RQ2 We evaluate the zero-shot transfer capability of RFCS across different capacity distributions encountered during training. During training, for scenarios with n = 50, the vehicle capac￾ity was set to 40. We directly evaluate the model obtained in RQ1 on instances with varying vehicle capacities to examine its performance changes. The target distribution of ve… view at source ↗
read the original abstract

The Capacitated Vehicle Routing Problem (CVRP) is a fundamental NP-hard problem with broad applications in logistics and transportation. Real-world CVRPs often involve diverse objectives and complex constraints, such as time windows or backhaul requirements, motivating the development of a unified solution framework. Recent reinforcement learning (RL) approaches have shown promise in combinatorial optimization, yet they rely on end-to-end learning and lack explicit problem-solving knowledge, limiting solution quality. In this paper, we propose a knowledge-embedded framework inspired by the Route-First Cluster-Second heuristics. It incorporates knowledge at two levels: (1) decomposing CVRPs into the route-first and cluster-second subproblems, and (2) leveraging dynamic programming to solve the second subproblem, whose results guide the RL-based constructive solver to solve the first problem. To mitigate partial observability caused by problem decomposition, we introduce a unified history-enhanced context processing module. Extensive experiments show that this framework achieves superior solution quality compared with state-of-the-art learning-based methods, with a smaller gap to classical heuristics, demonstrating strong generalization across diverse CVRP variants.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a knowledge-embedded RL framework for generalized CVRPs by decomposing into route-first (RL) and cluster-second (DP) subproblems, using DP outputs to guide the RL policy, and introducing a history-enhanced context module to handle partial observability from the split. It claims this yields better solution quality than SOTA learning methods and closer performance to classical heuristics, with strong generalization across CVRP variants.

Significance. If the experimental claims hold, the result would be significant for combinatorial optimization because it supplies a concrete mechanism for injecting domain knowledge (heuristic decomposition plus exact DP sub-solvers) into RL, directly addressing the generalization limits of end-to-end learning highlighted in the abstract. The hybrid Route-First Cluster-Second structure with feedback is a clear, non-circular strength that could influence future hybrid solvers.

major comments (2)
  1. Abstract: the central claim of 'superior solution quality' and 'strong generalization' is asserted without any quantitative metrics, baselines, instance sizes, error bars, or ablation results; this is load-bearing because the abstract supplies no evidence that the history module or DP guidance actually delivers the claimed improvements.
  2. Framework description (Route-First Cluster-Second + DP guidance): the skeptic concern lands—the decomposition induces partial observability whose repair by the unified history-enhanced module is not shown to be robust; no referenced test demonstrates that DP-derived guidance avoids systematic bias on out-of-distribution constraint combinations (e.g., unseen time-window + backhaul mixes).
minor comments (1)
  1. Abstract: the phrase 'smaller gap to classical heuristics' is imprecise; name the specific heuristics and report the numerical gap reduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the abstract should contain quantitative evidence and will revise it. We also acknowledge the need to demonstrate robustness of the history module on mixed out-of-distribution constraints and will add targeted experiments. Point-by-point responses are provided below.

read point-by-point responses
  1. Referee: Abstract: the central claim of 'superior solution quality' and 'strong generalization' is asserted without any quantitative metrics, baselines, instance sizes, error bars, or ablation results; this is load-bearing because the abstract supplies no evidence that the history module or DP guidance actually delivers the claimed improvements.

    Authors: We agree that the abstract is currently qualitative and should be strengthened with concrete evidence. In the revised manuscript we will insert specific metrics: average optimality gaps (with standard deviations over 5 runs) on CVRP instances of 20-100 customers, direct comparisons to learning baselines (AM, POMO) and classical solvers (LKH, OR-Tools), plus ablation results isolating the DP guidance and history-enhanced module contributions. revision: yes

  2. Referee: Framework description (Route-First Cluster-Second + DP guidance): the skeptic concern lands—the decomposition induces partial observability whose repair by the unified history-enhanced module is not shown to be robust; no referenced test demonstrates that DP-derived guidance avoids systematic bias on out-of-distribution constraint combinations (e.g., unseen time-window + backhaul mixes).

    Authors: We appreciate the concern. The history-enhanced module maintains a running record of prior route decisions to compensate for the split-induced partial observability, and our current experiments already cover separate CVRP variants (time windows, backhauls, etc.) with consistent gains. To directly test mixed OOD combinations, we will add new experiments in the revision: models trained on single-constraint data will be evaluated on unseen time-window + backhaul mixes, with explicit reporting of any systematic bias in the DP-guided policy. Because the DP sub-solver is exact, we expect limited bias, but the new results will quantify this. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected; derivation rests on external heuristics and DP components.

full rationale

The paper's framework decomposes CVRP via the established Route-First Cluster-Second heuristic and feeds DP outputs as guidance to an RL constructive solver, with a history-enhanced module added to address partial observability. These steps are presented as incorporation of independent problem-solving knowledge rather than self-definitions, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or claims in the abstract or described chain reduce the reported generalization or solution quality to the inputs by construction; performance claims are tied to external experimental baselines. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on the assumption that the Route-First Cluster-Second heuristic decomposition is a valid and useful split for RL, plus standard RL training assumptions and the correctness of dynamic programming for the cluster subproblem.

axioms (2)
  • domain assumption Route-First Cluster-Second heuristic provides a beneficial decomposition for CVRP variants
    Invoked in the framework design to justify splitting the problem and using DP on the second subproblem.
  • domain assumption Dynamic programming solves the cluster-second subproblem exactly and its outputs are useful guidance for RL
    Central to the knowledge-embedding claim.
invented entities (1)
  • unified history-enhanced context processing module no independent evidence
    purpose: Mitigate partial observability caused by problem decomposition
    New module introduced to handle information loss from the split; no independent evidence provided beyond the claim.

pith-pipeline@v0.9.0 · 5496 in / 1438 out tokens · 33911 ms · 2026-05-15T02:06:16.113961+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages

  1. [1]

    Routefinder: Towards foundation models for vehicle routing problems

    [Bertoet al., 2024 ] Federico Berto, Chuanbo Hua, Nayeli Gast Zepeda, Andr ´e Hottung, Niels Wouda, Leon Lan, Kevin Tierney, and Jinkyoo Park. Routefinder: Towards foundation models for vehicle routing problems. InICML 2024 Workshop on Foundation Models in the Wild,

  2. [2]

    Learn- ing to handle complex constraints for vehicle routing prob- lems.Advances in Neural Information Processing Sys- tems, 37:93479–93509,

    [Biet al., 2024 ] Jieyi Bi, Yining Ma, Jianan Zhou, Wen Song, Zhiguang Cao, Yaoxin Wu, and Jie Zhang. Learn- ing to handle complex constraints for vehicle routing prob- lems.Advances in Neural Information Processing Sys- tems, 37:93479–93509,

  3. [3]

    Learning to perform local rewriting for combinatorial opti- mization.Advances in neural information processing sys- tems, 32,

    [Chen and Tian, 2019] Xinyun Chen and Yuandong Tian. Learning to perform local rewriting for combinatorial opti- mization.Advances in neural information processing sys- tems, 32,

  4. [4]

    Select and optimize: Learning to solve large-scale tsp instances

    [Chenget al., 2023 ] Hanni Cheng, Haosi Zheng, Ya Cong, Weihao Jiang, and Shiliang Pu. Select and optimize: Learning to solve large-scale tsp instances. InInterna- tional conference on artificial intelligence and statistics, pages 1219–1231. PMLR,

  5. [5]

    Learning 2-opt heuristics for the traveling salesman problem via deep re- inforcement learning

    [d O Costaet al., 2020 ] Paulo R d O Costa, Jason Rhugge- naath, Yingqian Zhang, and Alp Akcay. Learning 2-opt heuristics for the traveling salesman problem via deep re- inforcement learning. InAsian conference on machine learning, pages 465–480. PMLR,

  6. [6]

    Moco: A learnable meta optimizer for combinatorial optimization

    [Derneddeet al., 2025 ] Tim Dernedde, Daniela Thyssens, S¨oren Dittrich, Maximilian Stubbemann, and Lars Schmidt-Thieme. Moco: A learnable meta optimizer for combinatorial optimization. InPacific-Asia Conference on Knowledge Discovery and Data Mining, pages 236–248. Springer,

  7. [7]

    Goal: A generalist combinatorial optimization agent learner

    [Drakulicet al., 2025 ] Darko Drakulic, Sofia Michel, and Jean-Marc Andreoli. Goal: A generalist combinatorial optimization agent learner. InProceedings of the Inter- national Conference on Learning Representations (ICLR) 2025,

  8. [8]

    [Furnon and Perron, ] Vincent Furnon and Laurent Perron

    Poster. [Furnon and Perron, ] Vincent Furnon and Laurent Perron. Or-tools routing library. [Gaoet al., 2024 ] Chengrui Gao, Haopu Shang, Ke Xue, Dong Li, and Chao Qian. Towards generalizable neural solvers for vehicle routing problems via ensemble with transferrable local policy. InProceedings of the Thirty- Third International Joint Conference on Artific...

  9. [9]

    Long short-term memory.Neural computation, 9(8):1735–1780,

    [Hochreiter and Schmidhuber, 1997] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory.Neural computation, 9(8):1735–1780,

  10. [10]

    Neural large neighborhood search for the capacitated vehicle routing problem

    [Hottung and Tierney, 2020] Andr´e Hottung and Kevin Tier- ney. Neural large neighborhood search for the capacitated vehicle routing problem. InECAI 2020, pages 443–450. IOS Press,

  11. [11]

    Optimizing urban logistics: Vehicle routing prob- lem with underground transportation.IEEE Transactions on Intelligent Transportation Systems,

    [Jeong and Song, 2025] Ho Young Jeong and Byung Duk Song. Optimizing urban logistics: Vehicle routing prob- lem with underground transportation.IEEE Transactions on Intelligent Transportation Systems,

  12. [12]

    Sym-nco: Leveraging symmetricity for neural com- binatorial optimization.Advances in Neural Information Processing Systems, 35:1936–1949,

    [Kimet al., 2022a ] Minsu Kim, Junyoung Park, and Jinkyoo Park. Sym-nco: Leveraging symmetricity for neural com- binatorial optimization.Advances in Neural Information Processing Systems, 35:1936–1949,

  13. [13]

    Scale-conditioned adaptation for large scale combinatorial optimization

    [Kimet al., 2022b ] Minsu Kim, Jiwoo Son, Hyeonah Kim, and Jinkyoo Park. Scale-conditioned adaptation for large scale combinatorial optimization. InNeurIPS 2022 Work- shop on Distribution Shifts: Connecting Methods and Ap- plications,

  14. [14]

    Attention, learn to solve routing problems! In International Conference on Learning Representations,

    [Koolet al., 2018 ] Wouter Kool, Herke van Hoof, and Max Welling. Attention, learn to solve routing problems! In International Conference on Learning Representations,

  15. [15]

    Deep policy dynamic pro- gramming for vehicle routing problems

    [Koolet al., 2022 ] Wouter Kool, Herke van Hoof, Joaquim Gromicho, and Max Welling. Deep policy dynamic pro- gramming for vehicle routing problems. InInternational conference on integration of constraint programming, ar- tificial intelligence, and operations research, pages 190–

  16. [16]

    Pomo: Policy optimization with multiple optima for reinforcement learning.Advances in Neural Information Processing Systems, 33:21188–21198,

    [Kwonet al., 2020 ] Yeong-Dae Kwon, Jinho Choo, By- oungjip Kim, Iljoo Yoon, Youngjune Gwon, and Seungjai Min. Pomo: Policy optimization with multiple optima for reinforcement learning.Advances in Neural Information Processing Systems, 33:21188–21198,

  17. [17]

    Distri- bution path optimization for intelligent logistics vehicles of urban rail transportation using vrp optimization model

    [Leng and Li, 2021] Kaijun Leng and Shanghong Li. Distri- bution path optimization for intelligent logistics vehicles of urban rail transportation using vrp optimization model. IEEE Transactions on Intelligent Transportation Systems, 23(2):1661–1669,

  18. [18]

    Complexity of vehicle routing and scheduling problems.Networks, 11(2):221–227,

    [Lenstra and Kan, 1981] Jan Karel Lenstra and AHG Rin- nooy Kan. Complexity of vehicle routing and scheduling problems.Networks, 11(2):221–227,

  19. [19]

    The open vehicle routing problem: Algorithms, large-scale test problems, and computational results.Com- puters & operations research, 34(10):2918–2930,

    [Liet al., 2007 ] Feiyue Li, Bruce Golden, and Edward Wasil. The open vehicle routing problem: Algorithms, large-scale test problems, and computational results.Com- puters & operations research, 34(10):2918–2930,

  20. [20]

    Destroy and repair using hyper-graphs for rout- ing

    [Liet al., 2025 ] Ke Li, Fei Liu, Zhenkun Wang, and Qingfu Zhang. Destroy and repair using hyper-graphs for rout- ing. InProceedings of the AAAI Conference on Artificial Intelligence, pages 18341–18349,

  21. [21]

    Neural combinatorial optimiza- tion with heavy decoder: Toward large scale generaliza- tion.Advances in Neural Information Processing Systems, 36:8845–8864,

    [Luoet al., 2023 ] Fu Luo, Xi Lin, Fei Liu, Qingfu Zhang, and Zhenkun Wang. Neural combinatorial optimiza- tion with heavy decoder: Toward large scale generaliza- tion.Advances in Neural Information Processing Systems, 36:8845–8864,

  22. [22]

    Order-first split-second methods for vehicle routing problems: A review.Transportation Re- search Part C-emerging Technologies, 40:179–200,

    [Prinset al., 2014 ] Christian Prins, Philippe Lacomme, and Caroline Prodhon. Order-first split-second methods for vehicle routing problems: A review.Transportation Re- search Part C-emerging Technologies, 40:179–200,

  23. [23]

    Two meta-heuristics for solving the capacitated ve- hicle routing problem: the case of the tunisian post office

    [Sbaiet al., 2022 ] Ines Sbai, Saoussen Krichen, and Olfa Li- mam. Two meta-heuristics for solving the capacitated ve- hicle routing problem: the case of the tunisian post office. Operational Research, 22(1):507–549,

  24. [24]

    Symmetry-informed marl: A de- centralized and cooperative uav swarm control approach for communication coverage.IEEE Transactions on Mo- bile Computing, 24(01):1–18,

    [Shiet al., 2025 ] Rongye Shi, Xin Yu, Yandong Wang, Yongkai Tian, Zhenyu Liu, Wenjun Wu, Xiao-Ping Zhang, and Manuela M Veloso. Symmetry-informed marl: A de- centralized and cooperative uav swarm control approach for communication coverage.IEEE Transactions on Mo- bile Computing, 24(01):1–18,

  25. [25]

    Equity-transformer: Solving np-hard min-max routing problems as sequential generation with equity context

    [Sonet al., 2024 ] Jiwoo Son, Minsu Kim, Sanghyeok Choi, Hyeonah Kim, and Jinkyoo Park. Equity-transformer: Solving np-hard min-max routing problems as sequential generation with equity context. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 20265–20273,

  26. [26]

    Learning 3-opt heuristics for traveling salesman problem via deep reinforcement learning

    [Suiet al., 2021 ] Jingyan Sui, Shizhe Ding, Ruizhi Liu, Liming Xu, and Dongbo Bu. Learning 3-opt heuristics for traveling salesman problem via deep reinforcement learning. InAsian conference on machine learning, pages 1301–1316. PMLR,

  27. [27]

    An optimization model for vehicle routing problem in last-mile delivery.Expert Systems with Applications, 222:119789,

    [Tiwari and Sharma, 2023] Krishna Veer Tiwari and Satyen- dra Kumar Sharma. An optimization model for vehicle routing problem in last-mile delivery.Expert Systems with Applications, 222:119789,

  28. [28]

    A unified solution framework for multi-attribute vehicle routing problems

    [Vidalet al., 2014 ] Thibaut Vidal, Teodor Gabriel Crainic, Michel Gendreau, and Christian Prins. A unified solution framework for multi-attribute vehicle routing problems. European Journal of Operational Research, 234(3):658– 673,

  29. [29]

    Technical note: Split algorithm in o(n) for the capacitated vehicle routing problem.Com- put

    [Vidal, 2015] Thibaut Vidal. Technical note: Split algorithm in o(n) for the capacitated vehicle routing problem.Com- put. Oper. Res., 69:40–47,

  30. [30]

    Pointer networks.Advances in neural in- formation processing systems, 28,

    [Vinyalset al., 2015 ] Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly. Pointer networks.Advances in neural in- formation processing systems, 28,

  31. [31]

    Solv- ing the min-max multiple traveling salesmen problem via learning-based path generation and optimal splitting

    [Wanget al., 2025 ] Wen Wang, Xiangchen Wu, Liang Wang, Hao Hu, Xianping Tao, and Linghao Zhang. Solv- ing the min-max multiple traveling salesmen problem via learning-based path generation and optimal splitting. arXiv preprint arXiv:2508.17087,

  32. [32]

    Williams

    [Williams, 1992] Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforce- ment learning.Mach. Learn., 8(3–4):229–256, May

  33. [33]

    Wouda, Leon Lan, and Wouter Kool

    [Woudaet al., 2024 ] Niels A. Wouda, Leon Lan, and Wouter Kool. PyVRP: a high-performance VRP solver package. INFORMS Journal on Computing, 36(4):943–955,

  34. [34]

    Learning improvement heuris- tics for solving routing problems.IEEE transactions on neural networks and learning systems, 33(9):5057–5069,

    [Wuet al., 2021 ] Yaoxin Wu, Wen Song, Zhiguang Cao, Jie Zhang, and Andrew Lim. Learning improvement heuris- tics for solving routing problems.IEEE transactions on neural networks and learning systems, 33(9):5057–5069,

  35. [35]

    Neural combinatorial optimiza- tion algorithms for solving vehicle routing problems: A comprehensive survey with perspectives.arXiv preprint arXiv:2406.00415,

    [Wuet al., 2024 ] Xuan Wu, Di Wang, Lijie Wen, Yubin Xiao, Chunguo Wu, Yuesong Wu, Chaoyu Yu, Douglas L Maskell, and You Zhou. Neural combinatorial optimiza- tion algorithms for solving vehicle routing problems: A comprehensive survey with perspectives.arXiv preprint arXiv:2406.00415,

  36. [36]

    [Xinet al., 2021 ] Liang Xin, Wen Song, Zhiguang Cao, and Jie Zhang. Neurolkh: Combining deep learning model with lin-kernighan-helsgaun heuristic for solving the trav- eling salesman problem.Advances in Neural Information Processing Systems, 34:7472–7483,

  37. [37]

    Deep neural network ap- proximated dynamic programming for combinatorial opti- mization

    [Xuet al., 2020 ] Shenghe Xu, Shivendra S Panwar, Murali Kodialam, and TV Lakshman. Deep neural network ap- proximated dynamic programming for combinatorial opti- mization. InProceedings of the AAAI Conference on Arti- ficial Intelligence, volume 34, pages 1684–1691,

  38. [38]

    Leveraging partial symmetry for multi-agent reinforcement learning

    [Yuet al., 2024 ] Xin Yu, Rongye Shi, Pu Feng, Yongkai Tian, Simin Li, Shuhao Liao, and Wenjun Wu. Leveraging partial symmetry for multi-agent reinforcement learning. InProceedings of the AAAI Conference on Artificial Intel- ligence, volume 38, pages 17583–17590,

  39. [39]

    Review of vehicle rout- ing problems: Models, classification and solving algo- rithms.Archives of Computational Methods in Engineer- ing, 29(1):195–221,

    [Zhanget al., 2022 ] Haifei Zhang, Hongwei Ge, Jinlong Yang, and Yubing Tong. Review of vehicle rout- ing problems: Models, classification and solving algo- rithms.Archives of Computational Methods in Engineer- ing, 29(1):195–221,

  40. [40]

    Combining reinforce- ment learning with lin-kernighan-helsgaun algorithm for the traveling salesman problem

    [Zhenget al., 2021 ] Jiongzhi Zheng, Kun He, Jianrong Zhou, Yan Jin, and Chu-Min Li. Combining reinforce- ment learning with lin-kernighan-helsgaun algorithm for the traveling salesman problem. InProceedings of the AAAI conference on artificial intelligence, volume 35, pages 12445–12452,

  41. [41]

    Dpn: Decoupling partition and navigation for neural solvers of min-max vehicle routing problems

    [Zhenget al., 2024 ] Zhi Zheng, Shunyu Yao, Zhenkun Wang, Tong Xialiang, Mingxuan Yuan, and Ke Tang. Dpn: Decoupling partition and navigation for neural solvers of min-max vehicle routing problems. InInternational Conference on Machine Learning, pages 61559–61592. PMLR,

  42. [42]

    Mvmoe: Multi-task vehicle routing solver with mixture-of-experts

    [Zhouet al., 2024 ] Jianan Zhou, Zhiguang Cao, Yaoxin Wu, Wen Song, Yining Ma, Jie Zhang, and Xu Chi. Mvmoe: Multi-task vehicle routing solver with mixture-of-experts. InInternational Conference on Machine Learning, pages 61804–61824. PMLR, 2024