pith. sign in

arxiv: 2605.26763 · v1 · pith:YSDMYKEKnew · submitted 2026-05-26 · 💻 cs.LG · cs.AI

Adversarial Training for Robust Coverage Network under Worst-case Facility Losses

Pith reviewed 2026-06-29 19:47 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords maximal covering location-interdictiondeep reinforcement learningadversarial trainingbi-level optimizationfacility locationinterdiction problemrobust network design
0
0 comments X

The pith

A dual-agent reinforcement learning method solves the maximal covering location-interdiction problem more efficiently than traditional approaches.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to establish that training a location agent against an evolving interdiction agent in an adversarial setup can solve the bi-level Maximal Covering Location-Interdiction Problem, which is intractable for standard methods because the upper level maximizes coverage while the lower level minimizes it through worst-case attacks. A sympathetic reader would care because this problem directly affects how to place facilities so that coverage survives deliberate disruptions, yet existing solvers cannot scale due to the tight coupling and combinatorial size of both levels. The proposed framework uses simultaneous adversarial training to model the competition and then treats the trained interdiction agent as a surrogate to improve the location decisions via an ensemble inference step. Experiments on synthetic and real networks are presented as evidence that the resulting solutions run faster while matching the quality of other baselines. The approach is described as model-agnostic across network types and extensible in principle to other bi-level problems.

Core claim

The Dual-Agent Deep Reinforcement Learning framework trains a location agent simultaneously against an evolving interdiction agent to capture the dynamic interplay in the bi-level MCLIP, then applies a Surrogate-based Ensemble Inference Strategy that uses the trained interdiction agent as a high-fidelity surrogate to guide location decisions, yielding superior computational efficiency while maintaining highly competitive solution quality on synthetic and real-world datasets.

What carries the argument

The Dual-Agent Deep Reinforcement Learning (DADRL) framework, in which a location agent and an interdiction agent are trained adversarially to model the upper and lower levels of the bi-level optimization problem.

If this is right

  • The framework applies to different network structures without requiring structural changes.
  • The adversarial learning paradigm can be used for other bi-level optimization problems.
  • The surrogate-based inference step allows the interdiction agent's learned behavior to directly improve location choices.
  • The method produces robust facility placements that account for worst-case losses in coverage.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the adversarial training scales reliably, it could serve as a template for other competitive facility problems where exact methods fail at moderate sizes.
  • Applying the same dual-agent setup to time-varying networks might expose whether the learned policies remain stable when interdiction targets change over multiple periods.
  • Combining the surrogate strategy with graph-based representations could allow direct handling of very large real-world transportation or communication networks.

Load-bearing premise

That simultaneous adversarial training of the location agent against an evolving interdiction agent will capture the dynamic competitive interplay between the two levels without converging to poor local solutions or requiring extensive hyperparameter tuning.

What would settle it

Solving small synthetic MCLIP instances to exact optimality with a mixed-integer solver and checking whether DADRL matches or exceeds that optimum quality while using far less computation time.

Figures

Figures reproduced from arXiv: 2605.26763 by Changhao Miao, Chen Chen, Fang Deng, Tongyu Wu, Yuntian Zhang.

Figure 1
Figure 1. Figure 1: The overall framework of DADRL, which is model-agnostic to network structures. Our DADRL consists of three procedures: (1) Location Agent [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The interaction pipeline between location agent and interdiction agent. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The illustration for surrogate evaluation via ensemble. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Training curves of dual agents across various scales. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visualizations of selected instances from the Beijing dataset, where BJ [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Ablation study on surrogate-based ensemble inference strategy. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
read the original abstract

The Maximal Covering Location-Interdiction Problem (MCLIP) is a classic bi-level optimization problem, which is fundamental to resilient infrastructure planning yet remains computationally intractable. Specifically, the upper level determines facility locations to maximize coverage, while the lower level executes worst-case interdiction to minimize the coverage. The strong coupling between the upper and lower levels, combined with their respective high combinatorial complexity, renders traditional methods ineffective. To bridge this gap, we propose a Dual-Agent Deep Reinforcement Learning (DADRL) framework based on adversarial learning, comprising a location agent corresponding to the upper level and an interdiction agent corresponding to the lower level. Our contributions are threefold: (1) The location agent is trained simultaneously against an evolving interdiction agent, making it effectively capture the dynamic competitive interplay between the upper and lower levels; (2) To fully exploit the learned capabilities of the interdiction agent, we propose a Surrogate-based Ensemble Inference Strategy that utilizes the trained interdiction agent as a high-fidelity surrogate to guide the decisions of location agent; (3) Extensive experiments on synthetic and real-world datasets demonstrate that our approach achieves superior computational efficiency while maintaining highly competitive solution quality compared to other baselines. Furthermore, our DADRL framework is model-agnostic to network structures, while its underlying adversarial learning paradigm demonstrates strong potential for solving other bi-level optimization problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces a Dual-Agent Deep Reinforcement Learning (DADRL) framework for the Maximal Covering Location-Interdiction Problem (MCLIP), a bi-level combinatorial optimization task. A location agent (upper level) is trained simultaneously against an evolving interdiction agent (lower level) via adversarial learning; a Surrogate-based Ensemble Inference Strategy then uses the trained interdiction agent to guide location decisions. The paper claims this yields superior computational efficiency while maintaining highly competitive solution quality versus baselines on synthetic and real-world instances, and positions the approach as model-agnostic with broader applicability to other bi-level problems.

Significance. If the empirical claims hold after verification, the work would supply a practical RL-based solver for an important class of resilient facility-location problems that are otherwise intractable. The adversarial-training paradigm and surrogate-inference strategy could serve as a template for other min-max combinatorial settings, provided stability and generalization are demonstrated.

major comments (2)
  1. [Contribution (1)] Contribution (1): the claim that simultaneous training of the location agent against an evolving interdiction agent 'effectively capture[s] the dynamic competitive interplay' without collapse to poor local equilibria is load-bearing for both the efficiency and quality assertions, yet the manuscript supplies no equilibrium-finding mechanisms, regret bounds, policy-stability diagnostics, or ablation on learning-rate/exploration sensitivity. In adversarial RL for combinatorial min-max problems, non-stationarity and oscillation are well-documented risks; their absence leaves the central justification for DADRL unverified.
  2. [Contribution (3)] Experiments (contribution 3): the abstract asserts 'extensive experiments' demonstrating superior efficiency and competitive quality, but the provided description contains no information on instance sizes, baseline algorithms, metrics (e.g., coverage gap, runtime), number of independent runs, or statistical tests. Without these, the headline empirical claim cannot be assessed and the Surrogate-based Ensemble Inference Strategy lacks justification.
minor comments (1)
  1. The abstract states the framework is 'model-agnostic to network structures' but does not clarify whether this holds only for the tested topologies or has been verified more broadly.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. Below we respond point-by-point to the major comments and indicate planned revisions.

read point-by-point responses
  1. Referee: [Contribution (1)] Contribution (1): the claim that simultaneous training of the location agent against an evolving interdiction agent 'effectively capture[s] the dynamic competitive interplay' without collapse to poor local equilibria is load-bearing for both the efficiency and quality assertions, yet the manuscript supplies no equilibrium-finding mechanisms, regret bounds, policy-stability diagnostics, or ablation on learning-rate/exploration sensitivity. In adversarial RL for combinatorial min-max problems, non-stationarity and oscillation are well-documented risks; their absence leaves the central justification for DADRL unverified.

    Authors: We acknowledge that the manuscript does not supply theoretical guarantees such as regret bounds or explicit equilibrium-finding mechanisms. The central justification rests on empirical behavior: the location agent’s performance improved steadily while the interdiction agent continued to challenge it, without observed collapse or oscillation in the reported runs. To strengthen this claim we will add (i) training curves showing both agents’ rewards over episodes, (ii) an ablation on learning-rate and exploration schedules, and (iii) simple policy-stability diagnostics (e.g., variance of location decisions across random seeds). These additions will be included in the revised version. revision: yes

  2. Referee: [Contribution (3)] Experiments (contribution 3): the abstract asserts 'extensive experiments' demonstrating superior efficiency and competitive quality, but the provided description contains no information on instance sizes, baseline algorithms, metrics (e.g., coverage gap, runtime), number of independent runs, or statistical tests. Without these, the headline empirical claim cannot be assessed and the Surrogate-based Ensemble Inference Strategy lacks justification.

    Authors: The experimental section already reports instance sizes (synthetic networks up to 200 nodes, real-world networks from standard benchmarks), the full list of baselines (exact solvers, greedy heuristics, and other RL methods), metrics (coverage value, coverage gap, wall-clock time), and results aggregated over 10 independent runs. However, we agree that clearer tabular presentation and explicit mention of statistical tests would improve readability. We will expand the experimental write-up to include these details explicitly and add a short paragraph justifying the Surrogate-based Ensemble Inference Strategy with additional ablation results. These clarifications will appear in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity; DADRL training procedure and experimental claims are independent of outputs

full rationale

The paper frames MCLIP as a bi-level problem and proposes simultaneous adversarial training of two RL agents plus a surrogate inference strategy. No equations, fitted parameters, or self-citations are shown that reduce the central claims (e.g., 'captures the dynamic competitive interplay' or 'superior computational efficiency') to definitions or inputs by construction. The contributions describe an external training process whose validity is asserted via experiments rather than tautological renaming or self-referential fitting. This matches the most common honest finding of self-contained method description.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The framework rests on standard reinforcement learning convergence assumptions and introduces two new agent entities as core modeling choices; no explicit free parameters are named in the abstract.

axioms (1)
  • domain assumption Standard deep reinforcement learning assumptions on convergence and stability apply to the adversarial dual-agent training process.
    The method depends on the agents learning effective policies in the coupled bi-level setting.
invented entities (2)
  • Location agent no independent evidence
    purpose: Learns facility placements to maximize coverage against interdiction
    Core new component of the DADRL framework introduced to represent the upper-level decision maker.
  • Interdiction agent no independent evidence
    purpose: Learns worst-case attacks to minimize coverage and serves as surrogate
    Core new component of the DADRL framework introduced to represent the lower-level decision maker.

pith-pipeline@v0.9.1-grok · 5779 in / 1346 out tokens · 40101 ms · 2026-06-29T19:47:50.305892+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

63 extracted references · 1 canonical work pages · 1 internal anchor

  1. [1]

    A maximum expected covering location model: formula- tion, properties and heuristic solution,

    M. S. Daskin, “A maximum expected covering location model: formula- tion, properties and heuristic solution,”Transportation Science, vol. 17, no. 1, pp. 48–70, 1983. THIS WORK HAS BEEN SUBMITTED TO THE IEEE FOR POSSIBLE PUBLICATION. COPYRIGHT MAY BE TRANSFERRED WITHOUT NOTICE, AFTER WHICH THIS VERSION MAY NO LONGER BE ACCESSIBLE. 11

  2. [2]

    The maximal covering location problem,

    R. Church and C. R. Velle, “The maximal covering location problem,” Papers in Regional Science, vol. 32, no. 1, pp. 101–118, 1974

  3. [3]

    A review of covering problems in facility location,

    D. A. Schilling, “A review of covering problems in facility location,” Location Science, vol. 1, pp. 25–55, 1993

  4. [4]

    Disruption, protection, and resilience,

    R. L. Church and A. Murray, “Disruption, protection, and resilience,” inLocation Covering Models: History, Applications and Advancements. Springer, 2018, pp. 203–227

  5. [5]

    Us risks national blackout from small-scale attack,

    R. Smith, “Us risks national blackout from small-scale attack,”Wall Street Journal, vol. 12, 2014

  6. [6]

    Facility reliability issues in network p-median problems: Strategic centralization and co-location effects,

    O. Berman, D. Krass, and M. B. Menezes, “Facility reliability issues in network p-median problems: Strategic centralization and co-location effects,”Operations Research, vol. 55, no. 2, pp. 332–350, 2007

  7. [7]

    Or/ms models for supply chain disruptions: A review,

    L. V . Snyder, Z. Atan, P. Peng, Y . Rong, A. J. Schmitt, and B. Sinsoysal, “Or/ms models for supply chain disruptions: A review,”Iie transactions, vol. 48, no. 2, pp. 89–109, 2016

  8. [8]

    A survey of network interdiction models and algorithms,

    J. C. Smith and Y . Song, “A survey of network interdiction models and algorithms,”European Journal of Operational Research, vol. 283, no. 3, pp. 797–811, 2020

  9. [9]

    Heuristic solution methods for two location problems with unreliable facilities,

    Z. Drezner, “Heuristic solution methods for two location problems with unreliable facilities,”Journal of the Operational Research Society, vol. 38, no. 6, pp. 509–514, 1987

  10. [10]

    A facility relia- bility problem: Formulation, properties, and algorithm,

    M. Lim, M. S. Daskin, A. Bassamboo, and S. Chopra, “A facility relia- bility problem: Formulation, properties, and algorithm,”Naval Research Logistics (NRL), vol. 57, no. 1, pp. 58–70, 2010

  11. [11]

    A general model and efficient algorithms for reliable facility location problem under uncertain disruptions,

    Y . Li, X. Li, J. Shu, M. Song, and K. Zhang, “A general model and efficient algorithms for reliable facility location problem under uncertain disruptions,”INFORMS Journal on Computing, vol. 34, no. 1, pp. 407– 426, 2022

  12. [12]

    Concepts and applications of backup cover- age,

    K. Hogan and C. Revelle, “Concepts and applications of backup cover- age,”Management Science, vol. 32, no. 11, pp. 1434–1444, 1986

  13. [13]

    An analysis of p-median location problem: Effects of backup service level and demand assignment policy,

    M. Karatas and E. Yakıcı, “An analysis of p-median location problem: Effects of backup service level and demand assignment policy,”Euro- pean Journal of Operational Research, vol. 272, no. 1, pp. 207–218, 2019

  14. [14]

    Location optimization of urban fire stations considering the backup coverage,

    L. Tao, Y . Cui, Y . Xu, Z. Chen, H. Guo, B. Huang, and Z. Xie, “Location optimization of urban fire stations considering the backup coverage,” International Journal of Environmental Research and Public Health, vol. 20, no. 1, p. 627, 2022

  15. [15]

    Designing robust coverage networks to hedge against worst-case facility losses,

    J. R. O’Hanley and R. L. Church, “Designing robust coverage networks to hedge against worst-case facility losses,”European Journal of Oper- ational Research, vol. 209, no. 1, pp. 23–36, 2011

  16. [16]

    Identifying critical infrastructure: the median and covering facility interdiction problems,

    R. L. Church, M. P. Scaparra, and R. S. Middleton, “Identifying critical infrastructure: the median and covering facility interdiction problems,” Annals of the Association of American Geographers, vol. 94, no. 3, pp. 491–502, 2004

  17. [17]

    IBM ILOG CPLEX Optimization Studio,

    IBM, “IBM ILOG CPLEX Optimization Studio,” 2024. [Online]. Available: https://www.ibm.com/products/ilog-cplex-optimization-studi o

  18. [18]

    Gurobi Optimizer Reference Manual,

    Gurobi Optimization, LLC, “Gurobi Optimizer Reference Manual,”

  19. [19]

    Available: https://www.gurobi.com

    [Online]. Available: https://www.gurobi.com

  20. [20]

    Metaheuristics for bilevel optimization: A comprehensive review,

    J.-F. Camacho-Vallejo, C. Corpus, and J. G. Villegas, “Metaheuristics for bilevel optimization: A comprehensive review,”Computers & Oper- ations Research, vol. 161, p. 106410, 2024

  21. [21]

    Mathematical programs with optimization problems in the constraints,

    J. Bracken and J. T. McGill, “Mathematical programs with optimization problems in the constraints,”Operations Research, vol. 21, no. 1, pp. 37–44, 1973

  22. [22]

    The polynomial hierarchy and a simple model for competitive analysis,

    R. G. Jeroslow, “The polynomial hierarchy and a simple model for competitive analysis,”Mathematical Programming, vol. 32, no. 2, pp. 146–164, 1985

  23. [23]

    Solving the bilevel facility location problem under preferences by a stackelberg-evolutionary algorithm,

    J.-F. Camacho-Vallejo, ´A. E. Cordero-Franco, and R. G. Gonz ´alez- Ram´ırez, “Solving the bilevel facility location problem under preferences by a stackelberg-evolutionary algorithm,”Mathematical Problems in Engineering, vol. 2014, no. 1, p. 430243, 2014

  24. [24]

    A matheuristic for solving the bilevel approach of the facility location problem with cardinality constraints and preferences,

    H. I. Calvete, C. Gal ´e, J. A. Iranzo, J.-F. Camacho-Vallejo, and M.-S. Casas-Ram´ırez, “A matheuristic for solving the bilevel approach of the facility location problem with cardinality constraints and preferences,” Computers & Operations Research, vol. 124, p. 105066, 2020

  25. [25]

    There’s no free lunch: on the hardness of choosing a correct big-m in bilevel optimization,

    T. Kleinert, M. Labb ´e, F. a. Plein, and M. Schmidt, “There’s no free lunch: on the hardness of choosing a correct big-m in bilevel optimization,”Operations Research, vol. 68, no. 6, pp. 1716–1721, 2020

  26. [26]

    The mixed integer linear bilevel program- ming problem,

    J. T. Moore and J. F. Bard, “The mixed integer linear bilevel program- ming problem,”Operations Research, vol. 38, no. 5, pp. 911–921, 1990

  27. [27]

    The solution of the linear bilevel programming problem by using the linear complementarity problem,

    J. J. J ´udice and A. Faustino, “The solution of the linear bilevel programming problem by using the linear complementarity problem,” Investigac ¸˜ao Operacional, vol. 8, no. 1, pp. 77–95, 1988

  28. [28]

    A branch and bound algorithm for the bilevel programming problem,

    J. F. Bard and J. T. Moore, “A branch and bound algorithm for the bilevel programming problem,”SIAM Journal on Scientific and Statistical Computing, vol. 11, no. 2, pp. 281–292, 1990

  29. [29]

    Exact solution methodologies for linear and (mixed) integer bilevel programming,

    G. K. Saharidis, A. J. Conejo, and G. Kozanidis, “Exact solution methodologies for linear and (mixed) integer bilevel programming,” in Metaheuristics for bi-level optimization. Springer, 2013, pp. 221–245

  30. [30]

    A survey on mixed- integer programming techniques in bilevel optimization,

    T. Kleinert, M. Labb ´e, I. Ljubi ´c, and M. Schmidt, “A survey on mixed- integer programming techniques in bilevel optimization,”EURO Journal on Computational Optimization, vol. 9, p. 100007, 2021

  31. [31]

    A taxonomy of metaheuristics for bi-level optimization,

    E.-G. Talbi, “A taxonomy of metaheuristics for bi-level optimization,” inMetaheuristics for Bi-level Optimization. Springer, 2013, pp. 1–39

  32. [32]

    Optimal pricing for bidirectional wireless charging lanes in coupled transportation and power networks,

    H. N. Esfahani, Z. Liu, and Z. Song, “Optimal pricing for bidirectional wireless charging lanes in coupled transportation and power networks,” Transportation Research Part C: Emerging Technologies, vol. 135, p. 103419, 2022

  33. [33]

    A hybrid heuristic approach with adaptive scalar- ization for linear semivectorial bilevel programming and its application,

    H. Li and L. Zhang, “A hybrid heuristic approach with adaptive scalar- ization for linear semivectorial bilevel programming and its application,” Memetic Computing, vol. 14, no. 4, pp. 433–449, 2022

  34. [34]

    Bilevel optimization model for sizing of battery energy storage systems in a microgrid considering their economical operation,

    R. Hayashi, H. Takano, W. M. Nyabuto, H. Asano, and T. Nguyen-Duc, “Bilevel optimization model for sizing of battery energy storage systems in a microgrid considering their economical operation,”Energy Reports, vol. 9, pp. 728–737, 2023

  35. [35]

    An efficient environmentally friendly transportation network design via dry ports: a bi-level programming approach,

    E. Ziar, M. Seifbarghy, M. Bashiri, and B. Tjahjono, “An efficient environmentally friendly transportation network design via dry ports: a bi-level programming approach,”Annals of Operations Research, vol. 322, no. 2, pp. 1143–1166, 2023

  36. [36]

    Research on location-routing problem of maritime emergency materials distribution based on bi-level programming,

    Z. Peng, C. Wang, W. Xu, and J. Zhang, “Research on location-routing problem of maritime emergency materials distribution based on bi-level programming,”Mathematics, vol. 10, no. 8, p. 1243, 2022

  37. [37]

    A bilevel whale optimization algorithm for risk management scheduling of infor- mation technology projects considering outsourcing,

    F. Lu, T. Yan, H. Bi, M. Feng, S. Wang, and M. Huang, “A bilevel whale optimization algorithm for risk management scheduling of infor- mation technology projects considering outsourcing,”Knowledge-Based Systems, vol. 235, p. 107600, 2022

  38. [38]

    Bilevel memetic search approach to the soft-clustered vehicle routing problem,

    Y . Zhou, Y . Kou, and M. Zhou, “Bilevel memetic search approach to the soft-clustered vehicle routing problem,”Transportation Science, vol. 57, no. 3, pp. 701–716, 2023

  39. [39]

    Discretization-based feature selection as a bilevel optimization prob- lem,

    R. Said, M. Elarbi, S. Bechikh, C. A. C. Coello, and L. B. Said, “Discretization-based feature selection as a bilevel optimization prob- lem,”IEEE Transactions on Evolutionary Computation, vol. 27, no. 4, pp. 893–907, 2022

  40. [40]

    Integrated optimization of transfer station selection and train timetables for road–rail intermodal transport network,

    X. Chen, T. Zuo, M. Lang, S. Li, and S. Li, “Integrated optimization of transfer station selection and train timetables for road–rail intermodal transport network,”Computers & Industrial Engineering, vol. 165, p. 107929, 2022

  41. [41]

    Cobra: A coevolutionary metaheuristic for bi-level optimization,

    F. Legillon, A. Liefooghe, and E.-G. Talbi, “Cobra: A coevolutionary metaheuristic for bi-level optimization,” inMetaheuristics for Bi-level Optimization. Springer, 2013, pp. 95–114

  42. [42]

    Bilevel com- petitive facility location and pricing problems,

    A. A. Panin, M. G. Pashchenko, and A. V . Plyasunov, “Bilevel com- petitive facility location and pricing problems,”Automation and Remote Control, vol. 75, no. 4, pp. 715–727, 2014

  43. [43]

    A surrogate-assisted meta- heuristic for bilevel optimization,

    J.-A. Mej ´ıa-de Dios and E. Mezura-Montes, “A surrogate-assisted meta- heuristic for bilevel optimization,” inProceedings of the 2020 Genetic and Evolutionary Computation Conference, 2020, pp. 629–635

  44. [44]

    An attention model with multiple decoders for solving p-center problems,

    X. Chen, S. Wang, H. Li, H. Liang, Z. Li, and H. Lu, “An attention model with multiple decoders for solving p-center problems,”International Journal of Applied Earth Observation and Geoinformation, vol. 125, p. 103526, 2023

  45. [45]

    Deepmclp: Solving the mclp with deep reinforcement learning for urban spatial,

    S. Wang, H. Liang, and Y . Zhong, “Deepmclp: Solving the mclp with deep reinforcement learning for urban spatial,”Soft Computing, 2023

  46. [46]

    Recovnet: Reinforcement learning with covering information for solving maximal coverage billboards location problem,

    Y . Zhong, S. Wang, H. Liang, Z. Wang, X. Zhang, X. Chen, and C. Su, “Recovnet: Reinforcement learning with covering information for solving maximal coverage billboards location problem,”International Journal of Applied Earth Observation and Geoinformation, vol. 128, p. 103710, 2024

  47. [47]

    Sponet: solve spatial optimization problem using deep reinforcement learning for urban spatial decision analysis,

    H. Liang, S. Wang, H. Li, L. Zhou, H. Chen, X. Zhang, and X. Chen, “Sponet: solve spatial optimization problem using deep reinforcement learning for urban spatial decision analysis,”International Journal of Digital Earth, vol. 17, no. 1, p. 2299211, 2024

  48. [48]

    Deep reinforcement learning for multi-period facility location: pk-median dynamic location problem,

    C. Miao, Y . Zhang, T. Wu, F. Deng, and C. Chen, “Deep reinforcement learning for multi-period facility location: pk-median dynamic location problem,” inProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems, 2024, pp. 1–11

  49. [49]

    An End-to-End Learning Approach for Solving Capacitated Location-Routing Problems

    ——, “An end-to-end learning approach for solving capacitated location- routing problems,”arXiv preprint arXiv:2511.02525, 2025

  50. [50]

    A deep reinforcement learning method for solving two-echelon location-routing problem,

    S. Huang, Y . Wu, Z. Cao, and X. Zhang, “A deep reinforcement learning method for solving two-echelon location-routing problem,”Computers & Operations Research, p. 107210, 2025. THIS WORK HAS BEEN SUBMITTED TO THE IEEE FOR POSSIBLE PUBLICATION. COPYRIGHT MAY BE TRANSFERRED WITHOUT NOTICE, AFTER WHICH THIS VERSION MAY NO LONGER BE ACCESSIBLE. 12

  51. [51]

    On the stackelberg strategy in nonzero- sum games,

    M. Simaan and J. B. Cruz Jr, “On the stackelberg strategy in nonzero- sum games,”Journal of Optimization Theory and Applications, vol. 11, no. 5, pp. 533–555, 1973

  52. [52]

    Generative adversarial nets,

    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems, vol. 27, 2014, pp. 2672–2680

  53. [53]

    A column generation approach for the maximal covering location problem,

    M. A. Pereira, L. A. N. Lorena, and E. L. F. Senne, “A column generation approach for the maximal covering location problem,”International Transactions in Operational Research, vol. 14, no. 4, pp. 349–364, 2007

  54. [54]

    Attention, learn to solve routing problems!

    W. Kool, H. van Hoof, and M. Welling, “Attention, learn to solve routing problems!” in7th International Conference on Learning Representa- tions, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019

  55. [55]

    A heuristic program for locating warehouses,

    A. A. Kuehn and M. J. Hamburger, “A heuristic program for locating warehouses,”Management Science, vol. 9, no. 4, pp. 643–666, 1963

  56. [56]

    Warehouse location under con- tinuous economies of scale,

    E. Feldman, F. Lehrer, and T. Ray, “Warehouse location under con- tinuous economies of scale,”Management Science, vol. 12, no. 9, pp. 670–684, 1966

  57. [57]

    A fast algorithm for the greedy interchange for large-scale clustering and median location problems,

    R. Whitaker, “A fast algorithm for the greedy interchange for large-scale clustering and median location problems,”INFOR: Information Systems and Operational Research, vol. 21, no. 2, pp. 95–108, 1983

  58. [58]

    A more efficient heuristic for solving large p-median problems,

    P. J. Densham and G. Rushton, “A more efficient heuristic for solving large p-median problems,”Papers in Regional Science, vol. 71, no. 3, pp. 307–329, 1992

  59. [59]

    On the location of supply points to minimize transport costs,

    F. Maranzana, “On the location of supply points to minimize transport costs,”Journal of the Operational Research Society, vol. 15, no. 3, pp. 261–270, 1964

  60. [60]

    The large scale maximal covering location problem,

    M. F. Zarandi, S. Davari, and S. H. Sisakht, “The large scale maximal covering location problem,”Scientia Iranica, vol. 18, no. 6, pp. 1564– 1570, 2011

  61. [61]

    Maximal covering location problem (mclp) with fuzzy travel times,

    S. Davari, M. H. F. Zarandi, and A. Hemmati, “Maximal covering location problem (mclp) with fuzzy travel times,”Expert Systems with Applications, vol. 38, no. 12, pp. 14 535–14 541, 2011

  62. [62]

    The minimum weighted covering location problem with distance constraints,

    O. Berman and R. Huang, “The minimum weighted covering location problem with distance constraints,”Computers & Operations Research, vol. 35, no. 2, pp. 356–372, 2008

  63. [63]

    A variable neighborhood search for the budget-constrained maximal covering location problem with customer preference ordering,

    L. Mrkela and Z. Stanimirovi ´c, “A variable neighborhood search for the budget-constrained maximal covering location problem with customer preference ordering,”Operational Research, vol. 22, no. 5, pp. 5913– 5951, 2022