Deception and Counter Deception in Adversarial Graph Traversal Game
Pith reviewed 2026-05-25 04:10 UTC · model grok-4.3
The pith
An adaptation of the XDO algorithm terminates in finite time and returns an epsilon-Nash equilibrium for indefinite-horizon adversarial graph traversal games with two-sided incomplete information.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed adaptation of the XDO algorithm terminates in finite time and returns an epsilon-Nash equilibrium in the adversarial graph traversal game with two-sided incomplete information.
What carries the argument
The adapted Extensive-Form Double Oracle (XDO) algorithm that handles endogenous termination in indefinite-horizon games.
If this is right
- Equilibrium strategies can be computed for deception games on graphs with mutual belief updating.
- Value of Information can be used to characterize deceptive and counter-deceptive behaviors that arise in equilibrium.
- The approach provides a method for solving indefinite-horizon games with endogenous termination and two-sided incomplete information.
Where Pith is reading between the lines
- The same adaptation technique may apply to other security or pursuit-evasion settings that involve repeated belief updates.
- Scalability tests on larger graphs would show whether the finite-time guarantee remains practical.
- The framework connects to repeated games with partial observability where both sides hold private state information.
Load-bearing premise
That the standard XDO algorithm designed for finite games can be adapted in a way that guarantees bounded computation despite endogenous termination in indefinite-horizon games with two-sided incomplete information.
What would settle it
A concrete graph instance and payoff structure where the adapted algorithm either fails to terminate in finite time or the returned strategy pair is not an epsilon-Nash equilibrium.
Figures
read the original abstract
We study deception in adversarial graph traversal, where a mobile agent seeks to reach a goal with minimum cost while an adversary alters edge costs to increase the total traversal cost. Unlike prior works that assume fixed observer-deceiver roles, we model this problem with two-sided incomplete information in which both players possess private information and update beliefs from observed actions. To solve the resulting indefinite-horizon game, we develop an adaptation of the Extensive-Form Double Oracle (XDO) algorithm. While the standard XDO algorithm is designed for finite games, the proposed adaptation ensures bounded computation despite endogenous game termination. We show that the proposed algorithm terminates in finite time and returns an epsilon-Nash equilibrium. Finally, we use Value of Information to characterize the deceptive and counter-deceptive behaviors that emerge from equilibrium strategies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript models deception in adversarial graph traversal as an indefinite-horizon game with two-sided incomplete information, where both players hold private information and update beliefs from observed actions. It proposes an adaptation of the Extensive-Form Double Oracle (XDO) algorithm that is asserted to ensure bounded computation, terminate in finite time, and return an epsilon-Nash equilibrium; the work then applies Value of Information analysis to characterize the deceptive and counter-deceptive strategies that arise in equilibrium.
Significance. If the finite-termination guarantee can be established, the adaptation would meaningfully extend double-oracle methods beyond finite game trees to endogenous-termination settings with mutual belief updating, supplying a practical computational tool for equilibrium analysis in deception problems on graphs. The VoI characterization of equilibrium behavior is a constructive element that could aid interpretability.
major comments (1)
- [Abstract] Abstract: the central claim that the XDO adaptation 'terminates in finite time and returns an epsilon-Nash equilibrium' and 'ensures bounded computation despite endogenous game termination' is load-bearing, yet the text provides no argument establishing that the set of reachable belief states (or information sets) remains finite under two-sided incomplete information and indefinite horizon; without an explicit bound (via graph structure, finite action space, or Value-of-Information analysis), termination does not follow from the standard XDO proof that assumes a finite game tree.
Simulated Author's Rebuttal
We thank the referee for the careful review and for highlighting the need for an explicit finiteness argument. We address the major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that the XDO adaptation 'terminates in finite time and returns an epsilon-Nash equilibrium' and 'ensures bounded computation despite endogenous game termination' is load-bearing, yet the text provides no argument establishing that the set of reachable belief states (or information sets) remains finite under two-sided incomplete information and indefinite horizon; without an explicit bound (via graph structure, finite action space, or Value-of-Information analysis), termination does not follow from the standard XDO proof that assumes a finite game tree.
Authors: We agree that the manuscript does not provide a self-contained argument establishing finiteness of the reachable belief states (or information sets) under two-sided incomplete information. While the finite graph and finite action space do limit the possible observations, this was not made explicit. In the revision we will add a dedicated paragraph in Section 4 (Algorithm) that proves the set of reachable information sets is finite: because the underlying graph is finite, the private information of each player is drawn from a finite set of possible goal locations or edge-cost configurations, and each observation is an action on a finite edge set, the belief simplex is discretized over a finite collection of possible posterior distributions. This supplies the missing bound and allows the standard XDO termination argument to carry over directly to the endogenous-termination case. revision: yes
Circularity Check
No circularity: XDO adaptation termination and equilibrium claims are independent of paper inputs
full rationale
The manuscript adapts the standard XDO algorithm (designed for finite games) to an indefinite-horizon setting with two-sided incomplete information and claims finite termination plus epsilon-Nash equilibrium. No quoted step reduces the termination guarantee to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation is presented as following from the algorithm's properties plus the graph structure and Value-of-Information analysis, remaining self-contained against external benchmarks rather than circular.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Multi-robot co- ordination in an adversarial graph-traversal game,
J. Berneburg, X. Wang, X. Xiao, and D. Shishika, “Multi-robot co- ordination in an adversarial graph-traversal game,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3915–3922, IEEE, 2025
work page 2025
-
[2]
Finding time-dependent shortest paths over large graphs,
B. Ding, J. X. Yu, and L. Qin, “Finding time-dependent shortest paths over large graphs,” inProceedings of the 11th international conference on Extending database technology: Advances in database technology, pp. 205–216, 2008
work page 2008
-
[3]
Time-varying shortest path prob- lems with constraints,
X. Cai, T. Kloks, and C.-K. Wong, “Time-varying shortest path prob- lems with constraints,”Networks: An International Journal, vol. 29, no. 3, pp. 141–150, 1997
work page 1997
-
[4]
Constrained shortest path query in a large time-dependent graph,
Y . Yuan, X. Lian, G. Wang, Y . Ma, and Y . Wang, “Constrained shortest path query in a large time-dependent graph,”Proceedings of the VLDB Endowment, vol. 12, no. 10, pp. 1058–1070, 2019
work page 2019
-
[5]
Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,
A. A. Pereira, J. Binney, G. A. Hollinger, and G. S. Sukhatme, “Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,”Journal of Field Robotics, vol. 30, no. 5, pp. 741–762, 2013
work page 2013
-
[6]
A risk-aware path planning strategy for uavs in urban environments,
S. Primatesta, G. Guglieri, and A. Rizzo, “A risk-aware path planning strategy for uavs in urban environments,”Journal of Intelligent & Robotic Systems, vol. 95, pp. 629–643, 2019
work page 2019
-
[7]
Risk-aware path planning using hir- erachical constrained markov decision processes,
S. Feyzabadi and S. Carpin, “Risk-aware path planning using hir- erachical constrained markov decision processes,” in2014 IEEE Inter- national Conference on Automation Science and Engineering (CASE), pp. 297–303, IEEE, 2014
work page 2014
-
[8]
Deceptive robot motion: synthesis, analysis and experiments,
A. Dragan, R. Holladay, and S. Srinivasa, “Deceptive robot motion: synthesis, analysis and experiments,”Autonomous Robots, vol. 39, pp. 331–345, 2015
work page 2015
-
[9]
P. Masters and S. Sardina, “Deceptive path-planning,” inInt. Joint Conf. on Artif. Intell., pp. 4368–4375, 2017
work page 2017
-
[10]
M. Ornik and U. Topcu, “Deception in optimal control,” in2018 56th Annu. Allerton Conf. on Communication, Control, and Comput., pp. 821–828
-
[11]
Deception in supervisory control,
M. O. Karabag, M. Ornik, and U. Topcu, “Deception in supervisory control,”IEEE Transactions on Automatic Control, vol. 67, no. 2, pp. 738–753, 2021
work page 2021
-
[12]
Stochastic shortest path games,
S. D. Patek and D. P. Bertsekas, “Stochastic shortest path games,” SIAM Journal on Control and Optimization, vol. 37, no. 3, pp. 804– 824, 1999
work page 1999
-
[13]
Optimal network security hardening using attack graph games,
K. Durkota, V . Lisy, B. Bo ˇsansky, and C. Kiekintveld, “Optimal network security hardening using attack graph games,” inProceedings of IJCAI, pp. 7–14, 2015
work page 2015
-
[14]
T. H. Nguyen, M. Wright, M. P. Wellman, and S. Baveja, “Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,” inProceedings of the 2017 Workshop on Moving Target Defense, pp. 87–97, 2017
work page 2017
-
[15]
Deception by motion: The eater and the mover game,
V . Rostobaya, Y . Guan, J. Berneburg, M. Dorothy, and D. Shishika, “Deception by motion: The eater and the mover game,”IEEE Control Systems Letters, vol. 7, pp. 3157–3162, 2023
work page 2023
-
[16]
Deceptive path planning: A Bayesian game approach,
V . Rostobaya, J. Berneburg, Y . Guan, M. Dorothy, and D. Shishika, “Deceptive path planning: A Bayesian game approach,” 2025, arXiv:2506.13650
-
[17]
Strategic concealment of envi- ronment representations in competitive games,
Y . Guan, D. Maity, and P. Tsiotras, “Strategic concealment of envi- ronment representations in competitive games,”IEEE Control Systems Letters, vol. 9, pp. 2609–2614, 2025
work page 2025
-
[18]
Solving zero-sum one-sided partially observable stochastic games,
K. Hor ´ak, B. Bo ˇsansk`y, V . Kovaˇr´ık, and C. Kiekintveld, “Solving zero-sum one-sided partially observable stochastic games,”Artificial Intelligence, vol. 316, p. 103838, 2023
work page 2023
-
[19]
Dynamic pro- gramming for partially observable stochastic games,
E. A. Hansen, D. S. Bernstein, and S. Zilberstein, “Dynamic pro- gramming for partially observable stochastic games,” inAAAI, vol. 4, pp. 709–715, 2004
work page 2004
-
[20]
Xdo: A double oracle algorithm for extensive-form games,
S. McAleer, J. B. Lanier, K. A. Wang, P. Baldi, and R. Fox, “Xdo: A double oracle algorithm for extensive-form games,”Advances in Neural Information Processing Systems, vol. 34, pp. 23128–23139, 2021
work page 2021
-
[21]
An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,
B. Bo ˇsansk´y, C. Kiekintveld, V . Lis ´y, and M. Pechou ˇcek, “An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,”J. Artif. Intell. Res., vol. 51, pp. 829–866, 2014
work page 2014
-
[22]
Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,
S. Udupa, C. Shi, and J. Fu, “Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,” inProceedings of the ACM/IEEE 16th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2025), ICCPS ’25, (New York, NY , USA), Association for Computing Machinery, 2025
work page 2025
-
[23]
A decentralized shotgun approach for team deception,
C. Probine, M. O. Karabag, and U. Topcu, “A decentralized shotgun approach for team deception,” inInternational Conference on Decision and Game Theory for Security, pp. 177–197, Springer, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.