pith. sign in

arxiv: 2605.23129 · v1 · pith:MZ7TO2W6new · submitted 2026-05-22 · 📡 eess.SY · cs.GT· cs.SY

Deception and Counter Deception in Adversarial Graph Traversal Game

Pith reviewed 2026-05-25 04:10 UTC · model grok-4.3

classification 📡 eess.SY cs.GTcs.SY
keywords deceptioncounter-deceptionadversarial graph traversalincomplete informationXDO algorithmNash equilibriumValue of Information
0
0 comments X

The pith

An adaptation of the XDO algorithm terminates in finite time and returns an epsilon-Nash equilibrium for indefinite-horizon adversarial graph traversal games with two-sided incomplete information.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an adaptation of the Extensive-Form Double Oracle algorithm to solve games of deception and counter-deception on graphs. In these games a mobile agent tries to minimize traversal cost to a goal while an adversary changes edge costs to maximize it, and both players have private information that they update based on observed actions. The adaptation ensures the algorithm terminates in finite time despite the game having an indefinite horizon. It returns an epsilon-Nash equilibrium, allowing analysis of deceptive behaviors using the Value of Information.

Core claim

The proposed adaptation of the XDO algorithm terminates in finite time and returns an epsilon-Nash equilibrium in the adversarial graph traversal game with two-sided incomplete information.

What carries the argument

The adapted Extensive-Form Double Oracle (XDO) algorithm that handles endogenous termination in indefinite-horizon games.

If this is right

  • Equilibrium strategies can be computed for deception games on graphs with mutual belief updating.
  • Value of Information can be used to characterize deceptive and counter-deceptive behaviors that arise in equilibrium.
  • The approach provides a method for solving indefinite-horizon games with endogenous termination and two-sided incomplete information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same adaptation technique may apply to other security or pursuit-evasion settings that involve repeated belief updates.
  • Scalability tests on larger graphs would show whether the finite-time guarantee remains practical.
  • The framework connects to repeated games with partial observability where both sides hold private state information.

Load-bearing premise

That the standard XDO algorithm designed for finite games can be adapted in a way that guarantees bounded computation despite endogenous termination in indefinite-horizon games with two-sided incomplete information.

What would settle it

A concrete graph instance and payoff structure where the adapted algorithm either fails to terminate in finite time or the returned strategy pair is not an epsilon-Nash equilibrium.

Figures

Figures reproduced from arXiv: 2605.23129 by Daigo Shishika, James Berneburg, Violetta Rostobaya.

Figure 1
Figure 1. Figure 1: Illustration of incomplete-information adversarial graph traversal [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 4
Figure 4. Figure 4: Equilibrium strategies in the Blue’s game. Each Blue type mixes [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: a) Equilibrium behavior of the players in a game where only the [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Equilibrium strategies in a game with two-sided incomplete [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
read the original abstract

We study deception in adversarial graph traversal, where a mobile agent seeks to reach a goal with minimum cost while an adversary alters edge costs to increase the total traversal cost. Unlike prior works that assume fixed observer-deceiver roles, we model this problem with two-sided incomplete information in which both players possess private information and update beliefs from observed actions. To solve the resulting indefinite-horizon game, we develop an adaptation of the Extensive-Form Double Oracle (XDO) algorithm. While the standard XDO algorithm is designed for finite games, the proposed adaptation ensures bounded computation despite endogenous game termination. We show that the proposed algorithm terminates in finite time and returns an epsilon-Nash equilibrium. Finally, we use Value of Information to characterize the deceptive and counter-deceptive behaviors that emerge from equilibrium strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript models deception in adversarial graph traversal as an indefinite-horizon game with two-sided incomplete information, where both players hold private information and update beliefs from observed actions. It proposes an adaptation of the Extensive-Form Double Oracle (XDO) algorithm that is asserted to ensure bounded computation, terminate in finite time, and return an epsilon-Nash equilibrium; the work then applies Value of Information analysis to characterize the deceptive and counter-deceptive strategies that arise in equilibrium.

Significance. If the finite-termination guarantee can be established, the adaptation would meaningfully extend double-oracle methods beyond finite game trees to endogenous-termination settings with mutual belief updating, supplying a practical computational tool for equilibrium analysis in deception problems on graphs. The VoI characterization of equilibrium behavior is a constructive element that could aid interpretability.

major comments (1)
  1. [Abstract] Abstract: the central claim that the XDO adaptation 'terminates in finite time and returns an epsilon-Nash equilibrium' and 'ensures bounded computation despite endogenous game termination' is load-bearing, yet the text provides no argument establishing that the set of reachable belief states (or information sets) remains finite under two-sided incomplete information and indefinite horizon; without an explicit bound (via graph structure, finite action space, or Value-of-Information analysis), termination does not follow from the standard XDO proof that assumes a finite game tree.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for highlighting the need for an explicit finiteness argument. We address the major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that the XDO adaptation 'terminates in finite time and returns an epsilon-Nash equilibrium' and 'ensures bounded computation despite endogenous game termination' is load-bearing, yet the text provides no argument establishing that the set of reachable belief states (or information sets) remains finite under two-sided incomplete information and indefinite horizon; without an explicit bound (via graph structure, finite action space, or Value-of-Information analysis), termination does not follow from the standard XDO proof that assumes a finite game tree.

    Authors: We agree that the manuscript does not provide a self-contained argument establishing finiteness of the reachable belief states (or information sets) under two-sided incomplete information. While the finite graph and finite action space do limit the possible observations, this was not made explicit. In the revision we will add a dedicated paragraph in Section 4 (Algorithm) that proves the set of reachable information sets is finite: because the underlying graph is finite, the private information of each player is drawn from a finite set of possible goal locations or edge-cost configurations, and each observation is an action on a finite edge set, the belief simplex is discretized over a finite collection of possible posterior distributions. This supplies the missing bound and allows the standard XDO termination argument to carry over directly to the endogenous-termination case. revision: yes

Circularity Check

0 steps flagged

No circularity: XDO adaptation termination and equilibrium claims are independent of paper inputs

full rationale

The manuscript adapts the standard XDO algorithm (designed for finite games) to an indefinite-horizon setting with two-sided incomplete information and claims finite termination plus epsilon-Nash equilibrium. No quoted step reduces the termination guarantee to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation is presented as following from the algorithm's properties plus the graph structure and Value-of-Information analysis, remaining self-contained against external benchmarks rather than circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities beyond standard concepts in game theory and algorithmic game solving.

pith-pipeline@v0.9.0 · 5672 in / 993 out tokens · 25462 ms · 2026-05-25T04:10:12.611902+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    Multi-robot co- ordination in an adversarial graph-traversal game,

    J. Berneburg, X. Wang, X. Xiao, and D. Shishika, “Multi-robot co- ordination in an adversarial graph-traversal game,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3915–3922, IEEE, 2025

  2. [2]

    Finding time-dependent shortest paths over large graphs,

    B. Ding, J. X. Yu, and L. Qin, “Finding time-dependent shortest paths over large graphs,” inProceedings of the 11th international conference on Extending database technology: Advances in database technology, pp. 205–216, 2008

  3. [3]

    Time-varying shortest path prob- lems with constraints,

    X. Cai, T. Kloks, and C.-K. Wong, “Time-varying shortest path prob- lems with constraints,”Networks: An International Journal, vol. 29, no. 3, pp. 141–150, 1997

  4. [4]

    Constrained shortest path query in a large time-dependent graph,

    Y . Yuan, X. Lian, G. Wang, Y . Ma, and Y . Wang, “Constrained shortest path query in a large time-dependent graph,”Proceedings of the VLDB Endowment, vol. 12, no. 10, pp. 1058–1070, 2019

  5. [5]

    Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,

    A. A. Pereira, J. Binney, G. A. Hollinger, and G. S. Sukhatme, “Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,”Journal of Field Robotics, vol. 30, no. 5, pp. 741–762, 2013

  6. [6]

    A risk-aware path planning strategy for uavs in urban environments,

    S. Primatesta, G. Guglieri, and A. Rizzo, “A risk-aware path planning strategy for uavs in urban environments,”Journal of Intelligent & Robotic Systems, vol. 95, pp. 629–643, 2019

  7. [7]

    Risk-aware path planning using hir- erachical constrained markov decision processes,

    S. Feyzabadi and S. Carpin, “Risk-aware path planning using hir- erachical constrained markov decision processes,” in2014 IEEE Inter- national Conference on Automation Science and Engineering (CASE), pp. 297–303, IEEE, 2014

  8. [8]

    Deceptive robot motion: synthesis, analysis and experiments,

    A. Dragan, R. Holladay, and S. Srinivasa, “Deceptive robot motion: synthesis, analysis and experiments,”Autonomous Robots, vol. 39, pp. 331–345, 2015

  9. [9]

    Deceptive path-planning,

    P. Masters and S. Sardina, “Deceptive path-planning,” inInt. Joint Conf. on Artif. Intell., pp. 4368–4375, 2017

  10. [10]

    Deception in optimal control,

    M. Ornik and U. Topcu, “Deception in optimal control,” in2018 56th Annu. Allerton Conf. on Communication, Control, and Comput., pp. 821–828

  11. [11]

    Deception in supervisory control,

    M. O. Karabag, M. Ornik, and U. Topcu, “Deception in supervisory control,”IEEE Transactions on Automatic Control, vol. 67, no. 2, pp. 738–753, 2021

  12. [12]

    Stochastic shortest path games,

    S. D. Patek and D. P. Bertsekas, “Stochastic shortest path games,” SIAM Journal on Control and Optimization, vol. 37, no. 3, pp. 804– 824, 1999

  13. [13]

    Optimal network security hardening using attack graph games,

    K. Durkota, V . Lisy, B. Bo ˇsansky, and C. Kiekintveld, “Optimal network security hardening using attack graph games,” inProceedings of IJCAI, pp. 7–14, 2015

  14. [14]

    Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,

    T. H. Nguyen, M. Wright, M. P. Wellman, and S. Baveja, “Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,” inProceedings of the 2017 Workshop on Moving Target Defense, pp. 87–97, 2017

  15. [15]

    Deception by motion: The eater and the mover game,

    V . Rostobaya, Y . Guan, J. Berneburg, M. Dorothy, and D. Shishika, “Deception by motion: The eater and the mover game,”IEEE Control Systems Letters, vol. 7, pp. 3157–3162, 2023

  16. [16]

    Deceptive path planning: A Bayesian game approach,

    V . Rostobaya, J. Berneburg, Y . Guan, M. Dorothy, and D. Shishika, “Deceptive path planning: A Bayesian game approach,” 2025, arXiv:2506.13650

  17. [17]

    Strategic concealment of envi- ronment representations in competitive games,

    Y . Guan, D. Maity, and P. Tsiotras, “Strategic concealment of envi- ronment representations in competitive games,”IEEE Control Systems Letters, vol. 9, pp. 2609–2614, 2025

  18. [18]

    Solving zero-sum one-sided partially observable stochastic games,

    K. Hor ´ak, B. Bo ˇsansk`y, V . Kovaˇr´ık, and C. Kiekintveld, “Solving zero-sum one-sided partially observable stochastic games,”Artificial Intelligence, vol. 316, p. 103838, 2023

  19. [19]

    Dynamic pro- gramming for partially observable stochastic games,

    E. A. Hansen, D. S. Bernstein, and S. Zilberstein, “Dynamic pro- gramming for partially observable stochastic games,” inAAAI, vol. 4, pp. 709–715, 2004

  20. [20]

    Xdo: A double oracle algorithm for extensive-form games,

    S. McAleer, J. B. Lanier, K. A. Wang, P. Baldi, and R. Fox, “Xdo: A double oracle algorithm for extensive-form games,”Advances in Neural Information Processing Systems, vol. 34, pp. 23128–23139, 2021

  21. [21]

    An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,

    B. Bo ˇsansk´y, C. Kiekintveld, V . Lis ´y, and M. Pechou ˇcek, “An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,”J. Artif. Intell. Res., vol. 51, pp. 829–866, 2014

  22. [22]

    Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,

    S. Udupa, C. Shi, and J. Fu, “Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,” inProceedings of the ACM/IEEE 16th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2025), ICCPS ’25, (New York, NY , USA), Association for Computing Machinery, 2025

  23. [23]

    A decentralized shotgun approach for team deception,

    C. Probine, M. O. Karabag, and U. Topcu, “A decentralized shotgun approach for team deception,” inInternational Conference on Decision and Game Theory for Security, pp. 177–197, Springer, 2024