Deception and Counter Deception in Adversarial Graph Traversal Game

Daigo Shishika; James Berneburg; Violetta Rostobaya

arxiv: 2605.23129 · v1 · pith:MZ7TO2W6new · submitted 2026-05-22 · 📡 eess.SY · cs.GT· cs.SY

Deception and Counter Deception in Adversarial Graph Traversal Game

Violetta Rostobaya , James Berneburg , Daigo Shishika This is my paper

Pith reviewed 2026-05-25 04:10 UTC · model grok-4.3

classification 📡 eess.SY cs.GTcs.SY

keywords deceptioncounter-deceptionadversarial graph traversalincomplete informationXDO algorithmNash equilibriumValue of Information

0 comments

The pith

An adaptation of the XDO algorithm terminates in finite time and returns an epsilon-Nash equilibrium for indefinite-horizon adversarial graph traversal games with two-sided incomplete information.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an adaptation of the Extensive-Form Double Oracle algorithm to solve games of deception and counter-deception on graphs. In these games a mobile agent tries to minimize traversal cost to a goal while an adversary changes edge costs to maximize it, and both players have private information that they update based on observed actions. The adaptation ensures the algorithm terminates in finite time despite the game having an indefinite horizon. It returns an epsilon-Nash equilibrium, allowing analysis of deceptive behaviors using the Value of Information.

Core claim

The proposed adaptation of the XDO algorithm terminates in finite time and returns an epsilon-Nash equilibrium in the adversarial graph traversal game with two-sided incomplete information.

What carries the argument

The adapted Extensive-Form Double Oracle (XDO) algorithm that handles endogenous termination in indefinite-horizon games.

If this is right

Equilibrium strategies can be computed for deception games on graphs with mutual belief updating.
Value of Information can be used to characterize deceptive and counter-deceptive behaviors that arise in equilibrium.
The approach provides a method for solving indefinite-horizon games with endogenous termination and two-sided incomplete information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adaptation technique may apply to other security or pursuit-evasion settings that involve repeated belief updates.
Scalability tests on larger graphs would show whether the finite-time guarantee remains practical.
The framework connects to repeated games with partial observability where both sides hold private state information.

Load-bearing premise

That the standard XDO algorithm designed for finite games can be adapted in a way that guarantees bounded computation despite endogenous termination in indefinite-horizon games with two-sided incomplete information.

What would settle it

A concrete graph instance and payoff structure where the adapted algorithm either fails to terminate in finite time or the returned strategy pair is not an epsilon-Nash equilibrium.

Figures

Figures reproduced from arXiv: 2605.23129 by Daigo Shishika, James Berneburg, Violetta Rostobaya.

**Figure 4.** Figure 4: Equilibrium strategies in the Blue’s game. Each Blue type mixes [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 3.** Figure 3: a) Equilibrium behavior of the players in a game where only the [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 5.** Figure 5: Equilibrium strategies in a game with two-sided incomplete [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

read the original abstract

We study deception in adversarial graph traversal, where a mobile agent seeks to reach a goal with minimum cost while an adversary alters edge costs to increase the total traversal cost. Unlike prior works that assume fixed observer-deceiver roles, we model this problem with two-sided incomplete information in which both players possess private information and update beliefs from observed actions. To solve the resulting indefinite-horizon game, we develop an adaptation of the Extensive-Form Double Oracle (XDO) algorithm. While the standard XDO algorithm is designed for finite games, the proposed adaptation ensures bounded computation despite endogenous game termination. We show that the proposed algorithm terminates in finite time and returns an epsilon-Nash equilibrium. Finally, we use Value of Information to characterize the deceptive and counter-deceptive behaviors that emerge from equilibrium strategies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Adapts XDO to two-sided incomplete information in graph traversal but the finite termination claim rests on an unshown bound on reachable belief states.

read the letter

The main addition is modeling mutual deception where both the traversing agent and the cost-altering adversary hold private information and update beliefs from observed moves. They adapt the XDO algorithm to the indefinite-horizon case with endogenous termination and claim it still finishes in finite time while returning an epsilon-Nash equilibrium. They then apply Value of Information to describe the deceptive and counter-deceptive strategies that appear at equilibrium. This moves past the fixed-role setups common in earlier security and robotics pathfinding papers, and the concrete graph setting matches real applications. The formulation itself is clear enough to be usable by people already working on extensive-form games with incomplete information. The central soft spot is the termination guarantee. The abstract asserts finite termination without a sketch, and nothing in the provided material shows why the set of distinct information sets or posterior beliefs stays finite when both sides update continuously and the horizon is open. Standard XDO proofs rely on a finite game tree; here that property needs separate justification from the graph structure or action space, and it is not supplied. No experiments, error bounds, or validation data are referenced either, which leaves the practical behavior of the adaptation untested. The work is aimed at algorithmic game theorists who already know XDO and want to extend it to mutual deception. A reader in that niche could extract the problem setup and the Value of Information characterization, but the equilibrium claim needs the missing finiteness argument before it can be used. I would bring it to a reading group to walk through whether the belief sets can actually be shown finite. I would not cite it yet. It is worth sending to peer review so the authors can supply the required argument on reachable beliefs and any supporting proofs or runs.

Referee Report

1 major / 0 minor

Summary. The manuscript models deception in adversarial graph traversal as an indefinite-horizon game with two-sided incomplete information, where both players hold private information and update beliefs from observed actions. It proposes an adaptation of the Extensive-Form Double Oracle (XDO) algorithm that is asserted to ensure bounded computation, terminate in finite time, and return an epsilon-Nash equilibrium; the work then applies Value of Information analysis to characterize the deceptive and counter-deceptive strategies that arise in equilibrium.

Significance. If the finite-termination guarantee can be established, the adaptation would meaningfully extend double-oracle methods beyond finite game trees to endogenous-termination settings with mutual belief updating, supplying a practical computational tool for equilibrium analysis in deception problems on graphs. The VoI characterization of equilibrium behavior is a constructive element that could aid interpretability.

major comments (1)

[Abstract] Abstract: the central claim that the XDO adaptation 'terminates in finite time and returns an epsilon-Nash equilibrium' and 'ensures bounded computation despite endogenous game termination' is load-bearing, yet the text provides no argument establishing that the set of reachable belief states (or information sets) remains finite under two-sided incomplete information and indefinite horizon; without an explicit bound (via graph structure, finite action space, or Value-of-Information analysis), termination does not follow from the standard XDO proof that assumes a finite game tree.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for highlighting the need for an explicit finiteness argument. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the XDO adaptation 'terminates in finite time and returns an epsilon-Nash equilibrium' and 'ensures bounded computation despite endogenous game termination' is load-bearing, yet the text provides no argument establishing that the set of reachable belief states (or information sets) remains finite under two-sided incomplete information and indefinite horizon; without an explicit bound (via graph structure, finite action space, or Value-of-Information analysis), termination does not follow from the standard XDO proof that assumes a finite game tree.

Authors: We agree that the manuscript does not provide a self-contained argument establishing finiteness of the reachable belief states (or information sets) under two-sided incomplete information. While the finite graph and finite action space do limit the possible observations, this was not made explicit. In the revision we will add a dedicated paragraph in Section 4 (Algorithm) that proves the set of reachable information sets is finite: because the underlying graph is finite, the private information of each player is drawn from a finite set of possible goal locations or edge-cost configurations, and each observation is an action on a finite edge set, the belief simplex is discretized over a finite collection of possible posterior distributions. This supplies the missing bound and allows the standard XDO termination argument to carry over directly to the endogenous-termination case. revision: yes

Circularity Check

0 steps flagged

No circularity: XDO adaptation termination and equilibrium claims are independent of paper inputs

full rationale

The manuscript adapts the standard XDO algorithm (designed for finite games) to an indefinite-horizon setting with two-sided incomplete information and claims finite termination plus epsilon-Nash equilibrium. No quoted step reduces the termination guarantee to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation is presented as following from the algorithm's properties plus the graph structure and Value-of-Information analysis, remaining self-contained against external benchmarks rather than circular.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities beyond standard concepts in game theory and algorithmic game solving.

pith-pipeline@v0.9.0 · 5672 in / 993 out tokens · 25462 ms · 2026-05-25T04:10:12.611902+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

[1]

Multi-robot co- ordination in an adversarial graph-traversal game,

J. Berneburg, X. Wang, X. Xiao, and D. Shishika, “Multi-robot co- ordination in an adversarial graph-traversal game,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3915–3922, IEEE, 2025

work page 2025
[2]

Finding time-dependent shortest paths over large graphs,

B. Ding, J. X. Yu, and L. Qin, “Finding time-dependent shortest paths over large graphs,” inProceedings of the 11th international conference on Extending database technology: Advances in database technology, pp. 205–216, 2008

work page 2008
[3]

Time-varying shortest path prob- lems with constraints,

X. Cai, T. Kloks, and C.-K. Wong, “Time-varying shortest path prob- lems with constraints,”Networks: An International Journal, vol. 29, no. 3, pp. 141–150, 1997

work page 1997
[4]

Constrained shortest path query in a large time-dependent graph,

Y . Yuan, X. Lian, G. Wang, Y . Ma, and Y . Wang, “Constrained shortest path query in a large time-dependent graph,”Proceedings of the VLDB Endowment, vol. 12, no. 10, pp. 1058–1070, 2019

work page 2019
[5]

Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,

A. A. Pereira, J. Binney, G. A. Hollinger, and G. S. Sukhatme, “Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,”Journal of Field Robotics, vol. 30, no. 5, pp. 741–762, 2013

work page 2013
[6]

A risk-aware path planning strategy for uavs in urban environments,

S. Primatesta, G. Guglieri, and A. Rizzo, “A risk-aware path planning strategy for uavs in urban environments,”Journal of Intelligent & Robotic Systems, vol. 95, pp. 629–643, 2019

work page 2019
[7]

Risk-aware path planning using hir- erachical constrained markov decision processes,

S. Feyzabadi and S. Carpin, “Risk-aware path planning using hir- erachical constrained markov decision processes,” in2014 IEEE Inter- national Conference on Automation Science and Engineering (CASE), pp. 297–303, IEEE, 2014

work page 2014
[8]

Deceptive robot motion: synthesis, analysis and experiments,

A. Dragan, R. Holladay, and S. Srinivasa, “Deceptive robot motion: synthesis, analysis and experiments,”Autonomous Robots, vol. 39, pp. 331–345, 2015

work page 2015
[9]

Deceptive path-planning,

P. Masters and S. Sardina, “Deceptive path-planning,” inInt. Joint Conf. on Artif. Intell., pp. 4368–4375, 2017

work page 2017
[10]

Deception in optimal control,

M. Ornik and U. Topcu, “Deception in optimal control,” in2018 56th Annu. Allerton Conf. on Communication, Control, and Comput., pp. 821–828

work page
[11]

Deception in supervisory control,

M. O. Karabag, M. Ornik, and U. Topcu, “Deception in supervisory control,”IEEE Transactions on Automatic Control, vol. 67, no. 2, pp. 738–753, 2021

work page 2021
[12]

Stochastic shortest path games,

S. D. Patek and D. P. Bertsekas, “Stochastic shortest path games,” SIAM Journal on Control and Optimization, vol. 37, no. 3, pp. 804– 824, 1999

work page 1999
[13]

Optimal network security hardening using attack graph games,

K. Durkota, V . Lisy, B. Bo ˇsansky, and C. Kiekintveld, “Optimal network security hardening using attack graph games,” inProceedings of IJCAI, pp. 7–14, 2015

work page 2015
[14]

Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,

T. H. Nguyen, M. Wright, M. P. Wellman, and S. Baveja, “Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,” inProceedings of the 2017 Workshop on Moving Target Defense, pp. 87–97, 2017

work page 2017
[15]

Deception by motion: The eater and the mover game,

V . Rostobaya, Y . Guan, J. Berneburg, M. Dorothy, and D. Shishika, “Deception by motion: The eater and the mover game,”IEEE Control Systems Letters, vol. 7, pp. 3157–3162, 2023

work page 2023
[16]

Deceptive path planning: A Bayesian game approach,

V . Rostobaya, J. Berneburg, Y . Guan, M. Dorothy, and D. Shishika, “Deceptive path planning: A Bayesian game approach,” 2025, arXiv:2506.13650

work page arXiv 2025
[17]

Strategic concealment of envi- ronment representations in competitive games,

Y . Guan, D. Maity, and P. Tsiotras, “Strategic concealment of envi- ronment representations in competitive games,”IEEE Control Systems Letters, vol. 9, pp. 2609–2614, 2025

work page 2025
[18]

Solving zero-sum one-sided partially observable stochastic games,

K. Hor ´ak, B. Bo ˇsansk`y, V . Kovaˇr´ık, and C. Kiekintveld, “Solving zero-sum one-sided partially observable stochastic games,”Artificial Intelligence, vol. 316, p. 103838, 2023

work page 2023
[19]

Dynamic pro- gramming for partially observable stochastic games,

E. A. Hansen, D. S. Bernstein, and S. Zilberstein, “Dynamic pro- gramming for partially observable stochastic games,” inAAAI, vol. 4, pp. 709–715, 2004

work page 2004
[20]

Xdo: A double oracle algorithm for extensive-form games,

S. McAleer, J. B. Lanier, K. A. Wang, P. Baldi, and R. Fox, “Xdo: A double oracle algorithm for extensive-form games,”Advances in Neural Information Processing Systems, vol. 34, pp. 23128–23139, 2021

work page 2021
[21]

An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,

B. Bo ˇsansk´y, C. Kiekintveld, V . Lis ´y, and M. Pechou ˇcek, “An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,”J. Artif. Intell. Res., vol. 51, pp. 829–866, 2014

work page 2014
[22]

Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,

S. Udupa, C. Shi, and J. Fu, “Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,” inProceedings of the ACM/IEEE 16th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2025), ICCPS ’25, (New York, NY , USA), Association for Computing Machinery, 2025

work page 2025
[23]

A decentralized shotgun approach for team deception,

C. Probine, M. O. Karabag, and U. Topcu, “A decentralized shotgun approach for team deception,” inInternational Conference on Decision and Game Theory for Security, pp. 177–197, Springer, 2024

work page 2024

[1] [1]

Multi-robot co- ordination in an adversarial graph-traversal game,

J. Berneburg, X. Wang, X. Xiao, and D. Shishika, “Multi-robot co- ordination in an adversarial graph-traversal game,” in2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3915–3922, IEEE, 2025

work page 2025

[2] [2]

Finding time-dependent shortest paths over large graphs,

B. Ding, J. X. Yu, and L. Qin, “Finding time-dependent shortest paths over large graphs,” inProceedings of the 11th international conference on Extending database technology: Advances in database technology, pp. 205–216, 2008

work page 2008

[3] [3]

Time-varying shortest path prob- lems with constraints,

X. Cai, T. Kloks, and C.-K. Wong, “Time-varying shortest path prob- lems with constraints,”Networks: An International Journal, vol. 29, no. 3, pp. 141–150, 1997

work page 1997

[4] [4]

Constrained shortest path query in a large time-dependent graph,

Y . Yuan, X. Lian, G. Wang, Y . Ma, and Y . Wang, “Constrained shortest path query in a large time-dependent graph,”Proceedings of the VLDB Endowment, vol. 12, no. 10, pp. 1058–1070, 2019

work page 2019

[5] [5]

Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,

A. A. Pereira, J. Binney, G. A. Hollinger, and G. S. Sukhatme, “Risk-aware path planning for autonomous underwater vehicles using predictive ocean models,”Journal of Field Robotics, vol. 30, no. 5, pp. 741–762, 2013

work page 2013

[6] [6]

A risk-aware path planning strategy for uavs in urban environments,

S. Primatesta, G. Guglieri, and A. Rizzo, “A risk-aware path planning strategy for uavs in urban environments,”Journal of Intelligent & Robotic Systems, vol. 95, pp. 629–643, 2019

work page 2019

[7] [7]

Risk-aware path planning using hir- erachical constrained markov decision processes,

S. Feyzabadi and S. Carpin, “Risk-aware path planning using hir- erachical constrained markov decision processes,” in2014 IEEE Inter- national Conference on Automation Science and Engineering (CASE), pp. 297–303, IEEE, 2014

work page 2014

[8] [8]

Deceptive robot motion: synthesis, analysis and experiments,

A. Dragan, R. Holladay, and S. Srinivasa, “Deceptive robot motion: synthesis, analysis and experiments,”Autonomous Robots, vol. 39, pp. 331–345, 2015

work page 2015

[9] [9]

Deceptive path-planning,

P. Masters and S. Sardina, “Deceptive path-planning,” inInt. Joint Conf. on Artif. Intell., pp. 4368–4375, 2017

work page 2017

[10] [10]

Deception in optimal control,

M. Ornik and U. Topcu, “Deception in optimal control,” in2018 56th Annu. Allerton Conf. on Communication, Control, and Comput., pp. 821–828

work page

[11] [11]

Deception in supervisory control,

M. O. Karabag, M. Ornik, and U. Topcu, “Deception in supervisory control,”IEEE Transactions on Automatic Control, vol. 67, no. 2, pp. 738–753, 2021

work page 2021

[12] [12]

Stochastic shortest path games,

S. D. Patek and D. P. Bertsekas, “Stochastic shortest path games,” SIAM Journal on Control and Optimization, vol. 37, no. 3, pp. 804– 824, 1999

work page 1999

[13] [13]

Optimal network security hardening using attack graph games,

K. Durkota, V . Lisy, B. Bo ˇsansky, and C. Kiekintveld, “Optimal network security hardening using attack graph games,” inProceedings of IJCAI, pp. 7–14, 2015

work page 2015

[14] [14]

Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,

T. H. Nguyen, M. Wright, M. P. Wellman, and S. Baveja, “Multi-stage attack graph security games: Heuristic strategies, with empirical game- theoretic analysis,” inProceedings of the 2017 Workshop on Moving Target Defense, pp. 87–97, 2017

work page 2017

[15] [15]

Deception by motion: The eater and the mover game,

V . Rostobaya, Y . Guan, J. Berneburg, M. Dorothy, and D. Shishika, “Deception by motion: The eater and the mover game,”IEEE Control Systems Letters, vol. 7, pp. 3157–3162, 2023

work page 2023

[16] [16]

Deceptive path planning: A Bayesian game approach,

V . Rostobaya, J. Berneburg, Y . Guan, M. Dorothy, and D. Shishika, “Deceptive path planning: A Bayesian game approach,” 2025, arXiv:2506.13650

work page arXiv 2025

[17] [17]

Strategic concealment of envi- ronment representations in competitive games,

Y . Guan, D. Maity, and P. Tsiotras, “Strategic concealment of envi- ronment representations in competitive games,”IEEE Control Systems Letters, vol. 9, pp. 2609–2614, 2025

work page 2025

[18] [18]

Solving zero-sum one-sided partially observable stochastic games,

K. Hor ´ak, B. Bo ˇsansk`y, V . Kovaˇr´ık, and C. Kiekintveld, “Solving zero-sum one-sided partially observable stochastic games,”Artificial Intelligence, vol. 316, p. 103838, 2023

work page 2023

[19] [19]

Dynamic pro- gramming for partially observable stochastic games,

E. A. Hansen, D. S. Bernstein, and S. Zilberstein, “Dynamic pro- gramming for partially observable stochastic games,” inAAAI, vol. 4, pp. 709–715, 2004

work page 2004

[20] [20]

Xdo: A double oracle algorithm for extensive-form games,

S. McAleer, J. B. Lanier, K. A. Wang, P. Baldi, and R. Fox, “Xdo: A double oracle algorithm for extensive-form games,”Advances in Neural Information Processing Systems, vol. 34, pp. 23128–23139, 2021

work page 2021

[21] [21]

An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,

B. Bo ˇsansk´y, C. Kiekintveld, V . Lis ´y, and M. Pechou ˇcek, “An exact double-oracle algorithm for zero-sum extensive-form games with imperfect information,”J. Artif. Intell. Res., vol. 51, pp. 829–866, 2014

work page 2014

[22] [22]

Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,

S. Udupa, C. Shi, and J. Fu, “Synthesis of dynamic masks for information-theoretic opacity in stochastic systems,” inProceedings of the ACM/IEEE 16th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2025), ICCPS ’25, (New York, NY , USA), Association for Computing Machinery, 2025

work page 2025

[23] [23]

A decentralized shotgun approach for team deception,

C. Probine, M. O. Karabag, and U. Topcu, “A decentralized shotgun approach for team deception,” inInternational Conference on Decision and Game Theory for Security, pp. 177–197, Springer, 2024

work page 2024