Near-Optimal Mixed Strategy for Zero-Sum Differential Games

Jianping He; Tao Xu; Wang Xi

arxiv: 2308.01144 · v3 · submitted 2023-08-02 · 🧮 math.OC

Near-Optimal Mixed Strategy for Zero-Sum Differential Games

Tao Xu , Wang Xi , Jianping He This is my paper

Pith reviewed 2026-05-24 07:10 UTC · model grok-4.3

classification 🧮 math.OC

keywords zero-sum differential gamesmixed strategiesweak approximationstochastic differential gamesdiscretization algorithmoptimal controlapproximation error boundslinear programming

0 comments

The pith

Mapping mixed-strategy zero-sum differential games to surrogate pure-strategy stochastic games produces near-optimal executable strategies with approximation error linear in commitment delay.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework to turn the hard problem of finding mixed strategies for zero-sum differential games into a more tractable pure-strategy stochastic differential game whose solution approximates the original value and yields usable strategies. The mapping is designed so that the probability distributions over states and the expected costs remain close between the two games. A discretization procedure then converts the infinite-dimensional optimization over measures into a sequence of finite linear programs that can be solved numerically. If the mapping holds, this supplies the first constructive method for obtaining executable mixed strategies together with explicit error guarantees that scale with the longest time a player must commit to a chosen action.

Core claim

The authors introduce a weak approximation framework that maps an original mixed-strategy zero-sum differential game into a surrogate stochastic differential game under pure strategies, ensuring that state distributions and cost expectations remain close. A control-space discretization algorithm parameterizes the required measures as probability simplices and solves local linear programs to synthesize executable mixed strategies. They prove that the resulting global weak approximation error is strictly of order O(π-bar) with respect to the maximum commitment delay π-bar and supply explicit analytical upper bounds on the suboptimality gaps of the recovered strategies.

What carries the argument

The weak approximation framework that maps the mixed-strategy zero-sum differential game into a surrogate stochastic differential game under pure strategies.

If this is right

Executable mixed strategies for general zero-sum differential games become constructible by solving a sequence of linear programs after discretization.
The value of the original game can be recovered to within an error that scales linearly with the longest commitment interval.
Explicit analytical upper bounds on suboptimality gaps are available once the surrogate game is solved.
The same discretization converts the infinite-dimensional measure optimization into finite-dimensional probability simplices that standard solvers can handle.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The linear scaling in commitment delay suggests that shorter action commitments yield proportionally better approximations without changing the underlying solver.
The surrogate stochastic game formulation may allow reuse of existing numerical methods developed for stochastic control when applied to deterministic differential games.
If the discretization mesh is refined, the method could produce strategies whose performance approaches the theoretical value arbitrarily closely for fixed commitment delay.

Load-bearing premise

The mapping from the original mixed-strategy game to the surrogate stochastic differential game ensures that both state distributions and cost expectations closely match those of the original game.

What would settle it

A numerical experiment on a standard zero-sum differential game in which the observed difference in state distributions or expected costs grows faster than linearly with the maximum commitment delay π-bar.

Figures

Figures reproduced from arXiv: 2308.01144 by Jianping He, Tao Xu, Wang Xi.

**Figure 2.** Figure 2: Simulation results of Glq under the mixed strategy spaces Aπm × Bπm and G˜ lq under the pure strategy spaces A˜π p × B˜π p . the design of first two moments of control inputs, practically one always needs a candidate pdf family F to specify the pdf. E. Step 5: Certify Value Approximation Error and Strategy Suboptimality Gap Assumption 4. There exists a pair of optimal pure strategies (˜α ∗ , β˜∗ ) for G˜ a… view at source ↗

read the original abstract

Synthesizing near-optimal mixed strategies for zero-sum differential games (ZSDGs) has been a longstanding challenge. Existing research mainly focuses on characterizing the theoretical value function, while the practical design of executable mixed strategies remains open. To address this issue, we propose a novel weak approximation framework. The core idea is to map the original mixed-strategy game into a surrogate stochastic differential game (SDG) under pure strategies. This mapping ensures that both state distributions and cost expectations closely match the original game. Based on the solution of this auxiliary SDG, the original game value can be approximated, and near-optimal mixed strategies can be synthesized. To operationalize this framework, we develop a constructive control-space discretization algorithm for general ZSDGs. By parameterizing the infinite-dimensional measure optimization into standard probability simplices and solving local linear programs, our method efficiently synthesizes executable mixed strategies. Furthermore, we rigorously prove that the global weak approximation error is strictly of order $\mathcal{O}(\bar\pi)$ with respect to the maximum commitment delay $\bar\pi$, and derive explicit analytical upper bounds for the strategy suboptimality gaps. Numerical examples are provided to illustrate and validate our theoretical results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's main offering is a discretization algorithm that turns mixed-strategy ZSDGs into solvable pure-strategy surrogates, with a claimed O(π-bar) global error, but the mapping step that justifies the bound is the part that needs the closest look.

read the letter

The punchline is that this work gives a concrete way to synthesize executable mixed strategies for zero-sum differential games instead of stopping at value function characterizations. The new piece is the weak approximation framework that maps the original game to a surrogate stochastic differential game under pure strategies, plus the control-space discretization that reduces the infinite-dimensional problem to local linear programs over simplices. That algorithm looks like a genuine step toward making the theory usable. It also supplies explicit suboptimality bounds and some numerical checks, which is more than many papers in this area deliver. The approach is constructive and targets a real practical gap, so credit there. The soft spot is exactly the one the stress-test note flags: the O(π-bar) error rests on the surrogate matching both state distributions and cost expectations. If that match only holds under extra regularity conditions not spelled out for general dynamics, or if the joint laws do not track as tightly as the marginals, the bound does not carry through. The abstract states the result but the letter would need to see the actual derivation to judge whether the order is strict or conditional. Minor issues like how the commitment delay is chosen in practice are secondary. This is for people working on approximation methods in differential games who want something they can code and run, not pure theorists. A reader already familiar with viscosity solutions or Isaacs equations would get the most out of the algorithmic part. It deserves a serious referee because the problem is open, the method is new, and the claims are falsifiable once the proofs are checked.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a weak approximation framework for synthesizing near-optimal mixed strategies in zero-sum differential games (ZSDGs). The core idea maps the original mixed-strategy game to a surrogate stochastic differential game (SDG) under pure strategies such that state distributions and cost expectations match closely; a constructive control-space discretization algorithm then parameterizes infinite-dimensional measure optimization into probability simplices and solves local linear programs to produce executable strategies. The authors claim to prove that the global weak approximation error is strictly of order O(π-bar) with respect to the maximum commitment delay π-bar and to derive explicit analytical upper bounds on strategy suboptimality gaps, supported by numerical examples.

Significance. If the mapping and error analysis are valid, the work supplies a practical, constructive route from theoretical value functions to executable mixed strategies together with rigorous, explicit error bounds—an advance over existing literature that has largely remained at the level of value-function characterization. The combination of a discretization procedure with an O(π-bar) guarantee would be a useful contribution to differential game theory and robust control.

major comments (2)

[Section 3 (mapping construction)] The central claim that the global weak approximation error is strictly O(π-bar) rests on the surrogate SDG construction (Section 3) producing state distributions and expected costs that differ from the original game by at most O(π-bar). The provided description indicates that the mapping is defined via commitment delay π-bar and control-space discretization into simplices, but it is unclear whether the distributional match holds for the joint laws under the controlled dynamics (as opposed to marginals) for general Lipschitz or non-Lipschitz dynamics; if the joint-law match fails, the error order does not propagate to the value function or suboptimality gaps.
[Section 5 (error analysis)] Theorem establishing the O(π-bar) global weak error bound (Section 5): the proof sketch relies on the surrogate game solution approximating the original value, yet the discretization step (local LPs on simplices) introduces an additional approximation whose order is not shown to be absorbed into the O(π-bar) term without further regularity assumptions (e.g., bounded variation of measures or uniform Lipschitz continuity of the dynamics). This step is load-bearing for the explicit suboptimality-gap bounds.

minor comments (2)

[Abstract and Section 2] Notation for the maximum commitment delay is introduced as both π-bar and bar{π}; a single consistent symbol should be used throughout.
[Numerical examples] The numerical examples section would benefit from an explicit statement of the dynamics, cost functions, and discretization parameters used, to allow direct reproduction of the reported error orders.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The two major comments raise important points about the scope of the weak approximation and the absorption of discretization error. We address them point by point below.

read point-by-point responses

Referee: [Section 3 (mapping construction)] The central claim that the global weak approximation error is strictly O(π-bar) rests on the surrogate SDG construction (Section 3) producing state distributions and expected costs that differ from the original game by at most O(π-bar). The provided description indicates that the mapping is defined via commitment delay π-bar and control-space discretization into simplices, but it is unclear whether the distributional match holds for the joint laws under the controlled dynamics (as opposed to marginals) for general Lipschitz or non-Lipschitz dynamics; if the joint-law match fails, the error order does not propagate to the value function or suboptimality gaps.

Authors: The construction in Section 3 defines the surrogate SDG so that the controlled state processes converge weakly in the Skorokhod space; under the manuscript's standing Lipschitz assumption on the dynamics, this implies convergence of the joint laws of (state, control) processes, not merely marginals. The O(π-bar) bound on the difference of expectations then follows from the continuous mapping theorem applied to the running cost. The framework is stated for Lipschitz dynamics (see Assumption 2.1); it does not claim to cover non-Lipschitz cases. We will insert an explicit remark after Definition 3.2 clarifying that joint-law weak convergence is used and is guaranteed only under the Lipschitz hypothesis. revision: partial
Referee: [Section 5 (error analysis)] Theorem establishing the O(π-bar) global weak error bound (Section 5): the proof sketch relies on the surrogate game solution approximating the original value, yet the discretization step (local LPs on simplices) introduces an additional approximation whose order is not shown to be absorbed into the O(π-bar) term without further regularity assumptions (e.g., bounded variation of measures or uniform Lipschitz continuity of the dynamics). This step is load-bearing for the explicit suboptimality-gap bounds.

Authors: The local LP step solves the measure optimization exactly on each simplex; the only additional error is the diameter of the simplices, which is set to O(π-bar) by construction. Under the uniform Lipschitz continuity already assumed for the dynamics and value function, standard Gronwall estimates show that this mesh error is absorbed into the leading O(π-bar) term. The proof sketch in Section 5 is therefore complete under the paper's hypotheses; we will expand the argument between equations (5.8) and (5.12) to display the absorption step explicitly. revision: partial

Circularity Check

0 steps flagged

No circularity detected; derivation is self-contained.

full rationale

The paper constructs a surrogate SDG via a control-space discretization mapping from the original mixed-strategy ZSDG, then derives the O(π-bar) weak error bound and suboptimality gaps from the distributional matching properties of that mapping. No self-definitional reductions, fitted inputs renamed as predictions, load-bearing self-citations, or imported uniqueness theorems appear in the provided abstract or framework description. The central claims rest on the explicit construction and subsequent analysis rather than reducing to their own inputs by definition, making the derivation independent of the target results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities; the mapping assumption and discretization are described at conceptual level without quantified details.

pith-pipeline@v0.9.0 · 5734 in / 1102 out tokens · 32284 ms · 2026-05-24T07:10:01.427566+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages

[1]

Differential game with mixed strategies: A weak approximation approach,

T. Xu, W. Xi, and J. He, “Differential game with mixed strategies: A weak approximation approach,” in 2023 62nd IEEE Conference on Decision and Control (CDC) , Dec. 2023, pp. 5216–5221

work page 2023
[2]

Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization

R. Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. Courier Corporation, 1999

work page 1999
[3]

Differential games and representation formulas for solutions of hamilton-jacobi-isaacs equations,

L. C. Evans and P. E. Souganidis, “Differential games and representation formulas for solutions of hamilton-jacobi-isaacs equations,” Indiana University mathematics journal , vol. 33, no. 5, pp. 773–797, 1984

work page 1984
[4]

The existence of subgame-perfect equilibrium in continuous games with almost perfect information: A case for public randomization,

C. Harris, P. Reny, and A. Robson, “The existence of subgame-perfect equilibrium in continuous games with almost perfect information: A case for public randomization,” Econometrica: Journal of the Econometric Society, pp. 507–544, 1995

work page 1995
[5]

Generative adversarial nets,

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems , vol. 27, 2014

work page 2014
[6]

Mixed-strategy learning with continuous action sets,

S. Perkins, P. Mertikopoulos, and D. S. Leslie, “Mixed-strategy learning with continuous action sets,” IEEE Transactions on Automatic Control , vol. 62, no. 1, pp. 379–384, Jan. 2017

work page 2017
[7]

Learning mixed strategies in trajectory games,

L. Peters, D. Fridovich-Keil, L. Ferranti, C. Stachniss, J. Alonso- Mora, and F. Laine, “Learning mixed strategies in trajectory games,” in Robotics: Science and Systems XVIII . Robotics: Science and Systems Foundation, Jun. 2022

work page 2022
[8]

Existence of saddle points in differential games,

P. Varaiya and J. Lin, “Existence of saddle points in differential games,” SIAM Journal on Control , vol. 7, no. 1, pp. 141–157, Feb. 1969

work page 1969
[9]

Axiomatic approach in differential games,

E. Roxin, “Axiomatic approach in differential games,” Journal of Op- timization Theory and Applications , vol. 3, no. 3, pp. 153–163, Mar. 1969

work page 1969
[10]

The existence of value in stochastic differential games,

R. Elliott, “The existence of value in stochastic differential games,” SIAM Journal on Control and Optimization , vol. 14, no. 1, pp. 85–94, Jan. 1976

work page 1976
[11]

Engwerda, LQ Dynamic Optimization and Differential Games , 1st ed

J. Engwerda, LQ Dynamic Optimization and Differential Games , 1st ed. Chicester, West Sussex, England Hoboken, NJ: Wiley, Jun. 2005

work page 2005
[12]

Differential games with asymmetric information,

P. Cardaliaguet, “Differential games with asymmetric information,” SIAM Journal on Control and Optimization , vol. 46, no. 3, pp. 816– 838, Jan. 2007

work page 2007
[13]

Pure and random strategies in differential game with incomplete informations,

P. Cardaliaguet, C. Jimenez, and M. Quincampoix, “Pure and random strategies in differential game with incomplete informations,” Journal of Dynamics and Games , vol. 1, no. 3, pp. 363–375, 2014

work page 2014
[14]

Value function of differential games without isaacs conditions. an approach with nonanticipative mixed strategies,

R. Buckdahn, J. Li, and M. Quincampoix, “Value function of differential games without isaacs conditions. an approach with nonanticipative mixed strategies,” International Journal of Game Theory , vol. 42, no. 4, pp. 989–1020, Nov. 2013

work page 2013
[15]

Value in mixed strategies for zero-sum stochastic differential games without isaacs condition,

——, “Value in mixed strategies for zero-sum stochastic differential games without isaacs condition,” The Annals of Probability , vol. 42, no. 4, pp. 1724–1768, 2014

work page 2014
[16]

Mixed strategies for de- terministic differential games,

W. H. Fleming and D. Hernandez-Hernandez, “Mixed strategies for de- terministic differential games,” Communications on Stochastic Analysis , vol. 11, no. 2, Jun. 2017

work page 2017
[17]

Chattering analysis,

A. Levant, “Chattering analysis,” IEEE Transactions on Automatic Control, vol. 55, no. 6, pp. 1380–1389, Jun. 2010

work page 2010
[18]

Stochastic modified equations and adaptive stochastic gradient algorithms,

Q. Li, C. Tai, and E. Weinan, “Stochastic modified equations and adaptive stochastic gradient algorithms,” in International Conference on Machine Learning. PMLR, 2017, pp. 2101–2110

work page 2017
[19]

Stochastic modified equations and dynamics of stochastic gra- dient algorithms i: Mathematical foundations,

——, “Stochastic modified equations and dynamics of stochastic gra- dient algorithms i: Mathematical foundations,” The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1474–1520, 2019

work page 2019
[20]

Continuized accelerations of deterministic and stochastic gradient descents, and of gossip algorithms,

M. Even, R. Berthier, F. Bach, N. Flammarion, H. Hendrikx, P. Gaillard, L. Massoulié, and A. Taylor, “Continuized accelerations of deterministic and stochastic gradient descents, and of gossip algorithms,” in Advances in Neural Information Processing Systems , vol. 34, 2021, pp. 28 054– 28 066

work page 2021
[21]

Analysis of stochastic gradient descent in continuous time,

J. Latz, “Analysis of stochastic gradient descent in continuous time,” Statistics and Computing , vol. 31, no. 4, p. 39, May 2021

work page 2021
[22]

M. J. Osborne and A. Rubinstein, A Course in Game Theory . Cam- bridge, Mass: The MIT Press, Jul. 1994

work page 1994
[23]

28. mixed and behavior strategies in infinite extensive games,

R. J. Aumann, “28. mixed and behavior strategies in infinite extensive games,” in Advances in Game Theory. (AM-52) . Princeton University Press, Dec. 1964, pp. 627–650

work page 1964
[24]

Non-cooperative games,

J. Nash Jr, “Non-cooperative games,” in Essays on Game Theory . Edward Elgar Publishing, 1996, pp. 22–33

work page 1996
[25]

Ba¸ sar and G

T. Ba¸ sar and G. J. Olsder, Dynamic Noncooperative Game Theory . SIAM, 1998

work page 1998
[26]

Stochastic perron’s method and elementary strategies for zero-sum differential games,

M. Sîrbu, “Stochastic perron’s method and elementary strategies for zero-sum differential games,” SIAM Journal on Control and Optimiza- tion, vol. 52, no. 3, pp. 1693–1711, Jan. 2014

work page 2014
[27]

Optimal control of the undamped linear wave equation with measure valued controls,

K. Kunisch, P. Trautmann, and B. Vexler, “Optimal control of the undamped linear wave equation with measure valued controls,” SIAM Journal on Control and Optimization , vol. 54, no. 3, pp. 1212–1244, Jan. 2016

work page 2016
[28]

Con- trolled measure-valued martingales: A viscosity solution approach,

A. M. G. Cox, S. Källblad, M. Larsson, and S. Svaluto-Ferro, “Con- trolled measure-valued martingales: A viscosity solution approach,” The Annals of Applied Probability, vol. 34, no. 2, pp. 1987–2035, Apr. 2024

work page 1987
[29]

Relaxed controls,

L. D. Berkovitz and N. G. Medhin, “Relaxed controls,” in Nonlinear Optimal Control Theory . Chapman and Hall/CRC, 2012

work page 2012
[30]

A pontryagin’s maximum principle for non-zero sum differential games of bsdes with applications,

G. Wang and Z. Yu, “A pontryagin’s maximum principle for non-zero sum differential games of bsdes with applications,” IEEE Transactions on Automatic Control , vol. 55, no. 7, pp. 1742–1747, Jul. 2010

work page 2010
[31]

W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions. Springer Science & Business Media, Feb. 2006

work page 2006
[32]

Optimal mixed strategies in a dynamic game,

P. Kumar, “Optimal mixed strategies in a dynamic game,” IEEE Trans- actions on Automatic Control , vol. 25, no. 4, pp. 743–749, Aug. 1980

work page 1980
[33]

Finding mixed strategy nash equilibrium for continuous games through deep learning,

Z. Dou, X. Yan, D. Wang, and X. Deng, “Finding mixed strategy nash equilibrium for continuous games through deep learning,” arXiv preprint arXiv:1910.12075, 2019

work page arXiv 1910
[34]

Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks,

C. Martin and T. Sandholm, “Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence , ser. IJCAI ’23, Aug. 2023, pp. 2844–2852

work page 2023
[35]

Probabilistic pursuit-evasion games: A one-step nash approach,

J. Hespanha, M. Prandini, and S. Sastry, “Probabilistic pursuit-evasion games: A one-step nash approach,” in Proceedings of the 39th IEEE Conference on Decision and Control , vol. 3, Dec. 2000, pp. 2272–2277 vol.3. 12

work page 2000
[36]

Probabilistic pursuit-evasion games: Theory, implementation, and experimental eval- uation,

R. Vidal, O. Shakernia, H. Kim, D. Shim, and S. Sastry, “Probabilistic pursuit-evasion games: Theory, implementation, and experimental eval- uation,” IEEE Transactions on Robotics and Automation , vol. 18, no. 5, pp. 662–669, Oct. 2002

work page 2002
[37]

Randomized pursuit-evasion in a polygonal environment,

V . Isler, S. Kannan, and S. Khanna, “Randomized pursuit-evasion in a polygonal environment,” IEEE Transactions on Robotics, vol. 21, no. 5, pp. 875–884, Oct. 2005

work page 2005
[38]

Randomized pursuit-evasion with local visibility,

——, “Randomized pursuit-evasion with local visibility,” SIAM Journal on Discrete Mathematics , vol. 20, no. 1, pp. 26–41, Jan. 2006

work page 2006
[39]

Ba¸ sar and G

T. Ba¸ sar and G. Zaccour,Handbook of Dynamic Game Theory. Springer International Publishing, Jun. 2018

work page 2018
[40]

Borel structures for function spaces,

R. J. Aumann, “Borel structures for function spaces,” Illinois Journal of Mathematics, vol. 5, no. 4, pp. 614–630, 1961

work page 1961
[41]

Spaces of measurable transformations,

——, “Spaces of measurable transformations,” Bulletin of the American Mathematical Society, vol. 66, no. 4, pp. 301–304, 1960

work page 1960
[42]

On von neumann’s minimax theorem,

H. Nikaidô, “On von neumann’s minimax theorem,” Pacific J. Math , vol. 4, no. 1, pp. 65–72, 1954

work page 1954
[43]

Weak approximation of solutions of systems of stochastic differential equations,

G. N. Mil’shtein, “Weak approximation of solutions of systems of stochastic differential equations,” Theory of Probability & Its Applica- tions, vol. 30, no. 4, pp. 750–766, 1986. APPENDIX A PROOF OF THEOREM 1 Proof. Our proof proceeds based on the principle of dynamic programming. Notice that the game value functions, presented in Definition 4, are defin...

work page 1986
[44]

Let K(t, x) = Z(t, x)⊤Ξ(t)Z(t, x), the optimization problem is equivalently transformed as max 0⪯K(t,x)⪯I tr n diag{˜λi}d2 i=1K(t, x) o

Next, consider that max 0⪯˜Σ2(t)⪯D2 tr n ˜Σ2(t) δπ(t)G2(x)⊤∂2 x ˜V G2(x) − 2R2 o = max 0⪯Ξ(t)⪯I tr n Ξ(t)Z(t, x) diag{˜λi}d2 i=1Z(t, x)⊤ o = max 0⪯Ξ(t)⪯I tr n diag{˜λi}d2 i=1Z(t, x)⊤Ξ(t)Z(t, x) o , where Ξ(t) = D − 1 2 2 ˜Σ2(t)D − 1 2 2 and Z(t, x) is an orthogonal matrix such that D 1 2 2 h δπ(t)G2(x)⊤∂2 x ˜V G2(x) − 2R2 i D 1 2 2 = Z(t, x) diag{˜λi}d2 i...

work page
[45]

Step 3: verify the smoothness condition

+ 2(b2 3 + b2 4)¯π and C = 2(1 +L0 + a2 1 + a2 2 + a2 3¯π + a2 4¯π). Step 3: verify the smoothness condition. Let δ˜Γi(t) = ˜Γd i (t) − ˜Γ∗ i (t) and δ˜Λi(t) = ˜Λd i (t) − ˜Λ∗ i (t) for i = 1 , 2, we have E∥δ˜Γi(t)∥2 = O(¯π2) and E∥δ˜Λi(t)∥2 = O(¯π2) since ˜Γ∗ i and ˜Λ∗ i are Lipschitz continuous and E∥e(t)∥2 = O(¯π2). Next, the cost difference is ˜J(t0, ...

work page 2023

[1] [1]

Differential game with mixed strategies: A weak approximation approach,

T. Xu, W. Xi, and J. He, “Differential game with mixed strategies: A weak approximation approach,” in 2023 62nd IEEE Conference on Decision and Control (CDC) , Dec. 2023, pp. 5216–5221

work page 2023

[2] [2]

Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization

R. Isaacs, Differential Games: A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. Courier Corporation, 1999

work page 1999

[3] [3]

Differential games and representation formulas for solutions of hamilton-jacobi-isaacs equations,

L. C. Evans and P. E. Souganidis, “Differential games and representation formulas for solutions of hamilton-jacobi-isaacs equations,” Indiana University mathematics journal , vol. 33, no. 5, pp. 773–797, 1984

work page 1984

[4] [4]

The existence of subgame-perfect equilibrium in continuous games with almost perfect information: A case for public randomization,

C. Harris, P. Reny, and A. Robson, “The existence of subgame-perfect equilibrium in continuous games with almost perfect information: A case for public randomization,” Econometrica: Journal of the Econometric Society, pp. 507–544, 1995

work page 1995

[5] [5]

Generative adversarial nets,

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio, “Generative adversarial nets,” in Advances in Neural Information Processing Systems , vol. 27, 2014

work page 2014

[6] [6]

Mixed-strategy learning with continuous action sets,

S. Perkins, P. Mertikopoulos, and D. S. Leslie, “Mixed-strategy learning with continuous action sets,” IEEE Transactions on Automatic Control , vol. 62, no. 1, pp. 379–384, Jan. 2017

work page 2017

[7] [7]

Learning mixed strategies in trajectory games,

L. Peters, D. Fridovich-Keil, L. Ferranti, C. Stachniss, J. Alonso- Mora, and F. Laine, “Learning mixed strategies in trajectory games,” in Robotics: Science and Systems XVIII . Robotics: Science and Systems Foundation, Jun. 2022

work page 2022

[8] [8]

Existence of saddle points in differential games,

P. Varaiya and J. Lin, “Existence of saddle points in differential games,” SIAM Journal on Control , vol. 7, no. 1, pp. 141–157, Feb. 1969

work page 1969

[9] [9]

Axiomatic approach in differential games,

E. Roxin, “Axiomatic approach in differential games,” Journal of Op- timization Theory and Applications , vol. 3, no. 3, pp. 153–163, Mar. 1969

work page 1969

[10] [10]

The existence of value in stochastic differential games,

R. Elliott, “The existence of value in stochastic differential games,” SIAM Journal on Control and Optimization , vol. 14, no. 1, pp. 85–94, Jan. 1976

work page 1976

[11] [11]

Engwerda, LQ Dynamic Optimization and Differential Games , 1st ed

J. Engwerda, LQ Dynamic Optimization and Differential Games , 1st ed. Chicester, West Sussex, England Hoboken, NJ: Wiley, Jun. 2005

work page 2005

[12] [12]

Differential games with asymmetric information,

P. Cardaliaguet, “Differential games with asymmetric information,” SIAM Journal on Control and Optimization , vol. 46, no. 3, pp. 816– 838, Jan. 2007

work page 2007

[13] [13]

Pure and random strategies in differential game with incomplete informations,

P. Cardaliaguet, C. Jimenez, and M. Quincampoix, “Pure and random strategies in differential game with incomplete informations,” Journal of Dynamics and Games , vol. 1, no. 3, pp. 363–375, 2014

work page 2014

[14] [14]

Value function of differential games without isaacs conditions. an approach with nonanticipative mixed strategies,

R. Buckdahn, J. Li, and M. Quincampoix, “Value function of differential games without isaacs conditions. an approach with nonanticipative mixed strategies,” International Journal of Game Theory , vol. 42, no. 4, pp. 989–1020, Nov. 2013

work page 2013

[15] [15]

Value in mixed strategies for zero-sum stochastic differential games without isaacs condition,

——, “Value in mixed strategies for zero-sum stochastic differential games without isaacs condition,” The Annals of Probability , vol. 42, no. 4, pp. 1724–1768, 2014

work page 2014

[16] [16]

Mixed strategies for de- terministic differential games,

W. H. Fleming and D. Hernandez-Hernandez, “Mixed strategies for de- terministic differential games,” Communications on Stochastic Analysis , vol. 11, no. 2, Jun. 2017

work page 2017

[17] [17]

Chattering analysis,

A. Levant, “Chattering analysis,” IEEE Transactions on Automatic Control, vol. 55, no. 6, pp. 1380–1389, Jun. 2010

work page 2010

[18] [18]

Stochastic modified equations and adaptive stochastic gradient algorithms,

Q. Li, C. Tai, and E. Weinan, “Stochastic modified equations and adaptive stochastic gradient algorithms,” in International Conference on Machine Learning. PMLR, 2017, pp. 2101–2110

work page 2017

[19] [19]

Stochastic modified equations and dynamics of stochastic gra- dient algorithms i: Mathematical foundations,

——, “Stochastic modified equations and dynamics of stochastic gra- dient algorithms i: Mathematical foundations,” The Journal of Machine Learning Research, vol. 20, no. 1, pp. 1474–1520, 2019

work page 2019

[20] [20]

Continuized accelerations of deterministic and stochastic gradient descents, and of gossip algorithms,

M. Even, R. Berthier, F. Bach, N. Flammarion, H. Hendrikx, P. Gaillard, L. Massoulié, and A. Taylor, “Continuized accelerations of deterministic and stochastic gradient descents, and of gossip algorithms,” in Advances in Neural Information Processing Systems , vol. 34, 2021, pp. 28 054– 28 066

work page 2021

[21] [21]

Analysis of stochastic gradient descent in continuous time,

J. Latz, “Analysis of stochastic gradient descent in continuous time,” Statistics and Computing , vol. 31, no. 4, p. 39, May 2021

work page 2021

[22] [22]

M. J. Osborne and A. Rubinstein, A Course in Game Theory . Cam- bridge, Mass: The MIT Press, Jul. 1994

work page 1994

[23] [23]

28. mixed and behavior strategies in infinite extensive games,

R. J. Aumann, “28. mixed and behavior strategies in infinite extensive games,” in Advances in Game Theory. (AM-52) . Princeton University Press, Dec. 1964, pp. 627–650

work page 1964

[24] [24]

Non-cooperative games,

J. Nash Jr, “Non-cooperative games,” in Essays on Game Theory . Edward Elgar Publishing, 1996, pp. 22–33

work page 1996

[25] [25]

Ba¸ sar and G

T. Ba¸ sar and G. J. Olsder, Dynamic Noncooperative Game Theory . SIAM, 1998

work page 1998

[26] [26]

Stochastic perron’s method and elementary strategies for zero-sum differential games,

M. Sîrbu, “Stochastic perron’s method and elementary strategies for zero-sum differential games,” SIAM Journal on Control and Optimiza- tion, vol. 52, no. 3, pp. 1693–1711, Jan. 2014

work page 2014

[27] [27]

Optimal control of the undamped linear wave equation with measure valued controls,

K. Kunisch, P. Trautmann, and B. Vexler, “Optimal control of the undamped linear wave equation with measure valued controls,” SIAM Journal on Control and Optimization , vol. 54, no. 3, pp. 1212–1244, Jan. 2016

work page 2016

[28] [28]

Con- trolled measure-valued martingales: A viscosity solution approach,

A. M. G. Cox, S. Källblad, M. Larsson, and S. Svaluto-Ferro, “Con- trolled measure-valued martingales: A viscosity solution approach,” The Annals of Applied Probability, vol. 34, no. 2, pp. 1987–2035, Apr. 2024

work page 1987

[29] [29]

Relaxed controls,

L. D. Berkovitz and N. G. Medhin, “Relaxed controls,” in Nonlinear Optimal Control Theory . Chapman and Hall/CRC, 2012

work page 2012

[30] [30]

A pontryagin’s maximum principle for non-zero sum differential games of bsdes with applications,

G. Wang and Z. Yu, “A pontryagin’s maximum principle for non-zero sum differential games of bsdes with applications,” IEEE Transactions on Automatic Control , vol. 55, no. 7, pp. 1742–1747, Jul. 2010

work page 2010

[31] [31]

W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions. Springer Science & Business Media, Feb. 2006

work page 2006

[32] [32]

Optimal mixed strategies in a dynamic game,

P. Kumar, “Optimal mixed strategies in a dynamic game,” IEEE Trans- actions on Automatic Control , vol. 25, no. 4, pp. 743–749, Aug. 1980

work page 1980

[33] [33]

Finding mixed strategy nash equilibrium for continuous games through deep learning,

Z. Dou, X. Yan, D. Wang, and X. Deng, “Finding mixed strategy nash equilibrium for continuous games through deep learning,” arXiv preprint arXiv:1910.12075, 2019

work page arXiv 1910

[34] [34]

Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks,

C. Martin and T. Sandholm, “Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence , ser. IJCAI ’23, Aug. 2023, pp. 2844–2852

work page 2023

[35] [35]

Probabilistic pursuit-evasion games: A one-step nash approach,

J. Hespanha, M. Prandini, and S. Sastry, “Probabilistic pursuit-evasion games: A one-step nash approach,” in Proceedings of the 39th IEEE Conference on Decision and Control , vol. 3, Dec. 2000, pp. 2272–2277 vol.3. 12

work page 2000

[36] [36]

Probabilistic pursuit-evasion games: Theory, implementation, and experimental eval- uation,

R. Vidal, O. Shakernia, H. Kim, D. Shim, and S. Sastry, “Probabilistic pursuit-evasion games: Theory, implementation, and experimental eval- uation,” IEEE Transactions on Robotics and Automation , vol. 18, no. 5, pp. 662–669, Oct. 2002

work page 2002

[37] [37]

Randomized pursuit-evasion in a polygonal environment,

V . Isler, S. Kannan, and S. Khanna, “Randomized pursuit-evasion in a polygonal environment,” IEEE Transactions on Robotics, vol. 21, no. 5, pp. 875–884, Oct. 2005

work page 2005

[38] [38]

Randomized pursuit-evasion with local visibility,

——, “Randomized pursuit-evasion with local visibility,” SIAM Journal on Discrete Mathematics , vol. 20, no. 1, pp. 26–41, Jan. 2006

work page 2006

[39] [39]

Ba¸ sar and G

T. Ba¸ sar and G. Zaccour,Handbook of Dynamic Game Theory. Springer International Publishing, Jun. 2018

work page 2018

[40] [40]

Borel structures for function spaces,

R. J. Aumann, “Borel structures for function spaces,” Illinois Journal of Mathematics, vol. 5, no. 4, pp. 614–630, 1961

work page 1961

[41] [41]

Spaces of measurable transformations,

——, “Spaces of measurable transformations,” Bulletin of the American Mathematical Society, vol. 66, no. 4, pp. 301–304, 1960

work page 1960

[42] [42]

On von neumann’s minimax theorem,

H. Nikaidô, “On von neumann’s minimax theorem,” Pacific J. Math , vol. 4, no. 1, pp. 65–72, 1954

work page 1954

[43] [43]

Weak approximation of solutions of systems of stochastic differential equations,

G. N. Mil’shtein, “Weak approximation of solutions of systems of stochastic differential equations,” Theory of Probability & Its Applica- tions, vol. 30, no. 4, pp. 750–766, 1986. APPENDIX A PROOF OF THEOREM 1 Proof. Our proof proceeds based on the principle of dynamic programming. Notice that the game value functions, presented in Definition 4, are defin...

work page 1986

[44] [44]

Let K(t, x) = Z(t, x)⊤Ξ(t)Z(t, x), the optimization problem is equivalently transformed as max 0⪯K(t,x)⪯I tr n diag{˜λi}d2 i=1K(t, x) o

Next, consider that max 0⪯˜Σ2(t)⪯D2 tr n ˜Σ2(t) δπ(t)G2(x)⊤∂2 x ˜V G2(x) − 2R2 o = max 0⪯Ξ(t)⪯I tr n Ξ(t)Z(t, x) diag{˜λi}d2 i=1Z(t, x)⊤ o = max 0⪯Ξ(t)⪯I tr n diag{˜λi}d2 i=1Z(t, x)⊤Ξ(t)Z(t, x) o , where Ξ(t) = D − 1 2 2 ˜Σ2(t)D − 1 2 2 and Z(t, x) is an orthogonal matrix such that D 1 2 2 h δπ(t)G2(x)⊤∂2 x ˜V G2(x) − 2R2 i D 1 2 2 = Z(t, x) diag{˜λi}d2 i...

work page

[45] [45]

Step 3: verify the smoothness condition

+ 2(b2 3 + b2 4)¯π and C = 2(1 +L0 + a2 1 + a2 2 + a2 3¯π + a2 4¯π). Step 3: verify the smoothness condition. Let δ˜Γi(t) = ˜Γd i (t) − ˜Γ∗ i (t) and δ˜Λi(t) = ˜Λd i (t) − ˜Λ∗ i (t) for i = 1 , 2, we have E∥δ˜Γi(t)∥2 = O(¯π2) and E∥δ˜Λi(t)∥2 = O(¯π2) since ˜Γ∗ i and ˜Λ∗ i are Lipschitz continuous and E∥e(t)∥2 = O(¯π2). Next, the cost difference is ˜J(t0, ...

work page 2023