Stochastic Mean-Field LQ Stackelberg Differential Games with Random Coefficients: Theory and a Deep FBSDE Picard Solver

Jie Xiong; Ying Yang; Zhouyu Wang

arxiv: 2605.12950 · v2 · pith:TNY573U6new · submitted 2026-05-13 · 🧮 math.OC

Stochastic Mean-Field LQ Stackelberg Differential Games with Random Coefficients: Theory and a Deep FBSDE Picard Solver

Ying Yang , Jie Xiong , Zhouyu Wang This is my paper

Pith reviewed 2026-05-22 09:39 UTC · model grok-4.3

classification 🧮 math.OC

keywords mean-field gamesStackelberg differential gameslinear-quadratic controlFBSDEdeep learning solverrandom coefficientsstochastic controloptimal control

0 comments

The pith

Mean-field Stackelberg games with random coefficients admit a Riccati-free FBSDE characterization solved by a deep Picard iteration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies stochastic mean-field linear-quadratic Stackelberg differential games where the coefficients are random. The combination of mean-field interaction terms and random coefficients prevents the use of standard decoupling methods. An extended Lagrange multiplier method produces an affine operator representation of the follower's optimal response. This representation converts the leader's problem into a generalized stochastic LQ control problem with operator-valued coefficients. The resulting Stackelberg optimal control is characterized by a coupled FBSDE system without Riccati equations, which is then solved numerically by a Deep FBSDE Picard Solver that respects the leader-follower hierarchy and enforces mean-field consistency via a neural augmented Lagrangian.

Core claim

The paper shows that an extended Lagrange multiplier method yields an affine operator representation of the follower's optimal response even when mean-field terms and random coefficients are present. This allows the leader's problem to be recast as a generalized stochastic linear-quadratic control problem whose coefficients are operators. The Stackelberg optimal control is then characterized through a Riccati-free coupled FBSDE system. A Deep FBSDE Picard Solver approximates the system by performing follower-response learning, extracting response sensitivities, optimizing the leader's control, and enforcing mean-field consistency constraints with a neural augmented Lagrangian.

What carries the argument

The affine operator representation of the follower's optimal response, derived via the extended Lagrange multiplier method, which recasts the leader problem as a generalized stochastic LQ control with operator-valued coefficients and yields the Riccati-free coupled FBSDE characterization.

Load-bearing premise

The extended Lagrange multiplier method successfully yields an affine operator representation of the follower's optimal response despite the presence of both mean-field interaction terms and random coefficients.

What would settle it

In a low-dimensional test case with an analytically known Stackelberg solution, the deep solver would produce controls that violate the FBSDE system or the leader-follower order.

Figures

Figures reproduced from arXiv: 2605.12950 by Jie Xiong, Ying Yang, Zhouyu Wang.

**Figure 2.** Figure 2: Adaptive ALM diagnostics: constraint violations (left axes, log scale) and penalty parameters [PITH_FULL_IMAGE:figures/full_fig_p024_2.png] view at source ↗

**Figure 3.** Figure 3: Temporal discretization convergence (constant-coefficient setting): (a) follower cost [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

**Figure 4.** Figure 4: Computational scaling with state dimension [PITH_FULL_IMAGE:figures/full_fig_p025_4.png] view at source ↗

**Figure 5.** Figure 5: Unilateral deviation test: cost increment [PITH_FULL_IMAGE:figures/full_fig_p027_5.png] view at source ↗

**Figure 6.** Figure 6: Stackelberg vs. Nash-type baseline across [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗

**Figure 6.** Figure 6: Mean-variance portfolio Stackelberg game ( [PITH_FULL_IMAGE:figures/full_fig_p029_6.png] view at source ↗

**Figure 7.** Figure 7: Mean-variance portfolio Stackelberg game ( [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

read the original abstract

This paper studies a stochastic mean-field linear-quadratic Stackelberg differential game with random coefficients. The interaction between mean-field terms and random coefficients precludes the direct use of conventional decoupling techniques. We apply an extended Lagrange multiplier method to derive an affine operator representation of the follower's optimal response. The induced leader problem is then formulated as a generalized stochastic LQ control problem with operator-valued coefficients, and the Stackelberg optimal control is characterized through a Riccati-free coupled FBSDE system. We further develop a Deep FBSDE Picard Solver that preserves the Stackelberg order through follower-response learning, response-sensitivity extraction, leader optimization, and neural augmented Lagrangian enforcement of mean-field consistency constraints. Numerical studies covering convergence diagnostics, discretization sensitivity, Riccati calibration, ablation tests, stability under control perturbations, Stackelberg--Nash comparisons, and a financial application support the effectiveness of the proposed framework.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a Riccati-free FBSDE characterization for mean-field Stackelberg games with random coefficients plus a custom deep Picard solver, but the affine operator step from the extended Lagrange multiplier is the part that needs the closest check.

read the letter

The main contribution is handling the case where random adapted coefficients block standard decoupling in stochastic mean-field LQ Stackelberg games. They use an extended Lagrange multiplier method to produce an affine operator for the follower's optimal response, recast the leader problem as a generalized LQ control with operator-valued coefficients, and characterize the solution through a coupled FBSDE system without Riccati equations. They then build a Deep FBSDE Picard Solver that steps through follower-response learning, sensitivity extraction, leader optimization, and neural augmented Lagrangian enforcement of the mean-field constraints. That order-preserving structure is a reasonable way to keep the Stackelberg hierarchy intact numerically.

Referee Report

1 major / 2 minor

Summary. The paper studies stochastic mean-field linear-quadratic Stackelberg differential games with random coefficients. It applies an extended Lagrange multiplier method to obtain an affine operator representation of the follower's optimal response, recasts the leader problem as a generalized stochastic LQ control problem with operator-valued coefficients, and characterizes the Stackelberg equilibrium via a Riccati-free coupled FBSDE system. A Deep FBSDE Picard Solver is proposed that preserves the Stackelberg order through follower-response learning, sensitivity extraction, leader optimization, and neural augmented Lagrangian enforcement of mean-field constraints. Numerical studies on convergence, discretization, ablation, stability, comparisons, and a financial application are included to support the framework.

Significance. If the derivation of the affine operator representation holds under random adapted coefficients, the work provides a valuable extension of Stackelberg game theory to settings where standard decoupling fails due to mean-field interactions and stochastic coefficients. The Riccati-free FBSDE characterization and the order-preserving deep solver represent technical advances with potential applicability in finance and stochastic control. The inclusion of extensive numerical diagnostics strengthens the practical contribution.

major comments (1)

The central theoretical step relies on the extended Lagrange multiplier method producing an affine operator representation of the follower's optimal response despite mean-field terms and adapted random coefficients (abstract and the derivation leading to the leader problem reformulation). The adaptedness of coefficients risks introducing non-affine remainders in the multiplier equations; the manuscript should explicitly exhibit the form of the response operator (e.g., the relevant theorem or proposition) and verify that linearity is preserved after incorporating the stochastic coefficients and mean-field interactions.

minor comments (2)

The abstract refers to 'Riccati calibration' in the numerical studies; a brief description of the calibration procedure and its relation to the FBSDE system would improve clarity.
Notation for the operator-valued coefficients in the generalized LQ problem could be introduced earlier to aid readability when transitioning from the follower to the leader problem.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address the major comment below and believe the requested clarification strengthens the presentation of the theoretical results.

read point-by-point responses

Referee: The central theoretical step relies on the extended Lagrange multiplier method producing an affine operator representation of the follower's optimal response despite mean-field terms and adapted random coefficients (abstract and the derivation leading to the leader problem reformulation). The adaptedness of coefficients risks introducing non-affine remainders in the multiplier equations; the manuscript should explicitly exhibit the form of the response operator (e.g., the relevant theorem or proposition) and verify that linearity is preserved after incorporating the stochastic coefficients and mean-field interactions.

Authors: We thank the referee for highlighting the need to make the affine structure fully explicit. In the manuscript, the extended Lagrange multiplier method is applied to the follower's stochastic LQ problem in Section 3. The resulting optimality conditions produce a linear FBSDE system whose solution yields the follower's control as an affine function of the leader's control: specifically, the response takes the form u_F = A u_L + b, where A is a linear operator whose kernel is constructed from the solutions of the multiplier BSDEs and b incorporates the mean-field consistency terms. Because the underlying dynamics are linear and the costs quadratic, the mean-field interactions enter as linear functionals of the state and control processes; the adapted random coefficients appear as multiplicative factors within these linear terms and do not generate nonlinear remainders in the response map. The well-posedness of the FBSDEs under adapted coefficients follows from standard Lipschitz assumptions on the coefficients. To address the comment directly, we will insert a new Corollary 3.2 in the revised manuscript that isolates the explicit form of the operator A, states the affine representation, and contains a short verification paragraph confirming preservation of linearity. This addition will not alter the existing proofs but will improve readability. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on adapted standard techniques

full rationale

The paper derives the affine operator representation of the follower's response via an extended Lagrange multiplier method, recasts the leader problem as a generalized LQ control with operator-valued coefficients, and characterizes the optimum through a Riccati-free coupled FBSDE. These steps use standard FBSDE and multiplier techniques adapted to the random-coefficient mean-field setting without reducing any central claim to a fitted quantity, self-defined input, or load-bearing self-citation chain. The Deep FBSDE Picard Solver is a separate numerical construction. The derivation chain is therefore self-contained against external benchmarks and does not exhibit the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on standard existence assumptions for FBSDEs and the applicability of the extended Lagrange multiplier method to the random-coefficient mean-field setting; no free parameters or invented physical entities are described in the abstract.

axioms (1)

domain assumption Existence and uniqueness of solutions to the coupled FBSDE system under the stated random coefficients and mean-field interactions
Invoked to guarantee that the Riccati-free characterization yields well-defined optimal controls.

invented entities (1)

Deep FBSDE Picard Solver no independent evidence
purpose: Numerical algorithm that learns follower response and enforces mean-field consistency via neural networks and augmented Lagrangian
New computational method introduced to solve the derived FBSDE system while preserving Stackelberg hierarchy

pith-pipeline@v0.9.0 · 5690 in / 1423 out tokens · 42247 ms · 2026-05-22T09:39:33.449407+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We apply an extended Lagrange multiplier method to derive an affine operator representation of the follower’s optimal response... characterized through a Riccati-free coupled FBSDE system.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

Analytical solution for an open-loop stackelberg game

H Abou-Kandil and P Bertrand. Analytical solution for an open-loop stackelberg game. IEEE Transactions on Automatic Control , 30(12):1222–1224, 1985

work page 1985
[2]

Machine learning approximation algo- rithms for high-dimensional fully nonlinear partial differential equations and second- order backward stochastic differential equations

Christian Beck, Weinan E, and Arnulf Jentzen. Machine learning approximation algo- rithms for high-dimensional fully nonlinear partial differential equations and second- order backward stochastic differential equations. Journal of Nonlinear Science , 29(4):1563–1619, 2019. 29

work page 2019
[3]

Mean field stack- elberg games: Aggregation of delayed instructions

Alain Bensoussan, Michael HM Chau, and Sheung Chi Phillip Yam. Mean field stack- elberg games: Aggregation of delayed instructions. SIAM Journal on Control and Op- timization, 53(4):2237–2266, 2015

work page 2015
[4]

Springer Science & Business Media, 2013

J Fr´ ed´ eric Bonnans and Alexander Shapiro.Perturbation analysis of optimization prob- lems. Springer Science & Business Media, 2013

work page 2013
[5]

Mean-field backward stochastic differential equations: a limit approach

Rainer Buckdahn, Boualem Djehiche, Juan Li, and Shige Peng. Mean-field backward stochastic differential equations: a limit approach. 2009

work page 2009
[6]

Mean field forward-backward stochastic differen- tial equations

Ren´ e Carmona and Fran¸ cois Delarue. Mean field forward-backward stochastic differen- tial equations. 2013

work page 2013
[7]

Springer, 2018

Ren´ e Carmona, Fran¸ cois Delarue, et al.Probabilistic theory of mean field games with applications I-II, volume 3. Springer, 2018

work page 2018
[8]

Infinite horizon linear-quadratic leader- follower stochastic differential games for regime switching diffusions

Kai Ding, Siyu Lv, Jie Xiong, and Xin Zhang. Infinite horizon linear-quadratic leader- follower stochastic differential games for regime switching diffusions. Applied Mathemat- ics & Optimization , 92(2):25, 2025

work page 2025
[9]

Existence and uniqueness of open-loop stackelberg equilibria in linear-quadratic differential games

G Freiling, G Jank, and SR Lee. Existence and uniqueness of open-loop stackelberg equilibria in linear-quadratic differential games. Journal of Optimization Theory and Applications, 110(3):515–544, 2001

work page 2001
[10]

Solving high-dimensional partial differen- tial equations using deep learning

Jiequn Han, Arnulf Jentzen, and Weinan E. Solving high-dimensional partial differen- tial equations using deep learning. Proceedings of the National Academy of Sciences , 115(34):8505–8510, 2018

work page 2018
[11]

Deep learning-based numerical methods for high- dimensional parabolic partial differential equations and backward stochastic differential equations

Jiequn Han, Arnulf Jentzen, et al. Deep learning-based numerical methods for high- dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in mathematics and statistics , 5(4):349–380, 2017

work page 2017
[12]

Convergence of the deep bsde method for coupled fbsdes

Jiequn Han and Jihao Long. Convergence of the deep bsde method for coupled fbsdes. Probability, Uncertainty and Quantitative Risk , 5(1):5, 2020

work page 2020
[13]

Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion monte carlo like approach

Jiequn Han, Jianfeng Lu, and Mo Zhou. Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion monte carlo like approach. Journal of Compu- tational Physics, 423:109792, 2020

work page 2020
[14]

Deep fictitious play for stochastic differential games

Ruimeng Hu. Deep fictitious play for stochastic differential games. arXiv preprint arXiv:1903.09376, 2019

work page arXiv 1903
[15]

A deep learning method for solv- ing stochastic optimal control problems driven by fully-coupled fbsdes

Shaolin Ji, Shige Peng, Ying Peng, and Xichuan Zhang. A deep learning method for solv- ing stochastic optimal control problems driven by fully-coupled fbsdes. arXiv preprint arXiv:2204.05796, 2022

work page arXiv 2022
[16]

Linear-quadratic generalized stackelberg games with jump-diffusion processes and related forward-backward stochastic differential equations

Na Li, Jie Xiong, and Zhiyong Yu. Linear-quadratic generalized stackelberg games with jump-diffusion processes and related forward-backward stochastic differential equations. Science China Mathematics , 64(9):2091–2116, 2021. 30

work page 2091
[17]

An open-loop stackelberg strategy for the linear quadratic mean-field stochastic differential game

Yaning Lin, Xiushan Jiang, and Weihai Zhang. An open-loop stackelberg strategy for the linear quadratic mean-field stochastic differential game. IEEE Transactions on Au- tomatic Control, 64(1):97–110, 2018

work page 2018
[18]

Optimization by vector space methods

David G Luenberger. Optimization by vector space methods . John Wiley & Sons, 1997

work page 1997
[19]

Two-player zero-sum stochastic differential games with regime switching

Siyu Lv. Two-player zero-sum stochastic differential games with regime switching. Au- tomatica, 114:108819, 2020

work page 2020
[20]

Linear quadratic leader–follower stochastic differ- ential games for mean-field switching diffusions

Siyu Lv, Jie Xiong, and Xin Zhang. Linear quadratic leader–follower stochastic differ- ential games for mean-field switching diffusions. Automatica, 154:111072, 2023

work page 2023
[21]

Linear-quadratic stochastic stackelberg differential games for jump-diffusion systems

Jun Moon. Linear-quadratic stochastic stackelberg differential games for jump-diffusion systems. SIAM Journal on Control and Optimization , 59(2):954–976, 2021

work page 2021
[22]

Leader–follower stochastic differential game with asymmetric information and applications

Jingtao Shi, Guangchen Wang, and Jie Xiong. Leader–follower stochastic differential game with asymmetric information and applications. Automatica, 63:60–73, 2016

work page 2016
[23]

Market structure and equilibrium

Heinrich Von Stackelberg. Market structure and equilibrium. Springer Science & Business Media, 2010

work page 2010
[24]

Linear quadratic mean field stackelberg games: Open-loop and feedback solutions

Bing-Chang Wang, Juanjuan Xu, Huanshui Zhang, and Yong Liang. Linear quadratic mean field stackelberg games: Open-loop and feedback solutions. IEEE Transactions on Cybernetics, 2025

work page 2025
[25]

Linear quadratic stochastic optimal control problems with operator coefficients: open-loop solutions

Qingmeng Wei, Jiongmin Yong, and Zhiyong Yu. Linear quadratic stochastic optimal control problems with operator coefficients: open-loop solutions. ESAIM: Control, Op- timisation and Calculus of Variations , 25:17, 2019

work page 2019
[26]

Mean-field stochastic linear quadratic control problem with random coefficients

Jie Xiong and Wen Xu. Mean-field stochastic linear quadratic control problem with random coefficients. SIAM Journal on Control and Optimization, 63(4):3042–3060, 2025

work page 2025
[27]

A leader-follower stochastic linear quadratic differential game

Jiongmin Yong. A leader-follower stochastic linear quadratic differential game. SIAM Journal on Control and Optimization , 41(4):1015–1041, 2002

work page 2002
[28]

Stochastic controls: Hamiltonian systems and HJB equations, volume 43

Jiongmin Yong and Xun Yu Zhou. Stochastic controls: Hamiltonian systems and HJB equations, volume 43. Springer Science & Business Media, 1999. Appendix A. The Proof of Problem (MFSOLQ-F) The Proof of Theorem 3.1. By the linearity of the SDE (2.3) and Lemma 2.1, together with the boundedness of all coefficient operators under (H1), there exist bounded line...

work page 1999
[29]

Define λϵ 1 = ( λϵ 1, ˜λϵ

be the optimal pair to Problem (F-2), and let (X η1,λ∗ 1(·), Y η1,λ∗ 1(·), Zη1,λ∗ 1(·)) be the corresponding state process satisfying the FBSDE (3.9) with (λ1, ˜λ1) replaced by ( λ∗ 1, ˜λ∗ 1). Define λϵ 1 = ( λϵ 1, ˜λϵ

work page
[30]

by λϵ 1 = λ∗ 1 + ϵλ1 1 and ˜λϵ 1 = ˜λ∗ 1 + ϵ˜λ1 1, where λ1 1 = ( λ1 1, ˜λ1

work page
[31]

Moreover, let ( X η1,λϵ 1(·), Y η1,λϵ 1(·), Zη1,λϵ 1(·)) denote the cor- responding state trajectory for the perturbed variable pair λϵ 1

is an arbitrary random variable pair in ( L2)2, with its corresponding state trajectory being (X η1,λ1 1(·), Y η1,λ1 1(·), Zη1,λ1 1(·)). Moreover, let ( X η1,λϵ 1(·), Y η1,λϵ 1(·), Zη1,λϵ 1(·)) denote the cor- responding state trajectory for the perturbed variable pair λϵ 1. To simplify notation, we replace the superscripts ( η1, λ∗ 1), ( η1, λϵ 1), and ( η1, λ1

work page
[32]

of the state triple ( X ·(·), Y ·(·), Z·(·)) with ∗, ϵ, and 1, respectively. Then, we introduce the following variation equation:    dX1(t) = A1X1 − B1R−1 1 (B⊤ 1 Y 1 + D⊤ 1 Z1 + λ1 1) dt + [C1X1 − D1R−1 1 (B⊤ 1 Y 1 + D⊤ 1 Z1 + λ1 1)]dW (t), dY 1(t) = − [A⊤ 1 Y 1 + C⊤ 1 Z1 + Q1X1 + ˜λ1 1]dt + Z1dW (t), X1(0) =0, Y 1(T ) = G1X1(T ). Notice that ...

work page
[33]

Now, we turn to proving the main theorem for Problem (F-3) in detail

is the optimal pair, then E˜˜u η1,λ∗ 1 1 = α1 and EX ∗ = β1. Now, we turn to proving the main theorem for Problem (F-3) in detail. First, we provide the detailed proof of Lemma 3.8. The proof of Lemma 3.8. By inserting the operator representations of ˜˜uη1,λ1 1 (·), X η1,λ1(·), X η1,λ1(T ), and β1(T ) , which are given from (3.15) to (3.17) respectively, ...

work page
[34]

are the optimal control variables. Then we have that ˜J1(α∗ 1(·), β∗ 1(·)) = (K∗ 2,1Q1K2,1 + K∗ 1,1R1K1,1 + K∗ 3,1G1K3,1)x, x Rn + (K∗ 2,2Q1K2,2 + K∗ 1,2R1K1,2 + ¯R1 + K∗ 3,2G1K3,2)α1, α1 L2 + (K∗ 2,3Q1K2,3 + K∗ 1,3R1K1,3 + ¯Q1 + K∗ 3,3G1K3,3)β1, β1 L2 + (K∗ 2,4Q1K2,4 + K∗ 1,4R1K1,4 + K∗ 3,4G1K3,4)u2, u2 U2 + 2 (K∗ 2,2Q1K2,1 + K∗ 1,2R1K1,1 + K∗ 3,2G1K3,1)...

work page

[1] [1]

Analytical solution for an open-loop stackelberg game

H Abou-Kandil and P Bertrand. Analytical solution for an open-loop stackelberg game. IEEE Transactions on Automatic Control , 30(12):1222–1224, 1985

work page 1985

[2] [2]

Machine learning approximation algo- rithms for high-dimensional fully nonlinear partial differential equations and second- order backward stochastic differential equations

Christian Beck, Weinan E, and Arnulf Jentzen. Machine learning approximation algo- rithms for high-dimensional fully nonlinear partial differential equations and second- order backward stochastic differential equations. Journal of Nonlinear Science , 29(4):1563–1619, 2019. 29

work page 2019

[3] [3]

Mean field stack- elberg games: Aggregation of delayed instructions

Alain Bensoussan, Michael HM Chau, and Sheung Chi Phillip Yam. Mean field stack- elberg games: Aggregation of delayed instructions. SIAM Journal on Control and Op- timization, 53(4):2237–2266, 2015

work page 2015

[4] [4]

Springer Science & Business Media, 2013

J Fr´ ed´ eric Bonnans and Alexander Shapiro.Perturbation analysis of optimization prob- lems. Springer Science & Business Media, 2013

work page 2013

[5] [5]

Mean-field backward stochastic differential equations: a limit approach

Rainer Buckdahn, Boualem Djehiche, Juan Li, and Shige Peng. Mean-field backward stochastic differential equations: a limit approach. 2009

work page 2009

[6] [6]

Mean field forward-backward stochastic differen- tial equations

Ren´ e Carmona and Fran¸ cois Delarue. Mean field forward-backward stochastic differen- tial equations. 2013

work page 2013

[7] [7]

Springer, 2018

Ren´ e Carmona, Fran¸ cois Delarue, et al.Probabilistic theory of mean field games with applications I-II, volume 3. Springer, 2018

work page 2018

[8] [8]

Infinite horizon linear-quadratic leader- follower stochastic differential games for regime switching diffusions

Kai Ding, Siyu Lv, Jie Xiong, and Xin Zhang. Infinite horizon linear-quadratic leader- follower stochastic differential games for regime switching diffusions. Applied Mathemat- ics & Optimization , 92(2):25, 2025

work page 2025

[9] [9]

Existence and uniqueness of open-loop stackelberg equilibria in linear-quadratic differential games

G Freiling, G Jank, and SR Lee. Existence and uniqueness of open-loop stackelberg equilibria in linear-quadratic differential games. Journal of Optimization Theory and Applications, 110(3):515–544, 2001

work page 2001

[10] [10]

Solving high-dimensional partial differen- tial equations using deep learning

Jiequn Han, Arnulf Jentzen, and Weinan E. Solving high-dimensional partial differen- tial equations using deep learning. Proceedings of the National Academy of Sciences , 115(34):8505–8510, 2018

work page 2018

[11] [11]

Deep learning-based numerical methods for high- dimensional parabolic partial differential equations and backward stochastic differential equations

Jiequn Han, Arnulf Jentzen, et al. Deep learning-based numerical methods for high- dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in mathematics and statistics , 5(4):349–380, 2017

work page 2017

[12] [12]

Convergence of the deep bsde method for coupled fbsdes

Jiequn Han and Jihao Long. Convergence of the deep bsde method for coupled fbsdes. Probability, Uncertainty and Quantitative Risk , 5(1):5, 2020

work page 2020

[13] [13]

Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion monte carlo like approach

Jiequn Han, Jianfeng Lu, and Mo Zhou. Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion monte carlo like approach. Journal of Compu- tational Physics, 423:109792, 2020

work page 2020

[14] [14]

Deep fictitious play for stochastic differential games

Ruimeng Hu. Deep fictitious play for stochastic differential games. arXiv preprint arXiv:1903.09376, 2019

work page arXiv 1903

[15] [15]

A deep learning method for solv- ing stochastic optimal control problems driven by fully-coupled fbsdes

Shaolin Ji, Shige Peng, Ying Peng, and Xichuan Zhang. A deep learning method for solv- ing stochastic optimal control problems driven by fully-coupled fbsdes. arXiv preprint arXiv:2204.05796, 2022

work page arXiv 2022

[16] [16]

Linear-quadratic generalized stackelberg games with jump-diffusion processes and related forward-backward stochastic differential equations

Na Li, Jie Xiong, and Zhiyong Yu. Linear-quadratic generalized stackelberg games with jump-diffusion processes and related forward-backward stochastic differential equations. Science China Mathematics , 64(9):2091–2116, 2021. 30

work page 2091

[17] [17]

An open-loop stackelberg strategy for the linear quadratic mean-field stochastic differential game

Yaning Lin, Xiushan Jiang, and Weihai Zhang. An open-loop stackelberg strategy for the linear quadratic mean-field stochastic differential game. IEEE Transactions on Au- tomatic Control, 64(1):97–110, 2018

work page 2018

[18] [18]

Optimization by vector space methods

David G Luenberger. Optimization by vector space methods . John Wiley & Sons, 1997

work page 1997

[19] [19]

Two-player zero-sum stochastic differential games with regime switching

Siyu Lv. Two-player zero-sum stochastic differential games with regime switching. Au- tomatica, 114:108819, 2020

work page 2020

[20] [20]

Linear quadratic leader–follower stochastic differ- ential games for mean-field switching diffusions

Siyu Lv, Jie Xiong, and Xin Zhang. Linear quadratic leader–follower stochastic differ- ential games for mean-field switching diffusions. Automatica, 154:111072, 2023

work page 2023

[21] [21]

Linear-quadratic stochastic stackelberg differential games for jump-diffusion systems

Jun Moon. Linear-quadratic stochastic stackelberg differential games for jump-diffusion systems. SIAM Journal on Control and Optimization , 59(2):954–976, 2021

work page 2021

[22] [22]

Leader–follower stochastic differential game with asymmetric information and applications

Jingtao Shi, Guangchen Wang, and Jie Xiong. Leader–follower stochastic differential game with asymmetric information and applications. Automatica, 63:60–73, 2016

work page 2016

[23] [23]

Market structure and equilibrium

Heinrich Von Stackelberg. Market structure and equilibrium. Springer Science & Business Media, 2010

work page 2010

[24] [24]

Linear quadratic mean field stackelberg games: Open-loop and feedback solutions

Bing-Chang Wang, Juanjuan Xu, Huanshui Zhang, and Yong Liang. Linear quadratic mean field stackelberg games: Open-loop and feedback solutions. IEEE Transactions on Cybernetics, 2025

work page 2025

[25] [25]

Linear quadratic stochastic optimal control problems with operator coefficients: open-loop solutions

Qingmeng Wei, Jiongmin Yong, and Zhiyong Yu. Linear quadratic stochastic optimal control problems with operator coefficients: open-loop solutions. ESAIM: Control, Op- timisation and Calculus of Variations , 25:17, 2019

work page 2019

[26] [26]

Mean-field stochastic linear quadratic control problem with random coefficients

Jie Xiong and Wen Xu. Mean-field stochastic linear quadratic control problem with random coefficients. SIAM Journal on Control and Optimization, 63(4):3042–3060, 2025

work page 2025

[27] [27]

A leader-follower stochastic linear quadratic differential game

Jiongmin Yong. A leader-follower stochastic linear quadratic differential game. SIAM Journal on Control and Optimization , 41(4):1015–1041, 2002

work page 2002

[28] [28]

Stochastic controls: Hamiltonian systems and HJB equations, volume 43

Jiongmin Yong and Xun Yu Zhou. Stochastic controls: Hamiltonian systems and HJB equations, volume 43. Springer Science & Business Media, 1999. Appendix A. The Proof of Problem (MFSOLQ-F) The Proof of Theorem 3.1. By the linearity of the SDE (2.3) and Lemma 2.1, together with the boundedness of all coefficient operators under (H1), there exist bounded line...

work page 1999

[29] [29]

Define λϵ 1 = ( λϵ 1, ˜λϵ

be the optimal pair to Problem (F-2), and let (X η1,λ∗ 1(·), Y η1,λ∗ 1(·), Zη1,λ∗ 1(·)) be the corresponding state process satisfying the FBSDE (3.9) with (λ1, ˜λ1) replaced by ( λ∗ 1, ˜λ∗ 1). Define λϵ 1 = ( λϵ 1, ˜λϵ

work page

[30] [30]

by λϵ 1 = λ∗ 1 + ϵλ1 1 and ˜λϵ 1 = ˜λ∗ 1 + ϵ˜λ1 1, where λ1 1 = ( λ1 1, ˜λ1

work page

[31] [31]

Moreover, let ( X η1,λϵ 1(·), Y η1,λϵ 1(·), Zη1,λϵ 1(·)) denote the cor- responding state trajectory for the perturbed variable pair λϵ 1

is an arbitrary random variable pair in ( L2)2, with its corresponding state trajectory being (X η1,λ1 1(·), Y η1,λ1 1(·), Zη1,λ1 1(·)). Moreover, let ( X η1,λϵ 1(·), Y η1,λϵ 1(·), Zη1,λϵ 1(·)) denote the cor- responding state trajectory for the perturbed variable pair λϵ 1. To simplify notation, we replace the superscripts ( η1, λ∗ 1), ( η1, λϵ 1), and ( η1, λ1

work page

[32] [32]

of the state triple ( X ·(·), Y ·(·), Z·(·)) with ∗, ϵ, and 1, respectively. Then, we introduce the following variation equation:    dX1(t) = A1X1 − B1R−1 1 (B⊤ 1 Y 1 + D⊤ 1 Z1 + λ1 1) dt + [C1X1 − D1R−1 1 (B⊤ 1 Y 1 + D⊤ 1 Z1 + λ1 1)]dW (t), dY 1(t) = − [A⊤ 1 Y 1 + C⊤ 1 Z1 + Q1X1 + ˜λ1 1]dt + Z1dW (t), X1(0) =0, Y 1(T ) = G1X1(T ). Notice that ...

work page

[33] [33]

Now, we turn to proving the main theorem for Problem (F-3) in detail

is the optimal pair, then E˜˜u η1,λ∗ 1 1 = α1 and EX ∗ = β1. Now, we turn to proving the main theorem for Problem (F-3) in detail. First, we provide the detailed proof of Lemma 3.8. The proof of Lemma 3.8. By inserting the operator representations of ˜˜uη1,λ1 1 (·), X η1,λ1(·), X η1,λ1(T ), and β1(T ) , which are given from (3.15) to (3.17) respectively, ...

work page

[34] [34]

are the optimal control variables. Then we have that ˜J1(α∗ 1(·), β∗ 1(·)) = (K∗ 2,1Q1K2,1 + K∗ 1,1R1K1,1 + K∗ 3,1G1K3,1)x, x Rn + (K∗ 2,2Q1K2,2 + K∗ 1,2R1K1,2 + ¯R1 + K∗ 3,2G1K3,2)α1, α1 L2 + (K∗ 2,3Q1K2,3 + K∗ 1,3R1K1,3 + ¯Q1 + K∗ 3,3G1K3,3)β1, β1 L2 + (K∗ 2,4Q1K2,4 + K∗ 1,4R1K1,4 + K∗ 3,4G1K3,4)u2, u2 U2 + 2 (K∗ 2,2Q1K2,1 + K∗ 1,2R1K1,1 + K∗ 3,2G1K3,1)...

work page