Extended mean-field control problems with Poissonian common noise: Stochastic maximum principle and Hamiltonian-Jacobi-Bellman equation

Jingfei Wang; Lijun Bo; Xiang Yu; Xiaoli Wei

arxiv: 2407.05356 · v5 · submitted 2024-07-07 · 🧮 math.OC · math.PR

Extended mean-field control problems with Poissonian common noise: Stochastic maximum principle and Hamiltonian-Jacobi-Bellman equation

Lijun Bo , Jingfei Wang , Xiaoli Wei , Xiang Yu This is my paper

Pith reviewed 2026-05-23 23:19 UTC · model grok-4.3

classification 🧮 math.OC math.PR

keywords mean-field controlstochastic maximum principleHamiltonian-Jacobi-Bellman equationPoissonian common noiseWasserstein spacerelaxed controlFokker-Planck equationjoint law dependence

0 comments

The pith

Mean-field control problems with joint state-control law dependence and Poissonian common noise admit a stochastic maximum principle connected to the HJB equation on Wasserstein space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops the stochastic maximum principle for mean-field control problems whose running cost and dynamics depend on the joint law of the state and control, in the presence of Poissonian common noise. It first derives the principle in a strong relaxed control setting to handle non-convex domains, using an extension transformation to manage compatibility with the conditional joint law, then proves equivalence to the original strict controls. Separately, the authors recast the problem as an equivalent controlled Fokker-Planck equation with measure-valued dynamics that include Poisson jumps, yielding the HJB equation on the Wasserstein space under open-loop strict controls and establishing the link between the two optimality characterizations.

Core claim

By introducing a strong relaxed control formulation and an extension transformation to overcome compatibility issues with the joint law under Poissonian common noise, followed by proving equivalence to strict controls, the stochastic maximum principle is obtained for the original problem. Equivalently, the controlled Fokker-Planck problem with Poisson jumps produces the HJB equation on the Wasserstein space for open-loop strict controls, and the SMP and HJB are shown to be connected.

What carries the argument

The extension transformation in the strong relaxed control formulation, which resolves compatibility issues arising from the joint law dependence and Poissonian common noise.

If this is right

The SMP provides necessary conditions for optimality in the strict control setting once equivalence is established.
The HJB equation on the Wasserstein space characterizes the value function for open-loop strict controls via the controlled measure-valued dynamics.
A direct connection links the necessary conditions from the SMP to the dynamic programming principle encoded in the HJB equation.
The framework applies to mean-field problems featuring both state-control joint law dependence and discontinuous Poissonian common noise.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The extension transformation technique could be tested on mean-field models with other jump processes to check robustness beyond Poisson noise.
The Wasserstein-space HJB formulation might support particle-based numerical approximations that incorporate the SMP as a verification tool.
Applications in areas with mean-field interactions and jump noise could use the SMP to derive explicit candidate controls before solving the HJB.

Load-bearing premise

The equivalence between the strong relaxed control formulation after the extension transformation and the original strict control formulation holds, so the SMP transfers to strict controls.

What would settle it

A concrete counterexample in which an optimal relaxed control under the joint law and Poisson jumps has no corresponding strict control would show that the derived SMP does not apply to the original problem.

Figures

Figures reproduced from arXiv: 2407.05356 by Jingfei Wang, Lijun Bo, Xiang Yu, Xiaoli Wei.

**Figure 1.** Figure 1: Our methodology for SMP state law has jumps whose sizes are characterized by generalized measure shifts in terms of some adjoint operator; see Lemma 4.1 and Lemma 4.2. This fact, combined with the Itˆo’s formula on flows of conditional probability measures (see Lemma 4.5), leads to an associated HJB equation on the space of probability measures in (54). We note that for every open-loop strict control of th… view at source ↗

**Figure 2.** Figure 2: Our methodology for the HJB equation tions and two formulations of the extended MFC problems with Poissonian common noise in both the strict and relaxed sense. In Section 3, we first develop the SMP for the extended MFC problems with Poissonian common noise in the relaxed formulation using the first order variation. We then establish the equivalence result between two different formulations to drive the S… view at source ↗

read the original abstract

This paper studies mean-field control problems with state-control joint law dependence and Poissonian common noise. We develop the stochastic maximum principle (SMP) and establish its connection to the Hamiltonian-Jacobi-Bellman (HJB) equation on the Wasserstein space. The presence of the conditional joint law and its discontinuity under Poissonian common noise bring new technical challenges. To develop the SMP when the control domain is not necessarily convex, we first consider a strong relaxed control formulation that allows us to perform the first-order variation. We propose the technique of extension transformation to overcome the compatibility issues arising from the joint law in the relaxed control formulation. By further establishing the equivalence between the relaxed control and the strict control formulations, we obtain the SMP for the original problem with strict controls. In the part to investigate the HJB equation, we formulate an equivalent controlled Fokker-Planck problem subjecting to a controlled measure-valued dynamics with Poisson jumps, which allows us to derive the HJB equation of the original problem under open-loop strict controls. We also establish the connection between the SMP and the HJB equation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable route to SMP and HJB for joint-law mean-field control with Poisson common noise by using an extension map on relaxed controls and a controlled Fokker-Planck equation.

read the letter

The central contribution is the extension transformation that lets them run first-order variation on a convex relaxed-control set even when the running cost depends on the joint law of state and control. They then claim equivalence back to strict controls and derive the HJB from an equivalent controlled Fokker-Planck equation that carries the Poisson jumps. Both steps are presented as adaptations that do not reduce to the references in the abstract, so the combination looks new for this setting. The logical flow—relaxed formulation, extension, equivalence, then measure-valued dynamics—is laid out cleanly in the abstract and appears to follow standard variation techniques with the necessary adjustments for the joint law and the jumps. The main soft spot is the equivalence step itself: the Poisson jumps create discontinuities in the conditional joint law, and it is not obvious from the high-level description whether the extension map preserves the necessary measurability and integrability without extra conditions on the intensity or the control set. If that equivalence is only shown under implicit regularity that the original strict problem may not satisfy, the SMP would not transfer. The HJB derivation via Fokker-Planck looks more straightforward once the controlled dynamics are accepted. This is a technical paper aimed at specialists already working on mean-field stochastic control with common noise or Wasserstein HJB equations. A reader who needs the precise statement of the SMP or the form of the HJB under joint-law dependence will get value if the proofs hold; otherwise the claims stay at the level of a plausible plan. I would send it to referees rather than desk-reject, because the technical obstacles are clearly identified and the proposed fixes are concrete enough to be checked.

Referee Report

2 major / 2 minor

Summary. The paper studies mean-field control problems depending on the joint law of state and control, subject to Poissonian common noise. It develops the stochastic maximum principle (SMP) first in a strong relaxed-control formulation (to permit first-order variation when the control set is non-convex), introduces an extension transformation to restore compatibility with the joint-law dependence, claims equivalence between the relaxed and original strict-control problems, and thereby obtains the SMP for strict controls. It then reformulates the problem as a controlled Fokker-Planck equation on the space of measures with Poisson jumps and derives the associated HJB equation on the Wasserstein space, establishing the link between the two approaches.

Significance. If the equivalence between relaxed and strict formulations survives the discontinuities induced by Poisson jumps without extra regularity, the work would provide a technically non-trivial extension of mean-field control theory to joint-law dependence and jump noise, supplying both necessary optimality conditions (SMP) and a dynamic-programming characterization (HJB). The combination of relaxed-control techniques with measure-valued dynamics under jumps is a natural but non-obvious step that could be useful for applications involving discontinuous mean-field interactions.

major comments (2)

[Abstract (paragraph on relaxed control and equivalence)] The equivalence between the strong relaxed control formulation and the original strict control formulation after the extension transformation is the load-bearing step that transfers the SMP to the original problem. Because Poissonian common noise produces discontinuities in the conditional joint law at jump times, the manuscript must verify that this equivalence continues to hold without additional regularity on the control set or the intensity measure; otherwise the SMP does not apply to strict controls. The abstract asserts the equivalence is established, but the provided text supplies no explicit conditions, proof outline, or verification that the transformation commutes with the jump discontinuities.
[Abstract (paragraph on HJB and Fokker-Planck)] The derivation of the HJB equation proceeds by formulating an equivalent controlled Fokker-Planck problem with Poisson jumps. The manuscript should state the precise regularity assumptions on the coefficients and the intensity measure that guarantee the measure-valued dynamics remain well-posed after the extension transformation, and should confirm that the open-loop strict controls used in the HJB derivation are consistent with the controls for which the SMP was obtained.

minor comments (2)

Notation for the conditional joint law and its evolution under Poisson jumps should be introduced with a short table or diagram to clarify the distinction between the pre- and post-jump measures.
[Abstract] The abstract refers to 'the connection between the SMP and the HJB equation' without indicating whether this is a verification theorem, a representation of the value function, or merely formal consistency; a one-sentence clarification would help readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and insightful comments on our work concerning mean-field control with joint-law dependence and Poissonian common noise. We address the two major comments below and will incorporate clarifications and additional details in a revised manuscript.

read point-by-point responses

Referee: [Abstract (paragraph on relaxed control and equivalence)] The equivalence between the strong relaxed control formulation and the original strict control formulation after the extension transformation is the load-bearing step that transfers the SMP to the original problem. Because Poissonian common noise produces discontinuities in the conditional joint law at jump times, the manuscript must verify that this equivalence continues to hold without additional regularity on the control set or the intensity measure; otherwise the SMP does not apply to strict controls. The abstract asserts the equivalence is established, but the provided text supplies no explicit conditions, proof outline, or verification that the transformation commutes with the jump discontinuities.

Authors: We agree that the equivalence is central and that the abstract should better signal the conditions under which it holds. The proof appears in Section 4 (Theorem 4.8 and the surrounding arguments), where the extension transformation is shown to preserve the conditional joint law across jumps under Assumptions (A1)–(A3) on the coefficients and intensity measure; these assumptions ensure the map remains measurable and the first-order variation is well-defined without extra regularity on the control set. We will revise the abstract to include a one-sentence summary of these conditions and add a short remark after the statement of the equivalence theorem clarifying that the transformation commutes with the Poisson jumps by construction of the relaxed control space. revision: yes
Referee: [Abstract (paragraph on HJB and Fokker-Planck)] The derivation of the HJB equation proceeds by formulating an equivalent controlled Fokker-Planck problem with Poisson jumps. The manuscript should state the precise regularity assumptions on the coefficients and the intensity measure that guarantee the measure-valued dynamics remain well-posed after the extension transformation, and should confirm that the open-loop strict controls used in the HJB derivation are consistent with the controls for which the SMP was obtained.

Authors: The well-posedness of the controlled Fokker-Planck equation with jumps is established in Section 5 under the same Assumptions (A1)–(A3) plus a Lipschitz condition on the intensity measure that guarantees unique strong solutions in the space of probability measures. The open-loop strict controls employed for the HJB derivation are exactly those for which the SMP was derived (see the consistency argument in Proposition 5.3). We will add an explicit statement of these regularity assumptions in the abstract and in the introduction to the HJB section, together with a sentence confirming that the control classes coincide. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper adapts standard variation techniques to derive the SMP first in the relaxed control setting (convex domain permits first-order variation), introduces an extension transformation to handle joint-law compatibility under Poisson jumps, then claims equivalence to strict controls. The HJB is obtained by reformulating as a controlled Fokker-Planck problem with measure-valued dynamics. No quoted step reduces by construction to a fitted parameter, self-citation loop, or renamed input; the central claims rest on explicit technical adaptations rather than tautological redefinitions. This is the normal non-circular outcome for a technical extension paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on background results from mean-field control and stochastic analysis whose precise statements are not given in the abstract; no free parameters or invented entities are introduced.

axioms (1)

domain assumption Standard technical assumptions on drift, diffusion, jump coefficients, and control sets that guarantee existence of solutions and allow first-order variations in the relaxed formulation.
Invoked implicitly to perform the first-order variation and establish equivalence between relaxed and strict controls.

pith-pipeline@v0.9.0 · 5733 in / 1299 out tokens · 21732 ms · 2026-05-23T23:19:19.962917+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Constrained mean-field control with singular controls: Existence, stochastic maximum principle and constrained FBSDE
math.OC 2025-01 unverdicted novelty 6.0

Establishes existence of optimal controls for constrained mean-field problems with singular controls and derives associated SMP and constrained FBSDEs using relaxed formulation and Lagrange multipliers.
Extended mean-field control under constraints: The generalized Fritz-John conditions and Lagrangian method
math.OC 2024-08 unverdicted novelty 5.0

The paper derives the stochastic maximum principle for mean-field control under dynamic constraints by embedding the problem in Banach-space optimization and applying generalized Fritz-John conditions to obtain a BSDE...

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · cited by 2 Pith papers

[1]

Acciaio, J

B. Acciaio, J. Backhoﬀ-Veraguas, and R. Carmona (2017): Ext ended mean ﬁeld control problems: stochastic maximum principle and transport persepctive. SIAM J. Contr. Optim. 57(6), 3666-3693

work page 2017
[2]

Andersson, and B

D. Andersson, and B. Djehiche (2010): A maximum principle for SD Es of mean-ﬁeld type. Appl. Math. Optim. 63, 341-356

work page 2010
[3]

Bahlali (2008): Necessary and suﬃcient optimality conditions f or relaxed and strict control prob- lems

S. Bahlali (2008): Necessary and suﬃcient optimality conditions f or relaxed and strict control prob- lems. SIAM J. Contr. Optim. 47(4), 2078-2095

work page 2008
[4]

Bayraktar, A

E. Bayraktar, A. Cosso, and H. Pham (2018): Randomized dyna mic programming principle and Feynman-Kac representation for optimal control of McKean-Vla sov dynamics. Trans. AMS 370(3), 2115-60

work page 2018
[5]

Bayraktar, I

E. Bayraktar, I. Ekren, and X. Zhang (2023): Comparison of v iscosity solutions for a class of second- order PDEs on the Wasserstein space. Preprint, available at arXiv:2 309.05040

work page 2023
[6]

Bensoussan (1981): Lecture on Stochastic Control, in Nonlinear Filtering and S tochastic Control

A. Bensoussan (1981): Lecture on Stochastic Control, in Nonlinear Filtering and S tochastic Control. Lecture Notes in Math. 972, Proc. Cortona, Springer-Verlag, Be rlin, New York

work page 1981
[7]

Bensoussan, J

A. Bensoussan, J. Frehse, and S. Yam (2015): The master equ ation in mean ﬁeld theory. J. Math. Pures Appl. 103(6), 1441-1474

work page 2015
[8]

L. Bo, T. Li and X. Yu (2022): Centralized systemic risk control in the interbank system: Weak formulation and Gamma-convergence. Stoch. Process. Appl. 150, 622-654

work page 2022
[9]

Buckdahn, B

R. Buckdahn, B. Djehiche, and J. Li (2011): A general maximum principle for SDEs of mean-ﬁeld type. Appl. Math. Optim. 64(2), 197-216

work page 2011
[10]

Buckdahn, J

R. Buckdahn, J. Li, S. Peng, and C. Rainer (2017): Mean-ﬁeld s tochastic diﬀerential equations and associated PDEs. Ann. Probab. 45(2), 824-878

work page 2017
[11]

Buckdahn, Y

R. Buckdahn, Y. Chen, and J. Li (2021): Partial derivative wit h respect to the measure and its application to general controlled mean-ﬁeld systems. Stoch. Process. Appl. 134, 265-307

work page 2021
[12]

Buckdahn, J

R. Buckdahn, J. Li, J.S. Li, and C.Z. Xing (2023): Path-dependin g controlled mean-ﬁeld cou- pled forward-backward SDEs. The associated stochastic maximum principle. Preprint, available at arXiv:2307.14148

work page arXiv 2023
[13]

Burzoni, V

M. Burzoni, V. Ignazio, M. Reppen, and H.M. Soner (2020): Visc osity solutions for controlled McKean-Vlasov jump diﬀusions. SIAM J. Contr. Optim. 58(3), 1676–1699. 43

work page 2020
[14]

Carmona, and F

R. Carmona, and F. Delarue (2015): Forward-backward stoc hastic diﬀerential equations and con- trolled McKean-Vlasov dynamics. Ann. Probab. 43(5), 2647-2700

work page 2015
[15]

Carmona, and F

R. Carmona, and F. Delarue (2017): Probabilistic Theory of Mean Field Games with Applications . Volume I: Mean Field FBSDEs, Control and Games, Springer-Verlag, New York

work page 2017
[16]

Chassagneux, D

J.F. Chassagneux, D. Crisan, and F. Delarue (2022): A probab ilistic approach to classical solutions of the master equation for large population equilibria. Memoirs AMS 280(1379), 1-121

work page 2022
[17]

Cosso, and H

A. Cosso, and H. Pham (2019): Zero-sum stochastic diﬀerent ial games of generalized McK- ean–Vlasov type. J. Math. Pures Appl. 129, 180-212

work page 2019
[18]

Cosso, F

A. Cosso, F. Gozzi, I. Kharroubi, H. Pham, and M. Rosestolato (2023): Optimal control of path- dependent McKean–Vlasov SDEs in inﬁnite-dimension. Ann. Appl. Probab. 33(4), 2863-918

work page 2023
[19]

Cosso, F

A. Cosso, F. Gozzi, I. Kharroubi, H. Pham, and M. Rosestolato (2024): Master Bellman equation in the Wasserstein space: Uniqueness of viscosity solutions. Trans. AMS 377(01), 31-83

work page 2024
[20]

Djehiche, H

B. Djehiche, H. Tembine, and R. Tempone (2015): A stochastic maximum principle for risk-sensitive mean-ﬁeld type control. IEEE Trans. Auto. Contr. 60(10), 2640-2649

work page 2015
[21]

Djete (2022): Extended mean ﬁeld control problem: a pro pagation of chaos result

M.F. Djete (2022): Extended mean ﬁeld control problem: a pro pagation of chaos result. Electronic J. Probab. 27, 1-53

work page 2022
[22]

Djete, D

M.F. Djete, D. Possama ¨ ı, and X. Tan (2022): McKean–Vlasov o ptimal control: the dynamic pro- gramming principle. Ann. Appl. Probab. 50(2):791-833

work page 2022
[23]

X. Guo, H. Pham, and X. Wei (2023): Itˆ o’s formula for ﬂows of measures on semimartingales. Stoch. Process. Appl. 159, 350-390

work page 2023
[24]

Guo, and J

X. Guo, and J. Zhang (2024): Itˆ o’s formula for ﬂows of condit ional measures on semimartingales. Preprint, available at arXiv:2404.11167

work page arXiv 2024
[25]

Hafayed, A

M. Hafayed, A. Abba, and S. Abbas (2014): On mean-ﬁeld stoc hastic maximum principle for near- optimal controls for Poisson jump diﬀusion with applications. Inter. J. Dyn. Contr. 2, 262-284

work page 2014
[26]

Hao (2020): Anticipated mean-ﬁeld backward stochastic diﬀ erential equations with jumps

T. Hao (2020): Anticipated mean-ﬁeld backward stochastic diﬀ erential equations with jumps. Lithuanian Math. J. 60(3), 359–375

work page 2020
[27]

Hern´ andez-Hern´ andez, and J.H

D. Hern´ andez-Hern´ andez, and J.H. Ricalde-Guerrero (202 3): Conditional McKean-Vlasov dif- ferential equations with common Poissonian noise: Propagation of c haos. Preprint, available at arXiv:2308.11564

work page arXiv
[28]

Hern´ andez-Hern´ andez, and J.H

D. Hern´ andez-Hern´ andez, and J.H. Ricalde-Guerrero (2024): Mean-ﬁeld games with common Pois- sonian noise: A maximum principle approach. Preprint, available at arX iv:2401.10952

work page arXiv 2024
[29]

Huang, R.P

M. Huang, R.P. Malham´ e, and P.E. Caines (2006): Large populat ion stochastic dynamic games closed-loop McKean-Vlasov systems and the nash certainty equiva lence principle. Commun. Inform. Syst. 6(3), 221-252

work page 2006
[30]

Jia, and X

Y. Jia, and X. Y. Zhou (2023): q-Learning in continuous time. Journal of Machine Learning Re- search. 24(161),1-61

work page 2023
[31]

Y. Jia, D. Ouyang, and Y. Zhang (2025): Accuracy of Discrete ly Sampled Stochastic Policies in Continuous-time Reinforcement Learning. Preprint, available at ar Xiv:2503.09981

work page arXiv 2025
[32]

Kallenberg

O. Kallenberg. Foundations of Modern Probability. Probability an d its Applications (New York). Springer Verlag, New York, second edition, 2002

work page 2002
[33]

Lauri` ere, and O

M. Lauri` ere, and O. Pironneau (2014): Dynamic programming for mean–ﬁeld type control. Comptes Rendus Math., 352(9), 707–713

work page 2014
[34]

Lasry, and P.L

J.M. Lasry, and P.L. Lions (2007): Mean ﬁeld games. Japanese J. Math. 2(1), 229-260. 44

work page 2007
[35]

Li (2012): Stochastic maximum principle in the mean-ﬁeld cont rols

J. Li (2012): Stochastic maximum principle in the mean-ﬁeld cont rols. Automatica. 48(2), 366-73

work page 2012
[36]

Meyer-Brandis, B

T. Meyer-Brandis, B. Øksendal, and X.Y. Zhou (2012): A mean- ﬁeld stochastic maximum principle via Malliavin calculus. Stochastics. 84(5-6), 643-666

work page 2012
[37]

McShane (1934): Extension of range of functions

E.J. McShane (1934): Extension of range of functions. Bull. AMS 40(12), 837-842

work page 1934
[38]

Mezerdi (2020): Equations diﬀ´ erentielles stochastiques de type McKean-Vlasov et leur contrˆ ole optimal

M.A. Mezerdi (2020): Equations diﬀ´ erentielles stochastiques de type McKean-Vlasov et leur contrˆ ole optimal. Analyse num´ erique Universit´ e de Toulon; Universit´ e Mohamed Khider (Biskra, Alg´ erie), 2020. Fran¸ cais.NNT : 2020TOUL0014. tel-03278583

work page 2020
[39]

Ma, and J

J. Ma, and J. Yong (1995): Solvability of forward-backward SD Es and the nodal set of Hamilton- Jacobi-Bellman equations. A Chinese summary appears in Chinese Ann. Math. Ser. A 16 (1995), no. 4, 532. Chinese Ann. Math. Ser. B 16 (1995), no. 3, pp 279–298

work page 1995
[40]

Nie, and K

T. Nie, and K. Yan(2022): Extended mean-ﬁeld control proble m with partial observation. ESAIM: Contr. Optim. Cal. Variat. 28:17

work page 2022
[41]

McCann (1997): A convexity principle for interacting gases

R.J. McCann (1997): A convexity principle for interacting gases . Adv. Math. 128(1), 153-179

work page 1997
[42]

Peng (1990): A general stochastic maximum principle for opt imal control problems

S. Peng (1990): A general stochastic maximum principle for opt imal control problems. SIAM J. Contr. Optim. 28(4), 966-979

work page 1990
[43]

Motte, and H

M. Motte, and H. Pham (2022): Mean-ﬁeld Markov decision proc ess with common noise and open- loop controls. Ann. Appl. Probab. 32(2), 1421-1458

work page 2022
[44]

Pham, and X

H. Pham, and X. Wei (2017): Dynamic programming for optimal c ontrol of stochastic McK- ean–Vlasov dynamics. SIAM J. Contr. Optim. 55(2), 1069–1101

work page 2017
[45]

Villani (2009): Optimal Transport: Old and New

C. Villani (2009): Optimal Transport: Old and New . Springer-Verlag, Berlin

work page 2009
[46]

Y. Shen, Q. Meng, and P. Shi (2014): Maximum principle for mean -ﬁeld jump–diﬀusion stochastic delay diﬀerential equations and its application to ﬁnance. Automatica. 50(6), 1565-1579

work page 2014
[47]

H. M. Soner, and Q. Yan (2022): Viscosity solutions for McKean -Vlasov control on a torus. Preprint, available at arXiv:2212.11053

work page arXiv 2022
[48]

Wang, and X.Y

H. Wang, and X.Y. Zhou (2020): Continuous-time mean–varianc e portfolio selection: A reinforce- ment learning framework. Math. Finance 30(4), 1273–1308

work page 2020
[49]

H. Wang, T. Zariphopoulou, and X.Y. Zhou (2020): Reinforceme nt learning in continuous time and space: A stochastic control approach. J. Machine Learning Res. 21(1), 8145-8178

work page 2020
[50]

Wei and X

X. Wei and X. Yu (2025): Continuous-time q-learning for mean- ﬁeld control problems. Appl. Math. Optim. 91, 10

work page 2025
[51]

Wu, and J

C. Wu, and J. Zhang (2020): Viscosity solutions to parabolic mas ter equations and McKean-Vlasov SDEs with closed-loop controls. Ann. Appl. Probab. 30(2), 936–986

work page 2020
[52]

Yong, and X.Y

J. Yong, and X.Y. Zhou (1999): Stochastic Controls: Hamiltonian Systems and HJB Equation s. Appl. Math. Vol. 43, Springer-Verlag, New York

work page 1999
[53]

Zhang, Z

X. Zhang, Z. Sun, and J. Xiong (2018): A general stochastic m aximum principle for a Markov regime switching jump-diﬀusion model of mean-ﬁeld type. SIAM J. Contr. Optim. 56(4), 2563-2592

work page 2018
[54]

J. Zhou, N. Touzi, and J. Zhang (2024): Viscosity solutions for HJB equations on the process space: Application to mean ﬁeld control with common noise. Preprint, availab le at arXiv:2401.04920. 45

work page arXiv 2024

[1] [1]

Acciaio, J

B. Acciaio, J. Backhoﬀ-Veraguas, and R. Carmona (2017): Ext ended mean ﬁeld control problems: stochastic maximum principle and transport persepctive. SIAM J. Contr. Optim. 57(6), 3666-3693

work page 2017

[2] [2]

Andersson, and B

D. Andersson, and B. Djehiche (2010): A maximum principle for SD Es of mean-ﬁeld type. Appl. Math. Optim. 63, 341-356

work page 2010

[3] [3]

Bahlali (2008): Necessary and suﬃcient optimality conditions f or relaxed and strict control prob- lems

S. Bahlali (2008): Necessary and suﬃcient optimality conditions f or relaxed and strict control prob- lems. SIAM J. Contr. Optim. 47(4), 2078-2095

work page 2008

[4] [4]

Bayraktar, A

E. Bayraktar, A. Cosso, and H. Pham (2018): Randomized dyna mic programming principle and Feynman-Kac representation for optimal control of McKean-Vla sov dynamics. Trans. AMS 370(3), 2115-60

work page 2018

[5] [5]

Bayraktar, I

E. Bayraktar, I. Ekren, and X. Zhang (2023): Comparison of v iscosity solutions for a class of second- order PDEs on the Wasserstein space. Preprint, available at arXiv:2 309.05040

work page 2023

[6] [6]

Bensoussan (1981): Lecture on Stochastic Control, in Nonlinear Filtering and S tochastic Control

A. Bensoussan (1981): Lecture on Stochastic Control, in Nonlinear Filtering and S tochastic Control. Lecture Notes in Math. 972, Proc. Cortona, Springer-Verlag, Be rlin, New York

work page 1981

[7] [7]

Bensoussan, J

A. Bensoussan, J. Frehse, and S. Yam (2015): The master equ ation in mean ﬁeld theory. J. Math. Pures Appl. 103(6), 1441-1474

work page 2015

[8] [8]

L. Bo, T. Li and X. Yu (2022): Centralized systemic risk control in the interbank system: Weak formulation and Gamma-convergence. Stoch. Process. Appl. 150, 622-654

work page 2022

[9] [9]

Buckdahn, B

R. Buckdahn, B. Djehiche, and J. Li (2011): A general maximum principle for SDEs of mean-ﬁeld type. Appl. Math. Optim. 64(2), 197-216

work page 2011

[10] [10]

Buckdahn, J

R. Buckdahn, J. Li, S. Peng, and C. Rainer (2017): Mean-ﬁeld s tochastic diﬀerential equations and associated PDEs. Ann. Probab. 45(2), 824-878

work page 2017

[11] [11]

Buckdahn, Y

R. Buckdahn, Y. Chen, and J. Li (2021): Partial derivative wit h respect to the measure and its application to general controlled mean-ﬁeld systems. Stoch. Process. Appl. 134, 265-307

work page 2021

[12] [12]

Buckdahn, J

R. Buckdahn, J. Li, J.S. Li, and C.Z. Xing (2023): Path-dependin g controlled mean-ﬁeld cou- pled forward-backward SDEs. The associated stochastic maximum principle. Preprint, available at arXiv:2307.14148

work page arXiv 2023

[13] [13]

Burzoni, V

M. Burzoni, V. Ignazio, M. Reppen, and H.M. Soner (2020): Visc osity solutions for controlled McKean-Vlasov jump diﬀusions. SIAM J. Contr. Optim. 58(3), 1676–1699. 43

work page 2020

[14] [14]

Carmona, and F

R. Carmona, and F. Delarue (2015): Forward-backward stoc hastic diﬀerential equations and con- trolled McKean-Vlasov dynamics. Ann. Probab. 43(5), 2647-2700

work page 2015

[15] [15]

Carmona, and F

R. Carmona, and F. Delarue (2017): Probabilistic Theory of Mean Field Games with Applications . Volume I: Mean Field FBSDEs, Control and Games, Springer-Verlag, New York

work page 2017

[16] [16]

Chassagneux, D

J.F. Chassagneux, D. Crisan, and F. Delarue (2022): A probab ilistic approach to classical solutions of the master equation for large population equilibria. Memoirs AMS 280(1379), 1-121

work page 2022

[17] [17]

Cosso, and H

A. Cosso, and H. Pham (2019): Zero-sum stochastic diﬀerent ial games of generalized McK- ean–Vlasov type. J. Math. Pures Appl. 129, 180-212

work page 2019

[18] [18]

Cosso, F

A. Cosso, F. Gozzi, I. Kharroubi, H. Pham, and M. Rosestolato (2023): Optimal control of path- dependent McKean–Vlasov SDEs in inﬁnite-dimension. Ann. Appl. Probab. 33(4), 2863-918

work page 2023

[19] [19]

Cosso, F

A. Cosso, F. Gozzi, I. Kharroubi, H. Pham, and M. Rosestolato (2024): Master Bellman equation in the Wasserstein space: Uniqueness of viscosity solutions. Trans. AMS 377(01), 31-83

work page 2024

[20] [20]

Djehiche, H

B. Djehiche, H. Tembine, and R. Tempone (2015): A stochastic maximum principle for risk-sensitive mean-ﬁeld type control. IEEE Trans. Auto. Contr. 60(10), 2640-2649

work page 2015

[21] [21]

Djete (2022): Extended mean ﬁeld control problem: a pro pagation of chaos result

M.F. Djete (2022): Extended mean ﬁeld control problem: a pro pagation of chaos result. Electronic J. Probab. 27, 1-53

work page 2022

[22] [22]

Djete, D

M.F. Djete, D. Possama ¨ ı, and X. Tan (2022): McKean–Vlasov o ptimal control: the dynamic pro- gramming principle. Ann. Appl. Probab. 50(2):791-833

work page 2022

[23] [23]

X. Guo, H. Pham, and X. Wei (2023): Itˆ o’s formula for ﬂows of measures on semimartingales. Stoch. Process. Appl. 159, 350-390

work page 2023

[24] [24]

Guo, and J

X. Guo, and J. Zhang (2024): Itˆ o’s formula for ﬂows of condit ional measures on semimartingales. Preprint, available at arXiv:2404.11167

work page arXiv 2024

[25] [25]

Hafayed, A

M. Hafayed, A. Abba, and S. Abbas (2014): On mean-ﬁeld stoc hastic maximum principle for near- optimal controls for Poisson jump diﬀusion with applications. Inter. J. Dyn. Contr. 2, 262-284

work page 2014

[26] [26]

Hao (2020): Anticipated mean-ﬁeld backward stochastic diﬀ erential equations with jumps

T. Hao (2020): Anticipated mean-ﬁeld backward stochastic diﬀ erential equations with jumps. Lithuanian Math. J. 60(3), 359–375

work page 2020

[27] [27]

Hern´ andez-Hern´ andez, and J.H

D. Hern´ andez-Hern´ andez, and J.H. Ricalde-Guerrero (202 3): Conditional McKean-Vlasov dif- ferential equations with common Poissonian noise: Propagation of c haos. Preprint, available at arXiv:2308.11564

work page arXiv

[28] [28]

Hern´ andez-Hern´ andez, and J.H

D. Hern´ andez-Hern´ andez, and J.H. Ricalde-Guerrero (2024): Mean-ﬁeld games with common Pois- sonian noise: A maximum principle approach. Preprint, available at arX iv:2401.10952

work page arXiv 2024

[29] [29]

Huang, R.P

M. Huang, R.P. Malham´ e, and P.E. Caines (2006): Large populat ion stochastic dynamic games closed-loop McKean-Vlasov systems and the nash certainty equiva lence principle. Commun. Inform. Syst. 6(3), 221-252

work page 2006

[30] [30]

Jia, and X

Y. Jia, and X. Y. Zhou (2023): q-Learning in continuous time. Journal of Machine Learning Re- search. 24(161),1-61

work page 2023

[31] [31]

Y. Jia, D. Ouyang, and Y. Zhang (2025): Accuracy of Discrete ly Sampled Stochastic Policies in Continuous-time Reinforcement Learning. Preprint, available at ar Xiv:2503.09981

work page arXiv 2025

[32] [32]

Kallenberg

O. Kallenberg. Foundations of Modern Probability. Probability an d its Applications (New York). Springer Verlag, New York, second edition, 2002

work page 2002

[33] [33]

Lauri` ere, and O

M. Lauri` ere, and O. Pironneau (2014): Dynamic programming for mean–ﬁeld type control. Comptes Rendus Math., 352(9), 707–713

work page 2014

[34] [34]

Lasry, and P.L

J.M. Lasry, and P.L. Lions (2007): Mean ﬁeld games. Japanese J. Math. 2(1), 229-260. 44

work page 2007

[35] [35]

Li (2012): Stochastic maximum principle in the mean-ﬁeld cont rols

J. Li (2012): Stochastic maximum principle in the mean-ﬁeld cont rols. Automatica. 48(2), 366-73

work page 2012

[36] [36]

Meyer-Brandis, B

T. Meyer-Brandis, B. Øksendal, and X.Y. Zhou (2012): A mean- ﬁeld stochastic maximum principle via Malliavin calculus. Stochastics. 84(5-6), 643-666

work page 2012

[37] [37]

McShane (1934): Extension of range of functions

E.J. McShane (1934): Extension of range of functions. Bull. AMS 40(12), 837-842

work page 1934

[38] [38]

Mezerdi (2020): Equations diﬀ´ erentielles stochastiques de type McKean-Vlasov et leur contrˆ ole optimal

M.A. Mezerdi (2020): Equations diﬀ´ erentielles stochastiques de type McKean-Vlasov et leur contrˆ ole optimal. Analyse num´ erique Universit´ e de Toulon; Universit´ e Mohamed Khider (Biskra, Alg´ erie), 2020. Fran¸ cais.NNT : 2020TOUL0014. tel-03278583

work page 2020

[39] [39]

Ma, and J

J. Ma, and J. Yong (1995): Solvability of forward-backward SD Es and the nodal set of Hamilton- Jacobi-Bellman equations. A Chinese summary appears in Chinese Ann. Math. Ser. A 16 (1995), no. 4, 532. Chinese Ann. Math. Ser. B 16 (1995), no. 3, pp 279–298

work page 1995

[40] [40]

Nie, and K

T. Nie, and K. Yan(2022): Extended mean-ﬁeld control proble m with partial observation. ESAIM: Contr. Optim. Cal. Variat. 28:17

work page 2022

[41] [41]

McCann (1997): A convexity principle for interacting gases

R.J. McCann (1997): A convexity principle for interacting gases . Adv. Math. 128(1), 153-179

work page 1997

[42] [42]

Peng (1990): A general stochastic maximum principle for opt imal control problems

S. Peng (1990): A general stochastic maximum principle for opt imal control problems. SIAM J. Contr. Optim. 28(4), 966-979

work page 1990

[43] [43]

Motte, and H

M. Motte, and H. Pham (2022): Mean-ﬁeld Markov decision proc ess with common noise and open- loop controls. Ann. Appl. Probab. 32(2), 1421-1458

work page 2022

[44] [44]

Pham, and X

H. Pham, and X. Wei (2017): Dynamic programming for optimal c ontrol of stochastic McK- ean–Vlasov dynamics. SIAM J. Contr. Optim. 55(2), 1069–1101

work page 2017

[45] [45]

Villani (2009): Optimal Transport: Old and New

C. Villani (2009): Optimal Transport: Old and New . Springer-Verlag, Berlin

work page 2009

[46] [46]

Y. Shen, Q. Meng, and P. Shi (2014): Maximum principle for mean -ﬁeld jump–diﬀusion stochastic delay diﬀerential equations and its application to ﬁnance. Automatica. 50(6), 1565-1579

work page 2014

[47] [47]

H. M. Soner, and Q. Yan (2022): Viscosity solutions for McKean -Vlasov control on a torus. Preprint, available at arXiv:2212.11053

work page arXiv 2022

[48] [48]

Wang, and X.Y

H. Wang, and X.Y. Zhou (2020): Continuous-time mean–varianc e portfolio selection: A reinforce- ment learning framework. Math. Finance 30(4), 1273–1308

work page 2020

[49] [49]

H. Wang, T. Zariphopoulou, and X.Y. Zhou (2020): Reinforceme nt learning in continuous time and space: A stochastic control approach. J. Machine Learning Res. 21(1), 8145-8178

work page 2020

[50] [50]

Wei and X

X. Wei and X. Yu (2025): Continuous-time q-learning for mean- ﬁeld control problems. Appl. Math. Optim. 91, 10

work page 2025

[51] [51]

Wu, and J

C. Wu, and J. Zhang (2020): Viscosity solutions to parabolic mas ter equations and McKean-Vlasov SDEs with closed-loop controls. Ann. Appl. Probab. 30(2), 936–986

work page 2020

[52] [52]

Yong, and X.Y

J. Yong, and X.Y. Zhou (1999): Stochastic Controls: Hamiltonian Systems and HJB Equation s. Appl. Math. Vol. 43, Springer-Verlag, New York

work page 1999

[53] [53]

Zhang, Z

X. Zhang, Z. Sun, and J. Xiong (2018): A general stochastic m aximum principle for a Markov regime switching jump-diﬀusion model of mean-ﬁeld type. SIAM J. Contr. Optim. 56(4), 2563-2592

work page 2018

[54] [54]

J. Zhou, N. Touzi, and J. Zhang (2024): Viscosity solutions for HJB equations on the process space: Application to mean ﬁeld control with common noise. Preprint, availab le at arXiv:2401.04920. 45

work page arXiv 2024