An Effective Particle Gradient Projection Method for Solving Stochastic and Mean Field Control Problem

Hui Sun

arxiv: 2604.06675 · v1 · submitted 2026-04-08 · 🧮 math.OC

An Effective Particle Gradient Projection Method for Solving Stochastic and Mean Field Control Problem

Hui Sun This is my paper

Pith reviewed 2026-05-10 18:14 UTC · model grok-4.3

classification 🧮 math.OC

keywords stochastic optimal controlmean field controlprojection methodrandomized neural networkshigh-dimensional HJB equationsmesh-free methodsstochastic maximum principle

0 comments

The pith

A projection algorithm with randomized neural networks solves high-dimensional stochastic optimal control and mean field control problems without backpropagation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a mesh-free numerical method for stochastic optimal control problems and mean field control problems. It relies on a projection algorithm inspired by the stochastic maximum principle and represents controls using randomized neural networks. Updates occur through regression on sampled trajectories rather than by minimizing a loss function via backpropagation. This design targets problems in dimensions of 100 and higher while also addressing the associated high-dimensional Hamilton-Jacobi-Bellman equations. Tests indicate the method typically achieves better performance than direct deep learning approaches on the same tasks.

Core claim

The authors introduce a particle gradient projection method powered by randomized neural networks for solving stochastic optimal control problems. The algorithm iteratively refines the control via regression steps drawn from the stochastic maximum principle, avoiding direct error backpropagation to train the networks. This enables effective handling of problems in dimensions 100 and above, as well as mean field control problems and, through links to HJB equations, high-dimensional and infinite-dimensional HJ equations solved pointwise for a given initial distribution.

What carries the argument

The particle gradient projection algorithm, which updates the control policy through regression on trajectories using randomized neural network approximations derived from the stochastic maximum principle.

Load-bearing premise

The projection algorithm powered by randomized neural networks will reliably converge and outperform backpropagation-based methods without a provided convergence proof or detailed error analysis.

What would settle it

A test on a stochastic control problem in dimension 100 or higher where the method produces higher final costs or fails to stabilize compared to a standard deep neural network solver trained by backpropagation.

Figures

Figures reproduced from arXiv: 2604.06675 by Hui Sun.

**Figure 2.** Figure 2: Comparison between the numerical control values and the exact solution. [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Benchmarking numerical solutions against the exact solutions over [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Benchmarking numerical solutions against the exact solutions over [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of the L2 loss between the benchmark method and the proposed method. Left figure: L2 error compared over computational time. Apparently, our proposed approach takes more time per epoch. However, even within the same time, our method reaches a lower L2 error. Mid and Right: comparison of the L2 loss over the number of training epochs. Our proposed method achieves much smaller L2 errors in much fe… view at source ↗

**Figure 6.** Figure 6: Comparing the predicted solution (control) against the exact solution. Left: control function at [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison between the control learned (orange) and the exact control function for the mean variance [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison between the control learned (orange) and the exact control function for the price impact [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗

**Figure 9.** Figure 9: A Sine function approximated by the proposed algorithm. Left: comparison between the exact function value and the numerical results. Right: the figure of loss decay. Conclusion In this paper, we design descent-based numerical schemes for solving stochastic optimal control and meanfield control problems. On six test examples, the algorithm performs well across a range of problem setups and demonstrates ove… view at source ↗

read the original abstract

This work puts forward a novel numerical approach for solving the stochastic optimal control problem (SOCP) and the mean field control (MFC) problem using projection algorithm inspired by the stochastic maximum principle (SMP) which is also powered by the randomized neural network. This approach is mesh-free, derivative free and it relies on gradually updating the underlying control via regression. It distinguishes itself from other traditional deep learning methods as it does not require minimizing the loss/cost function via direct error backward propagation to train the neural networks. The methodology designed can effectively solve stochastic optimal control problem in high dimensions ($100$ and above) and it can also be used to solve the mean field control problems. Due to the connection between the HJB equations and SOCP, the designed approach also provides a procedure for solving high dimensional HJB equations. Importantly, the infinite dimensional HJ equation related to the mean field control problem can also be solved in a point-wise sense (given the initial distribution) due to its connection with the Mean Field Control (MFC) problem. Our extensive test results show that the proposed approach typically performs better than the direct deep learning based approaches for solving control problems. We will leave the convergence proof and the extension to Mean Field Games (MFG) as future works.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a projection algorithm with randomized networks for high-dimensional SOCP and MFC that skips backprop, but defers convergence analysis so the performance claims rest on unshown tests.

read the letter

The core contribution is a mesh-free projection method that iteratively updates controls by regressing onto the stochastic maximum principle condition using randomized neural networks, applied to both stochastic optimal control and mean-field control. It avoids training via direct loss minimization and backpropagation, which is the main distinction from standard deep learning solvers for these problems. The approach also links to high-dimensional HJB equations and provides a pointwise route to the infinite-dimensional HJ equation for MFC given the initial distribution. That framing is useful and the SMP connection is handled cleanly. The claim of handling dimensions 100 and above is the practical hook if it holds. The soft spots are the missing pieces on convergence of the projection iterations and any approximation error bounds, both explicitly left for future work. Without those, the assertion that the method typically outperforms direct deep learning approaches depends entirely on the test cases, yet no details appear on particle counts, network widths, residual controls, or sensitivity to random seeds. In high dimensions regression errors could accumulate across steps without a priori checks on how the particle discretization interacts with the randomized network fits. The thinking is straightforward and the method is positioned honestly as an alternative rather than a replacement, but the numerical evidence needs more substance to support the strong claims. This is for researchers developing numerical tools for stochastic and mean-field control who want options beyond full deep optimization pipelines. It deserves peer review so the full experiments and any added analysis can be evaluated properly.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a particle gradient projection method powered by randomized neural networks to solve stochastic optimal control problems (SOCP) and mean field control (MFC) problems. The approach iteratively updates the control via regression to enforce the stochastic maximum principle (SMP) condition without direct backpropagation of a loss function, claiming to be mesh-free and derivative-free. It asserts effectiveness for high-dimensional problems (dimensions 100 and above), superior performance over direct deep-learning methods based on extensive tests, and applicability to solving high-dimensional HJB equations and infinite-dimensional HJ equations for MFC in a pointwise sense given the initial distribution. Convergence analysis and extension to mean field games are deferred to future work.

Significance. If the numerical claims hold and the deferred convergence and error analysis can be supplied, the method could provide a practical alternative for high-dimensional control problems by avoiding full backpropagation and leveraging randomized networks for regression-based projection steps. The explicit links to the SMP and HJB equations offer a theoretically motivated framework that might scale better than standard PINN-style approaches in dimensions where particle methods are feasible. However, without quantitative error bounds or sensitivity studies, the significance remains provisional and tied to the specific test cases reported.

major comments (3)

[§3] §3 (Algorithm description): The iterative projection steps that regress the control update via randomized neural networks to satisfy the SMP lack any convergence guarantee or a priori error bound on the residual; the manuscript explicitly defers both the convergence proof and approximation-error analysis to future work. This is load-bearing for the central claim because high-dimensional performance (dimensions 100+) and the assertion of outperforming direct deep-learning methods rest entirely on the reliability of these iterations without control on regression error accumulation or interaction with the particle discretization.
[§4] §4 (Numerical experiments): The claim that the approach “typically performs better than the direct deep learning based approaches” is supported only by unquantified test results; no tables or figures report concrete metrics such as relative errors, wall-clock times, sensitivity to network width/particle count/random seeds, or direct head-to-head comparisons with error bars. Without these, the high-dimensional effectiveness assertion cannot be evaluated independently of the deferred analysis.
[§2.2] §2.2 (Connection to HJB/MFC): The statement that the method solves the infinite-dimensional HJ equation for MFC “in a point-wise sense (given the initial distribution)” is asserted via the SMP link but no explicit derivation or equation is supplied showing how the particle-based projection yields a pointwise solution operator; this step is load-bearing for the MFC claim yet remains informal.

minor comments (2)

[§3] Notation for the randomized neural network approximation and the projection operator is introduced without a clear table of symbols or consistent use across sections, making it difficult to track the precise form of the regression step.
The abstract and introduction cite “extensive test results” but the manuscript provides no supplementary material or repository link for the code, random seeds, or full experimental setup, which is standard for reproducibility in numerical optimization papers.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the thorough and constructive report. The comments highlight important aspects of the theoretical foundations, numerical validation, and clarity of the MFC connection. We address each major comment below and indicate the planned revisions.

read point-by-point responses

Referee: §3 (Algorithm description): The iterative projection steps that regress the control update via randomized neural networks to satisfy the SMP lack any convergence guarantee or a priori error bound on the residual; the manuscript explicitly defers both the convergence proof and approximation-error analysis to future work. This is load-bearing for the central claim because high-dimensional performance (dimensions 100+) and the assertion of outperforming direct deep-learning methods rest entirely on the reliability of these iterations without control on regression error accumulation or interaction with the particle discretization.

Authors: We agree that a convergence guarantee and a priori error bounds would strengthen the theoretical foundation of the iterative projection steps. The manuscript is motivated by the stochastic maximum principle, with the regression-based projection designed to enforce the optimality condition at each iteration. As explicitly stated, the full convergence analysis and approximation-error study are deferred to future work. In the revised version we will expand the discussion in §3 to include a qualitative analysis of potential error sources (regression residual, particle discretization, and their interaction) and why the observed empirical stability in high dimensions is consistent with the SMP structure, while clearly reiterating the current limitations. revision: partial
Referee: §4 (Numerical experiments): The claim that the approach “typically performs better than the direct deep learning based approaches” is supported only by unquantified test results; no tables or figures report concrete metrics such as relative errors, wall-clock times, sensitivity to network width/particle count/random seeds, or direct head-to-head comparisons with error bars. Without these, the high-dimensional effectiveness assertion cannot be evaluated independently of the deferred analysis.

Authors: We accept that the numerical section would benefit from quantitative metrics to allow independent evaluation. Although the original manuscript reports extensive tests across dimensions up to 100+, the presentation was primarily qualitative. In the revision we will add tables and figures that report relative errors, wall-clock times, sensitivity studies with respect to particle number, network width, and random seeds, as well as direct comparisons against baseline deep-learning methods, each accompanied by error bars from repeated runs. revision: yes
Referee: §2.2 (Connection to HJB/MFC): The statement that the method solves the infinite-dimensional HJ equation for MFC “in a point-wise sense (given the initial distribution)” is asserted via the SMP link but no explicit derivation or equation is supplied showing how the particle-based projection yields a pointwise solution operator; this step is load-bearing for the MFC claim yet remains informal.

Authors: We thank the referee for this observation. The claim follows from the fact that, for a fixed initial distribution, the mean-field control problem reduces to a standard stochastic control problem for a representative particle whose law is approximated by the empirical measure; the projection step then yields a control that satisfies the SMP pointwise for that measure. In the revised manuscript we will insert an explicit derivation in §2.2 that links the particle regression operator to the pointwise solution of the infinite-dimensional Hamilton–Jacobi equation under the given initial measure. revision: yes

standing simulated objections not resolved

Full rigorous convergence proof and a priori error bounds for the iterative randomized-neural-network projection scheme, which the authors have deferred to a separate future work.

Circularity Check

0 steps flagged

No circularity: algorithm and empirical claims rest on independent numerical tests, not self-referential fits or derivations

full rationale

The paper proposes a mesh-free projection algorithm for SOCP/MFC that updates controls via randomized NN regression to satisfy the stochastic maximum principle, without backpropagation on a loss. Performance claims are supported solely by reported test comparisons against direct deep-learning baselines in high dimensions. No load-bearing step equates a 'prediction' to a fitted parameter by construction, invokes self-citations for uniqueness, or renames known results. Convergence and error analysis are explicitly left for future work, so the derivation chain does not reduce to its inputs. This is a standard honest numerical-methods paper whose central content is algorithmic and externally falsifiable via the tests.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the method is described at a high level without mathematical details.

pith-pipeline@v0.9.0 · 5518 in / 1156 out tokens · 55455 ms · 2026-05-10T18:14:22.191864+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages

[1]

Andersson and B

D. Andersson and B. Djehiche. A maximum principle for sdes of mean-field type.Appl Math Optim, 63:341–356, 2011

work page 2011
[2]

Archibald, F

R. Archibald, F. Bao, Y. Cao, and H. Sun. Numerical analysis for convergence of a sample-wise backprop- agation method for training stochastic neural networks.SIAM J. Numer. Anal., 62(2):593–621, 2024

work page 2024
[3]

Bao and H

F. Bao and H. Sun. Batch sample-wise stochastic optimal control via stochastic maximum principle.arXiv preprint, 2025. arXiv:2505.02688

work page arXiv 2025
[4]

Archibald, F

R. Archibald, F. Bao, Y. Cao, and H. Zhang. A backward sde method for uncertainty quantification in deep learning.Discrete Contin. Dyn. Syst. Ser. S, 15(7):2807–2835, 2022

work page 2022
[5]

W. Cai, S. Fang, and T. Zhou. Soc-martnet: A martingale neural network for the hamilton–jacobi–bellman equation without explicit inf u∈U hin stochastic optimal controls.SIAM J. Sci. Comput., 47(4):795–819, 2025

work page 2025
[6]

Bensoussan

A. Bensoussan. Lecture on stochastic control. InNonlinear Filtering and Stochastic Control, volume 972 ofLecture Notes in Mathematics, pages 1–62. Springer-Verlag, Berlin, New York, 1982

work page 1982
[7]

Biagini, Y

F. Biagini, Y. Hu, B. Øksendal, and A. Sulem. A stochastic maximum principle for processes driven by fractional brownian motion.Stochastic Process. Appl., 100(1-2):233–253, 2002

work page 2002
[8]

Carmona.Lectures on BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications

R. Carmona.Lectures on BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications. SIAM, Philadelphia, PA, 2016

work page 2016
[9]

Carmona, J

R. Carmona, J. P. Fouque, and L. Sun. Mean field games and systemic risk.Commun. Math. Sci., 13(4):911–933, 2015

work page 2015
[10]

Carmona and M

R. Carmona and M. Lauri` ere. Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games i: The ergodic case.SIAM J. Numer. Anal., 59(3):1455–1485, 2021

work page 2021
[11]

Carmona and M

R. Carmona and M. Lauri` ere. Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games: ii—the finite horizon case.Ann. Appl. Probab., 32(6):4065–4105, 2022

work page 2022
[12]

Carmona and M

R. Carmona and M. Lauri` ere. Deep learning for mean field games and mean field control with applications to finance. In J. J. Hasbrouck and T. J. Sargent, editors,Deep Learning in Economics, pages 369–392. Cambridge University Press, 2023

work page 2023
[13]

Extended mean field control problems: Stochastic maximum principle and transport perspective.SIAM Journal on Control and Optimization, 57(6):3666–3693, 2019

Beatrice Acciaio, Julio Backhoff-Veraguas, and Ren´ e Carmona. Extended mean field control problems: Stochastic maximum principle and transport perspective.SIAM Journal on Control and Optimization, 57(6):3666–3693, 2019

work page 2019
[14]

Domingo-Enrich, J

C. Domingo-Enrich, J. Han, B. Amos, J. Bruna, and R. T. Q. Chen. Stochastic optimal control matching. arXiv preprint, 2023. arXiv:2312.02027

work page arXiv 2023
[15]

N. Du, J. T. Shi, and W. B. Liu. An effective gradient projection method for stochastic optimal control. Int. J. Numer. Anal. Model., 4(4):757–774, 2013. 24

work page 2013
[16]

W. E., J. Han, and A. Jentzen. Deep learning-based numerical methods for high-dimensional parabolic par- tial differential equations and backward stochastic differential equations.Commun. Math. Stat., 5(4):349– 380, 2017

work page 2017
[17]

B. Gong, W. Liu, T. Tang, W. Zhao, and T. Zhou. An efficient gradient projection method for stochastic optimal control problems.SIAM J. Numer. Anal., 55(6):2982–3005, 2017

work page 2017
[18]

Han and S

Q. Han and S. Ji. A multi-step algorithm for bsdes based on a predictor-corrector scheme and least-squares monte carlo.Methodol. Comput. Appl. Probab., 24(4):2403–2426, 2022

work page 2022
[19]

Han and W

J. Han and W. E. Deep learning approximation for stochastic control problems. InAdvances in Neural Information Processing Systems, Deep Reinforcement Learning Workshop, 2016

work page 2016
[20]

M. Han, M. Lauri` ere, and E. Vanden-Eijnden. A simulation-free deep learning approach to stochastic optimal control.arXiv preprint, 2024. arXiv:2410.05163

work page arXiv 2024
[21]

F. B. Hanson.Applied Stochastic Processes and Control for Jump-Diffusions: Modeling, Analysis, and Computation. SIAM, Philadelphia, PA, 2007

work page 2007
[22]

U. G. Haussmann. Some examples of optimal stochastic controls or: The stochastic maximum principle at work.SIAM Rev., 23(2):292–307, 1981

work page 1981
[23]

H. J. Kushner. Numerical methods for stochastic control problems in continuous time.SIAM J. Control Optim., 28(5):999–1026, 1990

work page 1990
[24]

X. Li, D. Verma, and L. Ruthotto. A neural network approach for stochastic optimal control.SIAM J. Sci. Comput., 46(5):535–556, 2024

work page 2024
[25]

Q. Li, L. Chen, C. Tai, and W. E. Maximum principle based algorithms for deep learning.J. Mach. Learn. Res., 18(1):5998–6026, 2018

work page 2018
[26]

Min and R

M. Min and R. Hu. Signatured deep fictitious play for mean field games with common noise. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 7731–7740. PMLR, 2021

work page 2021
[27]

S. Peng. Backward stochastic differential equations and applications to optimal control.Appl. Math. Optim., 27(2):125–144, 1993

work page 1993
[28]

S. Peng. A general stochastic maximum principle for optimal control problems.SIAM J. Control Optim., 28(4):966–979, 1990

work page 1990
[29]

Peng and E

S. Peng and E. Pardoux. Backward stochastic differential equations and quasilinear parabolic partial differential equations. In B. L. Rozovskii and R. B. Sowers, editors,Stochastic Partial Differential Equations and Their Applications, volume 176 ofLecture Notes in Control and Information Sciences, pages 200–217. Springer, Berlin, Heidelberg, 1992

work page 1992
[30]

Pham.Continuous-Time Stochastic Control and Optimization with Financial Applications, volume 61 ofStochastic Modelling and Applied Probability

H. Pham.Continuous-Time Stochastic Control and Optimization with Financial Applications, volume 61 ofStochastic Modelling and Applied Probability. Springer, Berlin, 2009

work page 2009
[31]

Pham and X

H. Pham and X. Warin. Mean-field neural networks-based algorithms for mckean-vlasov control problems. J. Mach. Learn. Model. Comput., 3(2):176–214, 2024

work page 2024
[32]

Pham and X

H. Pham and X. Warin. Actor-critic learning algorithms for mean-field control with moment neural net- works.arXiv preprint, 2023. arXiv:2309.04317

work page arXiv 2023
[33]

Pham and X

H. Pham and X. Wei. Bellman equation and viscosity solutions for mean-field stochastic control problem. ESAIM: COCV, 24(1):437–461, 2018. 25

work page 2018
[34]

H. Sun. Meshfree approximation for stochastic optimal control problems.Commun. Math. Res., 37(3):387– 420, 2021

work page 2021
[35]

H. M. Soner, J. Teichmann, and Qinxin Yan. Learning algorithms for mean field optimal control.arXiv preprint, 2025. arXiv:2503.17869

work page arXiv 2025
[36]

Herrera, F

C. Herrera, F. Krach, P. Ruyssen, and J. Teichmann. Optimal stopping via randomized neural networks. Front. Math. Finance, 3(1):31–77, 2025

work page 2025
[37]

Yong and X

J. Yong and X. Y. Zhou.Stochastic Controls: Hamiltonian Systems and HJB Equations, volume 43 of Applications of Mathematics. Springer, New York, 1999

work page 1999
[38]

Zhang.Backward Stochastic Differential Equations: From Linear to Fully Nonlinear Theory, volume 86 ofProbability Theory and Stochastic Modelling

J. Zhang.Backward Stochastic Differential Equations: From Linear to Fully Nonlinear Theory, volume 86 ofProbability Theory and Stochastic Modelling. Springer, 2017

work page 2017
[39]

Zhang, Y

R. Zhang, Y. Lan, G.-B. Huang, and Z.-B. Xu. Universal approximation of extreme learning machine with adaptive growth of hidden nodes.IEEE Trans. Neural Netw. Learn. Syst., 23(2):365–371, 2012

work page 2012
[40]

W. Zhao, L. Chen, and S. Peng. A new kind of accurate numerical method for backward stochastic differential equations.SIAM J. Sci. Comput., 28(4):1563–1581, 2006

work page 2006
[41]

Kolda and Jackson R

Tamara G. Kolda and Jackson R. Mayo. An adaptive shifted power method for computing generalized tensor eigenpairs.SIAM Journal on Matrix Analysis and Applications, 35(4):1563–1581, 2014

work page 2014
[42]

SIAM style manual: For journals and books. 2013

work page 2013
[43]

A call for better indexes.SIAM Blogs, November 2014

Nick Higham. A call for better indexes.SIAM Blogs, November 2014

work page 2014
[44]

Kolda, and Ali Pinar

Chengbin Peng, Tamara G. Kolda, and Ali Pinar. Accelerating community detection by using K-core subgraphs. arXiv:1403.2226, March 2014

work page arXiv 2014
[45]

Woessner, Shanrong Zhang, Matthew E

Donald E. Woessner, Shanrong Zhang, Matthew E. Merritt, and A. Dean Sherry. Numerical solution of the Bloch equations provides insights into the optimum design of PARACEST agents for MRI.Magnetic Resonance in Medicine, 53(4):790–799, 2005

work page 2005
[46]

M. E. J. Newman. Properties of highly clustered networks.Phys. Rev. E, 68:026121, 2003

work page 2003
[47]

Clawpack software

Clawpack Development Team. Clawpack software. Version 5.2.2, 2015

work page 2015
[48]

Mathematics Subject Classification

American Mathematical Society. Mathematics Subject Classification. 2010

work page 2010
[49]

Addison-Wesley, Reading, MA, 1986

Leslie Lamport.L ATEX: A Document Preparation System. Addison-Wesley, Reading, MA, 1986

work page 1986
[50]

Addison-Wesley, 2nd edition, 2004

Frank Mittlebach and Michel Goossens.The L ATEX Companion. Addison-Wesley, 2nd edition, 2004

work page 2004
[51]

Golub and Charles F

Gene H. Golub and Charles F. Van Loan.Matrix Computations. The Johns Hopkins University Press, Baltimore, 4th edition, 2013

work page 2013
[52]

Paul’s online math notes: Calculus i — notes

Paul Dawkins. Paul’s online math notes: Calculus i — notes. 2015

work page 2015
[53]

User’s guide for theamsmathpackage (version 2.0)

American Mathematical Society. User’s guide for theamsmathpackage (version 2.0). 2002

work page 2002
[54]

Short math guide for L ATEX

Michael Downes. Short math guide for L ATEX. 2002

work page 2002
[55]

Manual for packagePGFPLOTS

Christian Feuers¨ anger. Manual for packagePGFPLOTS. May 2015

work page 2015
[56]

J. N. Tsitsiklis and B. Van Roy. Regression methods for pricing complex American-style options.IEEE Transactions on Neural Networks, 12(4):694–703, 2001. 26

work page 2001
[57]

Carmona and D

R. Carmona and D. Lacker. A probabilistic weak formulation of mean field games and applications.Ann. Appl. Probab., 25(3):1189–1231, 2015

work page 2015
[58]

Carmona and F

R. Carmona and F. Delarue.Probabilistic Theory of Mean Field Games with Applications. I, volume 83 of Probability Theory and Stochastic Modelling. Springer, Cham, 2018

work page 2018
[59]

Cardaliaguet

P. Cardaliaguet. Notes from P.-L. Lions’ lectures at the Coll` ege de France. Technical report, 2012. 27

work page 2012

[1] [1]

Andersson and B

D. Andersson and B. Djehiche. A maximum principle for sdes of mean-field type.Appl Math Optim, 63:341–356, 2011

work page 2011

[2] [2]

Archibald, F

R. Archibald, F. Bao, Y. Cao, and H. Sun. Numerical analysis for convergence of a sample-wise backprop- agation method for training stochastic neural networks.SIAM J. Numer. Anal., 62(2):593–621, 2024

work page 2024

[3] [3]

Bao and H

F. Bao and H. Sun. Batch sample-wise stochastic optimal control via stochastic maximum principle.arXiv preprint, 2025. arXiv:2505.02688

work page arXiv 2025

[4] [4]

Archibald, F

R. Archibald, F. Bao, Y. Cao, and H. Zhang. A backward sde method for uncertainty quantification in deep learning.Discrete Contin. Dyn. Syst. Ser. S, 15(7):2807–2835, 2022

work page 2022

[5] [5]

W. Cai, S. Fang, and T. Zhou. Soc-martnet: A martingale neural network for the hamilton–jacobi–bellman equation without explicit inf u∈U hin stochastic optimal controls.SIAM J. Sci. Comput., 47(4):795–819, 2025

work page 2025

[6] [6]

Bensoussan

A. Bensoussan. Lecture on stochastic control. InNonlinear Filtering and Stochastic Control, volume 972 ofLecture Notes in Mathematics, pages 1–62. Springer-Verlag, Berlin, New York, 1982

work page 1982

[7] [7]

Biagini, Y

F. Biagini, Y. Hu, B. Øksendal, and A. Sulem. A stochastic maximum principle for processes driven by fractional brownian motion.Stochastic Process. Appl., 100(1-2):233–253, 2002

work page 2002

[8] [8]

Carmona.Lectures on BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications

R. Carmona.Lectures on BSDEs, Stochastic Control, and Stochastic Differential Games with Financial Applications. SIAM, Philadelphia, PA, 2016

work page 2016

[9] [9]

Carmona, J

R. Carmona, J. P. Fouque, and L. Sun. Mean field games and systemic risk.Commun. Math. Sci., 13(4):911–933, 2015

work page 2015

[10] [10]

Carmona and M

R. Carmona and M. Lauri` ere. Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games i: The ergodic case.SIAM J. Numer. Anal., 59(3):1455–1485, 2021

work page 2021

[11] [11]

Carmona and M

R. Carmona and M. Lauri` ere. Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games: ii—the finite horizon case.Ann. Appl. Probab., 32(6):4065–4105, 2022

work page 2022

[12] [12]

Carmona and M

R. Carmona and M. Lauri` ere. Deep learning for mean field games and mean field control with applications to finance. In J. J. Hasbrouck and T. J. Sargent, editors,Deep Learning in Economics, pages 369–392. Cambridge University Press, 2023

work page 2023

[13] [13]

Extended mean field control problems: Stochastic maximum principle and transport perspective.SIAM Journal on Control and Optimization, 57(6):3666–3693, 2019

Beatrice Acciaio, Julio Backhoff-Veraguas, and Ren´ e Carmona. Extended mean field control problems: Stochastic maximum principle and transport perspective.SIAM Journal on Control and Optimization, 57(6):3666–3693, 2019

work page 2019

[14] [14]

Domingo-Enrich, J

C. Domingo-Enrich, J. Han, B. Amos, J. Bruna, and R. T. Q. Chen. Stochastic optimal control matching. arXiv preprint, 2023. arXiv:2312.02027

work page arXiv 2023

[15] [15]

N. Du, J. T. Shi, and W. B. Liu. An effective gradient projection method for stochastic optimal control. Int. J. Numer. Anal. Model., 4(4):757–774, 2013. 24

work page 2013

[16] [16]

W. E., J. Han, and A. Jentzen. Deep learning-based numerical methods for high-dimensional parabolic par- tial differential equations and backward stochastic differential equations.Commun. Math. Stat., 5(4):349– 380, 2017

work page 2017

[17] [17]

B. Gong, W. Liu, T. Tang, W. Zhao, and T. Zhou. An efficient gradient projection method for stochastic optimal control problems.SIAM J. Numer. Anal., 55(6):2982–3005, 2017

work page 2017

[18] [18]

Han and S

Q. Han and S. Ji. A multi-step algorithm for bsdes based on a predictor-corrector scheme and least-squares monte carlo.Methodol. Comput. Appl. Probab., 24(4):2403–2426, 2022

work page 2022

[19] [19]

Han and W

J. Han and W. E. Deep learning approximation for stochastic control problems. InAdvances in Neural Information Processing Systems, Deep Reinforcement Learning Workshop, 2016

work page 2016

[20] [20]

M. Han, M. Lauri` ere, and E. Vanden-Eijnden. A simulation-free deep learning approach to stochastic optimal control.arXiv preprint, 2024. arXiv:2410.05163

work page arXiv 2024

[21] [21]

F. B. Hanson.Applied Stochastic Processes and Control for Jump-Diffusions: Modeling, Analysis, and Computation. SIAM, Philadelphia, PA, 2007

work page 2007

[22] [22]

U. G. Haussmann. Some examples of optimal stochastic controls or: The stochastic maximum principle at work.SIAM Rev., 23(2):292–307, 1981

work page 1981

[23] [23]

H. J. Kushner. Numerical methods for stochastic control problems in continuous time.SIAM J. Control Optim., 28(5):999–1026, 1990

work page 1990

[24] [24]

X. Li, D. Verma, and L. Ruthotto. A neural network approach for stochastic optimal control.SIAM J. Sci. Comput., 46(5):535–556, 2024

work page 2024

[25] [25]

Q. Li, L. Chen, C. Tai, and W. E. Maximum principle based algorithms for deep learning.J. Mach. Learn. Res., 18(1):5998–6026, 2018

work page 2018

[26] [26]

Min and R

M. Min and R. Hu. Signatured deep fictitious play for mean field games with common noise. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 7731–7740. PMLR, 2021

work page 2021

[27] [27]

S. Peng. Backward stochastic differential equations and applications to optimal control.Appl. Math. Optim., 27(2):125–144, 1993

work page 1993

[28] [28]

S. Peng. A general stochastic maximum principle for optimal control problems.SIAM J. Control Optim., 28(4):966–979, 1990

work page 1990

[29] [29]

Peng and E

S. Peng and E. Pardoux. Backward stochastic differential equations and quasilinear parabolic partial differential equations. In B. L. Rozovskii and R. B. Sowers, editors,Stochastic Partial Differential Equations and Their Applications, volume 176 ofLecture Notes in Control and Information Sciences, pages 200–217. Springer, Berlin, Heidelberg, 1992

work page 1992

[30] [30]

Pham.Continuous-Time Stochastic Control and Optimization with Financial Applications, volume 61 ofStochastic Modelling and Applied Probability

H. Pham.Continuous-Time Stochastic Control and Optimization with Financial Applications, volume 61 ofStochastic Modelling and Applied Probability. Springer, Berlin, 2009

work page 2009

[31] [31]

Pham and X

H. Pham and X. Warin. Mean-field neural networks-based algorithms for mckean-vlasov control problems. J. Mach. Learn. Model. Comput., 3(2):176–214, 2024

work page 2024

[32] [32]

Pham and X

H. Pham and X. Warin. Actor-critic learning algorithms for mean-field control with moment neural net- works.arXiv preprint, 2023. arXiv:2309.04317

work page arXiv 2023

[33] [33]

Pham and X

H. Pham and X. Wei. Bellman equation and viscosity solutions for mean-field stochastic control problem. ESAIM: COCV, 24(1):437–461, 2018. 25

work page 2018

[34] [34]

H. Sun. Meshfree approximation for stochastic optimal control problems.Commun. Math. Res., 37(3):387– 420, 2021

work page 2021

[35] [35]

H. M. Soner, J. Teichmann, and Qinxin Yan. Learning algorithms for mean field optimal control.arXiv preprint, 2025. arXiv:2503.17869

work page arXiv 2025

[36] [36]

Herrera, F

C. Herrera, F. Krach, P. Ruyssen, and J. Teichmann. Optimal stopping via randomized neural networks. Front. Math. Finance, 3(1):31–77, 2025

work page 2025

[37] [37]

Yong and X

J. Yong and X. Y. Zhou.Stochastic Controls: Hamiltonian Systems and HJB Equations, volume 43 of Applications of Mathematics. Springer, New York, 1999

work page 1999

[38] [38]

Zhang.Backward Stochastic Differential Equations: From Linear to Fully Nonlinear Theory, volume 86 ofProbability Theory and Stochastic Modelling

J. Zhang.Backward Stochastic Differential Equations: From Linear to Fully Nonlinear Theory, volume 86 ofProbability Theory and Stochastic Modelling. Springer, 2017

work page 2017

[39] [39]

Zhang, Y

R. Zhang, Y. Lan, G.-B. Huang, and Z.-B. Xu. Universal approximation of extreme learning machine with adaptive growth of hidden nodes.IEEE Trans. Neural Netw. Learn. Syst., 23(2):365–371, 2012

work page 2012

[40] [40]

W. Zhao, L. Chen, and S. Peng. A new kind of accurate numerical method for backward stochastic differential equations.SIAM J. Sci. Comput., 28(4):1563–1581, 2006

work page 2006

[41] [41]

Kolda and Jackson R

Tamara G. Kolda and Jackson R. Mayo. An adaptive shifted power method for computing generalized tensor eigenpairs.SIAM Journal on Matrix Analysis and Applications, 35(4):1563–1581, 2014

work page 2014

[42] [42]

SIAM style manual: For journals and books. 2013

work page 2013

[43] [43]

A call for better indexes.SIAM Blogs, November 2014

Nick Higham. A call for better indexes.SIAM Blogs, November 2014

work page 2014

[44] [44]

Kolda, and Ali Pinar

Chengbin Peng, Tamara G. Kolda, and Ali Pinar. Accelerating community detection by using K-core subgraphs. arXiv:1403.2226, March 2014

work page arXiv 2014

[45] [45]

Woessner, Shanrong Zhang, Matthew E

Donald E. Woessner, Shanrong Zhang, Matthew E. Merritt, and A. Dean Sherry. Numerical solution of the Bloch equations provides insights into the optimum design of PARACEST agents for MRI.Magnetic Resonance in Medicine, 53(4):790–799, 2005

work page 2005

[46] [46]

M. E. J. Newman. Properties of highly clustered networks.Phys. Rev. E, 68:026121, 2003

work page 2003

[47] [47]

Clawpack software

Clawpack Development Team. Clawpack software. Version 5.2.2, 2015

work page 2015

[48] [48]

Mathematics Subject Classification

American Mathematical Society. Mathematics Subject Classification. 2010

work page 2010

[49] [49]

Addison-Wesley, Reading, MA, 1986

Leslie Lamport.L ATEX: A Document Preparation System. Addison-Wesley, Reading, MA, 1986

work page 1986

[50] [50]

Addison-Wesley, 2nd edition, 2004

Frank Mittlebach and Michel Goossens.The L ATEX Companion. Addison-Wesley, 2nd edition, 2004

work page 2004

[51] [51]

Golub and Charles F

Gene H. Golub and Charles F. Van Loan.Matrix Computations. The Johns Hopkins University Press, Baltimore, 4th edition, 2013

work page 2013

[52] [52]

Paul’s online math notes: Calculus i — notes

Paul Dawkins. Paul’s online math notes: Calculus i — notes. 2015

work page 2015

[53] [53]

User’s guide for theamsmathpackage (version 2.0)

American Mathematical Society. User’s guide for theamsmathpackage (version 2.0). 2002

work page 2002

[54] [54]

Short math guide for L ATEX

Michael Downes. Short math guide for L ATEX. 2002

work page 2002

[55] [55]

Manual for packagePGFPLOTS

Christian Feuers¨ anger. Manual for packagePGFPLOTS. May 2015

work page 2015

[56] [56]

J. N. Tsitsiklis and B. Van Roy. Regression methods for pricing complex American-style options.IEEE Transactions on Neural Networks, 12(4):694–703, 2001. 26

work page 2001

[57] [57]

Carmona and D

R. Carmona and D. Lacker. A probabilistic weak formulation of mean field games and applications.Ann. Appl. Probab., 25(3):1189–1231, 2015

work page 2015

[58] [58]

Carmona and F

R. Carmona and F. Delarue.Probabilistic Theory of Mean Field Games with Applications. I, volume 83 of Probability Theory and Stochastic Modelling. Springer, Cham, 2018

work page 2018

[59] [59]

Cardaliaguet

P. Cardaliaguet. Notes from P.-L. Lions’ lectures at the Coll` ege de France. Technical report, 2012. 27

work page 2012