Nonlinear Stochastic Optimal Control and Optimal Stopping using the Fokker-Planck Transformation

Akan Selim; Ali Pakniyat; Panagiotis Tsiotras; Siddhartha Ganguly

arxiv: 2604.12153 · v1 · submitted 2026-04-14 · 🧮 math.OC

Nonlinear Stochastic Optimal Control and Optimal Stopping using the Fokker-Planck Transformation

Akan Selim , Siddhartha Ganguly , Ali Pakniyat , Panagiotis Tsiotras This is my paper

Pith reviewed 2026-05-10 16:43 UTC · model grok-4.3

classification 🧮 math.OC

keywords stochastic optimal controloptimal stoppingFokker-Planck equationcontinuity equationStein identitiesdistributional dynamic programmingvariational analysis

0 comments

The pith

A Fokker-Planck transformation converts nonlinear stochastic optimal control with stopping into deterministic density dynamics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a theoretical framework for nonlinear stochastic optimal control problems that include optimal stopping. It does so by creating a density-based deterministic representation of the underlying diffusion process. For cases with state-independent diffusion, the controlled Fokker-Planck equation becomes a continuity equation driven by a score-corrected velocity field. This produces deterministic characteristic dynamics that match the marginal probability laws of the original stochastic system. Using Stein-type identities, the distributional version of the dynamic programming equation shares the same second-order operator as the stochastic Hamilton-Jacobi-Bellman equation.

Core claim

For state-independent diffusion, the controlled Fokker-Planck equation can be rewritten as a continuity equation driven by a score-corrected velocity field, yielding deterministic characteristic dynamics that reproduce the marginal law of the stochastic system. Leveraging Stein-type identities, the associated distributional dynamic programming equation admits the same second-order differential operator as the distributional stochastic Hamilton-Jacobi-Bellman formulation. This representation is used to formulate an optimal control problem with state-dependent terminal-time assignment and terminal distributional constraints, from which first-order necessary conditions are derived via variat

What carries the argument

The Fokker-Planck transformation to a continuity equation with score-corrected velocity field, which produces deterministic dynamics whose marginal laws match those of the controlled stochastic diffusion.

If this is right

First-order necessary conditions follow for optimal control problems that include both a common terminal time and state-dependent stopping.
The same second-order operator appears in the distributional dynamic programming equation and the stochastic Hamilton-Jacobi-Bellman equation.
Terminal distributional constraints can be imposed directly on the deterministic density evolution.
Variational analysis yields the optimality conditions without requiring explicit solution of the underlying stochastic trajectories.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The deterministic reformulation could support numerical schemes that evolve densities rather than sample paths.
If the score correction generalizes, the approach might extend to state-dependent diffusion coefficients.
The necessary conditions may be checked on low-dimensional linear-quadratic examples where closed-form stochastic solutions are known.

Load-bearing premise

The derivation assumes state-independent diffusion to rewrite the Fokker-Planck equation as a continuity equation with score-corrected velocity.

What would settle it

Direct numerical comparison showing that the marginal distributions generated by the deterministic continuity equation exactly match the empirical marginals obtained from Monte Carlo simulation of the original controlled stochastic process.

Figures

Figures reproduced from arXiv: 2604.12153 by Akan Selim, Ali Pakniyat, Panagiotis Tsiotras, Siddhartha Ganguly.

**Figure 1.** Figure 1: Conceptual roadmap of our article. (C-d) First-order necessary conditions: Building on this reformulation, we formulate an optimal control problem with state-dependent terminal-time assignment and terminal distributional constraints, and analyze it by variational methods in the space of probability measures. This yields a (Pontryagin-type) first-order optimality system, applicable to both the common termin… view at source ↗

read the original abstract

In this paper, we develop a theoretical framework for nonlinear stochastic optimal control problems with optimal stopping by establishing a density-based deterministic representation of the underlying diffusion. For state-independent diffusion, we rewrite the controlled Fokker-Planck equation as a continuity equation driven by a score-corrected velocity field, yielding a deterministic characteristic dynamics that reproduces the marginal law of the stochastic system. Leveraging Stein-type identities, we show that the associated distributional dynamic programming equation admits the same second-order differential operator as the distributional stochastic Hamilton-Jacobi-Bellman formulation. Building on this representation, we formulate an optimal control problem with state-dependent terminal-time assignment and terminal distributional constraints and derive the first-order necessary conditions using variational analysis. We present the conditions both for a common terminal time and for the general case of state-dependent stopping.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A density-based reformulation for stochastic control with optimal stopping that works cleanly for state-independent diffusion but stays narrow.

read the letter

The core contribution is a deterministic density formulation for nonlinear stochastic optimal control that includes optimal stopping. For state-independent diffusion they convert the controlled Fokker-Planck equation into a continuity equation whose velocity includes a score correction; the resulting characteristics reproduce the marginal law of the original diffusion. Stein identities then let them equate the second-order operator in the distributional dynamic programming equation with the one from the stochastic HJB. On that basis they pose a problem with state-dependent terminal times and terminal density constraints, then extract first-order necessary conditions by standard variational arguments for both fixed and state-dependent stopping.

Referee Report

1 major / 3 minor

Summary. The manuscript develops a theoretical framework for nonlinear stochastic optimal control problems with optimal stopping by transforming the controlled Fokker-Planck equation into a deterministic continuity equation representation of the diffusion process. For state-independent diffusion coefficients, a score-corrected velocity field is introduced to recover the marginal laws via characteristic dynamics. Stein-type identities are used to establish that the distributional dynamic programming equation shares the same second-order operator as the distributional stochastic Hamilton-Jacobi-Bellman equation. The paper then poses an optimal control problem with state-dependent terminal times and terminal distributional constraints, deriving first-order necessary conditions via variational analysis for both fixed and state-dependent stopping.

Significance. If the central derivations are correct, the work provides a deterministic density-based route to stochastic control and stopping problems, which could aid analysis and computation by shifting focus from sample paths to continuity equations and variational conditions. The explicit invocation of Stein identities to equate operators and the variational treatment of state-dependent stopping times are strengths that offer a clean, non-stochastic derivation path under the stated assumptions. The restriction to state-independent diffusion is acknowledged in the claims and does not undermine the internal logic for the cases treated.

major comments (1)

[§3] §3 (Stein-identity equivalence): the claim that the distributional DP and HJB share the identical second-order operator rests on the Stein identity holding in the chosen function space; the manuscript should explicitly state the regularity assumptions on the densities and test functions that guarantee this identity applies without additional boundary terms.

minor comments (3)

[§2] The transition from the Fokker-Planck equation to the continuity equation (around Eq. (8)) would benefit from an explicit statement of the score function's definition and its dependence on the control.
[§4] Notation for the state-dependent terminal time and the associated terminal distributional constraint should be introduced earlier and used consistently in the variational analysis section.
[Introduction] A brief comparison to existing density-based or measure-valued control approaches would help situate the contribution, even if only in the introduction.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive evaluation of the manuscript and the constructive comment. We address the single major comment below and will incorporate the requested clarification in the revised version.

read point-by-point responses

Referee: [§3] §3 (Stein-identity equivalence): the claim that the distributional DP and HJB share the identical second-order operator rests on the Stein identity holding in the chosen function space; the manuscript should explicitly state the regularity assumptions on the densities and test functions that guarantee this identity applies without additional boundary terms.

Authors: We agree that the regularity assumptions underlying the Stein identity should be stated explicitly to ensure the absence of boundary terms. In the revised manuscript we will add a remark in §3 specifying the required conditions: the probability densities are assumed to belong to C^2 with sufficient decay at infinity (e.g., integrable first and second derivatives with Gaussian-type tails), and the test functions are taken from a dense subspace of H^1 such that integration by parts is justified without residual boundary integrals. This addition will make the equivalence between the distributional dynamic programming equation and the distributional stochastic Hamilton-Jacobi-Bellman equation fully rigorous under the stated hypotheses. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper establishes a density-based representation by rewriting the controlled Fokker-Planck equation (for state-independent diffusion) as a continuity equation with score-corrected velocity, then invokes Stein-type identities to equate the second-order operators between the distributional dynamic programming equation and the stochastic HJB formulation, and applies variational analysis to obtain first-order necessary conditions for the optimal stopping problem. These steps draw on standard external mathematical tools (Stein identities, continuity equations, variational calculus) rather than reducing any claimed result to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. No equations or claims in the provided text exhibit the enumerated circular patterns; the derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on the existence of a score function for the density and on the applicability of Stein-type identities to the controlled diffusion; these are standard in probability but their precise invocation here is not expanded in the abstract.

axioms (2)

domain assumption Stein-type identities hold for the chosen test functions and densities arising from the controlled diffusion
Invoked to equate the distributional dynamic programming operator with the stochastic HJB operator
domain assumption The controlled Fokker-Planck equation can be rewritten as a continuity equation for state-independent diffusion
Central step that produces the deterministic characteristic dynamics

pith-pipeline@v0.9.0 · 5442 in / 1491 out tokens · 24382 ms · 2026-05-10T16:43:09.759258+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

4 extracted references · 4 canonical work pages

[1]

6, 1191–1244, doi: https://doi.org/10.1017/S0308210513000085,

[AC14] L.AmbrosioandG.Crippa,ContinuityequationsandODEflowswithnon-smoothvelocity,Proceed- ings of the Royal Society of Edinburgh: Section A Mathematics144(2014), no. 6, 1191–1244, doi: https://doi.org/10.1017/S0308210513000085,. [ADM+08] L. Ambrosio, B. Dacorogna, E. Mascolo, P. Marcellini, L.A. Caffarelli, M.G. Crandall, L.C. Evans, and N. Fusco,Calculu...

work page doi:10.1017/s0308210513000085 2014
[2]

3, 470–514, doi: https://doi.org/10.1017/S0956792521000073

[BCJW21] S.Becker,P.Cheridito,A.Jentzen,andT.Welti,Solvinghigh-dimensionaloptimalstoppingproblems using deep learning, European Journal of Applied Mathematics32(2021), no. 3, 470–514, doi: https://doi.org/10.1017/S0956792521000073. [BF21] B. Bonnet and H. Frankowska,Differential inclusions in Wasserstein spaces: The Cauchy-Lipschitz framework, Journal of ...

work page doi:10.1017/s0956792521000073 2021
[3]

12, 1985, doi:https://doi.org/10.1007/978-1-4684-6748-2_12, pp

Progress in Probability and Statistics, vol. 12, 1985, doi:https://doi.org/10.1007/978-1-4684-6748-2_12, pp. 172–223. [Pes99] G. Peskir,Designing options given the risk: the optimal Skorokhod-embedding problem, Sto- chastic Processes and their Applications81(1999), 25–38, doi:https://doi.org/10.1016/ S0304-4149(98)00097-0. [PP01] J. L. Pedersen and G. Pes...

work page doi:10.1007/978-1-4684-6748-2_12 1985
[4]

[STBT21] A

[Sko17] ,Studies in the Theory of Random Processes, Dover Books on Mathematics, 2017, URL: https://tinyurl.com/4xjsemch. [STBT21] A. Saravanos, A. Tsolovikos, E. Bakolas, and E. Theodorou,Distributed covariance steering with consensus ADMM for stochastic multi-agent systems, Robotics: Science and Systems 2021, 2021, doi:https://doi.org/10.15607/RSS.2021.X...

work page doi:10.15607/rss.2021.xvii.075 2017

[1] [1]

6, 1191–1244, doi: https://doi.org/10.1017/S0308210513000085,

[AC14] L.AmbrosioandG.Crippa,ContinuityequationsandODEflowswithnon-smoothvelocity,Proceed- ings of the Royal Society of Edinburgh: Section A Mathematics144(2014), no. 6, 1191–1244, doi: https://doi.org/10.1017/S0308210513000085,. [ADM+08] L. Ambrosio, B. Dacorogna, E. Mascolo, P. Marcellini, L.A. Caffarelli, M.G. Crandall, L.C. Evans, and N. Fusco,Calculu...

work page doi:10.1017/s0308210513000085 2014

[2] [2]

3, 470–514, doi: https://doi.org/10.1017/S0956792521000073

[BCJW21] S.Becker,P.Cheridito,A.Jentzen,andT.Welti,Solvinghigh-dimensionaloptimalstoppingproblems using deep learning, European Journal of Applied Mathematics32(2021), no. 3, 470–514, doi: https://doi.org/10.1017/S0956792521000073. [BF21] B. Bonnet and H. Frankowska,Differential inclusions in Wasserstein spaces: The Cauchy-Lipschitz framework, Journal of ...

work page doi:10.1017/s0956792521000073 2021

[3] [3]

12, 1985, doi:https://doi.org/10.1007/978-1-4684-6748-2_12, pp

Progress in Probability and Statistics, vol. 12, 1985, doi:https://doi.org/10.1007/978-1-4684-6748-2_12, pp. 172–223. [Pes99] G. Peskir,Designing options given the risk: the optimal Skorokhod-embedding problem, Sto- chastic Processes and their Applications81(1999), 25–38, doi:https://doi.org/10.1016/ S0304-4149(98)00097-0. [PP01] J. L. Pedersen and G. Pes...

work page doi:10.1007/978-1-4684-6748-2_12 1985

[4] [4]

[STBT21] A

[Sko17] ,Studies in the Theory of Random Processes, Dover Books on Mathematics, 2017, URL: https://tinyurl.com/4xjsemch. [STBT21] A. Saravanos, A. Tsolovikos, E. Bakolas, and E. Theodorou,Distributed covariance steering with consensus ADMM for stochastic multi-agent systems, Robotics: Science and Systems 2021, 2021, doi:https://doi.org/10.15607/RSS.2021.X...

work page doi:10.15607/rss.2021.xvii.075 2017