Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version

Adolfo M.D da Silva; Alberto Ohashi; Dorival Le\~ao; Simone Scotti

arxiv: 2604.13147 · v1 · submitted 2026-04-14 · 📊 stat.ML · cs.LG· math.PR

Adaptive Learning via Off-Model Training and Importance Sampling for Fully Non-Markovian Optimal Stochastic Control. Complete version

Dorival Le\~ao , Alberto Ohashi , Simone Scotti , Adolfo M.D da Silva This is my paper

Pith reviewed 2026-05-10 14:17 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.PR

keywords stochastic controlnon-Markovian processesimportance samplingdeep neural networksadaptive learningMonte Carlo methodsdynamic programmingmodel uncertainty

0 comments

The pith

A single fixed dataset of trajectories can recover optimal controls for non-Markovian systems with unknown parameters through importance sampling reweighting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a Monte Carlo method for solving continuous-time optimal control problems whose state dynamics are fully non-Markovian and depend on uncertain parameters. It constructs explicit reference probability laws under which a fixed collection of trajectories is simulated once; dynamic programming operators for any target model are then recovered by reweighting those trajectories with Radon-Nikodym factors. This off-model architecture supports both static approximation of the value function by deep neural networks and an adaptive scheme that updates the control law under parametric uncertainty by adjusting weights rather than regenerating paths. Non-asymptotic error bounds separate Monte Carlo sampling error from model-risk error, and the approach is illustrated on linear-quadratic examples with path-dependent features.

Core claim

We construct explicit dominating training laws and Radon-Nikodym weights for representative classes of non-Markovian controlled systems. This yields an off-model training architecture in which a fixed synthetic dataset is generated under a reference law, while the dynamic programming operators associated with a target model are recovered by importance sampling. For fixed parameters, non-asymptotic error bounds are established for deep neural network approximation of the embedded dynamic programming equation; for adaptive learning, quantitative estimates separate Monte Carlo approximation error from model-risk error.

What carries the argument

The dominating training law together with its Radon-Nikodym weight, which performs importance sampling to map the reference measure to the target measure inside the embedded backward dynamic programming recursion.

If this is right

Non-asymptotic error bounds hold for deep neural network approximation of the embedded dynamic programming equation when parameters are fixed.
Quantitative estimates separate Monte Carlo sampling error from model-risk error when the control law is adapted to changing parameters.
Recalibration to new parameter values requires only reweighting of the existing training sample.
The method applies directly to path-dependent stochastic differential equations, rough-volatility hedging, and systems driven by fractional Brownian motion.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same off-model structure could reduce simulation cost in any path-dependent control setting where a dominating law with moderate variance can be exhibited.
Error separation offers a practical way to decide how much computational budget to allocate to additional sampling versus parameter estimation.
If dominating laws exist for wider classes of non-Markovian drivers, the architecture would extend beyond the linear-quadratic examples shown.

Load-bearing premise

Explicit dominating training laws and associated Radon-Nikodym weights with controlled variance can be constructed for the representative classes of fully non-Markovian controlled systems.

What would settle it

A concrete non-Markovian controlled system from one of the paper's representative classes in which every candidate dominating law produces importance-sampling weights whose variance grows without bound as the time horizon or discretization level increases.

Figures

Figures reproduced from arXiv: 2604.13147 by Adolfo M.D da Silva, Alberto Ohashi, Dorival Le\~ao, Simone Scotti.

**Figure 2.** Figure 2: Empirical VarP&L as a function of the discretization level (Rough SV model with E[VT ] = 0.2. rtrain = 0.5) not explore a sufficiently rich range of controls and the resulting hedge appears too rigid. If rtrain is too large, the exploratory distribution becomes too diffuse and the regressions entering the dynamic programming step become noisier. The value rtrain = 0.5 seems to provide the best compromise b… view at source ↗

**Figure 3.** Figure 3: Histogram of P&L. Rough volatility. ATM Put with S0 = 100 = K. Number of Monte Carlo samples = 8000. rtrain = 0.5 5.3. A structured random-skeleton importance-sampling experiment under model risk. We now present a richer numerical illustration of the adaptive importance-sampling update of Section 4.2. The purpose of this experiment is to illustrate the adaptive importance sampling mechanism in a very simp… view at source ↗

read the original abstract

This paper studies continuous-time stochastic control problems whose controlled states are fully non-Markovian and depend on unknown model parameters. Such problems arise naturally in path-dependent stochastic differential equations, rough-volatility hedging, and systems driven by fractional Brownian motion. Building on the discrete skeleton approach developed in earlier work, we propose a Monte Carlo learning methodology for the associated embedded backward dynamic programming equation. Our main contribution is twofold. First, we construct explicit dominating training laws and Radon--Nikodym weights for several representative classes of non-Markovian controlled systems. This yields an off-model training architecture in which a fixed synthetic dataset is generated under a reference law, while the dynamic programming operators associated with a target model are recovered by importance sampling. Second, we use this structure to design an adaptive update mechanism under parametric model uncertainty, so that repeated recalibration can be performed by reweighting the same training sample rather than regenerating new trajectories. For fixed parameters, we establish non-asymptotic error bounds for the approximation of the embedded dynamic programming equation via deep neural networks. For adaptive learning, we derive quantitative estimates that separate Monte Carlo approximation error from model-risk error. Numerical experiments illustrate both the off-model training mechanism and the adaptive importance-sampling update in structured linear-quadratic examples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete off-model training setup for non-Markovian control that reuses one fixed dataset via importance sampling and separates Monte Carlo from model-risk error.

read the letter

The main point is that the authors construct explicit dominating laws and Radon-Nikodym weights for several non-Markovian classes (path-dependent SDEs, rough volatility, fBM-driven) using the discrete skeleton embedding. This lets them generate trajectories once under a reference measure and recover the dynamic programming operators for different target models by reweighting, which supports an adaptive update rule that avoids regenerating samples when parameters change. They also supply non-asymptotic bounds on the DNN approximation of the embedded equation and quantitative estimates that split Monte Carlo error from model-risk error. The stress-test confirms the weights stay square-integrable with explicit variance control tied to the linear-quadratic structure and Girsanov-type changes, so the separation holds without hidden degeneracy. The numerics stay in structured linear-quadratic examples, which is reasonable for checking the mechanism. The construction is new in its combination of fixed-dataset training with the adaptive reweighting for fully non-Markovian systems. The paper does well at making the off-model architecture practical and at stating the error separation clearly. The main limitation is that everything rests on being able to build those dominating laws explicitly, so the method applies to the listed representative classes rather than arbitrary non-Markovian problems. In higher dimensions or with larger model deviations the weight variance could still bite, even if the bounds are clean on paper. This is for researchers who already work with stochastic control in finance or rough-volatility settings and want a way to handle repeated recalibration without fresh Monte Carlo runs each time. A reader focused on Monte Carlo methods for path-dependent problems will find the explicit weights and error split useful. It deserves a serious referee because the core technical pieces are grounded and the contribution on adaptive importance sampling is specific enough to evaluate.

Referee Report

0 major / 3 minor

Summary. The paper develops an off-model Monte Carlo learning method for continuous-time stochastic control problems with fully non-Markovian controlled states and unknown parameters. It constructs explicit dominating training laws and square-integrable Radon-Nikodym weights for representative classes (path-dependent SDEs, rough-volatility models, fBM-driven systems) via discrete skeleton embedding. This permits generation of a single fixed synthetic dataset under a reference measure, with target-model dynamic programming operators recovered by importance sampling. Non-asymptotic error bounds are derived for deep neural network approximation of the embedded backward equation under fixed parameters, together with quantitative estimates that separate Monte Carlo sampling error from model-risk error under adaptive parametric updates. The approach is illustrated on linear-quadratic examples.

Significance. If the explicit constructions and variance bounds hold, the work supplies a practical mechanism for adaptive recalibration without trajectory regeneration, while rigorously separating approximation and model-risk contributions. The provision of concrete dominating measures and Radon-Nikodym weights for several non-Markovian classes, together with the non-asymptotic DNN bounds and error-separation estimates, constitutes a concrete advance over generic importance-sampling arguments. These features support reproducible numerical implementation and falsifiable error predictions in structured settings.

minor comments (3)

[Abstract] Abstract: the statement of non-asymptotic bounds does not indicate the dependence on network width, depth, or time horizon; adding a brief qualitative indication would improve readability without altering the technical content.
[§1] §1 (Introduction): the reference to the 'discrete skeleton approach developed in earlier work' would benefit from an explicit citation or pointer to the relevant prior result to allow readers to locate the embedding construction.
[Numerical experiments] Numerical experiments section: while the LQ examples illustrate the mechanism, quantitative reporting of the realized variance of the importance weights (or comparison to the derived bounds) would strengthen the empirical support for the controlled-variance claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No specific major comments were listed in the report, so we have no individual points to address. We remain available to incorporate any editorial suggestions or minor clarifications in a revised version.

Circularity Check

0 steps flagged

Minor self-citation to prior discrete skeleton method; central claims remain independent

full rationale

The derivation relies on explicit constructions of dominating training laws and Radon-Nikodym weights for path-dependent SDEs, rough-volatility, and fBM-driven systems, which are supplied in the paper via the discrete skeleton embedding rather than assumed or fitted. Non-asymptotic DNN approximation bounds and the separation of Monte Carlo versus model-risk errors in the adaptive setting follow directly from these weights and standard concentration arguments, without reducing to self-referential definitions or re-using fitted quantities as predictions. The single reference to earlier discrete skeleton work is not load-bearing for the new off-model architecture or error estimates, which are self-contained against the stated assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that dominating reference laws with usable Radon-Nikodym derivatives exist and can be written explicitly for the non-Markovian classes studied; no free parameters or new entities are mentioned in the abstract.

axioms (1)

domain assumption Existence of explicit dominating training laws and Radon-Nikodym weights for representative classes of fully non-Markovian controlled systems
Required for the off-model training architecture to function without degeneracy.

pith-pipeline@v0.9.0 · 5546 in / 1452 out tokens · 71895 ms · 2026-05-10T14:17:24.385700+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 1 internal anchor

[1]

Abi Jaber and O

E. Abi Jaber and O. El Euch, Multifactor approximation of rough volatility models,SIAM Journal on Financial Mathematics10(2019), 309–349

work page 2019
[2]

Agapiou, O

S. Agapiou, O. Papaspiliopoulos, D. Sanz-Alonso and A. M. Stuart, Importance Sampling: Intrinsic Dimension and Computational Cost,Statistical Science32(2017), 405-431

work page 2017
[3]

Alfonsi and A

A. Alfonsi and A. Kebaier, Approximation of stochastic Volterra equations with kernels of completely monotone type,Mathematics of Computation93(2024), 643–677

work page 2024
[4]

Bach, Breaking the curse of dimensionality with convex neural networks,Journal of Machine Learning Research 18(2017), 1–53

F. Bach, Breaking the curse of dimensionality with convex neural networks,Journal of Machine Learning Research 18(2017), 1–53

work page 2017
[5]

P. Bank, C. Bayer, P. P. Hager, S. Riedel, and T. Nauen,Stochastic Control with Signatures, arXiv:2406.01585, 2025

work page arXiv 2025
[6]

Bayer and S

C. Bayer and S. Breneis, Markovian approximations of stochastic Volterra equations with the fractional kernel, Quantitative Finance23(2023), 53–70

work page 2023
[7]

Bayer, P

C. Bayer, P. Friz, M. Fukasawa, J. Gatheral, A. Jacquier, and M. Rosenbaum (eds.),Rough Volatility, Financial Mathematics, Society for Industrial and Applied Mathematics, Philadelphia, 2024. 74 DORIVAL LE ˜AO, ALBERTO OHASHI, SIMONE SCOTTI, AND ADOLFO M. DIAS DA SILVA

work page 2024
[8]

Bayraktar and T

E. Bayraktar and T. Chen,Nonparametric Adaptive Robust Control Under Model Uncertainty,SIAM Journal on Control and Optimization61(2023), no. 5, 2737–2760

work page 2023
[9]

Bayraktar and T

E. Bayraktar and T. Chen, Data-driven non-parametric robust control under dependence uncertainty, inPeter Carr Gedenkschrift: Research Advances in Mathematical Finance, World Scientific, 2024, pp. 141–178

work page 2024
[10]

Bertsekas

D. Bertsekas. Dynamic Programming and Optimal Control

work page
[11]

A. N. Borodin and P. Salminen,Handbook of Brownian Motion: Facts and Formulae, Birkh¨ auser, 2002

work page 2002
[12]

Z. A. Burq and O. D. Jones, Simulation of Brownian motion at first-passage times,Mathematics and Computers in Simulation77(2008), no. 1, 64–71

work page 2008
[13]

Carlin,Deep Learning Architectures, Springer

O. Carlin,Deep Learning Architectures, Springer

work page
[14]

Chakraborty, H

P. Chakraborty, H. Honnappa, and S. Tindel,Pathwise Relaxed Optimal Control of Rough Differential Equations, arXiv:2402.17900, 2024

work page arXiv 2024
[15]

Chen and J

T. Chen and J. Myung,Nonparametric Adaptive Bayesian Stochastic Control Under Model Uncertainty, Preprint, 2020

work page 2020
[16]

Cheridito, H

P. Cheridito, H. Kawaguchi, and M. Maejima, Fractional Ornstein–Uhlenbeck processes,Electronic Journal of Probability8(2003), no. 3, 14 pp

work page 2003
[17]

El Euch and M

O. El Euch and M. Rosenbaum, Perfect hedging in rough Heston models,Annals of Applied Probability28(2018), 3813–3856

work page 2018
[18]

Fukasawa, Hedging under rough volatility

M. Fukasawa, Hedging under rough volatility

work page
[19]

Gatheral, T

J. Gatheral, T. Jaisson, and M. Rosenbaum, Volatility is rough,Quantitative Finance18(2018), no. 6, 933–949

work page 2018
[20]

Gobet and P

E. Gobet and P. Turkedjiev, Adaptive importance sampling in least-squares Monte Carlo algorithms for backward stochastic differential equations,Stochastic Processes and their Applications127(2017), no. 4, 1171–1203

work page 2017
[21]

P. P. Hager, F. N. Harang, L. Pelizzari, and S. Tindel,The Volterra signature, arXiv:2603.04525, 2026

work page internal anchor Pith review arXiv 2026
[22]

J. P. Hanna, S. Niekum, and P. Stone, Importance sampling in reinforcement learning with an estimated behavior policy,Machine Learning110(2021), 1267–1317

work page 2021
[23]

Horvath, J

B. Horvath, J. Teichmann, and Z. Zuric, Deep hedging under rough volatility,Risks9(2021), no. 7, 138

work page 2021
[24]

Hur´ e, H

C. Hur´ e, H. Pham, A. Bachouch, and N. Langren´ e, Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis,SIAM Journal on Numerical Analysis59(2021), no. 1, 525–557

work page 2021
[25]

Kohler, Nonparametric regression with additional measurement errors in the dependent variable,Journal of Statistical Planning and Inference136(2006), 3339–3361

M. Kohler, Nonparametric regression with additional measurement errors in the dependent variable,Journal of Statistical Planning and Inference136(2006), 3339–3361

work page 2006
[26]

Kohler, A

M. Kohler, A. Krzy˙ zak, and N. Todorovi´ c, Pricing of high-dimensional American options by neural networks, Mathematical Finance20(2010), 383–410

work page 2010
[27]

Le˜ ao and A

D. Le˜ ao and A. Ohashi, Weak approximations for Wiener functionals,Annals of Applied Probability23(2013), no. 4, 1660–1691

work page 2013
[28]

Le˜ ao, A

D. Le˜ ao, A. Ohashi, and A. B. Simas, A weak version of path-dependent functional Itˆ o calculus,Annals of Proba- bility46(2018), no. 6, 3399–3441

work page 2018
[29]

Le˜ ao, A

D. Le˜ ao, A. Ohashi, and F. Russo, Discrete-type approximations for non-Markovian optimal stopping problems: Part I,Journal of Applied Probability56(2019), no. 4, 981–1005

work page 2019
[30]

Le˜ ao, A

D. Le˜ ao, A. Ohashi, and F. A. de Souza, Solving non-Markovian stochastic control problems driven by Wiener functionals,Annals of Applied Probability34(2024), 5116–5171

work page 2024
[31]

Ledoux and M

M. Ledoux and M. Talagrand,Probability in Banach Spaces

work page
[32]

Motte and D

E. Motte and D. Hainaut, Partial hedging in rough volatility models,SIAM Journal on Financial Mathematics15 (2024), no. 3, 601–652

work page 2024
[33]

Ohashi and F

A. Ohashi and F. A. de Souza,L p uniform random walk-type approximation for fractional Brownian motion with Hurst exponent 0< H < 1 2 ,Electronic Communications in Probability25(2020), 1–13

work page 2020
[34]

Riedel, The value of the high, low and close in the estimation of Brownian motion,Statistical Inference for Stochastic Processes24(2021), 179–210

K. Riedel, The value of the high, low and close in the estimation of Brownian motion,Statistical Inference for Stochastic Processes24(2021), 179–210

work page 2021
[35]

H. J. Kappen and H. C. Ruiz, Adaptive importance sampling for control and inference,Journal of Statistical Physics162(2016), 1244–1266. Departamento de Matem´atica Aplicada e Estat´ıstica. Universidade de S ˜ao Paulo, 13560-970, S˜ao Carlos - SP, Brazil Email address:leao@estatcamp.com.br Departamento de Matem´atica, Universidade de Bras´ılia, 13560-970, ...

work page 2016

[1] [1]

Abi Jaber and O

E. Abi Jaber and O. El Euch, Multifactor approximation of rough volatility models,SIAM Journal on Financial Mathematics10(2019), 309–349

work page 2019

[2] [2]

Agapiou, O

S. Agapiou, O. Papaspiliopoulos, D. Sanz-Alonso and A. M. Stuart, Importance Sampling: Intrinsic Dimension and Computational Cost,Statistical Science32(2017), 405-431

work page 2017

[3] [3]

Alfonsi and A

A. Alfonsi and A. Kebaier, Approximation of stochastic Volterra equations with kernels of completely monotone type,Mathematics of Computation93(2024), 643–677

work page 2024

[4] [4]

Bach, Breaking the curse of dimensionality with convex neural networks,Journal of Machine Learning Research 18(2017), 1–53

F. Bach, Breaking the curse of dimensionality with convex neural networks,Journal of Machine Learning Research 18(2017), 1–53

work page 2017

[5] [5]

P. Bank, C. Bayer, P. P. Hager, S. Riedel, and T. Nauen,Stochastic Control with Signatures, arXiv:2406.01585, 2025

work page arXiv 2025

[6] [6]

Bayer and S

C. Bayer and S. Breneis, Markovian approximations of stochastic Volterra equations with the fractional kernel, Quantitative Finance23(2023), 53–70

work page 2023

[7] [7]

Bayer, P

C. Bayer, P. Friz, M. Fukasawa, J. Gatheral, A. Jacquier, and M. Rosenbaum (eds.),Rough Volatility, Financial Mathematics, Society for Industrial and Applied Mathematics, Philadelphia, 2024. 74 DORIVAL LE ˜AO, ALBERTO OHASHI, SIMONE SCOTTI, AND ADOLFO M. DIAS DA SILVA

work page 2024

[8] [8]

Bayraktar and T

E. Bayraktar and T. Chen,Nonparametric Adaptive Robust Control Under Model Uncertainty,SIAM Journal on Control and Optimization61(2023), no. 5, 2737–2760

work page 2023

[9] [9]

Bayraktar and T

E. Bayraktar and T. Chen, Data-driven non-parametric robust control under dependence uncertainty, inPeter Carr Gedenkschrift: Research Advances in Mathematical Finance, World Scientific, 2024, pp. 141–178

work page 2024

[10] [10]

Bertsekas

D. Bertsekas. Dynamic Programming and Optimal Control

work page

[11] [11]

A. N. Borodin and P. Salminen,Handbook of Brownian Motion: Facts and Formulae, Birkh¨ auser, 2002

work page 2002

[12] [12]

Z. A. Burq and O. D. Jones, Simulation of Brownian motion at first-passage times,Mathematics and Computers in Simulation77(2008), no. 1, 64–71

work page 2008

[13] [13]

Carlin,Deep Learning Architectures, Springer

O. Carlin,Deep Learning Architectures, Springer

work page

[14] [14]

Chakraborty, H

P. Chakraborty, H. Honnappa, and S. Tindel,Pathwise Relaxed Optimal Control of Rough Differential Equations, arXiv:2402.17900, 2024

work page arXiv 2024

[15] [15]

Chen and J

T. Chen and J. Myung,Nonparametric Adaptive Bayesian Stochastic Control Under Model Uncertainty, Preprint, 2020

work page 2020

[16] [16]

Cheridito, H

P. Cheridito, H. Kawaguchi, and M. Maejima, Fractional Ornstein–Uhlenbeck processes,Electronic Journal of Probability8(2003), no. 3, 14 pp

work page 2003

[17] [17]

El Euch and M

O. El Euch and M. Rosenbaum, Perfect hedging in rough Heston models,Annals of Applied Probability28(2018), 3813–3856

work page 2018

[18] [18]

Fukasawa, Hedging under rough volatility

M. Fukasawa, Hedging under rough volatility

work page

[19] [19]

Gatheral, T

J. Gatheral, T. Jaisson, and M. Rosenbaum, Volatility is rough,Quantitative Finance18(2018), no. 6, 933–949

work page 2018

[20] [20]

Gobet and P

E. Gobet and P. Turkedjiev, Adaptive importance sampling in least-squares Monte Carlo algorithms for backward stochastic differential equations,Stochastic Processes and their Applications127(2017), no. 4, 1171–1203

work page 2017

[21] [21]

P. P. Hager, F. N. Harang, L. Pelizzari, and S. Tindel,The Volterra signature, arXiv:2603.04525, 2026

work page internal anchor Pith review arXiv 2026

[22] [22]

J. P. Hanna, S. Niekum, and P. Stone, Importance sampling in reinforcement learning with an estimated behavior policy,Machine Learning110(2021), 1267–1317

work page 2021

[23] [23]

Horvath, J

B. Horvath, J. Teichmann, and Z. Zuric, Deep hedging under rough volatility,Risks9(2021), no. 7, 138

work page 2021

[24] [24]

Hur´ e, H

C. Hur´ e, H. Pham, A. Bachouch, and N. Langren´ e, Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis,SIAM Journal on Numerical Analysis59(2021), no. 1, 525–557

work page 2021

[25] [25]

Kohler, Nonparametric regression with additional measurement errors in the dependent variable,Journal of Statistical Planning and Inference136(2006), 3339–3361

M. Kohler, Nonparametric regression with additional measurement errors in the dependent variable,Journal of Statistical Planning and Inference136(2006), 3339–3361

work page 2006

[26] [26]

Kohler, A

M. Kohler, A. Krzy˙ zak, and N. Todorovi´ c, Pricing of high-dimensional American options by neural networks, Mathematical Finance20(2010), 383–410

work page 2010

[27] [27]

Le˜ ao and A

D. Le˜ ao and A. Ohashi, Weak approximations for Wiener functionals,Annals of Applied Probability23(2013), no. 4, 1660–1691

work page 2013

[28] [28]

Le˜ ao, A

D. Le˜ ao, A. Ohashi, and A. B. Simas, A weak version of path-dependent functional Itˆ o calculus,Annals of Proba- bility46(2018), no. 6, 3399–3441

work page 2018

[29] [29]

Le˜ ao, A

D. Le˜ ao, A. Ohashi, and F. Russo, Discrete-type approximations for non-Markovian optimal stopping problems: Part I,Journal of Applied Probability56(2019), no. 4, 981–1005

work page 2019

[30] [30]

Le˜ ao, A

D. Le˜ ao, A. Ohashi, and F. A. de Souza, Solving non-Markovian stochastic control problems driven by Wiener functionals,Annals of Applied Probability34(2024), 5116–5171

work page 2024

[31] [31]

Ledoux and M

M. Ledoux and M. Talagrand,Probability in Banach Spaces

work page

[32] [32]

Motte and D

E. Motte and D. Hainaut, Partial hedging in rough volatility models,SIAM Journal on Financial Mathematics15 (2024), no. 3, 601–652

work page 2024

[33] [33]

Ohashi and F

A. Ohashi and F. A. de Souza,L p uniform random walk-type approximation for fractional Brownian motion with Hurst exponent 0< H < 1 2 ,Electronic Communications in Probability25(2020), 1–13

work page 2020

[34] [34]

Riedel, The value of the high, low and close in the estimation of Brownian motion,Statistical Inference for Stochastic Processes24(2021), 179–210

K. Riedel, The value of the high, low and close in the estimation of Brownian motion,Statistical Inference for Stochastic Processes24(2021), 179–210

work page 2021

[35] [35]

H. J. Kappen and H. C. Ruiz, Adaptive importance sampling for control and inference,Journal of Statistical Physics162(2016), 1244–1266. Departamento de Matem´atica Aplicada e Estat´ıstica. Universidade de S ˜ao Paulo, 13560-970, S˜ao Carlos - SP, Brazil Email address:leao@estatcamp.com.br Departamento de Matem´atica, Universidade de Bras´ılia, 13560-970, ...

work page 2016