Duality and DeepMartingale for High-Dimensional Optimal Switching: Computable Upper Bounds and Approximation-Expressivity Guarantees

Hoi Ying Wong; Junyan Ye

arxiv: 2604.08080 · v1 · submitted 2026-04-09 · 🧮 math.OC · cs.NA· math.NA· math.PR

Duality and DeepMartingale for High-Dimensional Optimal Switching: Computable Upper Bounds and Approximation-Expressivity Guarantees

Junyan Ye , Hoi Ying Wong This is my paper

Pith reviewed 2026-05-10 17:26 UTC · model grok-4.3

classification 🧮 math.OC cs.NAmath.NAmath.PR

keywords optimal switchingdual methodsmartingale penaltiesdeep learningneural network expressivityhigh-dimensional problemsstochastic controlDoob martingales

0 comments

The pith

Neural networks of size polynomial in dimension can approximate dual upper bounds for high-dimensional optimal switching to arbitrary accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a dual representation for finite-horizon optimal switching using a family of martingale penalties, with the minimal penalty given by Doob martingales of the continuation values. This produces fully computable upper bounds that are extended to a DeepMartingale framework for the switching case, with proven convergence under an upper-bound loss and an L2 surrogate loss. An expressivity analysis then shows that, under stated structural assumptions, neural networks of size at most c d^q ε^{-r} induce dual upper bounds within ε of the true value, with c, q, r independent of dimension and ε. The resulting dual solver therefore avoids the curse of dimensionality. Numerical tests pair the upper bounds with policy-based lower bounds on Brownian and Brownian-Poisson models to produce small gaps and practical hedging strategies.

Core claim

Under the stated structural assumptions, for any target accuracy ε>0 there exist neural networks of size at most c d^q ε^{-r} whose induced dual upper bound approximates the true value within ε, where c, q, and r are independent of d and ε; the dual representation is obtained by introducing martingale penalties whose minimal member is characterized by the Doob martingales of the continuation values, yielding a computable upper bound that can be trained via the extended DeepMartingale method.

What carries the argument

The family of martingale penalties in the dual representation of multiple switching, minimized by the Doob martingales of continuation values, which supplies the computable upper bound approximated by neural networks.

If this is right

The dual upper bound obtained from any neural-network approximation to the martingale penalties is computable and converges to the true value under the training losses.
Convergence holds both for the upper-bound loss and for the L2 surrogate loss.
The learned dual martingale directly supplies a practical delta-hedging strategy.
Pairing the dual upper bounds with deep policy-based lower bounds produces empirical gaps that shrink in high-dimensional Brownian and Brownian-Poisson models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dual-martingale construction may extend to other classes of stochastic control with discrete interventions.
In applications such as energy storage or portfolio switching, the dimension-free bound would allow direct solution of problems whose state dimension is hundreds.
Verifying the structural assumptions on concrete diffusion models would immediately certify that the reported network-size bound applies.
Relaxing the structural assumptions while preserving polynomial expressivity would enlarge the class of solvable high-dimensional switching problems.

Load-bearing premise

The structural assumptions on the problem that enable a dimension-independent expressivity bound for the neural networks approximating the dual martingales.

What would settle it

A high-dimensional instance satisfying the basic problem setup but not the structural assumptions, in which every neural network whose size grows only polynomially in d leaves an ε-gap in the dual upper bound that cannot be driven to zero.

read the original abstract

We study finite-horizon optimal switching with discrete intervention dates on a general filtration, allowing continuous-time observations between decision dates, and develop a deep-learning-based dual framework with computable upper bounds. We first derive a dual representation for multiple switching by introducing a family of martingale penalties. The minimal penalty is characterized by the Doob martingales of the continuation values, which yields a fully computable upper bound. We then extend DeepMartingale from optimal stopping to optimal switching and establish convergence under both the upper-bound loss and an $L^2$-surrogate loss. We also provide an expressivity analysis: under the stated structural assumptions, for any target accuracy $\varepsilon>0$, there exist neural networks of size at most $c d^{q}\varepsilon^{-r}$ whose induced dual upper bound approximates the true value within $\varepsilon$, where $c$, $q$, and $r$ are independent of $d$ and $\varepsilon$. Hence, the dual solver avoids the curse of dimensionality under the stated structural assumptions. For numerical assessment, we additionally implement a deep policy-based approach to produce feasible lower bounds and empirical upper--lower gaps. Numerical experiments on Brownian and Brownian--Poisson models demonstrate small upper--lower gaps and favorable performance in high dimensions. The learned dual martingale also yields a practical delta-hedging strategy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Paper extends DeepMartingale duality to optimal switching with computable NN upper bounds and dimension-independent expressivity rates under structural assumptions.

read the letter

This paper extends duality methods from optimal stopping to finite-horizon optimal switching by introducing martingale penalties and identifying the minimal one through Doob martingales of the continuation values. That choice yields a fully computable upper bound that neural networks can approximate. They prove convergence under both the direct upper-bound loss and an L2 surrogate, then add an expressivity result: under the stated structural assumptions, network size scales as c d^q epsilon^{-r} with the constants independent of dimension and target accuracy. Numerics pair the dual upper bound with a policy-based lower bound and show small gaps on Brownian and Brownian-Poisson examples, plus a hedging strategy from the learned martingale.

Referee Report

0 major / 3 minor

Summary. The paper develops a dual representation for finite-horizon optimal switching on a general filtration by introducing martingale penalties whose minimal form is given by the Doob decomposition of the continuation-value processes. It extends the DeepMartingale methodology to the switching setting, proves convergence of the resulting deep-learning approximations under both an upper-bound loss and an L²-surrogate loss, and supplies an expressivity result: under stated structural assumptions, neural networks of size at most c d^q ε^{-r} (with c, q, r independent of dimension d and accuracy ε) induce dual upper bounds within ε of the true value. Numerical experiments on Brownian and Brownian–Poisson models compare the learned dual upper bounds against policy-based lower bounds and illustrate small gaps together with a practical delta-hedging interpretation.

Significance. If the stated structural assumptions hold, the dimension-independent neural-network size bound constitutes a genuine advance for high-dimensional stochastic switching problems, where most existing methods suffer from the curse of dimensionality. The derivation from standard martingale theory (Doob decomposition) is non-circular, the convergence statements are conditioned on verifiable losses, and the provision of both theoretical guarantees and reproducible numerical lower-bound comparisons strengthens the contribution. The work therefore supplies a practical, theoretically supported route to computable upper bounds in settings such as energy management and high-dimensional option pricing.

minor comments (3)

The abstract and introduction repeatedly refer to “the stated structural assumptions” that deliver the dimension-free rate c d^q ε^{-r}. A concise, self-contained list or paragraph summarizing these assumptions (e.g., regularity of the payoff processes, properties of the filtration, or boundedness conditions) should appear early in the paper so that readers can immediately judge the scope of the expressivity claim.
In the numerical section, the architecture and training details of the deep policy network used to generate lower bounds are described only at a high level. Adding a short table or paragraph listing layer widths, activation functions, and optimizer hyperparameters would improve reproducibility without lengthening the manuscript.
Notation for the family of martingale penalties and the continuation-value processes is introduced in the duality section; a single consolidated table of symbols (including the dependence on the number of switching dates) would reduce cross-referencing for readers.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our work, the recognition of its significance for high-dimensional stochastic switching, and the recommendation for minor revision. We appreciate the referee's view that the dimension-independent neural-network size bound constitutes a genuine advance when the structural assumptions hold.

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The dual representation for optimal switching is derived from standard martingale theory via a family of martingale penalties, with the minimal penalty characterized directly by the Doob martingales of the continuation values rather than any fit to the target value. The extension of DeepMartingale, convergence statements under the upper-bound and L2-surrogate losses, and the expressivity analysis (neural network size bound c d^q ε^{-r} independent of d) are all explicitly conditioned on the stated structural assumptions and follow from established results in the literature once those hypotheses are granted. No load-bearing step reduces by construction to its own inputs, no self-citation is required to justify the central claims, and the numerical lower-bound approach is presented separately as an empirical complement.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on classical stochastic-process results and neural-network approximation theory; no new free parameters or invented entities are introduced in the abstract.

axioms (2)

standard math Doob martingale decomposition applies to continuation values of the switching problem
Used to characterize the minimal penalty that yields the computable upper bound.
domain assumption Neural networks can approximate the required martingale processes under the stated structural assumptions
Basis for the size bound c d^q ε^{-r} independent of dimension.

pith-pipeline@v0.9.0 · 5555 in / 1275 out tokens · 41961 ms · 2026-05-10T17:26:46.693270+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

A ¨ıd, L

[1]R. A ¨ıd, L. Campi, N. Langren´e, and H. Pham,A probabilistic numerical method for optimal multiple switching problems in high dimension, SIAM J. Financ. Math., 5 (2014), pp. 191–

work page 2014
[2]

Alfonsi, A

[2]A. Alfonsi, A. Kebaier, and J. Lelong,A pure dual approach for hedging bermudan options, Math. Finance, 35 (2025), pp. 745–759. [3]E. Bayraktar, A. Cohen, and A. Nellis,A neural network approach to high-dimensional optimal switching problems with jumps in energy markets, SIAM J. Financ. Math., 14 (2023), pp. 1028–1061. [4]C. Beck, S. Becker, P. Cheridi...

work page 2025
[3]

Hutzenthaler, A

[22]M. Hutzenthaler, A. Jentzen, T. Kruse, and et al.,A proof that rectified deep neural net- works overcome the curse of dimensionality in the numerical approximation of semilinear heat equations, SN Partial Differ. Equ. Appl., 1 (2020). [23]B. Jia and H. Y. Wong,Deep impulse control: application to interest rate intervention, Quant. Finance, 24 (2024), ...

work page 2020
[4]

Ly Vath and H

[28]V. Ly Vath and H. Pham,Explicit solution to an optimal switching problem in the two-regime case, SIAM J. Control Optim., 46 (2007), pp. 395–426. [29]R. MARTYR,Dynamic programming for discrete-time finite-horizon optimal switching prob- lems with negative switching costs, Adv. Appl. Probab., 48 (2016), pp. 832–847. [30]R. Martyr,Finite-horizon optimal ...

work page 2007
[5]

[32]J. A. A. Opschoor, P. C. Petersen, and C. Schwab,Deep ReLU networks and high-order finite element methods, Anal. Appl., 18 (2020), pp. 715–770. [33]M. Raissi, P. Perdikaris, and G. Karniadakis,Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comp...

work page arXiv 2020

[1] [1]

A ¨ıd, L

[1]R. A ¨ıd, L. Campi, N. Langren´e, and H. Pham,A probabilistic numerical method for optimal multiple switching problems in high dimension, SIAM J. Financ. Math., 5 (2014), pp. 191–

work page 2014

[2] [2]

Alfonsi, A

[2]A. Alfonsi, A. Kebaier, and J. Lelong,A pure dual approach for hedging bermudan options, Math. Finance, 35 (2025), pp. 745–759. [3]E. Bayraktar, A. Cohen, and A. Nellis,A neural network approach to high-dimensional optimal switching problems with jumps in energy markets, SIAM J. Financ. Math., 14 (2023), pp. 1028–1061. [4]C. Beck, S. Becker, P. Cheridi...

work page 2025

[3] [3]

Hutzenthaler, A

[22]M. Hutzenthaler, A. Jentzen, T. Kruse, and et al.,A proof that rectified deep neural net- works overcome the curse of dimensionality in the numerical approximation of semilinear heat equations, SN Partial Differ. Equ. Appl., 1 (2020). [23]B. Jia and H. Y. Wong,Deep impulse control: application to interest rate intervention, Quant. Finance, 24 (2024), ...

work page 2020

[4] [4]

Ly Vath and H

[28]V. Ly Vath and H. Pham,Explicit solution to an optimal switching problem in the two-regime case, SIAM J. Control Optim., 46 (2007), pp. 395–426. [29]R. MARTYR,Dynamic programming for discrete-time finite-horizon optimal switching prob- lems with negative switching costs, Adv. Appl. Probab., 48 (2016), pp. 832–847. [30]R. Martyr,Finite-horizon optimal ...

work page 2007

[5] [5]

[32]J. A. A. Opschoor, P. C. Petersen, and C. Schwab,Deep ReLU networks and high-order finite element methods, Anal. Appl., 18 (2020), pp. 715–770. [33]M. Raissi, P. Perdikaris, and G. Karniadakis,Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comp...

work page arXiv 2020