Duality and DeepMartingale for High-Dimensional Optimal Switching: Computable Upper Bounds and Approximation-Expressivity Guarantees
Pith reviewed 2026-05-10 17:26 UTC · model grok-4.3
The pith
Neural networks of size polynomial in dimension can approximate dual upper bounds for high-dimensional optimal switching to arbitrary accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the stated structural assumptions, for any target accuracy ε>0 there exist neural networks of size at most c d^q ε^{-r} whose induced dual upper bound approximates the true value within ε, where c, q, and r are independent of d and ε; the dual representation is obtained by introducing martingale penalties whose minimal member is characterized by the Doob martingales of the continuation values, yielding a computable upper bound that can be trained via the extended DeepMartingale method.
What carries the argument
The family of martingale penalties in the dual representation of multiple switching, minimized by the Doob martingales of continuation values, which supplies the computable upper bound approximated by neural networks.
If this is right
- The dual upper bound obtained from any neural-network approximation to the martingale penalties is computable and converges to the true value under the training losses.
- Convergence holds both for the upper-bound loss and for the L2 surrogate loss.
- The learned dual martingale directly supplies a practical delta-hedging strategy.
- Pairing the dual upper bounds with deep policy-based lower bounds produces empirical gaps that shrink in high-dimensional Brownian and Brownian-Poisson models.
Where Pith is reading between the lines
- The same dual-martingale construction may extend to other classes of stochastic control with discrete interventions.
- In applications such as energy storage or portfolio switching, the dimension-free bound would allow direct solution of problems whose state dimension is hundreds.
- Verifying the structural assumptions on concrete diffusion models would immediately certify that the reported network-size bound applies.
- Relaxing the structural assumptions while preserving polynomial expressivity would enlarge the class of solvable high-dimensional switching problems.
Load-bearing premise
The structural assumptions on the problem that enable a dimension-independent expressivity bound for the neural networks approximating the dual martingales.
What would settle it
A high-dimensional instance satisfying the basic problem setup but not the structural assumptions, in which every neural network whose size grows only polynomially in d leaves an ε-gap in the dual upper bound that cannot be driven to zero.
read the original abstract
We study finite-horizon optimal switching with discrete intervention dates on a general filtration, allowing continuous-time observations between decision dates, and develop a deep-learning-based dual framework with computable upper bounds. We first derive a dual representation for multiple switching by introducing a family of martingale penalties. The minimal penalty is characterized by the Doob martingales of the continuation values, which yields a fully computable upper bound. We then extend DeepMartingale from optimal stopping to optimal switching and establish convergence under both the upper-bound loss and an $L^2$-surrogate loss. We also provide an expressivity analysis: under the stated structural assumptions, for any target accuracy $\varepsilon>0$, there exist neural networks of size at most $c d^{q}\varepsilon^{-r}$ whose induced dual upper bound approximates the true value within $\varepsilon$, where $c$, $q$, and $r$ are independent of $d$ and $\varepsilon$. Hence, the dual solver avoids the curse of dimensionality under the stated structural assumptions. For numerical assessment, we additionally implement a deep policy-based approach to produce feasible lower bounds and empirical upper--lower gaps. Numerical experiments on Brownian and Brownian--Poisson models demonstrate small upper--lower gaps and favorable performance in high dimensions. The learned dual martingale also yields a practical delta-hedging strategy.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops a dual representation for finite-horizon optimal switching on a general filtration by introducing martingale penalties whose minimal form is given by the Doob decomposition of the continuation-value processes. It extends the DeepMartingale methodology to the switching setting, proves convergence of the resulting deep-learning approximations under both an upper-bound loss and an L²-surrogate loss, and supplies an expressivity result: under stated structural assumptions, neural networks of size at most c d^q ε^{-r} (with c, q, r independent of dimension d and accuracy ε) induce dual upper bounds within ε of the true value. Numerical experiments on Brownian and Brownian–Poisson models compare the learned dual upper bounds against policy-based lower bounds and illustrate small gaps together with a practical delta-hedging interpretation.
Significance. If the stated structural assumptions hold, the dimension-independent neural-network size bound constitutes a genuine advance for high-dimensional stochastic switching problems, where most existing methods suffer from the curse of dimensionality. The derivation from standard martingale theory (Doob decomposition) is non-circular, the convergence statements are conditioned on verifiable losses, and the provision of both theoretical guarantees and reproducible numerical lower-bound comparisons strengthens the contribution. The work therefore supplies a practical, theoretically supported route to computable upper bounds in settings such as energy management and high-dimensional option pricing.
minor comments (3)
- The abstract and introduction repeatedly refer to “the stated structural assumptions” that deliver the dimension-free rate c d^q ε^{-r}. A concise, self-contained list or paragraph summarizing these assumptions (e.g., regularity of the payoff processes, properties of the filtration, or boundedness conditions) should appear early in the paper so that readers can immediately judge the scope of the expressivity claim.
- In the numerical section, the architecture and training details of the deep policy network used to generate lower bounds are described only at a high level. Adding a short table or paragraph listing layer widths, activation functions, and optimizer hyperparameters would improve reproducibility without lengthening the manuscript.
- Notation for the family of martingale penalties and the continuation-value processes is introduced in the duality section; a single consolidated table of symbols (including the dependence on the number of switching dates) would reduce cross-referencing for readers.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work, the recognition of its significance for high-dimensional stochastic switching, and the recommendation for minor revision. We appreciate the referee's view that the dimension-independent neural-network size bound constitutes a genuine advance when the structural assumptions hold.
Circularity Check
No significant circularity in the derivation chain
full rationale
The dual representation for optimal switching is derived from standard martingale theory via a family of martingale penalties, with the minimal penalty characterized directly by the Doob martingales of the continuation values rather than any fit to the target value. The extension of DeepMartingale, convergence statements under the upper-bound and L2-surrogate losses, and the expressivity analysis (neural network size bound c d^q ε^{-r} independent of d) are all explicitly conditioned on the stated structural assumptions and follow from established results in the literature once those hypotheses are granted. No load-bearing step reduces by construction to its own inputs, no self-citation is required to justify the central claims, and the numerical lower-bound approach is presented separately as an empirical complement.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Doob martingale decomposition applies to continuation values of the switching problem
- domain assumption Neural networks can approximate the required martingale processes under the stated structural assumptions
Reference graph
Works this paper leans on
- [1]
-
[2]
[2]A. Alfonsi, A. Kebaier, and J. Lelong,A pure dual approach for hedging bermudan options, Math. Finance, 35 (2025), pp. 745–759. [3]E. Bayraktar, A. Cohen, and A. Nellis,A neural network approach to high-dimensional optimal switching problems with jumps in energy markets, SIAM J. Financ. Math., 14 (2023), pp. 1028–1061. [4]C. Beck, S. Becker, P. Cheridi...
work page 2025
-
[3]
[22]M. Hutzenthaler, A. Jentzen, T. Kruse, and et al.,A proof that rectified deep neural net- works overcome the curse of dimensionality in the numerical approximation of semilinear heat equations, SN Partial Differ. Equ. Appl., 1 (2020). [23]B. Jia and H. Y. Wong,Deep impulse control: application to interest rate intervention, Quant. Finance, 24 (2024), ...
work page 2020
-
[4]
[28]V. Ly Vath and H. Pham,Explicit solution to an optimal switching problem in the two-regime case, SIAM J. Control Optim., 46 (2007), pp. 395–426. [29]R. MARTYR,Dynamic programming for discrete-time finite-horizon optimal switching prob- lems with negative switching costs, Adv. Appl. Probab., 48 (2016), pp. 832–847. [30]R. Martyr,Finite-horizon optimal ...
work page 2007
-
[5]
[32]J. A. A. Opschoor, P. C. Petersen, and C. Schwab,Deep ReLU networks and high-order finite element methods, Anal. Appl., 18 (2020), pp. 715–770. [33]M. Raissi, P. Perdikaris, and G. Karniadakis,Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comp...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.