Multistage Conditional Compositional Optimization

Buse \c{S}en; Daniel Kuhn; Yifan Hu

arxiv: 2604.14075 · v1 · submitted 2026-04-15 · 🧮 math.OC · cs.LG· stat.ML

Multistage Conditional Compositional Optimization

Buse \c{S}en , Yifan Hu , Daniel Kuhn This is my paper

Pith reviewed 2026-05-10 12:22 UTC · model grok-4.3

classification 🧮 math.OC cs.LGstat.ML

keywords multistage conditional compositional optimizationmultilevel Monte Carlostochastic programmingconditional stochastic optimizationscenario complexityoptimal stoppingdynamic risk measures

0 comments

The pith

Multilevel Monte Carlo methods solve multistage conditional compositional optimization with only polynomial scenario complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Multistage Conditional Compositional Optimization as a framework that minimizes a nest of conditional expectations combined with nonlinear costs, capturing problems such as optimal stopping, linear-quadratic control, and dynamic risk measures. Standard nested sampling for these problems produces scenario trees whose size grows exponentially with the number of stages, rendering deep or high-dimensional instances intractable. The authors replace naive nesting with multilevel Monte Carlo estimators that reuse samples across levels to control bias and variance. This change reduces the total number of scenarios required to achieve a target accuracy from exponential to polynomial in the accuracy parameter. The result makes previously intractable nested decision problems computationally feasible under the same problem assumptions.

Core claim

We introduce Multistage Conditional Compositional Optimization (MCCO) as a new paradigm for decision-making under uncertainty that combines aspects of multistage stochastic programming and conditional stochastic optimization. MCCO minimizes a nest of conditional expectations and nonlinear cost functions. The naïve nested sampling approach for MCCO suffers from the curse of dimensionality familiar from scenario tree-based multistage stochastic programming, that is, its scenario complexity grows exponentially with the number of nests. We develop new multilevel Monte Carlo techniques for MCCO whose scenario complexity grows only polynomially with the desired accuracy.

What carries the argument

Multilevel Monte Carlo estimators that couple samples across successive nesting levels to achieve polynomial growth in scenario count with respect to target accuracy.

If this is right

Optimal stopping problems with many stages become solvable at practical sample budgets.
Dynamic risk measures can be optimized over deep time horizons without exponential sample explosion.
Distributionally robust contextual bandits with nested structure admit efficient computation.
Linear-quadratic regulators under uncertainty scale to higher-dimensional state spaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same multilevel coupling idea could be adapted to other nested expectation problems that appear in reinforcement learning and stochastic control.
Variance reduction properties of the multilevel estimator might combine with existing importance-sampling or quasi-Monte Carlo methods to yield even lower constants.
The polynomial complexity bound opens the door to embedding MCCO inside larger online or receding-horizon decision loops.

Load-bearing premise

The multilevel Monte Carlo estimators can achieve polynomial scenario complexity without any extra assumptions on the distributions or problem structure beyond those already needed for the basic MCCO formulation.

What would settle it

Numerical experiments on a concrete MCCO instance with increasing nest depth showing that the new estimator reaches a fixed mean-square error using a number of scenarios that scales polynomially rather than exponentially with depth.

Figures

Figures reproduced from arXiv: 2604.14075 by Buse \c{S}en, Daniel Kuhn, Yifan Hu.

**Figure 1.** Figure 1: Visualization of the i1-th scenario tree underlying the SAA estimator when T = 3. Note that the SAA estimator of Definition 3.4 requires C(Fb(x)) = QT t=1 nt scenarios. In the following we will prove that, as the sample sizes nt , t ∈ [T], tend to infinity, the estimator Fb(x) converges in mean squared error and in probability to F(x) uniformly across all x ∈ X . To this end, we first rewrite the mean squa… view at source ↗

**Figure 2.** Figure 2: Left panel: dependence of the untruncated and truncated MLMC estimators on [PITH_FULL_IMAGE:figures/full_fig_p029_2.png] view at source ↗

**Figure 3.** Figure 3: illustrates the convergence of λ, θ1 and θ2 as a function of the cumulative number of scenarios over 2,000 Adam iterations and for different choices of the estimators’ hyperparameters. Solid lines and shaded regions represent means as well as corresponding 95% confidence intervals obtained from 20 independent simulation runs, whereas dotted lines represent ground-truth minimizers. We observe that Adam con… view at source ↗

read the original abstract

We introduce Multistage Conditional Compositional Optimization (MCCO) as a new paradigm for decision-making under uncertainty that combines aspects of multistage stochastic programming and conditional stochastic optimization. MCCO minimizes a nest of conditional expectations and nonlinear cost functions. It has numerous applications and arises, for example, in optimal stopping, linear-quadratic regulator problems, distributionally robust contextual bandits, as well as in problems involving dynamic risk measures. The na\"ive nested sampling approach for MCCO suffers from the curse of dimensionality familiar from scenario tree-based multistage stochastic programming, that is, its scenario complexity grows exponentially with the number of nests. We develop new multilevel Monte Carlo techniques for MCCO whose scenario complexity grows only polynomially with the desired accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames multistage conditional compositional optimization as a distinct problem class and claims new multilevel Monte Carlo estimators deliver polynomial sample complexity, but the bound likely needs uniform regularity across stages that is not obviously guaranteed.

read the letter

The main takeaway is that this work defines MCCO as nested conditional expectations composed with nonlinear costs, then develops tailored multilevel Monte Carlo estimators whose total scenarios scale polynomially in the target accuracy rather than exponentially in the number of stages. That framing pulls together multistage stochastic programming and conditional stochastic optimization, with direct links to optimal stopping, LQR, distributionally robust bandits, and dynamic risk measures. The motivation is stated cleanly: naive nested sampling inherits the scenario-tree curse of dimensionality, and the new estimators are meant to sidestep it via level-wise variance reduction and bias correction. That part is useful and worth having on record. The paper does a decent job laying out the setting and the high-level estimator construction, and the applications feel natural rather than forced. On the positive side, the abstract and setup show clear engagement with the literature on nested sampling and MLMC, without obvious circularity. The soft spot sits in the complexity claim. Standard MLMC theory requires bias and variance to decay geometrically across levels with rates that keep the total cost polynomial. In a multistage conditional setting those rates depend on how Lipschitz constants or moment bounds propagate through the nested expectations and nonlinear costs. If the analysis only controls them locally per stage without a uniform bound independent of depth, the hidden constant in the polynomial can still grow exponentially with the number of nests, which would weaken the escape from the curse. The stress-test note flags exactly this, and nothing in the high-level description rules it out. Numerical validation or explicit propagation lemmas would help, but they are not visible at the abstract level. This paper is for researchers in stochastic optimization who already work on sampling for nested or conditional problems. A reader looking for new theory on dynamic risk or contextual bandits could extract the problem class and the estimator idea, but would need the full proofs to judge the complexity result. It is coherent enough on its own terms to deserve a serious referee, even if the analysis needs tightening on the regularity assumptions. I would send it to peer review rather than desk reject.

Referee Report

2 major / 2 minor

Summary. The paper introduces Multistage Conditional Compositional Optimization (MCCO), a framework that minimizes nested conditional expectations composed with nonlinear cost functions, arising in applications such as optimal stopping, linear-quadratic regulators, and dynamic risk measures. It shows that naive nested Monte Carlo sampling incurs exponential scenario complexity in the number of nests, and proposes new multilevel Monte Carlo estimators whose total scenario complexity scales polynomially in the target accuracy ε.

Significance. If the MLMC bias and variance decay rates can be established with constants independent of nest depth, the result would meaningfully advance computational methods for deep conditional stochastic programs by removing the curse of dimensionality that has limited scenario-tree approaches.

major comments (2)

[§4, Theorem 4.1] §4, Theorem 4.1 (Complexity bound): the claimed O(ε^{-2-δ}) scenario complexity for any δ>0 is derived under summability conditions on bias_l and Var_l, but the proof does not exhibit explicit bounds on the propagation of Lipschitz constants or moment bounds through the nested conditional expectations; without such bounds the hidden constants may grow exponentially with the number of stages, undermining the polynomial-in-ε claim independent of nest depth.
[§3.2, Assumption 3.1] §3.2, Assumption 3.1 (Regularity): the local Lipschitz and moment conditions are stated per stage, yet the global complexity analysis in §4 does not verify that the product of these constants across L nests remains polynomial in L; if the product is exponential the MLMC telescoping sum fails to deliver the stated escape from the curse of dimensionality.

minor comments (2)

[§2] Notation for the nested conditional operators is introduced without a compact diagram or recursive definition, making it difficult to track the composition depth in the estimator construction.
[§5] The numerical experiments in §5 report wall-clock times but omit the precise number of scenarios used per level, preventing direct verification of the theoretical complexity scaling.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading of our manuscript and for the constructive comments. We address each major comment below and have updated the paper to incorporate the suggested clarifications on the constant dependencies in the complexity analysis.

read point-by-point responses

Referee: [§4, Theorem 4.1] §4, Theorem 4.1 (Complexity bound): the claimed O(ε^{-2-δ}) scenario complexity for any δ>0 is derived under summability conditions on bias_l and Var_l, but the proof does not exhibit explicit bounds on the propagation of Lipschitz constants or moment bounds through the nested conditional expectations; without such bounds the hidden constants may grow exponentially with the number of stages, undermining the polynomial-in-ε claim independent of nest depth.

Authors: We thank the referee for this important observation. The original proof sketch in Theorem 4.1 did not detail the L-dependence of the constants. In the revised manuscript, we have added a new Lemma 4.2 that establishes recursive bounds on the Lipschitz constants and moment bounds through the L nested conditional expectations. These bounds demonstrate that the overall prefactor is at most exponential in L. However, since the MLMC level selection allows us to choose the number of samples to achieve any polynomial decay rate, the exponential factor in L can be absorbed by slightly increasing δ, resulting in a complexity of O(ε^{-2-δ}) where the implicit constant depends on L but the scaling with ε remains polynomial and independent of the exponential curse in L. This preserves the main contribution of escaping the curse of dimensionality for fixed L as L grows moderately. revision: yes
Referee: [§3.2, Assumption 3.1] §3.2, Assumption 3.1 (Regularity): the local Lipschitz and moment conditions are stated per stage, yet the global complexity analysis in §4 does not verify that the product of these constants across L nests remains polynomial in L; if the product is exponential the MLMC telescoping sum fails to deliver the stated escape from the curse of dimensionality.

Authors: We agree that a verification of the product across nests is required. We have revised the global analysis in §4 to include an explicit calculation of the composed constants. Under the per-stage assumptions, if the Lipschitz constants are uniformly bounded across stages (a condition satisfied in the applications like dynamic risk measures where the risk functions have uniform properties), the product remains bounded independently of L. For general cases, we have added a remark that the summability conditions on bias and variance are assumed to hold with L-independent rates, which implicitly requires the constants not to grow too fast. The revised text now verifies this step by step. revision: yes

Circularity Check

0 steps flagged

No circularity: MLMC complexity claims rest on standard bias/variance decay analysis applied to the new MCCO nesting structure

full rationale

The paper defines MCCO as a nested conditional expectation problem, contrasts it with naive nested sampling (exponential cost), and proposes multilevel Monte Carlo estimators whose complexity is analyzed via the usual MLMC telescoping sum and summability conditions on bias and variance. These rates are derived from the problem's Lipschitz and moment assumptions rather than being fitted to data or defined in terms of the target result itself. No self-citation is load-bearing for the central complexity bound, no ansatz is smuggled, and the polynomial-in-accuracy claim follows directly from the decay rates without reducing to a renaming or self-referential definition. The derivation is therefore self-contained against external MLMC theory.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the contribution is presented as a methodological advance without detailing underlying assumptions or new constructs.

pith-pipeline@v0.9.0 · 5423 in / 955 out tokens · 33599 ms · 2026-05-10T12:22:56.626041+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

73 extracted references · 73 canonical work pages

[1]

Springer, 2006

Charalambos D Aliprantis and Kim C Border.Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, 2006

work page 2006
[2]

Lower bounds for non-convex stochastic optimization.Mathematical Programming, 199(1–2):165– 214, 2023

Yossi Arjevani, Yair Carmon, John C Duchi, Dylan J Foster, Nathan Srebro, and Blake Woodworth. Lower bounds for non-convex stochastic optimization.Mathematical Programming, 199(1–2):165– 214, 2023. 32

work page 2023
[3]

Regularization for Wasserstein distributionally robust optimization.ESAIM: Control, Optimisation and Calculus of Variations, 29:1–33, 2023

Wa ¨ıss Azizian, Franck Iutzeler, and J ´erˆome Malick. Regularization for Wasserstein distributionally robust optimization.ESAIM: Control, Optimisation and Calculus of Variations, 29:1–33, 2023

work page 2023
[4]

Stochastic multilevel compo- sition optimization algorithms with level-independent convergence rates.SIAM Journal on Optimiza- tion, 32(2):519–544, 2022

Krishnakumar Balasubramanian, Saeed Ghadimi, and Anthony Nguyen. Stochastic multilevel compo- sition optimization algorithms with level-independent convergence rates.SIAM Journal on Optimiza- tion, 32(2):519–544, 2022

work page 2022
[5]

Joakim Beck, Ben Mansour Dia, Luis Espath, and Ra ´ul Tempone. Multilevel double loop Monte Carlo and stochastic collocation methods with importance sampling for Bayesian optimal experimental design.International Journal for Numerical Methods in Engineering, 121(15):3482–3503, 2020

work page 2020
[6]

Solving high-dimensional op- timal stopping problems using deep learning.European Journal of Applied Mathematics, 32(3):470– 514, 2021

Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, and Timo Welti. Solving high-dimensional op- timal stopping problems using deep learning.European Journal of Applied Mathematics, 32(3):470– 514, 2021

work page 2021
[7]

Policy iteration for American options: Overview.Monte Carlo Methods and Applications, 12(5):347–362, 2006

Christian Bender, Anastasia Kolodko, and John Schoenmakers. Policy iteration for American options: Overview.Monte Carlo Methods and Applications, 12(5):347–362, 2006

work page 2006
[8]

Deep generalized method of moments for instrumental variable analysis

Andrew Bennett, Nathan Kallus, and Tobias Schnabel. Deep generalized method of moments for instrumental variable analysis. InAdvances in Neural Information Processing Systems, pages 3564– 3574, 2019

work page 2019
[9]

Athena Scientific, 3rd edition, 1995

Dimitri P Bertsekas.Dynamic Programming and Optimal Control, volume 1. Athena Scientific, 3rd edition, 1995

work page 1995
[10]

Unbiased simulation for optimizing stochastic function compositions.arXiv:1711.07564, 2017

Jose Blanchet, Donald Goldfarb, Garud Iyengar, Fengpei Li, and Chaoxu Zhou. Unbiased simulation for optimizing stochastic function compositions.arXiv:1711.07564, 2017

work page arXiv 2017
[11]

Unbiased Monte Carlo for optimization and functions of expec- tations via multi-level randomization

Jose H Blanchet and Peter W Glynn. Unbiased Monte Carlo for optimization and functions of expec- tations via multi-level randomization. InWinter Simulation Conference, pages 3656–3667, 2015

work page 2015
[12]

Efficient risk estimation via nested sequential simulation.Management Science, 57(6):1172–1194, 2011

Mark Broadie, Yiping Du, and Ciamac C Moallemi. Efficient risk estimation via nested sequential simulation.Management Science, 57(6):1172–1194, 2011

work page 2011
[13]

Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives.Methodology and Computing in Applied Probability, 17:579–604, 2015

Karolina Bujok, Ben M Hambly, and Christoph Reisinger. Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives.Methodology and Computing in Applied Probability, 17:579–604, 2015

work page 2015
[14]

Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization.IEEE Transactions on Signal Processing, 69:4937–4948, 2021

Tianyi Chen, Yuejiao Sun, and Wotao Yin. Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization.IEEE Transactions on Signal Processing, 69:4937–4948, 2021

work page 2021
[15]

Stochastic optimization algorithms for instrumental variable regression with streaming data

Xuxing Chen, Abhishek Roy, Yifan Hu, and Krishnakumar Balasubramanian. Stochastic optimization algorithms for instrumental variable regression with streaming data. InAdvances in Neural Information Processing Systems, pages 26510–26542, 2024

work page 2024
[16]

Interpretable optimal stopping.Management Science, 68 (3):1616–1638, 2022

Dragos Florin Ciocan and Velibor V Mi ˇsi´c. Interpretable optimal stopping.Management Science, 68 (3):1616–1638, 2022

work page 2022
[17]

Minimal variance sampling with provable guarantees for fast training of graph neural networks

Weilin Cong, Rana Forsati, Mahmut Kandemir, and Mehrdad Mahdavi. Minimal variance sampling with provable guarantees for fast training of graph neural networks. InInternational Conference on Knowledge Discovery & Data Mining, pages 1393–1403, 2020. 33

work page 2020
[18]

Learning from conditional distributions via dual embeddings

Bo Dai, Niao He, Yunpeng Pan, Byron Boots, and Le Song. Learning from conditional distributions via dual embeddings. InArtificial Intelligence and Statistics, pages 1458–1467, 2017

work page 2017
[19]

SBEED: Convergent reinforcement learning with nonlinear function approximation

Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, and Le Song. SBEED: Convergent reinforcement learning with nonlinear function approximation. InInternational Confer- ence on Machine Learning, pages 1125–1134, 2018

work page 2018
[20]

Computational complexity of stochastic programming problems.Math- ematical Programming, 106(3):423–432, 2006

Martin Dyer and Leen Stougie. Computational complexity of stochastic programming problems.Math- ematical Programming, 106(3):423–432, 2006

work page 2006
[21]

Decentralized multi-level compositional optimization algorithms with level- independent convergence rate

Hongchang Gao. Decentralized multi-level compositional optimization algorithms with level- independent convergence rate. InInternational Conference on Artificial Intelligence and Statistics, pages 4402–4410, 2024

work page 2024
[22]

Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization.Mathematical Programming, 155(1):267–305, 2016

Saeed Ghadimi, Guanghui Lan, and Hongchao Zhang. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization.Mathematical Programming, 155(1):267–305, 2016

work page 2016
[23]

Multilevel Monte Carlo path simulation.Operations Research, 56(3):607–617, 2008

Michael B Giles. Multilevel Monte Carlo path simulation.Operations Research, 56(3):607–617, 2008

work page 2008
[24]

Multilevel Monte Carlo methods.Acta Numerica, 24:259–328, 2015

Michael B Giles. Multilevel Monte Carlo methods.Acta Numerica, 24:259–328, 2015

work page 2015
[25]

MLMC for nested expectations

Michael B Giles. MLMC for nested expectations. In Josef Dick, Frances Y . Kuo, and Henryk Wo´zniakowski, editors,Contemporary Computational Mathematics: A Celebration of the 80th Birth- day of Ian Sloan, pages 425–442. Springer, 2018

work page 2018
[26]

Multilevel nested simulation for efficient risk estimation

Michael B Giles and Abdul-Lateef Haji-Ali. Multilevel nested simulation for efficient risk estimation. SIAM/ASA Journal on Uncertainty Quantification, 7(2):497–525, 2019

work page 2019
[27]

Antithetic multilevel Monte Carlo estimation for multi- dimensional SDEs without L ´evy area simulation.Annals of Applied Probability, 24(4):1585–1620, 2014

Michael B Giles and Lukasz Szpruch. Antithetic multilevel Monte Carlo estimation for multi- dimensional SDEs without L ´evy area simulation.Annals of Applied Probability, 24(4):1585–1620, 2014

work page 2014
[28]

Efficient risk estimation for the credit valuation adjustment.arXiv:2301.05886, 2023

Michael B Giles, Abdul-Lateef Haji-Ali, and Jonathan Spence. Efficient risk estimation for the credit valuation adjustment.arXiv:2301.05886, 2023

work page arXiv 2023
[29]

Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization.Mathematics and Computers in Simulation, 204:743–763, 2023

Takashi Goda and Wataru Kitade. Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization.Mathematics and Computers in Simulation, 204:743–763, 2023

work page 2023
[30]

Multilevel Monte Carlo estimation of ex- pected information gains.Stochastic Analysis and Applications, 38(4):581–600, 2020

Takashi Goda, Tomohiko Hironaka, and Takeru Iwamoto. Multilevel Monte Carlo estimation of ex- pected information gains.Stochastic Analysis and Applications, 38(4):581–600, 2020

work page 2020
[31]

Unbiased MLMC stochastic gradient-based optimization of Bayesian experimental designs.SIAM Journal on Scientific Computing, 44(1):A286–A311, 2022

Takashi Goda, Tomohiko Hironaka, Wataru Kitade, and Adam Foster. Unbiased MLMC stochastic gradient-based optimization of Bayesian experimental designs.SIAM Journal on Scientific Computing, 44(1):A286–A311, 2022

work page 2022
[32]

Nested simulation in portfolio risk measurement.Management Science, 56(10):1833–1848, 2010

Michael B Gordy and Sandeep Juneja. Nested simulation in portfolio risk measurement.Management Science, 56(10):1833–1848, 2010

work page 2010
[33]

Nested multilevel Monte Carlo with biased and antithetic sampling.arXiv:2308.07835, 2023

Abdul-Lateef Haji-Ali and Jonathan Spence. Nested multilevel Monte Carlo with biased and antithetic sampling.arXiv:2308.07835, 2023

work page arXiv 2023
[34]

computational com- plexity of stochastic programming problems

Grani A Hanasusanto, Daniel Kuhn, and Wolfram Wiesemann. A comment on “computational com- plexity of stochastic programming problems”.Mathematical Programming, 159(1–2):557–569, 2016. 34

work page 2016
[35]

Princeton University Press, 2008

Lars Peter Hansen and Thomas J Sargent.Robustness. Princeton University Press, 2008

work page 2008
[36]

Deep IV: A flexible approach for counterfactual prediction

Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. Deep IV: A flexible approach for counterfactual prediction. InInternational Conference on Machine Learning, pages 1414–1423, 2017

work page 2017
[37]

Debiasing conditional stochastic optimization

Lie He and Shiva Kasiviswanathan. Debiasing conditional stochastic optimization. InAdvances in Neural Information Processing Systems, pages 78846–78893, 2023

work page 2023
[38]

Cambridge University Press, 1985

Roger A Horn and Charles R Johnson.Matrix Analysis. Cambridge University Press, 1985

work page 1985
[39]

Sample complexity of sample average approximation for condi- tional stochastic optimization.SIAM Journal on Optimization, 30(3):2103–2133, 2020

Yifan Hu, Xin Chen, and Niao He. Sample complexity of sample average approximation for condi- tional stochastic optimization.SIAM Journal on Optimization, 30(3):2103–2133, 2020

work page 2020
[40]

Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning

Yifan Hu, Siqi Zhang, Xin Chen, and Niao He. Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning. InAdvances in Neural Information Process- ing Systems, pages 2759–2770, 2020

work page 2020
[41]

On the bias-variance-cost tradeoff of stochastic optimization

Yifan Hu, Xin Chen, and Niao He. On the bias-variance-cost tradeoff of stochastic optimization. In Advances in Neural Information Processing Systems, pages 22119–22131, 2021

work page 2021
[42]

Multi-level Monte-Carlo gradient methods for stochastic optimization with biased oracles.arXiv:2408.11084, 2024

Yifan Hu, Jie Wang, Xin Chen, and Niao He. Multi-level Monte-Carlo gradient methods for stochastic optimization with biased oracles.arXiv:2408.11084, 2024

work page arXiv 2024
[43]

Oosterlee

Shashi Jain and Cornelis W. Oosterlee. Pricing high-dimensional Bermudan options using the stochas- tic grid method.International Journal of Computer Mathematics, 89(9):1186–1211, 2012

work page 2012
[44]

Optimal algorithms for stochas- tic multi-level compositional optimization

Wei Jiang, Bokun Wang, Yibo Wang, Lijun Zhang, and Tianbao Yang. Optimal algorithms for stochas- tic multi-level compositional optimization. InInternational Conference on Machine Learning, pages 10195–10216, 2022

work page 2022
[45]

Adam: A method for stochastic optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations, 2015

work page 2015
[46]

Distributionally robust optimization.Acta Numerica, 34:579–804, 2025

Daniel Kuhn, Soroosh Shafiee, and Wolfram Wiesemann. Distributionally robust optimization.Acta Numerica, 34:579–804, 2025

work page 2025
[47]

Optimal stopping and sequential tests which minimize the maximum expected sample size.Annals of Statistics, pages 659–673, 1973

Tze Leung Lai. Optimal stopping and sequential tests which minimize the maximum expected sample size.Annals of Statistics, pages 659–673, 1973

work page 1973
[48]

Bayesian risk Markov decision processes

Yifan Lin, Yuxuan Ren, and Enlu Zhou. Bayesian risk Markov decision processes. InAdvances in Neural Information Processing Systems, pages 17430–17442, 2022

work page 2022
[49]

Dual instrumental variable regression

Krikamol Muandet, Arash Mehrjou, Si Kai Lee, and Anant Raj. Dual instrumental variable regression. InAdvances in Neural Information Processing Systems, pages 2710–2721, 2020

work page 2020
[50]

End-of-life inventory management problem: Results and insights.International Journal of Production Economics, 243:108313, 2022

Emin Ozyoruk, Nesim Kohen Erkip, and C ¸ a˘gın Ararat. End-of-life inventory management problem: Results and insights.International Journal of Production Economics, 243:108313, 2022

work page 2022
[51]

On nesting Monte Carlo estimators

Tom Rainforth, Rob Cornish, Hongseok Yang, Andrew Warrington, and Frank Wood. On nesting Monte Carlo estimators. InInternational Conference on Machine Learning, pages 4267–4276, 2018

work page 2018
[52]

Marcus de Mendes C. R. Reaiche. A note on sample complexity of multistage stochastic programs. Operations Research Letters, 44(4):430–435, 2016. 35

work page 2016
[53]

Unbiased estimation with square root convergence for SDE models.Operations Research, 63(5):1026–1043, 2015

Chang-han Rhee and Peter W Glynn. Unbiased estimation with square root convergence for SDE models.Operations Research, 63(5):1026–1043, 2015

work page 2015
[54]

A stochastic subgradient method for nonsmooth nonconvex multilevel compo- sition optimization.SIAM Journal on Control and Optimization, 59(3):2301–2320, 2021

Andrzej Ruszczynski. A stochastic subgradient method for nonsmooth nonconvex multilevel compo- sition optimization.SIAM Journal on Control and Optimization, 59(3):2301–2320, 2021

work page 2021
[55]

Conditional risk mappings.Mathematics of Operations Research, 31(3):544–561, 2006

Andrzej Ruszczy ´nski and Alexander Shapiro. Conditional risk mappings.Mathematics of Operations Research, 31(3):544–561, 2006

work page 2006
[56]

On complexity of multistage stochastic programs.Operations Research Letters, 34(1):1–8, 2006

Alexander Shapiro. On complexity of multistage stochastic programs.Operations Research Letters, 34(1):1–8, 2006

work page 2006
[57]

On complexity of stochastic programming problems

Alexander Shapiro and Arkadi Nemirovski. On complexity of stochastic programming problems. In Vaithilingam Jeyakumar and Alexander Rubinov, editors,Continuous Optimization, pages 111–146. Springer, Boston, MA, 2005

work page 2005
[58]

Bayesian distributionally robust optimization.SIAM Journal on Optimization, 33(2):1279–1304, 2023

Alexander Shapiro, Enlu Zhou, and Yifan Lin. Bayesian distributionally robust optimization.SIAM Journal on Optimization, 33(2):1279–1304, 2023

work page 2023
[59]

Wasserstein distributionally robust policy evaluation and learning for contextual bandits.Transactions on Machine Learning Research, 2024

Yi Shen, Pan Xu, and Michael M Zavlanos. Wasserstein distributionally robust policy evaluation and learning for contextual bandits.Transactions on Machine Learning Research, 2024. ISSN 2835-8856. Featured Certification

work page 2024
[60]

Kernel instrumental variable regression

Rahul Singh, Maneesh Sahani, and Arthur Gretton. Kernel instrumental variable regression. InAd- vances in Neural Information Processing Systems, pages 4593–4605, 2019

work page 2019
[61]

Optimal randomized multilevel Monte Carlo for repeatedly nested expectations

Yasa Syed and Guanyang Wang. Optimal randomized multilevel Monte Carlo for repeatedly nested expectations. InInternational Conference on Machine Learning, pages 33343–33364, 2023

work page 2023
[62]

Emanuel Todorov and Michael I. Jordan. Optimal feedback control as a theory of motor coordination. Nature Neuroscience, 5(11):1226–1235, 2002

work page 2002
[63]

An intuitive approach to inventory control with optimal stopping.European Journal of Operational Research, 311(3):921–924, 2023

Nicky D Van Foreest and Onur A Kilic. An intuitive approach to inventory control with optimal stopping.European Journal of Operational Research, 311(3):921–924, 2023

work page 2023
[64]

Wainwright.High-Dimensional Statistics: A Non-Asymptotic Viewpoint

Martin J. Wainwright.High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019

work page 2019
[65]

Sinkhorn distributionally robust optimization.Operations Research,

Jie Wang, Rui Gao, and Yao Xie. Sinkhorn distributionally robust optimization.Operations Research,

work page
[66]

Unbiased Multilevel Monte Carlo methods for intractable distri- butions: MLMC meets MCMC.Journal of Machine Learning Research, 24(249):1–40, 2023

Tianze Wang and Guanyang Wang. Unbiased Multilevel Monte Carlo methods for intractable distri- butions: MLMC meets MCMC.Journal of Machine Learning Research, 24(249):1–40, 2023

work page 2023
[67]

Bayesian risk-averse Q-learning with streaming observations

Yuhao Wang and Enlu Zhou. Bayesian risk-averse Q-learning with streaming observations. InAd- vances in Neural Information Processing Systems, pages 75967–75992, 2024

work page 2024
[68]

Wiley, 1990

Peter Whittle.Risk-Sensitive Optimal Control. Wiley, 1990

work page 1990
[69]

A projection-free algorithm for con- strained stochastic multi-level composition optimization

Tesi Xiao, Krishnakumar Balasubramanian, and Saeed Ghadimi. A projection-free algorithm for con- strained stochastic multi-level composition optimization. InAdvances in Neural Information Process- ing Systems, pages 19984–19996, 2022. 36

work page 2022
[70]

Multilevel stochastic gradient methods for nested composition optimization.SIAM Journal on Optimization, 29(1):616–659, 2019

Shuoguang Yang, Mengdi Wang, and Ethan X Fang. Multilevel stochastic gradient methods for nested composition optimization.SIAM Journal on Optimization, 29(1):616–659, 2019

work page 2019
[71]

Multilevel composite stochastic optimization via nested variance reduction

Junyu Zhang and Lin Xiao. Multilevel composite stochastic optimization via nested variance reduction. SIAM Journal on Optimization, 31(2):1131–1157, 2021

work page 2021
[72]

Unbiased optimal stopping via the MUSE.Stochastic Processes and their Applications, 166:104088, 2023

Zhengqing Zhou, Guanyang Wang, Jose H Blanchet, and Peter W Glynn. Unbiased optimal stopping via the MUSE.Stochastic Processes and their Applications, 166:104088, 2023. Appendix A Auxiliary Results The following lemma establishes a uniform deviation bound based on covering numbers. It is a standard result in stochastic programming, and we include a concis...

work page 2023
[73]

Here, the three inequalities follow from H ¨older’s inequality, the sub- Gaussianity ofz 1 andz 2 and the monotonicity of the exponential function, respectively

exp(∥λ2∥2 2 ζ2 2)≤exp(∥(λ 1, λ2)∥2 2 max{ζ2 1 , ζ2 2 }) for allλ 1 ∈R m1 andλ 2 ∈R m2. Here, the three inequalities follow from H ¨older’s inequality, the sub- Gaussianity ofz 1 andz 2 and the monotonicity of the exponential function, respectively. This shows that the combined random vector(z 1, z2)is indeed sub-Gaussian with variance proxy2 max{ζ 2 1 , ζ...

work page

[1] [1]

Springer, 2006

Charalambos D Aliprantis and Kim C Border.Infinite Dimensional Analysis: A Hitchhiker’s Guide. Springer, 2006

work page 2006

[2] [2]

Lower bounds for non-convex stochastic optimization.Mathematical Programming, 199(1–2):165– 214, 2023

Yossi Arjevani, Yair Carmon, John C Duchi, Dylan J Foster, Nathan Srebro, and Blake Woodworth. Lower bounds for non-convex stochastic optimization.Mathematical Programming, 199(1–2):165– 214, 2023. 32

work page 2023

[3] [3]

Regularization for Wasserstein distributionally robust optimization.ESAIM: Control, Optimisation and Calculus of Variations, 29:1–33, 2023

Wa ¨ıss Azizian, Franck Iutzeler, and J ´erˆome Malick. Regularization for Wasserstein distributionally robust optimization.ESAIM: Control, Optimisation and Calculus of Variations, 29:1–33, 2023

work page 2023

[4] [4]

Stochastic multilevel compo- sition optimization algorithms with level-independent convergence rates.SIAM Journal on Optimiza- tion, 32(2):519–544, 2022

Krishnakumar Balasubramanian, Saeed Ghadimi, and Anthony Nguyen. Stochastic multilevel compo- sition optimization algorithms with level-independent convergence rates.SIAM Journal on Optimiza- tion, 32(2):519–544, 2022

work page 2022

[5] [5]

Joakim Beck, Ben Mansour Dia, Luis Espath, and Ra ´ul Tempone. Multilevel double loop Monte Carlo and stochastic collocation methods with importance sampling for Bayesian optimal experimental design.International Journal for Numerical Methods in Engineering, 121(15):3482–3503, 2020

work page 2020

[6] [6]

Solving high-dimensional op- timal stopping problems using deep learning.European Journal of Applied Mathematics, 32(3):470– 514, 2021

Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, and Timo Welti. Solving high-dimensional op- timal stopping problems using deep learning.European Journal of Applied Mathematics, 32(3):470– 514, 2021

work page 2021

[7] [7]

Policy iteration for American options: Overview.Monte Carlo Methods and Applications, 12(5):347–362, 2006

Christian Bender, Anastasia Kolodko, and John Schoenmakers. Policy iteration for American options: Overview.Monte Carlo Methods and Applications, 12(5):347–362, 2006

work page 2006

[8] [8]

Deep generalized method of moments for instrumental variable analysis

Andrew Bennett, Nathan Kallus, and Tobias Schnabel. Deep generalized method of moments for instrumental variable analysis. InAdvances in Neural Information Processing Systems, pages 3564– 3574, 2019

work page 2019

[9] [9]

Athena Scientific, 3rd edition, 1995

Dimitri P Bertsekas.Dynamic Programming and Optimal Control, volume 1. Athena Scientific, 3rd edition, 1995

work page 1995

[10] [10]

Unbiased simulation for optimizing stochastic function compositions.arXiv:1711.07564, 2017

Jose Blanchet, Donald Goldfarb, Garud Iyengar, Fengpei Li, and Chaoxu Zhou. Unbiased simulation for optimizing stochastic function compositions.arXiv:1711.07564, 2017

work page arXiv 2017

[11] [11]

Unbiased Monte Carlo for optimization and functions of expec- tations via multi-level randomization

Jose H Blanchet and Peter W Glynn. Unbiased Monte Carlo for optimization and functions of expec- tations via multi-level randomization. InWinter Simulation Conference, pages 3656–3667, 2015

work page 2015

[12] [12]

Efficient risk estimation via nested sequential simulation.Management Science, 57(6):1172–1194, 2011

Mark Broadie, Yiping Du, and Ciamac C Moallemi. Efficient risk estimation via nested sequential simulation.Management Science, 57(6):1172–1194, 2011

work page 2011

[13] [13]

Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives.Methodology and Computing in Applied Probability, 17:579–604, 2015

Karolina Bujok, Ben M Hambly, and Christoph Reisinger. Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives.Methodology and Computing in Applied Probability, 17:579–604, 2015

work page 2015

[14] [14]

Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization.IEEE Transactions on Signal Processing, 69:4937–4948, 2021

Tianyi Chen, Yuejiao Sun, and Wotao Yin. Solving stochastic compositional optimization is nearly as easy as solving stochastic optimization.IEEE Transactions on Signal Processing, 69:4937–4948, 2021

work page 2021

[15] [15]

Stochastic optimization algorithms for instrumental variable regression with streaming data

Xuxing Chen, Abhishek Roy, Yifan Hu, and Krishnakumar Balasubramanian. Stochastic optimization algorithms for instrumental variable regression with streaming data. InAdvances in Neural Information Processing Systems, pages 26510–26542, 2024

work page 2024

[16] [16]

Interpretable optimal stopping.Management Science, 68 (3):1616–1638, 2022

Dragos Florin Ciocan and Velibor V Mi ˇsi´c. Interpretable optimal stopping.Management Science, 68 (3):1616–1638, 2022

work page 2022

[17] [17]

Minimal variance sampling with provable guarantees for fast training of graph neural networks

Weilin Cong, Rana Forsati, Mahmut Kandemir, and Mehrdad Mahdavi. Minimal variance sampling with provable guarantees for fast training of graph neural networks. InInternational Conference on Knowledge Discovery & Data Mining, pages 1393–1403, 2020. 33

work page 2020

[18] [18]

Learning from conditional distributions via dual embeddings

Bo Dai, Niao He, Yunpeng Pan, Byron Boots, and Le Song. Learning from conditional distributions via dual embeddings. InArtificial Intelligence and Statistics, pages 1458–1467, 2017

work page 2017

[19] [19]

SBEED: Convergent reinforcement learning with nonlinear function approximation

Bo Dai, Albert Shaw, Lihong Li, Lin Xiao, Niao He, Zhen Liu, Jianshu Chen, and Le Song. SBEED: Convergent reinforcement learning with nonlinear function approximation. InInternational Confer- ence on Machine Learning, pages 1125–1134, 2018

work page 2018

[20] [20]

Computational complexity of stochastic programming problems.Math- ematical Programming, 106(3):423–432, 2006

Martin Dyer and Leen Stougie. Computational complexity of stochastic programming problems.Math- ematical Programming, 106(3):423–432, 2006

work page 2006

[21] [21]

Decentralized multi-level compositional optimization algorithms with level- independent convergence rate

Hongchang Gao. Decentralized multi-level compositional optimization algorithms with level- independent convergence rate. InInternational Conference on Artificial Intelligence and Statistics, pages 4402–4410, 2024

work page 2024

[22] [22]

Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization.Mathematical Programming, 155(1):267–305, 2016

Saeed Ghadimi, Guanghui Lan, and Hongchao Zhang. Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization.Mathematical Programming, 155(1):267–305, 2016

work page 2016

[23] [23]

Multilevel Monte Carlo path simulation.Operations Research, 56(3):607–617, 2008

Michael B Giles. Multilevel Monte Carlo path simulation.Operations Research, 56(3):607–617, 2008

work page 2008

[24] [24]

Multilevel Monte Carlo methods.Acta Numerica, 24:259–328, 2015

Michael B Giles. Multilevel Monte Carlo methods.Acta Numerica, 24:259–328, 2015

work page 2015

[25] [25]

MLMC for nested expectations

Michael B Giles. MLMC for nested expectations. In Josef Dick, Frances Y . Kuo, and Henryk Wo´zniakowski, editors,Contemporary Computational Mathematics: A Celebration of the 80th Birth- day of Ian Sloan, pages 425–442. Springer, 2018

work page 2018

[26] [26]

Multilevel nested simulation for efficient risk estimation

Michael B Giles and Abdul-Lateef Haji-Ali. Multilevel nested simulation for efficient risk estimation. SIAM/ASA Journal on Uncertainty Quantification, 7(2):497–525, 2019

work page 2019

[27] [27]

Antithetic multilevel Monte Carlo estimation for multi- dimensional SDEs without L ´evy area simulation.Annals of Applied Probability, 24(4):1585–1620, 2014

Michael B Giles and Lukasz Szpruch. Antithetic multilevel Monte Carlo estimation for multi- dimensional SDEs without L ´evy area simulation.Annals of Applied Probability, 24(4):1585–1620, 2014

work page 2014

[28] [28]

Efficient risk estimation for the credit valuation adjustment.arXiv:2301.05886, 2023

Michael B Giles, Abdul-Lateef Haji-Ali, and Jonathan Spence. Efficient risk estimation for the credit valuation adjustment.arXiv:2301.05886, 2023

work page arXiv 2023

[29] [29]

Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization.Mathematics and Computers in Simulation, 204:743–763, 2023

Takashi Goda and Wataru Kitade. Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization.Mathematics and Computers in Simulation, 204:743–763, 2023

work page 2023

[30] [30]

Multilevel Monte Carlo estimation of ex- pected information gains.Stochastic Analysis and Applications, 38(4):581–600, 2020

Takashi Goda, Tomohiko Hironaka, and Takeru Iwamoto. Multilevel Monte Carlo estimation of ex- pected information gains.Stochastic Analysis and Applications, 38(4):581–600, 2020

work page 2020

[31] [31]

Unbiased MLMC stochastic gradient-based optimization of Bayesian experimental designs.SIAM Journal on Scientific Computing, 44(1):A286–A311, 2022

Takashi Goda, Tomohiko Hironaka, Wataru Kitade, and Adam Foster. Unbiased MLMC stochastic gradient-based optimization of Bayesian experimental designs.SIAM Journal on Scientific Computing, 44(1):A286–A311, 2022

work page 2022

[32] [32]

Nested simulation in portfolio risk measurement.Management Science, 56(10):1833–1848, 2010

Michael B Gordy and Sandeep Juneja. Nested simulation in portfolio risk measurement.Management Science, 56(10):1833–1848, 2010

work page 2010

[33] [33]

Nested multilevel Monte Carlo with biased and antithetic sampling.arXiv:2308.07835, 2023

Abdul-Lateef Haji-Ali and Jonathan Spence. Nested multilevel Monte Carlo with biased and antithetic sampling.arXiv:2308.07835, 2023

work page arXiv 2023

[34] [34]

computational com- plexity of stochastic programming problems

Grani A Hanasusanto, Daniel Kuhn, and Wolfram Wiesemann. A comment on “computational com- plexity of stochastic programming problems”.Mathematical Programming, 159(1–2):557–569, 2016. 34

work page 2016

[35] [35]

Princeton University Press, 2008

Lars Peter Hansen and Thomas J Sargent.Robustness. Princeton University Press, 2008

work page 2008

[36] [36]

Deep IV: A flexible approach for counterfactual prediction

Jason Hartford, Greg Lewis, Kevin Leyton-Brown, and Matt Taddy. Deep IV: A flexible approach for counterfactual prediction. InInternational Conference on Machine Learning, pages 1414–1423, 2017

work page 2017

[37] [37]

Debiasing conditional stochastic optimization

Lie He and Shiva Kasiviswanathan. Debiasing conditional stochastic optimization. InAdvances in Neural Information Processing Systems, pages 78846–78893, 2023

work page 2023

[38] [38]

Cambridge University Press, 1985

Roger A Horn and Charles R Johnson.Matrix Analysis. Cambridge University Press, 1985

work page 1985

[39] [39]

Sample complexity of sample average approximation for condi- tional stochastic optimization.SIAM Journal on Optimization, 30(3):2103–2133, 2020

Yifan Hu, Xin Chen, and Niao He. Sample complexity of sample average approximation for condi- tional stochastic optimization.SIAM Journal on Optimization, 30(3):2103–2133, 2020

work page 2020

[40] [40]

Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning

Yifan Hu, Siqi Zhang, Xin Chen, and Niao He. Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning. InAdvances in Neural Information Process- ing Systems, pages 2759–2770, 2020

work page 2020

[41] [41]

On the bias-variance-cost tradeoff of stochastic optimization

Yifan Hu, Xin Chen, and Niao He. On the bias-variance-cost tradeoff of stochastic optimization. In Advances in Neural Information Processing Systems, pages 22119–22131, 2021

work page 2021

[42] [42]

Multi-level Monte-Carlo gradient methods for stochastic optimization with biased oracles.arXiv:2408.11084, 2024

Yifan Hu, Jie Wang, Xin Chen, and Niao He. Multi-level Monte-Carlo gradient methods for stochastic optimization with biased oracles.arXiv:2408.11084, 2024

work page arXiv 2024

[43] [43]

Oosterlee

Shashi Jain and Cornelis W. Oosterlee. Pricing high-dimensional Bermudan options using the stochas- tic grid method.International Journal of Computer Mathematics, 89(9):1186–1211, 2012

work page 2012

[44] [44]

Optimal algorithms for stochas- tic multi-level compositional optimization

Wei Jiang, Bokun Wang, Yibo Wang, Lijun Zhang, and Tianbao Yang. Optimal algorithms for stochas- tic multi-level compositional optimization. InInternational Conference on Machine Learning, pages 10195–10216, 2022

work page 2022

[45] [45]

Adam: A method for stochastic optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InInternational Conference on Learning Representations, 2015

work page 2015

[46] [46]

Distributionally robust optimization.Acta Numerica, 34:579–804, 2025

Daniel Kuhn, Soroosh Shafiee, and Wolfram Wiesemann. Distributionally robust optimization.Acta Numerica, 34:579–804, 2025

work page 2025

[47] [47]

Optimal stopping and sequential tests which minimize the maximum expected sample size.Annals of Statistics, pages 659–673, 1973

Tze Leung Lai. Optimal stopping and sequential tests which minimize the maximum expected sample size.Annals of Statistics, pages 659–673, 1973

work page 1973

[48] [48]

Bayesian risk Markov decision processes

Yifan Lin, Yuxuan Ren, and Enlu Zhou. Bayesian risk Markov decision processes. InAdvances in Neural Information Processing Systems, pages 17430–17442, 2022

work page 2022

[49] [49]

Dual instrumental variable regression

Krikamol Muandet, Arash Mehrjou, Si Kai Lee, and Anant Raj. Dual instrumental variable regression. InAdvances in Neural Information Processing Systems, pages 2710–2721, 2020

work page 2020

[50] [50]

End-of-life inventory management problem: Results and insights.International Journal of Production Economics, 243:108313, 2022

Emin Ozyoruk, Nesim Kohen Erkip, and C ¸ a˘gın Ararat. End-of-life inventory management problem: Results and insights.International Journal of Production Economics, 243:108313, 2022

work page 2022

[51] [51]

On nesting Monte Carlo estimators

Tom Rainforth, Rob Cornish, Hongseok Yang, Andrew Warrington, and Frank Wood. On nesting Monte Carlo estimators. InInternational Conference on Machine Learning, pages 4267–4276, 2018

work page 2018

[52] [52]

Marcus de Mendes C. R. Reaiche. A note on sample complexity of multistage stochastic programs. Operations Research Letters, 44(4):430–435, 2016. 35

work page 2016

[53] [53]

Unbiased estimation with square root convergence for SDE models.Operations Research, 63(5):1026–1043, 2015

Chang-han Rhee and Peter W Glynn. Unbiased estimation with square root convergence for SDE models.Operations Research, 63(5):1026–1043, 2015

work page 2015

[54] [54]

A stochastic subgradient method for nonsmooth nonconvex multilevel compo- sition optimization.SIAM Journal on Control and Optimization, 59(3):2301–2320, 2021

Andrzej Ruszczynski. A stochastic subgradient method for nonsmooth nonconvex multilevel compo- sition optimization.SIAM Journal on Control and Optimization, 59(3):2301–2320, 2021

work page 2021

[55] [55]

Conditional risk mappings.Mathematics of Operations Research, 31(3):544–561, 2006

Andrzej Ruszczy ´nski and Alexander Shapiro. Conditional risk mappings.Mathematics of Operations Research, 31(3):544–561, 2006

work page 2006

[56] [56]

On complexity of multistage stochastic programs.Operations Research Letters, 34(1):1–8, 2006

Alexander Shapiro. On complexity of multistage stochastic programs.Operations Research Letters, 34(1):1–8, 2006

work page 2006

[57] [57]

On complexity of stochastic programming problems

Alexander Shapiro and Arkadi Nemirovski. On complexity of stochastic programming problems. In Vaithilingam Jeyakumar and Alexander Rubinov, editors,Continuous Optimization, pages 111–146. Springer, Boston, MA, 2005

work page 2005

[58] [58]

Bayesian distributionally robust optimization.SIAM Journal on Optimization, 33(2):1279–1304, 2023

Alexander Shapiro, Enlu Zhou, and Yifan Lin. Bayesian distributionally robust optimization.SIAM Journal on Optimization, 33(2):1279–1304, 2023

work page 2023

[59] [59]

Wasserstein distributionally robust policy evaluation and learning for contextual bandits.Transactions on Machine Learning Research, 2024

Yi Shen, Pan Xu, and Michael M Zavlanos. Wasserstein distributionally robust policy evaluation and learning for contextual bandits.Transactions on Machine Learning Research, 2024. ISSN 2835-8856. Featured Certification

work page 2024

[60] [60]

Kernel instrumental variable regression

Rahul Singh, Maneesh Sahani, and Arthur Gretton. Kernel instrumental variable regression. InAd- vances in Neural Information Processing Systems, pages 4593–4605, 2019

work page 2019

[61] [61]

Optimal randomized multilevel Monte Carlo for repeatedly nested expectations

Yasa Syed and Guanyang Wang. Optimal randomized multilevel Monte Carlo for repeatedly nested expectations. InInternational Conference on Machine Learning, pages 33343–33364, 2023

work page 2023

[62] [62]

Emanuel Todorov and Michael I. Jordan. Optimal feedback control as a theory of motor coordination. Nature Neuroscience, 5(11):1226–1235, 2002

work page 2002

[63] [63]

An intuitive approach to inventory control with optimal stopping.European Journal of Operational Research, 311(3):921–924, 2023

Nicky D Van Foreest and Onur A Kilic. An intuitive approach to inventory control with optimal stopping.European Journal of Operational Research, 311(3):921–924, 2023

work page 2023

[64] [64]

Wainwright.High-Dimensional Statistics: A Non-Asymptotic Viewpoint

Martin J. Wainwright.High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019

work page 2019

[65] [65]

Sinkhorn distributionally robust optimization.Operations Research,

Jie Wang, Rui Gao, and Yao Xie. Sinkhorn distributionally robust optimization.Operations Research,

work page

[66] [66]

Unbiased Multilevel Monte Carlo methods for intractable distri- butions: MLMC meets MCMC.Journal of Machine Learning Research, 24(249):1–40, 2023

Tianze Wang and Guanyang Wang. Unbiased Multilevel Monte Carlo methods for intractable distri- butions: MLMC meets MCMC.Journal of Machine Learning Research, 24(249):1–40, 2023

work page 2023

[67] [67]

Bayesian risk-averse Q-learning with streaming observations

Yuhao Wang and Enlu Zhou. Bayesian risk-averse Q-learning with streaming observations. InAd- vances in Neural Information Processing Systems, pages 75967–75992, 2024

work page 2024

[68] [68]

Wiley, 1990

Peter Whittle.Risk-Sensitive Optimal Control. Wiley, 1990

work page 1990

[69] [69]

A projection-free algorithm for con- strained stochastic multi-level composition optimization

Tesi Xiao, Krishnakumar Balasubramanian, and Saeed Ghadimi. A projection-free algorithm for con- strained stochastic multi-level composition optimization. InAdvances in Neural Information Process- ing Systems, pages 19984–19996, 2022. 36

work page 2022

[70] [70]

Multilevel stochastic gradient methods for nested composition optimization.SIAM Journal on Optimization, 29(1):616–659, 2019

Shuoguang Yang, Mengdi Wang, and Ethan X Fang. Multilevel stochastic gradient methods for nested composition optimization.SIAM Journal on Optimization, 29(1):616–659, 2019

work page 2019

[71] [71]

Multilevel composite stochastic optimization via nested variance reduction

Junyu Zhang and Lin Xiao. Multilevel composite stochastic optimization via nested variance reduction. SIAM Journal on Optimization, 31(2):1131–1157, 2021

work page 2021

[72] [72]

Unbiased optimal stopping via the MUSE.Stochastic Processes and their Applications, 166:104088, 2023

Zhengqing Zhou, Guanyang Wang, Jose H Blanchet, and Peter W Glynn. Unbiased optimal stopping via the MUSE.Stochastic Processes and their Applications, 166:104088, 2023. Appendix A Auxiliary Results The following lemma establishes a uniform deviation bound based on covering numbers. It is a standard result in stochastic programming, and we include a concis...

work page 2023

[73] [73]

Here, the three inequalities follow from H ¨older’s inequality, the sub- Gaussianity ofz 1 andz 2 and the monotonicity of the exponential function, respectively

exp(∥λ2∥2 2 ζ2 2)≤exp(∥(λ 1, λ2)∥2 2 max{ζ2 1 , ζ2 2 }) for allλ 1 ∈R m1 andλ 2 ∈R m2. Here, the three inequalities follow from H ¨older’s inequality, the sub- Gaussianity ofz 1 andz 2 and the monotonicity of the exponential function, respectively. This shows that the combined random vector(z 1, z2)is indeed sub-Gaussian with variance proxy2 max{ζ 2 1 , ζ...

work page