pith. sign in

arxiv: 2507.11350 · v2 · submitted 2025-07-15 · 🧮 math.OC · cs.SY· eess.SY

Robustness Measures in Distributionally Robust Optimization

Pith reviewed 2026-05-19 04:32 UTC · model grok-4.3

classification 🧮 math.OC cs.SYeess.SY
keywords distributionally robust optimizationworst-case sensitivityregularizationrobustness measureuncertainty setsperformance-robustness tradeoffmodel misspecificationPareto frontier
0
0 comments X

The pith

The regularizer that approximates distributionally robust optimization is exactly the worst-case sensitivity of expected cost to shifts away from the nominal model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that for certain uncertainty sets, the DRO problem reduces to a regularized version of the nominal problem, but the regularizer itself is the worst-case sensitivity of the expected cost to deviations in the probability distribution. This gives the regularizer a direct interpretation as a robustness measure rather than an arbitrary penalty. As a result, DRO is revealed to be a tradeoff between nominal performance and this specific form of robustness, with the uncertainty set determining which aspects of the cost distribution control the sensitivity. The measure supports systematic selection of uncertainty sets and shows that varying their size traces a near Pareto frontier between performance and robustness.

Core claim

The central claim is that the regularizer arising in the approximation of distributionally robust optimization by a regularized nominal problem is identical to the worst-case sensitivity of the expected cost with respect to deviations from the nominal probability model. This equivalence supplies the regularizer with an interpretation as a robustness measure, so that DRO amounts to an explicit performance-robustness tradeoff whose form is fixed by the choice of uncertainty set. The resulting robustness measure identifies features of the cost distribution that govern sensitivity to misspecification, which in turn yields a systematic method for choosing uncertainty sets. Solutions obtained by a

What carries the argument

Worst-case sensitivity (WCS) of the expected cost to deviations from the nominal model; it supplies the explicit robustness measure that equals the regularizer in the DRO approximation.

If this is right

  • DRO solutions trace a near Pareto-optimal performance-robustness frontier when the uncertainty set size is varied.
  • The frontier identifies problem instances where the price of robustness is high and suggests system redesigns to lower that cost.
  • Uncertainty sets can be chosen systematically according to the properties of the cost distribution that affect sensitivity.
  • WCS can be derived explicitly for a collection of standard uncertainty sets used in DRO.
  • The robustness measure reveals which features of a cost distribution make the solution sensitive to model misspecification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Different uncertainty sets could be compared by the distinct robustness measures they induce on the same cost distribution.
  • The frontier construction might be used to set the uncertainty-set size in applied problems by inspecting the marginal cost of added robustness.
  • The same sensitivity perspective could be tested in sequential or multi-stage decision settings where model updates occur over time.

Load-bearing premise

The approximation of DRO by a regularized nominal problem must hold for the uncertainty sets under consideration.

What would settle it

Compute the worst-case sensitivity directly for a concrete cost function and uncertainty set, then check whether it equals the regularizer coefficient obtained from the DRO approximation; any material mismatch would refute the claimed equivalence.

Figures

Figures reproduced from arXiv: 2507.11350 by Andrew E.B. Lim, Jun-ya Gotoh, Michael Jong Kim.

Figure 3.1
Figure 3.1. Figure 3.1: The figure on the left shows worst-case expected cost as the size (ε) of the uncertainty set increases for decisions x and x ′ . The intercept (m) at ε = 0 is the expected cost under the nominal P; worst-case sensitivity (SP) is its slope. Every mean–sensitivity pair maps to a point on the right. In this example, the expected cost under the nominal is smaller for x (m(x) < m(x ′ )) but is more sensitive … view at source ↗
Figure 3.2
Figure 3.2. Figure 3.2: Solutions of the DRO problem (3.6) are nearly Pareto optimal for the mean–sensitivity problem (3.4). In summary: • (near) Pareto optimal decisions with respect to EP[f(x, Y )] and SP[f(x, ·)] can be gener￾ated by solving the worst-case problem (3.5); DRO gives us a computationally tractable tool for tracing out the mean–sensitivity frontier. • The uncertainty set for the DRO problem defines the sensitivi… view at source ↗
Figure 4
Figure 4. Figure 4: and Table 4.1 that these can be very different. For example, worst-case sensitivity [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 4.1
Figure 4.1. Figure 4.1: Worst-case sensitivities are different measures of spread from P(dθ, dy). Another important class is mixture models with populations indexed by θ ∈ {θ1, · · · , θn} and population distribution P(dY |θ = θi) = L θi (dY ). For concreteness, we refer to the marginal distribution ρ(θ) as the posterior and the conditional distribution L θ (dY ) as the likelihood, fully recognizing that it is more general than… view at source ↗
Figure 6.1
Figure 6.1. Figure 6.1: Distribution of the cost under the optimal order quantity for the SAA problem. The cost has a long right tail, which suggests we should choose an uncertainty set with a sensitivity measure that shrinks this tail. We begin by looking at the distribution of the cost under the SAA solution to determine whether there is a need for sensitivity control [PITH_FULL_IMAGE:figures/full_fig_p025_6_1.png] view at source ↗
Figure 6.2
Figure 6.2. Figure 6.2: Distribution of the cost with the modified χ 2 -uncertainty set. Worst-case sensitivity is the standard deviation of the cost distribution. The DRO solution reduces the standard deviation (sensitivity) by shrinking the right tail [PITH_FULL_IMAGE:figures/full_fig_p026_6_2.png] view at source ↗
Figure 6.3
Figure 6.3. Figure 6.3: Distribution of the cost with the “budgeted” uncertainty set. Under this uncertainty set, worst-case sensitivity is the size of the left tail. The solution of the DRO problem shrinks the left tail (sensitivity), which increases the right tail of the distribution [PITH_FULL_IMAGE:figures/full_fig_p026_6_3.png] view at source ↗
Figure 6.4
Figure 6.4. Figure 6.4: Mean–sensitivity frontier for DRO solutions for a χ 2 uncer￾tainty set. though this comes at the cost of a “wider body” and a larger expected cost. Nevertheless, it reduces the sensitivity (standard deviation) as desired. The optimal order quantity for the χ 2 -deviation measure (x(ε) = 44) is larger than for SAA (x(0) = 24). To highlight the importance of the uncertainty set on the DRO solution, [PITH_… view at source ↗
Figure 6
Figure 6. Figure 6: shows the mean–sensitivity frontier generated by solutions of the DRO problem [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗
Figure 6.5
Figure 6.5. Figure 6.5: Tail of the cost distribution of the SAA solution: (a) is the WCS for the TV uncertainty set and (b) the WCS for the budgeted uncer￾tainty set. We first solve the (ambiguity-neutral) minimum CVaR problem (i.e., ε = 0) [PITH_FULL_IMAGE:figures/full_fig_p028_6_5.png] view at source ↗
Figure 6
Figure 6. Figure 6 [PITH_FULL_IMAGE:figures/full_fig_p028_6.png] view at source ↗
Figure 6.6
Figure 6.6. Figure 6.6: The plot on the left shows the CVaR–TV sensitivity frontiers generated by solutions of the robust CVaR90% problem with TV, budgeted, and χ 2 uncertainty sets. All frontiers are different because the robust prob￾lems have different solutions. Not surprisingly, the most efficient frontier is generated with the TV uncertainty set, though robust solutions with the budgeted and χ 2 uncertainty sets also reduc… view at source ↗
Figure 6.7
Figure 6.7. Figure 6.7: Tails of cost distributions of empirical CVaR90% and DRO CVaR90% with the TV uncertainty set (ε = 0.004). The size of the uncer￾tainty set is chosen so that CVaR is approximately 5.91. (a) is TV-sensitivity amd (b) budgeted sensitivity for the SAA and robust CVaR solutions, re￾spectively; (b) and (b ′ ) are the respective budgeted sensitivities. The robust solution reduces TV-sensitivity ((a) versus (a ′… view at source ↗
Figure 6
Figure 6. Figure 6: superimposes the tail of the loss distribution of the solution of the robust CVaR [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗
Figure 6.8
Figure 6.8. Figure 6.8: Tails of cost distributions of empirical CVaR90% and DRO CVaR90% with budgeted uncertainty set (ε = 0.15). The size of the uncer￾tainty set is chosen so that CVaR is approximately 5.91. (a) is TV-sensitivity amd (b) budgeted sensitivity for the SAA and robust CVaR solutions, re￾spectively; (b) and (b ′ ) are the respective budgeted sensitivities. The robust solution reduces TV-sensitivity ((a) versus (a … view at source ↗
read the original abstract

Distributionally Robust Optimization (DRO) is a worst-case approach to decision making when there is model uncertainty. It is also well known that for certain uncertainty sets, DRO is approximated by a regularized nominal problem. We show that the regularizer is not just a penalty function but the worst-case sensitivity (WCS) of the expected cost with respect to deviations from the nominal model, giving it the interpretation of a robustness measure. This has substantial consequences for robust modeling. It shows that DRO is fundamentally a tradeoff between performance and robustness, where the robustness measure is determined by the uncertainty set. The robustness measure reveals properties of a cost distribution that affect sensitivity to misspecification. This leads to a systematic approach to selecting uncertainty sets. The family of DRO solutions obtained by varying the size of the uncertainty set traces a near Pareto-optimal performance--robustness frontier that can be used to select its size. The frontier identifies problem instances where the price of robustness is high and provides insight into effective ways of redesigning the system to reduce this cost. We derive WCS for a collection of commonly used uncertainty sets, and illustrate these ideas in a number of applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper claims that for certain uncertainty sets, DRO can be approximated by a regularized nominal problem whose regularizer equals the worst-case sensitivity (WCS) of the expected cost to deviations from the nominal distribution. This interpretation frames the regularizer as a robustness measure, shows that DRO encodes a performance-robustness tradeoff determined by the uncertainty set, derives explicit WCS expressions for common sets, and introduces a performance-robustness frontier obtained by varying the uncertainty radius to guide set selection and system redesign.

Significance. If the central equivalence holds under the stated conditions, the work supplies a principled way to interpret and select uncertainty sets via their induced robustness measures, moving beyond ad-hoc choices. The explicit WCS derivations for standard sets and the frontier construction are concrete contributions that could inform both theory and applications in robust optimization.

major comments (1)
  1. [§3, Theorem 1] §3, Theorem 1 and surrounding derivation: the claimed exact identification of the regularizer with WCS is shown only after invoking the DRO-to-regularized approximation; the manuscript should state the precise conditions (e.g., twice-differentiability of the cost, small radius, or specific divergence) under which the approximation becomes an equality, because these conditions are load-bearing for the robustness-measure interpretation.
minor comments (2)
  1. [§2] Notation for the nominal distribution P_0 and the uncertainty ball radius is introduced late; moving the definitions to §2 would improve readability.
  2. [Figure 1] Figure 1 (performance-robustness frontier) lacks axis labels on the robustness measure; adding them would make the trade-off visually clearer.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comment on the conditions underlying the DRO-regularizer equivalence. We address the point below and will revise the manuscript to incorporate the requested clarification.

read point-by-point responses
  1. Referee: [§3, Theorem 1] §3, Theorem 1 and surrounding derivation: the claimed exact identification of the regularizer with WCS is shown only after invoking the DRO-to-regularized approximation; the manuscript should state the precise conditions (e.g., twice-differentiability of the cost, small radius, or specific divergence) under which the approximation becomes an equality, because these conditions are load-bearing for the robustness-measure interpretation.

    Authors: We agree that the identification of the regularizer with the worst-case sensitivity is established within the DRO-to-regularized approximation. The revised manuscript will explicitly delineate the conditions under which the approximation holds with equality or high accuracy, including sufficiently small uncertainty radii, twice continuous differentiability of the cost function with respect to the distribution parameter, and specific divergence choices (e.g., Kullback-Leibler or Wasserstein). This clarification will be added to the statement of Theorem 1 and the surrounding discussion in §3 to strengthen the robustness-measure interpretation. revision: yes

Circularity Check

0 steps flagged

No circularity: WCS interpretation derived from known DRO approximation

full rationale

The paper starts from the established fact that DRO approximates a regularized nominal problem for certain uncertainty sets, then derives that the regularizer equals the worst-case sensitivity (WCS) of expected cost to nominal deviations. This interpretive step and its consequences for performance-robustness tradeoffs are presented as following from the definitions and the approximation, without reducing to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. The abstract and description indicate an independent derivation chain supported by external literature on the approximation, making the result self-contained rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review is based on abstract only; the ledger therefore records only the assumptions explicitly named in the abstract.

axioms (1)
  • domain assumption For certain uncertainty sets, DRO is approximated by a regularized nominal problem.
    Stated as background fact in the first sentence of the abstract.
invented entities (1)
  • Worst-case sensitivity (WCS) no independent evidence
    purpose: To serve as the robustness measure that equals the DRO regularizer.
    Introduced in the abstract as the new interpretive object derived from the regularizer.

pith-pipeline@v0.9.0 · 5738 in / 1297 out tokens · 70179 ms · 2026-05-19T04:32:22.325372+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 1 internal anchor

  1. [1]

    Random House, 1995

    Douglas Adams.The Hitchhiker’s Guide to the Galaxy: A Trilogy in Five Parts. Random House, 1995

  2. [2]

    Distributionally robust project crashing with partial or no correlation information.Networks, 74(1): 79–106, 2019

    Selin Damla Ahipasaoglu, Karthik Natarajan, and Dongjian Shi. Distributionally robust project crashing with partial or no correlation information.Networks, 74(1): 79–106, 2019

  3. [3]

    Improving sample average approximation using distributional robustness.INFORMS Journal on Optimization, 4(1):90–124, 2022

    Edward Anderson and Andy Philpott. Improving sample average approximation using distributional robustness.INFORMS Journal on Optimization, 4(1):90–124, 2022

  4. [4]

    Gah-Yi Ban, Noureddine El Karoui, and Andrew E. B. Lim. Machine learning and portfolio optimization.Management Science, 64(3):1136–1154, 2016. URLhttps: //doi.org/10.1287/mnsc.2016.2644

  5. [5]

    Daniel Bartl, Samuel Drapeau, Jan Obloj, and Johannes Wiesel. Sensitivity analysis of Wasserstein distributionally robust optimization problems.Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 477(20210176), 2021. doi: 10.1098/rspa.2021.0176

  6. [6]

    Robust convex optimization.Mathematics of Operations Research, 23(4):769–805, 1989

    Aharon Ben-Tal and Arkadi Nemirovski. Robust convex optimization.Mathematics of Operations Research, 23(4):769–805, 1989

  7. [7]

    Robust solutions of optimization problems affected by uncertain probabilities

    Aharon Ben-Tal, Dick den Hertog, Anja De Waegenaere, Bertrand Melenberg, and Gijs Rennen. Robust solutions of optimization problems affected by uncertain probabilities. Management Science, 59(2):341–357, 2013

  8. [8]

    Copenhaver

    Dimitris Bertsimas and Martin S. Copenhaver. Characterization of the equivalence of robustification and regularization in linear and matrix regression.European Journal of Operational Research, 270(3):931–942, 2018. DISTRIBUTIONALLY ROBUST OPTIMIZATION IS A MULTI-OBJECTIVE PROBLEM 33

  9. [9]

    The price of robustness.Operations Research, 52 (1):35–53, 2004

    Dimitris Bertsimas and Melvyn Sim. The price of robustness.Operations Research, 52 (1):35–53, 2004

  10. [10]

    Data-driven robust optimiza- tion.Mathematical Programming, 167(2):235–292, 2018

    Dimitris Bertsimas, Vishal Gupta, and Nathan Kallus. Data-driven robust optimiza- tion.Mathematical Programming, 167(2):235–292, 2018

  11. [11]

    Robust sample average approx- imation.Mathematical Programming, 171(1-2):217–282, 2018

    Dimitris Bertsimas, Vishal Gupta, and Nathan Kallus. Robust sample average approx- imation.Mathematical Programming, 171(1-2):217–282, 2018

  12. [12]

    Robust Wasserstein profile inference and applications to machine learning.Journal of Applied Probability, 56(3):830–857, 2019

    Jose Blanchet, Yang Kang, and Karthyek Murthy. Robust Wasserstein profile inference and applications to machine learning.Journal of Applied Probability, 56(3):830–857, 2019

  13. [13]

    Brown, Enrico De Giorgi, and Melvyn Sim

    David B. Brown, Enrico De Giorgi, and Melvyn Sim. Aspirational preferences and their representation by risk measures.Management Science, 58(11):2095–2113, 2012

  14. [14]

    Chen and Johannes O

    Louis L. Chen and Johannes O. Royset. Rockafellian relaxation in optimization under uncertainty: Asymptotically exact formulations.arXiv:2204.04762, 2017. URLhttps: //arxiv.org/abs/2204.04762

  15. [15]

    Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.BMC Medical Informatics and Decision Making, 20(1):16, 2020

    Davide Chicco and Giuseppe Jurman. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.BMC Medical Informatics and Decision Making, 20(1):16, 2020

  16. [16]

    On the heavy-tail behavior of the distributionally robust newsvendor.Operations Research, 69(4):1077–1099, 2021

    Bikramjit Das, Anulekha Dhara, and Karthik Natarajan. On the heavy-tail behavior of the distributionally robust newsvendor.Operations Research, 69(4):1077–1099, 2021

  17. [17]

    Distributionally robust optimization under moment uncer- tainty with application to data-driven problems.Management Science, 58(2):695–612, 2010

    Erick Delage and Yinyu Ye. Distributionally robust optimization under moment uncer- tainty with application to data-driven problems.Management Science, 58(2):695–612, 2010

  18. [18]

    Duchi and Hongseok Namkoong

    John C. Duchi and Hongseok Namkoong. Variance-based regularization with convex objectives.Journal of Machine Learning Research, 20(68):1–55, 2019

  19. [19]

    Duchi, Peter W

    John C. Duchi, Peter W. Glynn, and Hongseok Namkoong. Statistics of robust op- timization: A generalized empirical likelihood approach.Mathematics of Operations Research, 46(3):946–969, 2021

  20. [20]

    Peyman Mohajerin Esfahani and Daniel Kuhn. Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable re- formulations.Mathematical Programming, 171(1-2):115–166, 2018

  21. [21]

    Kenneth R. French. Data library.http://mba.tuck.dartmouth.edu/pages/ faculty/ken.french/data_library.html, n.d

  22. [22]

    Distributionally robust stochastic optimization with Wasserstein distance.Mathematics of Operations Research, 48(2):603–655, 2023

    Rui Gao and Anton Kleywegt. Distributionally robust stochastic optimization with Wasserstein distance.Mathematics of Operations Research, 48(2):603–655, 2023

  23. [23]

    Wasserstein distributionally robust opti- mization and variation regularization.Operations Research, 72(3):1177–1191, 2024

    Rui Gao, Xi Chen, and Anton J Kleywegt. Wasserstein distributionally robust opti- mization and variation regularization.Operations Research, 72(3):1177–1191, 2024. 34 GOTOH, KIM, AND LIM

  24. [24]

    Jun-ya Gotoh, Michael Jong Kim, and Andrew E.B. Lim. Robust empirical optimiza- tion is almost the same as mean–variance optimization.Operations Research Letters, 46(4):448–452, 2018

  25. [25]

    Jun-ya Gotoh, Michael Jong Kim, and Andrew E.B. Lim. Calibration of distribu- tionally robust empirical optimization models.Operations Research, 69(5):1630–1650, 2021

  26. [26]

    Jun-ya Gotoh, Michael Jong Kim, and Andrew E.B. Lim. A data-driven approach to beating SAA out of sample.Operations Research, 73(2):829–841, 2023

  27. [27]

    V. Gupta. Near-Optimal Bayesian Ambiguity Sets for Distributionally Robust Opti- mization.Management Science, 65(9):4242–4260, 2019

  28. [28]

    Hansen and Thomas J

    Lars P. Hansen and Thomas J. Sargent.Robustness. Princeton University Press, 2008

  29. [29]

    Springer-Verlag, 2nd edition, 2009

    Trevor Hastie, Robert Tibshirani, and Jerome Friedman.Elements of Statistical Learn- ing. Springer-Verlag, 2nd edition, 2009

  30. [30]

    Distributionally favorable optimization: A framework for data-driven decision-making with endogenous outliers.SIAM Journal on Optimization, 34(1), 2024

    Nan Jiang and Weijun Xie. Distributionally favorable optimization: A framework for data-driven decision-making with endogenous outliers.SIAM Journal on Optimization, 34(1), 2024. URLhttps://doi.org/10.1137/22M1528094

  31. [31]

    Michael Jong Kim and Andrew E.B. Lim. Robust multi-armed bandit problems.Man- agement Science, 62(1):264–285, 2015

  32. [32]

    Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

    Daniel Kuhn, Peyman Mohajerin Esfahani, Viet Anh Nguyen, and Soroosh Shafieezadeh-Abadeh. Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning. InINFORMS TutORials in Operations Research, pages 130–166. INFORMS, 2019

  33. [33]

    Distributionally robust optimization, 2025

    Daniel Kuhn, Soroosh Shafiee, and Wolfram Wiesemann. Distributionally robust op- timization.arXiv preprint arXiv:2411.02549, 2024

  34. [34]

    Robust sensitivity analysis for stochastic systems.Mathematics of Oper- ations Research, 41(4):1248–1275, 2016

    Henry Lam. Robust sensitivity analysis for stochastic systems.Mathematics of Oper- ations Research, 41(4):1248–1275, 2016

  35. [35]

    The empirical likelihood approach to quantifying uncer- tainty in sample average approximation.Operations Research Letters, 45(4):301–307, 2017

    Henry Lam and Zhou Enlu. The empirical likelihood approach to quantifying uncer- tainty in sample average approximation.Operations Research Letters, 45(4):301–307, 2017

  36. [36]

    Lim and J

    Andrew E.B. Lim and J. George Shanthikumar. Relative entropy, exponential utility, and robust dynamic pricing.Operations Research, 55(2):198–214, 2007

  37. [37]

    Andrew E.B. Lim, J. George Shanthikumar, and Zuo-Jun Shen. Model uncertainty, robust optimization, and learning.TutORials in Operations Research, 3:66–94, 2006

  38. [38]

    Andrew E.B. Lim, J. George Shanthikumar, and Gah-Yi Vahn. Conditional value-at- risk in portfolio optimization: Coherent but fragile.Operations Research Letters, 39 (3):163–171, 2011. DISTRIBUTIONALLY ROBUST OPTIMIZATION IS A MULTI-OBJECTIVE PROBLEM 35

  39. [39]

    Luenberger.Optimization by Vector Space Methods

    David G. Luenberger.Optimization by Vector Space Methods. John Wiley & Sons, 1997

  40. [40]

    Optimistic distributionally robust optimization for nonparamet- ric likelihood approximation.Advances in Neural Information Processing Systems, 32: 15872 – 15882, 2019

    Viet Anh Nguyen, Soroosh Shafieezadeh-Abadeh, Man Chung Yue, Daniel Kuhn, and Wolfram Wiesemann. Optimistic distributionally robust optimization for nonparamet- ric likelihood approximation.Advances in Neural Information Processing Systems, 32: 15872 – 15882, 2019

  41. [41]

    Optimistic Robust Optimization With Applications To Machine Learning

    Matthew Norton, Akiko Takeda, and Alexander Mafusalov. Optimistic robust op- timization with applications to machine learning.arXiv:1711.07511, 2017. URL https://arxiv.org/abs/1711.07511

  42. [42]

    Dual stochastic dominance and re- lated mean–risk models.SIAM Journal on Optimization, 13(1):60–78, 2002

    W lodzimierz Ogryczak and Andrzej Ruszczynski. Dual stochastic dominance and re- lated mean–risk models.SIAM Journal on Optimization, 13(1):60–78, 2002

  43. [43]

    Petersen, Matthew R

    Ian R. Petersen, Matthew R. James, and Paul Dupuis. Minimax optimal control of stochastic uncertain systems with relative entropy constraints.IEEE Transactions on Automatic Control, 45:398–412, 2000

  44. [44]

    Some remarks on the value-at-risk and the conditional value-at-risk

    Georg Ch Pflug. Some remarks on the value-at-risk and the conditional value-at-risk. InProbabilistic constrained optimization, pages 272–281. Springer, 2000

  45. [45]

    Distributionally robust optimization: A re- view.Open Journal of Mathematical Optimization, 2022

    Hamed Rahimian and Sanjay Mehrotra. Distributionally robust optimization: A re- view.Open Journal of Mathematical Optimization, 2022

  46. [46]

    Controlling risk and demand ambiguity in newsvendor models.European Journal of Operational Research, 279:854–868, 2019

    Hamed Rahimian, Guzin Bayraksan, and Tito Homem-de Mello. Controlling risk and demand ambiguity in newsvendor models.European Journal of Operational Research, 279:854–868, 2019

  47. [47]

    Terry Rockafellar, Johannes O

    R. Terry Rockafellar, Johannes O. Royset, and Sofia I. Miranda. Superquantile regres- sion with applications to buffered reliability, uncertainty quantification, and conditional value-at-risk.European Journal of Operational Research, 234(1):140–154, 2014

  48. [48]

    Conditional value-at-risk for general loss distributions.Journal of Banking & Binance, 26(7):1443–1471, 2002

    R Tyrrell Rockafellar and Stanislav Uryasev. Conditional value-at-risk for general loss distributions.Journal of Banking & Binance, 26(7):1443–1471, 2002

  49. [49]

    Tyrrell Rockafellar, Stan Uryasev, and Michael Zabarankin

    R. Tyrrell Rockafellar, Stan Uryasev, and Michael Zabarankin. Generalized deviations in risk analysis.Finance and Stochastics, 10(1):51–74, 2006

  50. [50]

    Dis- tributionally robust logistic regression

    Soroosh Shafieezadeh-Abadeh, Peyman Mohajerin Esfahani, and Daniel Kuhn. Dis- tributionally robust logistic regression. InAdvances in Neural Information Processing Systems, pages 1576–1584, 2015

  51. [51]

    Bayesian distributionally robust opti- mization.SIAM Journal on Optimization, 33(2):1279–1304, 2023

    Alexander Shapiro, Enlu Zhou, and Yifan Lin. Bayesian distributionally robust opti- mization.SIAM Journal on Optimization, 33(2):1279–1304, 2023

  52. [52]

    Shiffler and Phillip D

    Ronald E. Shiffler and Phillip D. Harsha. Upper and lower bounds for the sample standard deviation.Teaching Statistics, 2(3):84–86, 1980

  53. [53]

    CVaR Deviation off

    Yiu Man Tsang and Karmel S. Shehadeh. On the tradeoff between distributional belief and ambiguity: Conservatism, finite-sample guarantees, and asymptotic properties. 36 GOTOH, KIM, AND LIM INFORMS Journal on Optimization, 2025. URLhttps://doi.org/10.1287/ijoo. 2024.0047. DISTRIBUTIONALLY ROBUST OPTIMIZATION IS A MULTI-OBJECTIVE PROBLEM 37 AppendixA.Proofs...