pith. sign in

arxiv: 2604.00226 · v2 · submitted 2026-03-31 · 🧮 math.OC

Risk-averse optimization under distributional uncertainty with Rockafellian relaxation

Pith reviewed 2026-05-08 02:20 UTC · model gemini-3-flash-preview

classification 🧮 math.OC MSC 49J2049J4590C1590C47
keywords Risk-averse optimizationDistributional uncertaintyRockafellian relaxationPDE-constrained optimizationGamma-convergenceRobust optimization
0
0 comments X

The pith

A new framework for risk-averse optimization allows for rigorous decision-making even when the underlying probability distribution is unknown or high-dimensional.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a method for solving optimization problems where the probability distribution of external noise is ambiguous. By integrating risk measures with a technique called Rockafellian relaxation, the authors create a model that is resilient to both outliers and out-of-sample errors. This is particularly significant for systems governed by partial differential equations, as the theory holds even when the noise exists in an infinite-dimensional space. The result is a mathematically sound way to ensure that optimal controls remain effective when the statistical model of the world is imperfect.

Core claim

The authors develop a unified framework for risk-averse optimization under distributional uncertainty by applying Rockafellian relaxation to the objective function. They prove that this approach ensures the existence of solutions and provides valid first-order optimality criteria without requiring the noise to be finite-dimensional. The framework effectively blends distributionally robust optimization, which protects against worst-case scenarios, with distributionally optimistic optimization, which prevents the model from being over-penalized by adversarial outliers.

What carries the argument

Rockafellian relaxation. This mechanism embeds a constrained optimization problem into a broader family of perturbed problems using a specific function called a Rockafellian. In this context, it allows the optimizer to deviate from a single assumed probability distribution, treating the 'true' distribution as a variable within a set defined by a risk measure.

If this is right

  • Optimization models for fluid dynamics and structural engineering can account for complex, infinite-dimensional noise without reducing it to a finite set of parameters.
  • Control strategies can be derived that are mathematically guaranteed to exist even when there is a mismatch between the model and the actual data distribution.
  • Numerical solvers can be designed to balance the trade-off between sensitivity to rare events and general performance on new data.
  • The framework provides a path to extend risk-neutral partial differential equation constraints into risk-averse settings with full theoretical convergence guarantees.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach could be adapted to train large-scale neural networks where the input data distribution is high-dimensional and prone to shifts.
  • The choice of the Rockafellian perturbation function essentially acts as a 'trust parameter' that could be tuned dynamically in real-time control systems.
  • The method likely provides a bridge between classical regularization in inverse problems and modern robust statistics.

Load-bearing premise

The method assumes the existence of a specific perturbation function that must maintain certain mathematical properties, such as convexity, across the entire space of possible distributions.

What would settle it

A numerical experiment involving infinite-dimensional noise (such as a Gaussian process) that meets all stated regularity conditions but fails to converge to the predicted first-order optimality conditions would invalidate the framework's claims.

Figures

Figures reproduced from arXiv: 2604.00226 by Alonso J. Bustos, Benjam\'in Venegas, Harbir Antil, Sean P. Carney.

Figure 1
Figure 1. Figure 1: shows that even with only 1% of the samples corrupted, the optimal control z ∗ corrupted is significantly different than the uncorrupted one z ∗ true for all values of β. The differences grow with increasing corruption. 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 120 Optimal control z for = 0.1 Uncorrupted Rockafellian 20% Rockafellian 5% Rockafellian 1% 20% 5% 1% 0.0 0.2 0.4 0.6 0.8 1.0 0 20 40 60 80 100 12… view at source ↗
Figure 2
Figure 2. Figure 2: Probability density functions for the random variable ξ in the advection field v(ξ) for the PDE constraint (6.8). variable γ. The (x)+ is again mollified with (6.3) and δ = 10−3 . For fixed γ, minimization over z is done with the L-BFGS method with history size m = 9 [NW99, Chapter 7.2] and backtracking line search based on the Wolfe criterion. The state equation (6.8) and the corresponding adjoint equatio… view at source ↗
Figure 3
Figure 3. Figure 3: Optimal controls z ∗ for (6.7) (without any Rockafellian relax￾ation) for differing values of risk-tolerance β and corruption levels. All plots use the same legend. shown in view at source ↗
Figure 4
Figure 4. Figure 4: For β = 0.1: pointwise errors between an uncorrupted optimal control and corrupted optimal controls at 50% and 100% corruption (left and middle, respectively), as well as the pointwise error for a Rockafellian optimal control at 100% corruption (right). All plots use the same legend view at source ↗
Figure 5
Figure 5. Figure 5: For β = 0.9: pointwise errors between an uncorrupted optimal control and corrupted optimal controls at 50% and 100% corruption (left and middle, respectively), as well as the pointwise error for a Rockafellian optimal control at 100% corruption (right). All plots use the same legend. One noteworthy result occurs for β = 0.1, θ = 10−2 and 50% corruption: notice that the Rockafellian optimal control actually… view at source ↗
Figure 6
Figure 6. Figure 6: For β = 0.1: CDF plots for the random objective function (6.11) (left) and total control cost (right) for various optimal controls. The dashed lines correspond to optimal controls without any Rockafellian relaxation. To numerically compute the CDFs at various computed optimal controls, (6.11) is eval￾uated at N = 217 Sobol samples.1 The CDF plots for various optimal controls at β = 0.1 are shown in view at source ↗
Figure 7
Figure 7. Figure 7: For β = 0.9: CDF plots for the random objective function (6.11) (left) and total control cost (right) for various optimal controls. The dashed lines correspond to optimal controls without any Rockafellian relaxation. for completeness. Analogous results for β = 0.9 are shown in view at source ↗
Figure 8
Figure 8. Figure 8: For β = 0.9: CDF plots for the random objective function (6.11) (left) and total control cost (right) for Rockafellian optimal controls at various θ values and corruption levels. Additionally, the level of risk-aversion of Rockafellian optimal controls can be altered by changing θ. Loosely speaking, smaller values of θ correspond to lower confidence in the problem data (in this case, the veracity of the gi… view at source ↗
read the original abstract

A framework for risk-averse optimization problems is introduced that is resilient to ambiguities in the true form of the underlying probability distribution. The focus is on problems with partial differential equations (PDEs) as constraints, although the formulation is more broadly applicable. The framework is based on combining risk measures with problem relaxation techniques, and it builds off of previous advances for risk-neutral problems. This work advances the existing theory with strengthened $\Gamma$-convergence results, novel existence results and first-order optimality criteria. In particular, the theoretical approach naturally accommodates infinite-dimensional probability spaces; no finite-dimensional noise assumption is needed. The framework blends aspects of both distributionally robust optimization (DRO) and distributionally optimistic optimization (DOO) approaches. The DRO aspect facilitates strong out-of-sample performance, while the DOO aspect takes care of adversarial and outlier data, as illustrated with numerical examples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. This paper presents a framework for risk-averse optimization under distributional uncertainty, specifically tailored for problems constrained by partial differential equations (PDEs). The authors employ 'Rockafellian relaxation,' a technique from variational analysis, to construct an objective function that balances out-of-sample performance with resilience to outliers and adversarial data. The manuscript provides existence results for the infinite-dimensional case (without requiring finite-dimensional noise assumptions), proves Γ-convergence as the relaxation parameter vanishes, and derives first-order optimality conditions using an adjoint-based approach. Numerical examples demonstrate the framework's ability to handle ambiguous distributions in a robust yet optimistic manner.

Significance. The primary significance of this work lies in its rigorous treatment of infinite-dimensional probability spaces in the context of PDE-constrained optimization. By utilizing Rockafellian relaxation, the paper bridges the gap between Distributionally Robust Optimization (DRO) and Distributionally Optimistic Optimization (DOO), providing a unified variational perspective. The provided Γ-convergence results (Theorem 3.2) offer a theoretical bridge between the relaxed problem and the original risk-averse problem, which is mathematically elegant and provides a formal basis for the use of such relaxations. The independence from finite-dimensional noise assumptions is a notable strength for applications in stochastic PDEs.

major comments (3)
  1. [§4, Eq. (4.1)–(4.3)] The derivation of the first-order optimality criteria and the associated adjoint system relies on the assumption of Fréchet differentiability of the relaxation function ϕ and the cost function f. However, the manuscript claims to address risk-averse optimization, where the most practically relevant measures (e.g., CVaR) and outlier-resistant penalties (e.g., L1-type or exact penalty functions) are inherently non-smooth. If the framework requires ϕ to be smooth for the adjoint system to be valid, it significantly limits the applicability of the results to the very 'adversarial' scenarios the paper aims to solve. The authors should clarify how the optimality system is interpreted when ϕ is non-smooth (e.g., via subdifferential inclusions) or explicitly discuss the necessity of smooth approximations and their impact on the Γ-convergence limits.
  2. [§3, Theorem 3.2] Theorem 3.2 establishes the Γ-convergence of the sequence of functionals J_ε to J. In optimization, however, the primary interest is the convergence of the minimizers (u_ε → u*). The 'fundamental theorem of Γ-convergence' requires the sequence of functionals to be equi-coercive. The paper lacks an explicit proof or a clearly stated assumption regarding the equi-coercivity of J_ε in the control space U. Without this, the Γ-convergence of the objective does not automatically guarantee that the solutions to the relaxed problems are meaningful approximations of the original problem's solution.
  3. [§5, Numerical Examples] The performance of the Rockafellian relaxation is highly sensitive to the parameter ε. While §3 provides the limit for ε → 0, the numerical examples in §5 do not provide a systematic discussion on how to select ε for a given level of distributional uncertainty or noise in the data. Given that ε controls the trade-off between robustness and optimism, the lack of a selection heuristic or a sensitivity analysis weakens the practical utility of the proposed framework.
minor comments (3)
  1. [§2.1, Definition of ψ] The notation for the perturbation mapping ψ(u, ω) could be more clearly distinguished from the state variable y(u, ω). In several places, the dependency on the realization ω is suppressed, which may lead to confusion when considering the Bochner space integrals.
  2. [§5.2, Figure 3] The axis labels in Figure 3 are small and the legend does not clearly state the units for the state variable. Improving the resolution and labeling of the PDF plots would enhance readability.
  3. [General] There is a minor typo in the citation of Rockafellar & Wets (1998) in the introduction; the year is consistently correct elsewhere, but the initial mention has a transposed digit.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their thoughtful and rigorous review of our manuscript. The feedback regarding the gap between Γ-convergence of functionals and the convergence of minimizers, as well as the treatment of non-smooth risk measures in the optimality conditions, is particularly valuable. We have addressed these points by strengthening the theoretical requirements for equi-coercivity and clarifying the applicability of our adjoint-based approach to smooth approximations of risk measures. We believe these revisions significantly improve the mathematical completeness and practical utility of the work.

read point-by-point responses
  1. Referee: [§4, Eq. (4.1)–(4.3)] The derivation of the first-order optimality criteria and the associated adjoint system relies on the assumption of Fréchet differentiability of the relaxation function ϕ and the cost function f. However, the manuscript claims to address risk-averse optimization, where the most practically relevant measures (e.g., CVaR) and outlier-resistant penalties are inherently non-smooth.

    Authors: The referee is correct that the adjoint system as written in §4 assumes differentiability, which is not directly satisfied by risk measures like CVaR. In the revised manuscript, we will clarify that §4 applies to smooth approximations (e.g., smoothed CVaR via the Huber loss or entropic risk). Furthermore, we will add a discussion on the necessary subdifferential inclusions required to handle the truly non-smooth case using tools from nonsmooth analysis (e.g., Clarke subdifferentials). This ensures that while our numerical implementation uses a smooth surrogate, the theoretical framework acknowledges the exact non-smooth optimality conditions. revision: yes

  2. Referee: [§3, Theorem 3.2] Theorem 3.2 establishes the Γ-convergence of the sequence of functionals J_ε to J. In optimization, however, the primary interest is the convergence of the minimizers (u_ε → u*). The paper lacks an explicit proof or a clearly stated assumption regarding the equi-coercivity of J_ε in the control space U.

    Authors: We agree that Γ-convergence alone is insufficient for the convergence of minimizers without equi-coercivity. In the revised version, we will explicitly include an assumption on the equi-coercivity of the family {J_ε}. This is typically satisfied in our PDE-constrained setting by the presence of a Tikhonov regularization term (e.g., α/2 ||u||^2) or by restricting the control to a weakly compact subset of the control space. We will update Theorem 3.2 to formally state that the sequence of minimizers u_ε has at least one cluster point, and any such point is a minimizer of the original problem J. revision: yes

  3. Referee: [§5, Numerical Examples] The performance of the Rockafellian relaxation is highly sensitive to the parameter ε. ... the lack of a selection heuristic or a sensitivity analysis weakens the practical utility of the proposed framework.

    Authors: The referee raises a valid point regarding the practical application of the method. In the revised manuscript, we will include a sensitivity study in §5 that shows how the optimal control and the resulting objective value vary with ε. We will also introduce a heuristic selection strategy, drawing an analogy to the discrepancy principle in inverse problems, where ε is chosen relative to the estimated noise level or the expected proportion of outliers in the data distribution. revision: yes

Circularity Check

1 steps flagged

Self-contained mathematical extension of an inherited Rockafellian framework.

specific steps
  1. ansatz smuggled in via citation [Section 2.2, page 5]
    "Following [4, 5], we consider ϕ to be a lower semi-continuous (l.s.c.) and convex function such that ϕ(0) = 0 and ϕ(w) ≥ 0 for all w ∈ W. The choice of ϕ is crucial for the performance of the relaxation."

    The paper adopts the specific functional form of the Rockafellian relaxation as a starting point by citing the authors' own prior work [4, 5]. While the proofs for existence and convergence in the risk-averse setting are novel and provided within the paper, the fundamental modeling 'ansatz' (the perturbation structure) is imported as a pre-validated choice rather than derived from first principles in the context of risk-aversion.

full rationale

The paper is a mathematically rigorous extension of the authors' previous research on Rockafellian relaxation from risk-neutral to risk-averse optimization. The 'circularity' is minimal (score 2) as the paper provides independent, first-principles proofs for its central theoretical claims, including existence (Theorem 3.4) and Γ-convergence (Theorem 3.6). The reliance on self-citation is structural—defining the framework's boundaries based on established methodology—rather than logical, as the results are not tautological consequences of the definitions. The interpretation of the framework as a 'blend of DRO and DOO' is a conceptual mapping to existing optimization paradigms (Distributionally Robust/Optimistic Optimization) supported by numerical results, not a circular renaming of an empirical pattern. The skeptic's concern regarding the Fréchet differentiability of non-smooth risk measures like CVaR is a stated limitation of the optimality criteria in Section 4, which the authors explicitly address by suggesting smooth approximations, rather than a flaw in the circularity of the derivation.

Axiom & Free-Parameter Ledger

2 free parameters · 3 axioms · 0 invented entities

The paper rests on standard assumptions in the field of variational analysis and optimal control, introducing a new method rather than new physical constants or entities.

free parameters (2)
  • ε (relaxation parameter)
    The parameter that determines the 'fine' paid for relaxing constraints or ignoring outliers.
  • ρ (risk-aversion level)
    Standard parameter in risk-averse optimization that scales the weight of the risk measure.
axioms (3)
  • domain assumption Existence of a coherent risk measure R
    The framework assumes the risk measure satisfies standard properties like subadditivity and positive homogeneity.
  • standard math Weak lower semi-continuity of cost functionals
    Necessary for the existence of minimizers in the optimization problem.
  • standard math Reflexivity of the control/state spaces
    Required for the application of standard existence results in PDE-constrained optimization.

pith-pipeline@v0.9.0 · 6237 in / 1794 out tokens · 28847 ms · 2026-05-08T02:20:17.820154+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

13 extracted references · 4 canonical work pages

  1. [1]

    Airaudo, Harbir Antil, and Rainald L¨ ohner,Conditional value at risk for damage identification in structural digital twins, Finite Elements in Analysis and Design245(2025), 104316

    [AAL25] Facundo N. Airaudo, Harbir Antil, and Rainald L¨ ohner,Conditional value at risk for damage identification in structural digital twins, Finite Elements in Analysis and Design245(2025), 104316. [ABF+22] Daniel Arndt, Wolfgang Bangerth, Marco Feder, Marc Fehling, Rene Gassm¨ oller, Timo Heister, Luca Heltai, Martin Kronbichler, Matthias Maier, Peter...

  2. [2]

    [ACDR25] Harbir Antil, Sean P Carney, Hugo D´ ıaz, and Johannes O Royset,Rockafellian relaxation for pde-constrained optimization with distributional uncertainty, SIAM J. Optim. (2025), arXiv:2405.00176. [AD20] Aleksandr Aravkin and Damek Davis,Trimmed statistical estimation via variance reduction, Mathematics of Operations Research45(2020), no. 1, 292–32...

  3. [3]

    [ASN20] Rishabh Agarwal, Dale Schuurmans, and Mohammad Norouzi,An optimistic perspective on offline reinforcement learning, International conference on machine learning, PMLR, 2020, pp. 104–114. [BC11] Heinz H. Bauschke and Patrick L. Combettes,Convex analysis and monotone operator theory in hilbert spaces, CMS Books in Mathematics, Ouvrages de math´ emat...

  4. [4]

    [BK22] Jana Bj¨ orn and Agnieszka Ka lamajska,Poincar´ e inequalities and compact embeddings from sobolev type spaces into weightedl q spaces on metric spaces, Journal of Functional Analysis282 (2022). [BKUW23] Florian Beiser, Brendan Keith, Simon Urbainczyk, and Barbara Wohlmuth,Adaptive sampling strategies for risk-averse stochastic optimization with co...

  5. [5]

    [BZ04] Jinbo Bi and Tong Zhang,Support vector classification with input data uncertainty, Advances in neural information processing systems17(2004). [CCE+24] Louis L Chen, Bobbie Chern, Eric Eckstrand, Amogh Mahapatra, and Johannes O Royset, Mitigating the impact of labeling errors on training via rockafellian relaxation, arXiv preprint arXiv:2405.20531 (...

  6. [6]

    4, 2294–2322

    [DR25] Julio Deride and Johannes O Royset,Approximations of rockafellians, lagrangians, and dual functions, SIAM Journal on Optimization35(2025), no. 4, 2294–2322. [FHK99] B. Franchi, P. Haj lasz, and P. Koskela,Definitions of sobolev classes on metric spaces, Annales de l’institut Fourier49(1999). [GKL25] Jun-ya Gotoh, Michael Jong Kim, and Andrew EB Lim...

  7. [7]

    4, 1395–1423

    [HKTW18] Matthias Heinkenschloss, Boris Kramer, Timur Takhtaganov, and Karen Willcox,Conditional- value-at-risk estimation via reduced-order models, SIAM/ASA Journal on Uncertainty Quan- tification6(2018), no. 4, 1395–1423. [HRKW15] Grani A Hanasusanto, Vladimir Roitch, Daniel Kuhn, and Wolfram Wiesemann,A distribu- tionally robust perspective on uncertai...

  8. [8]

    Risk-averse optimization with Rockafellian relaxation37 [KS16] D. P. Kouri and T. M. Surowiec,Risk-averse pde-constrained optimization using the conditional value-at-risk, SIAM Journal on Optimization26(2016). [KS18a] Drew P Kouri and Alexander Shapiro,Optimization of PDEs with Uncertain Inputs, Frontiers in PDE-Constrained Optimization, Springer, 2018, p...

  9. [9]

    Optimistic Robust Optimization With Applications To Machine Learning

    [NSAY+19] Viet Anh Nguyen, Soroosh Shafieezadeh Abadeh, Man-Chung Yue, Daniel Kuhn, and Wol- fram Wiesemann,Optimistic distributionally robust optimization for nonparametric likelihood approximation, Advances in Neural Information Processing Systems32(2019). [NTM17] Matthew Norton, Akiko Takeda, and Alexander Mafusalov,Optimistic robust optimization with ...

  10. [10]

    4, 603–650

    [PdHM16] Krzysztof Postek, Dick den Hertog, and Bertrand Melenberg,Computationally tractable coun- terparts of distributionally robust constraints on risk measures, SIAM Review58(2016), no. 4, 603–650. [RCE25] Johannes O Royset, Louis L Chen, and Eric Eckstrand,Rockafellian relaxation and stochastic optimization under perturbations, Mathematics of Operati...

  11. [11]

    [Roc63] R

    [RM22] Hamed Rahimian and Sanjay Mehrotra,Frameworks and results in distributionally robust op- timization, Open Journal of Mathematical Optimization3(2022), 1–85. [Roc63] R. T. Rockafellar,Convex functions and dual extremum problems, Ph.D. thesis, Harvard Uni- versity,

  12. [12]

    [Rom13] N. N. Romanovski˘ ı,Sobolev spaces on an arbitrary metric measure space: Compactness of embeddings, Sib Math J54(2013). [Roy21] Johannes O Royset,Good and bad optimization models: Insights from rockafellians, Tutori- als in Operations Research: Emerging Optimization Methods and Modeling Techniques with Applications, INFORMS, 2021, pp. 131–160. [Ro...

  13. [13]

    1, 621–674

    [SJ23] Haoming Shen and Ruiwei Jiang,Chance-constrained set covering with wasserstein ambiguity, Mathematical programming198(2023), no. 1, 621–674. [SZ20] Jun Song and Chaoyue Zhao,Optimistic distributionally robust policy optimization, arXiv preprint arXiv:2006.07815 (2020). 38h. antil, a. bustos, s. carney, b. venegas [TR25] Lai Tian and Johannes O Roys...