pith. sign in

arxiv: 2605.16685 · v1 · pith:3ZUYUYCWnew · submitted 2026-05-15 · 🧮 math.OC · cs.GT

Black-Box Followers, White-Box Leaders: Partial Zeroth-Order Methods for MPECs

Pith reviewed 2026-05-20 15:46 UTC · model grok-4.3

classification 🧮 math.OC cs.GT
keywords mathematical programs with equilibrium constraintszeroth-order optimizationpartial gradientsGoldstein subdifferentialbilevel optimizationrouting gamessecurity gamesvariance reduction
0
0 comments X

The pith

PZOS algorithm combines exact leader gradients with zeroth-order follower estimates for lower variance in MPECs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops methods for mathematical programs with equilibrium constraints where the leader knows their own cost function but must query the followers' response at points. It proposes using zeroth-order tools selectively on the unknown response while using exact gradients on the known part. The resulting PZOS method is shown to have strictly lower variance than full black-box zeroth-order approaches and converges to stationary points. Applications to routing and security games confirm practical benefits in speed and performance.

Core claim

The paper claims that by deploying zeroth-order tools only on the followers' unknown response mapping and using exact partial gradients for the leader's cost in a chain-rule fashion, the PZOS algorithm achieves a strictly lower variance bound than black-box baselines and converges to both standard and partial Goldstein stationary points in MPECs.

What carries the argument

The PZOS algorithm that applies a chain-rule-inspired update combining exact partial gradients of the leader's cost with zeroth-order Jacobian estimates of the followers' response.

If this is right

  • Convergence to partial Goldstein stationary points tailored to the composite structure.
  • Improved convergence speed and lower estimator variance in numerical tests on toll optimization and security games.
  • Robust performance even when limited to few queries per iteration.
  • Better objective values compared to black-box methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The selective use of information could extend to other optimization problems with partial model knowledge.
  • Defining a partial subdifferential may help in analyzing non-smooth bilevel problems more precisely.
  • Further analysis could explore how the variance reduction affects scalability to larger problems.

Load-bearing premise

The followers' response mapping is queryable at chosen points and the composite objective admits a well-defined partial Goldstein subdifferential.

What would settle it

A calculation showing that the variance bound for PZOS is not strictly lower than the black-box method, or an experiment where the algorithm fails to converge to the defined stationary points under the stated conditions.

Figures

Figures reproduced from arXiv: 2605.16685 by Dario Paccagnan, Miriam Fischer.

Figure 1
Figure 1. Figure 1: Empirical second moments as a function of problem dimension. Lemma 1 (Second-moment bound). Under Assumptions 1 and 2, let gt be the estimator from (4) and let get be the black-box estimator from (2). Then, E[∥gt∥ 2 | xt] ≤ k2 dx L 2 fL 2 y + L 2 f (1 + 2Ly), E[∥get∥ 2 | xt] ≤ k2 dx L 2 fL 2 y + k2 dx L 2 f (1 + 2Ly), where k2 ≥ 1 is a universal constant from [24, Lem. 10], and the first bound is no larger… view at source ↗
Figure 2
Figure 2. Figure 2: Algorithm performance on heterogeneous routing games. Median normalized objective trajectories with in￾terquartile range (dark shading) and 10th– 90th percentile range (light shading). For a given instance, both algorithms are run with the same starting point and noise directions vt ∼ U(S dx ), so that per￾formance differences are not attributable to sampling varia￾tion. Since objective values depend on pr… view at source ↗
Figure 3
Figure 3. Figure 3: Normalized objective value (higher is better) achieved by the baseline [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Effect of batch size Q on algorithm performance for heterogeneous routing games: PZOS at Q = 1 compared against ZOS at Q ∈ {1, 2, 4}, showing median normalized objective with IQR (dark shading) and 10th–90th percentile bands (light shading), on the same 225 runs as Sec. 5.1. 5.2 Security games We consider a Stackelberg security game in which a defender allocates investments across n targets to minimize the… view at source ↗
Figure 5
Figure 5. Figure 5: Algorithm performance on secu￾rity games. Median normalized objective with interquartile range (dark shading) and 10th–90th percentile (light shading). Baseline and Setup. We compare PZOS against ZOS, the same baseline as in Sec. 5.1, and normalize trajectories as described therein. We use 427 randomly generated instances with n ∈ [2, 300] targets, vi , wi ∼ U([1, 5]), bi ∼ U([0.1, 0.5]), c D i , cA i ∼ U(… view at source ↗
Figure 6
Figure 6. Figure 6: Normalized objective value (lower is better) achieved by the baseline [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Normalized objective value (higher is better) for the heterogeneous routing game of [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Normalized objective value (higher is better) for smoothing parameter [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Algorithm performance over heterogeneous routing games from Section 5.1 with non [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Plot of Figure 1 with error bars. Mean [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Graphs of F(x) = f(x, y∗ (x)) for Examples 1 and 2. Left: F(x) = (x − 1)2 + |x| is non-differentiable at x0 = 0 (red dot). For δ ≥ 1 2 , the Goldstein subdifferential reaches the minimizer z = 1 2 (green diamond) and contains zero, while the partial Goldstein subdifferential equals [−3, −1] for all δ > 0. Right: F(x) = |x| − (x − 1)2 is non-differentiable at x0 = 0 (red dot). The non-differentiability of … view at source ↗
read the original abstract

We study mathematical programs with equilibrium constraints, in which a leader knows their own cost function, but lacks a model of the followers' response. Instead, the leader can only query this response at specific points. While this setting precludes the use of gradient-based methods, existing zeroth-order approaches treat the composed objective entirely as a black box, deploying zeroth-order tools across both the leader and follower. Such approaches are inefficient, as they discard information the leader already possesses about their own cost function. In this work we instead propose to deploy zeroth-order tools only where they are truly needed: to handle the unknown, non-smooth followers' response. Specifically, we first propose PZOS, an algorithm that combines exact partial gradients of the leader's cost with zeroth-order Jacobian estimates of the followers' response in a chain-rule-inspired manner, and establish that it achieves a strictly lower variance bound than the black-box baseline. Second, we introduce the partial Goldstein subdifferential, a stationarity notion tailored to this composite structure, and prove convergence of our algorithm to both standard and partial Goldstein stationary points. Finally, we validate our method on two application domains -- toll optimization in routing games and defense-attack investment in security games -- demonstrating consistent improvements over black-box baselines in convergence speed, objective value, and estimator variance, with robust performance even under few queries per iteration.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes PZOS, a partial zeroth-order algorithm for mathematical programs with equilibrium constraints (MPECs) in which the leader has exact knowledge of their own cost but only query access to the followers' response. PZOS combines exact partial gradients of the leader objective with zeroth-order Jacobian estimates of the follower mapping via a chain-rule construction, claims a strictly lower variance bound than fully black-box zeroth-order baselines, introduces the partial Goldstein subdifferential as a tailored stationarity notion, proves convergence to both standard and partial Goldstein points, and reports empirical gains on toll optimization in routing games and defense-attack investment in security games.

Significance. If the variance reduction and convergence results hold under the stated assumptions, the work offers a principled way to exploit partial white-box structure in composite zeroth-order problems, which could improve sample efficiency in bilevel and equilibrium-constrained optimization. The empirical validation on two distinct game-theoretic domains provides concrete evidence of faster convergence and lower estimator variance. The partial Goldstein subdifferential is a novel technical device whose utility, if rigorously justified, would be of interest to the mathematical programming community.

major comments (2)
  1. [§4, variance theorem] §4 (Variance Analysis) and the statement of the main variance theorem: the claim that PZOS achieves a strictly lower variance bound than the black-box baseline rests on a chain-rule decomposition that combines exact leader partials with the ZO Jacobian estimate. When the follower response is set-valued or non-differentiable, the partial Goldstein subdifferential may not admit the same one-sided directional properties required for the bound to remain strict; the black-box and partial methods could then target inequivalent stationarity notions, undermining the strict comparison. Please supply the precise statement of the variance lemma and the additional assumptions (if any) that guarantee the inequality holds for non-smooth responses.
  2. [Definition 3.2 and convergence theorem] Definition 3.2 (partial Goldstein subdifferential) and the convergence theorem that follows: the stationarity notion is introduced specifically for the composite structure, yet the proof that PZOS converges to both standard Goldstein and partial Goldstein points does not clarify whether the partial notion is strictly weaker or whether the algorithm can converge to a partial point that fails to be a standard stationary point. A concrete counter-example or equivalence result under mild conditions on the response mapping would strengthen the claim.
minor comments (2)
  1. [Experimental section] The abstract states that the method is validated 'with robust performance even under few queries per iteration,' but the experimental section does not report the exact number of queries per iteration or the corresponding variance reduction factor; adding a table with these quantities would improve reproducibility.
  2. [Notation section] Notation for the leader's partial gradient and the estimated Jacobian should be introduced once and used consistently; several places appear to reuse symbols without redefinition.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below, providing clarifications based on the paper's assumptions and indicating revisions where appropriate to improve clarity.

read point-by-point responses
  1. Referee: [§4, variance theorem] §4 (Variance Analysis) and the statement of the main variance theorem: the claim that PZOS achieves a strictly lower variance bound than the black-box baseline rests on a chain-rule decomposition that combines exact leader partials with the ZO Jacobian estimate. When the follower response is set-valued or non-differentiable, the partial Goldstein subdifferential may not admit the same one-sided directional properties required for the bound to remain strict; the black-box and partial methods could then target inequivalent stationarity notions, undermining the strict comparison. Please supply the precise statement of the variance lemma and the additional assumptions (if any) that guarantee the inequality holds for non-smooth responses.

    Authors: The variance bound in Theorem 4.1 is established under the manuscript's core assumptions, including Lipschitz continuity of the follower response mapping (Assumption 3.1) and the use of measurable selections for set-valued responses. The partial Goldstein subdifferential is defined via a chain-rule construction in the Clarke sense, which preserves the necessary directional properties for the variance comparison even in the non-smooth case. Because the leader partial gradient is computed exactly, its variance contribution is identically zero, yielding a strictly lower bound than the full black-box estimator regardless of non-smoothness in the follower component. We will add the explicit statement of the supporting variance lemma (currently Lemma 4.2) together with a dedicated paragraph listing all assumptions required for the strict inequality in the revised manuscript. revision: yes

  2. Referee: [Definition 3.2 and convergence theorem] Definition 3.2 (partial Goldstein subdifferential) and the convergence theorem that follows: the stationarity notion is introduced specifically for the composite structure, yet the proof that PZOS converges to both standard Goldstein and partial Goldstein points does not clarify whether the partial notion is strictly weaker or whether the algorithm can converge to a partial point that fails to be a standard stationary point. A concrete counter-example or equivalence result under mild conditions on the response mapping would strengthen the claim.

    Authors: Under the Lipschitz continuity assumption on the follower mapping (Assumption 3.1), the partial Goldstein subdifferential coincides with the standard Goldstein subdifferential of the composed objective. This follows from the chain rule holding in the Clarke subdifferential for Lipschitz functions, so the two stationarity notions are equivalent in the setting of the paper. Consequently, any point satisfying the partial condition also satisfies the standard condition, and the convergence theorem already guarantees both. We will insert a short proposition (or remark) in Section 3 establishing this equivalence under the stated assumptions, thereby clarifying that the partial notion is not strictly weaker here. revision: yes

Circularity Check

0 steps flagged

No circularity: variance bound and convergence derived from problem structure without reduction to self-defined inputs or self-citations.

full rationale

The paper's central claims rest on a mathematical derivation of a variance bound for PZOS that combines exact leader partials with ZO estimates of the follower Jacobian, plus convergence to partial Goldstein stationary points. These steps are presented as direct consequences of the composite objective and the stationarity definition introduced in the paper, without any quoted reduction to a fitted parameter, renamed empirical pattern, or load-bearing self-citation whose validity depends on the current result. The partial Goldstein subdifferential is defined explicitly to handle the non-smooth follower response, and the variance comparison is stated as strictly lower than the black-box baseline by construction of the update rule. No equations or sections in the provided text exhibit a self-definitional loop or a prediction that is statistically forced by prior fitting within the same work. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on abstract only: the paper relies on standard convergence analysis for zeroth-order methods and introduces a new stationarity concept; no explicit free parameters or invented physical entities are mentioned.

axioms (1)
  • domain assumption Existence of a well-defined partial Goldstein subdifferential for the composite objective
    Invoked when defining the stationarity notion that the algorithm targets.
invented entities (1)
  • partial Goldstein subdifferential no independent evidence
    purpose: Stationarity notion tailored to the leader-follower composite structure
    New concept introduced to prove convergence of PZOS

pith-pipeline@v0.9.0 · 5778 in / 1338 out tokens · 35359 ms · 2026-05-20T15:46:02.651634+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

  1. [1]

    Optnet: Differentiable optimization as a layer in neural networks

    Brandon Amos and J Zico Kolter. Optnet: Differentiable optimization as a layer in neural networks. InInternational conference on machine learning, pages 136–145. PMLR, 2017

  2. [2]

    Set-valued maps and viability theory

    Jean-Pierre Aubin and Arrigo Cellina.Differential inclusions. Set-valued maps and viability theory. Springer, 1984

  3. [3]

    Interior-point algorithms, penalty methods and equilibrium problems.Computational Optimization and Applications, 34 (2):155–182, 2006

    Hande Y Benson, Arun Sen, David F Shanno, and Robert J Vanderbei. Interior-point algorithms, penalty methods and equilibrium problems.Computational Optimization and Applications, 34 (2):155–182, 2006

  4. [4]

    Faster gradient-free algorithms for nonsmooth nonconvex stochastic optimization

    Lesi Chen, Jing Xu, and Luo Luo. Faster gradient-free algorithms for nonsmooth nonconvex stochastic optimization. InInternational Conference on Machine Learning, pages 5219–5233. PMLR, 2023

  5. [5]

    SIAM, 1990

    Frank H Clarke.Optimization and nonsmooth analysis. SIAM, 1990

  6. [6]

    On the computation of equilibria in monotone and potential stochastic hierarchical games.Mathematical Programming, 198(2):1227–1285, 2023

    Shisheng Cui and Uday V Shanbhag. On the computation of equilibria in monotone and potential stochastic hierarchical games.Mathematical Programming, 198(2):1227–1285, 2023

  7. [7]

    Complexity guarantees for an implicit smoothing-enabled method for stochastic MPECs.Mathematical Programming, 198(2):1153– 1225, 2023

    Shisheng Cui, Uday V Shanbhag, and Farzad Yousefian. Complexity guarantees for an implicit smoothing-enabled method for stochastic MPECs.Mathematical Programming, 198(2):1153– 1225, 2023

  8. [8]

    A two-sided relaxation scheme for mathematical programs with equilibrium constraints.SIAM Journal on Optimization, 16(2):587–609, 2005

    Victor DeMiguel, Michael P Friedlander, Francisco J Nogales, and Stefan Scholtes. A two-sided relaxation scheme for mathematical programs with equilibrium constraints.SIAM Journal on Optimization, 16(2):587–609, 2005

  9. [9]

    Continuous-time best-response and related dynamics in tullock contests with convex costs.arXiv preprint arXiv:2402.08541, 2024

    Edith Elkind, Abheek Ghosh, and Paul W Goldberg. Continuous-time best-response and related dynamics in tullock contests with convex costs.arXiv preprint arXiv:2402.08541, 2024

  10. [10]

    Flaxman, Adam Tauman Kalai, and H

    Abraham D. Flaxman, Adam Tauman Kalai, and H. Brendan McMahan. Online convex optimization in the bandit setting: gradient descent without a gradient. InProceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’05, page 385–394, USA, 2005. Society for Industrial and Applied Mathematics. ISBN 0898715857

  11. [11]

    Local convergence of SQP methods for mathematical programs with equilibrium constraints.SIAM Journal on Optimization, 17(1):259–286, 2006

    Roger Fletcher, Sven Leyffer, Danny Ralph, and Stefan Scholtes. Local convergence of SQP methods for mathematical programs with equilibrium constraints.SIAM Journal on Optimization, 17(1):259–286, 2006

  12. [12]

    Big hype: Best intervention in games via distributed hypergradient descent

    Panagiotis D Grontas, Giuseppe Belgioioso, Carlo Cenedese, Marta Fochesato, John Lygeros, and Florian Dörfler. Big hype: Best intervention in games via distributed hypergradient descent. IEEE Transactions on Automatic Control, 69(12):8338–8353, 2024

  13. [13]

    Defending against multiple different attackers.European Journal of Operational Research, 211(2):370–384, 2011

    Kjell Hausken and Vicki M Bier. Defending against multiple different attackers.European Journal of Operational Research, 211(2):370–384, 2011

  14. [14]

    Theoretical and numerical com- parison of relaxation methods for mathematical programs with complementarity constraints

    Tim Hoheisel, Christian Kanzow, and Alexandra Schwartz. Theoretical and numerical com- parison of relaxation methods for mathematical programs with complementarity constraints. Mathematical Programming, 137(1):257–288, 2013

  15. [15]

    A tullock-contest-based approach for cyber security investments.Annals of Operations Research, 320(1):61–84, 2023

    David Iliaev, Sigal Oren, and Ella Segev. A tullock-contest-based approach for cyber security investments.Annals of Operations Research, 320(1):61–84, 2023. 10

  16. [16]

    Oracle complexity in nonsmooth nonconvex optimization

    Guy Kornowski and Ohad Shamir. Oracle complexity in nonsmooth nonconvex optimization. Journal of Machine Learning Research, 23(314):1–44, 2022

  17. [17]

    An algorithm with optimal dimension-dependence for zero- order nonsmooth nonconvex stochastic optimization.Journal of Machine Learning Research, 25(122):1–14, 2024

    Guy Kornowski and Ohad Shamir. An algorithm with optimal dimension-dependence for zero- order nonsmooth nonconvex stochastic optimization.Journal of Machine Learning Research, 25(122):1–14, 2024

  18. [18]

    Gradient-free methods for deterministic and stochastic nonsmooth nonconvex optimization.Advances in Neural Information Processing Systems, 35:26160–26175, 2022

    Tianyi Lin, Zeyu Zheng, and Michael Jordan. Gradient-free methods for deterministic and stochastic nonsmooth nonconvex optimization.Advances in Neural Information Processing Systems, 35:26160–26175, 2022

  19. [19]

    Lindberg and Leonid Engelson

    P.O. Lindberg and Leonid Engelson. Tolled multi-class traffic equilibria and toll sensitivities. EURO Journal on Transportation and Logistics, 4(2):197–222, 2015. ISSN 2192-4376

  20. [20]

    Zeroth-order methods for constrained nonconvex nonsmooth stochastic optimization

    Zhuanghua Liu, Cheng Chen, Luo Luo, and Bryan Kian Hsiang Low. Zeroth-order methods for constrained nonconvex nonsmooth stochastic optimization. In Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, and Felix Berkenkamp, editors,Proceedings of the 41st International Conference on Machine Learning, volume 2...

  21. [21]

    Random gradient-free minimization of convex functions

    Yurii Nesterov and Vladimir Spokoiny. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17(2):527–566, 2017

  22. [22]

    On the constant positive linear dependence condition and its application to SQP methods.SIAM Journal on Optimization, 10(4):963–981, 2000

    Liqun Qi and Zengxin Wei. On the constant positive linear dependence condition and its application to SQP methods.SIAM Journal on Optimization, 10(4):963–981, 2000

  23. [23]

    An interior point method for mathematical programs with complementarity constraints (mpccs).SIAM Journal on Optimization, 15(3): 720–750, 2005

    Arvind U Raghunathan and Lorenz T Biegler. An interior point method for mathematical programs with complementarity constraints (mpccs).SIAM Journal on Optimization, 15(3): 720–750, 2005

  24. [24]

    An optimal algorithm for bandit and zero-order convex optimization with two-point feedback.Journal of Machine Learning Research, 18(52):1–11, 2017

    Ohad Shamir. An optimal algorithm for bandit and zero-order convex optimization with two-point feedback.Journal of Machine Learning Research, 18(52):1–11, 2017

  25. [25]

    On the convergence theory for Hessian-free bilevel algorithms.Advances in Neural Information Processing Systems, 35:4136–4149, 2022

    Daouda Sow, Kaiyi Ji, and Yingbin Liang. On the convergence theory for Hessian-free bilevel algorithms.Advances in Neural Information Processing Systems, 35:4136–4149, 2022

  26. [26]

    Cambridge University Press, 1996

    Rangarajan K Sundaram.A first course in optimization theory. Cambridge University Press, 1996

  27. [27]

    A zeroth-order stochastic implicit method for bilevel-structured actor-critic schemes.Science China Information Sciences, 68(5):150204, 2025

    Haochen Tao, Shisheng Cui, Zhuo Li, and Jian Sun. A zeroth-order stochastic implicit method for bilevel-structured actor-critic schemes.Science China Information Sciences, 68(5):150204, 2025

  28. [28]

    Efficient rent seeking

    Gordon Tullock. Efficient rent seeking. In James M. Buchanan, Robert D. Tollison, and Gordon Tullock, editors,Toward a Theory of the Rent-Seeking Society, pages 97–112. Texas A&M University Press, College Station, TX, 1980

  29. [29]

    Cambridge University Press, 2nd edition, 2018

    Roman Vershynin.High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge University Press, 2nd edition, 2018

  30. [30]

    The multi-class, multi-criteria traffic network equilibrium and systems optimum problem.Transportation Research Part B: Methodological, 38(1):1–15, 2004

    Hai Yang and Hai-Jun Huang. The multi-class, multi-criteria traffic network equilibrium and systems optimum problem.Transportation Research Part B: Methodological, 38(1):1–15, 2004

  31. [31]

    On stochastic gradient and subgradient methods with adaptive steplength sequences.Automatica, 48(1):56–67, 2012

    Farzad Yousefian, Angelia Nedi´c, and Uday V Shanbhag. On stochastic gradient and subgradient methods with adaptive steplength sequences.Automatica, 48(1):56–67, 2012

  32. [32]

    Complexity of finding stationary points of nonconvex nonsmooth functions

    Jingzhao Zhang, Hongzhou Lin, Stefanie Jegelka, Suvrit Sra, and Ali Jadbabaie. Complexity of finding stationary points of nonconvex nonsmooth functions. InInternational Conference on Machine Learning, pages 11173–11182. PMLR, 2020. 11 Technical appendices and supplementary material A Supplementary algorithms and graphs Algorithm 2Zero-Order Smoothing (ZOS...

  33. [33]

    Goldstein subdifferential.For δ >0 , F is differentiable on B(0, δ)\ {0} with F ′(z) = 2z−1 for z >0 and F ′(z) = 2z−3 for z <0

    = 3 4. Goldstein subdifferential.For δ >0 , F is differentiable on B(0, δ)\ {0} with F ′(z) = 2z−1 for z >0 and F ′(z) = 2z−3 for z <0 . Collecting Clarke gradients over B(0, δ), the Goldstein subdifferential atx 0 = 0equals ∂G δ F(0) = conv ∪z∈B(x,δ) ∂F(z) = [−2δ−3,2δ−1]. This interval contains zero if and only ifδ≥ 1 2: -δ < 1 2: all elements are negati...

  34. [34]

    Hence x= 0 is not a (δ, ε)-Goldstein stationary point for δ∈(0,1) and ε <1 , nor for δ∈[1, 3 2) andε <3−2δ

    = 0, so0∈∂ G δ F(0). Hence x= 0 is not a (δ, ε)-Goldstein stationary point for δ∈(0,1) and ε <1 , nor for δ∈[1, 3 2) andε <3−2δ. Partial Goldstein subdifferential.The Jacobian ofy ∗ at differentiable pointsz /∈ {0,1}is J y∗(z) =    (−1,−1) ⊤, z <0, (1,−1) ⊤,0< z <1, (1,1) ⊤, z >1, with Clarke Jacobians ∂y ∗(0) = conv{(−1,−1) ⊤,(1,−1) ⊤} and ∂y ∗(1) = c...