pith. sign in

arxiv: 2505.22916 · v2 · pith:A2CJY3RCnew · submitted 2025-05-28 · 🧮 math.OC

On the Resolution of Stochastic MPECs over Networks: Distributed Implicit Zeroth-Order Gradient Tracking Methods

Pith reviewed 2026-05-22 01:12 UTC · model grok-4.3

classification 🧮 math.OC
keywords stochastic MPECdistributed optimizationzeroth-order methodsgradient trackingvariational inequalitynonconvex optimization
0
0 comments X

The pith

Distributed zeroth-order gradient tracking methods solve stochastic MPECs over networks with optimal complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops fully iterative distributed methods for stochastic mathematical programs with equilibrium constraints (SMPECs) where the global objective is shared across a network of agents. It proposes single-stage and two-stage zeroth-order gradient tracking algorithms that approximate the gradient of the implicit objective by using evaluations of the lower-level variational inequality solutions. Under the assumption of unique lower-level solutions and Lipschitz continuity, these methods achieve the best-known complexity bounds for nonsmooth nonconvex stochastic optimization in the exact case, and extend this achievement to the inexact two-stage setting for the first time.

Core claim

Under the assumption that the parametrized VI is uniquely solvable for every upper-level decision, the resulting implicit problem is generally neither convex nor smooth. The proposed single-stage and two-stage zeroth-order distributed gradient tracking optimization methods, where the gradient of a smoothed implicit objective function is approximated using two (possibly inexact) evaluations of the lower-level VI solutions, achieve the best-known complexity bound for centralized nonsmooth nonconvex stochastic optimization in the exact setting of both problems, and for the first time in the inexact setting of the distributed two-stage SMPEC.

What carries the argument

Zeroth-order distributed gradient tracking methods that approximate the gradient of the smoothed implicit objective using evaluations of the lower-level VI solutions.

If this is right

  • In the exact setting, the methods match the complexity of centralized approaches despite the distributed network constraint.
  • The two-stage inexact setting now has the same complexity guarantee as the exact case for the first time.
  • For the single-stage inexact setting, the dimension dependence is improved compared to prior centralized results.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The reduction from distributed SMPECs to implicit nonsmooth problems may apply to other network bilevel structures with unique lower-level solutions.
  • The zeroth-order approximation technique could be tested on larger-scale network instances to check if communication costs remain controlled at the claimed rate.

Load-bearing premise

The parametrized variational inequality has a unique solution for every choice of upper-level decision variables.

What would settle it

Finding an instance of the parametrized VI that has multiple solutions for some upper-level decision, rendering the implicit objective ill-defined, or a numerical test where the observed convergence rate fails to match the claimed bound under the stated assumptions.

Figures

Figures reproduced from arXiv: 2505.22916 by Farzad Yousefian, Mohammadjavad Ebrahimi, Uday V. Shanbhag.

Figure 1
Figure 1. Figure 1: Comparison of Algorithm 1 with ZSOL1s ncvx in terms of the sample averaged global objective function and consensus error five different networks, each with 20 nodes. 32 [PITH_FULL_IMAGE:figures/full_fig_p032_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of Algorithm 3 with ZSOL2s ncvx in terms of the sample averaged global objective function and consensus error five different networks, each with 20 nodes. 34 [PITH_FULL_IMAGE:figures/full_fig_p034_2.png] view at source ↗
read the original abstract

The mathematical program with equilibrium constraints (MPEC) is a powerful yet challenging class of constrained optimization problems, where the constraints are characterized by a parametrized variational inequality (VI) problem. While efficient algorithms for addressing MPECs and their stochastic variants (SMPECs) have been recently presented, distributed SMPECs over networks pose significant challenges. This work aims to develop fully iterative methods with complexity guarantees for resolving distributed SMPECs in two problem settings: (1) distributed single-stage SMPECs and (2) distributed two-stage SMPECs. In both cases, the global objective function is distributed among a network of agents that communicate cooperatively. Under the assumption that the parametrized VI is uniquely solvable, the resulting implicit problem in upper-level decisions is generally neither convex nor smooth. Under some standard assumptions, including the uniqueness of the solution to the VI problems and the Lipschitz continuity of the implicit global objective function, we propose single-stage and two-stage zeroth-order distributed gradient tracking optimization methods where the gradient of a smoothed implicit objective function is approximated using two (possibly inexact) evaluations of the lower-level VI solutions. In the exact setting of both the single-stage and two-stage problems, we achieve the best-known complexity bound for centralized nonsmooth nonconvex stochastic optimization. This complexity bound is also achieved (for the first time) for our method in addressing the inexact setting of the distributed two-stage SMPEC. In addressing the inexact setting of the single-stage problem, we derive an overall complexity bound, improving the dependence on the dimension compared to the existing results for the centralized SMPECs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes single-stage and two-stage zeroth-order distributed gradient tracking methods for resolving stochastic MPECs over networks. Under the assumption that the parametrized variational inequality is uniquely solvable for each upper-level decision, the problem reduces to an implicit nonsmooth nonconvex stochastic optimization problem. The methods approximate the gradient of a smoothed implicit objective using two evaluations of the lower-level VI solutions and achieve the best-known complexity bounds for centralized nonsmooth nonconvex stochastic optimization in the exact setting, with the bound also achieved for the first time in the inexact two-stage distributed setting.

Significance. If the complexity bounds hold under the stated assumptions, the work is significant for extending centralized SMPEC results to distributed networked agents for the first time in the inexact two-stage case. It provides practical fully iterative methods with explicit rates, using gradient tracking to handle communication and zeroth-order smoothing for the implicit nonsmooth objective.

major comments (2)
  1. The central claim that the proposed methods achieve the best-known complexity bound for centralized nonsmooth nonconvex stochastic optimization (abstract and complexity analysis sections) requires an explicit statement of the target bound (including dependence on epsilon, dimension, and network parameters such as the spectral gap) and a step-by-step verification that distributed tracking errors and inexactness tolerances do not degrade it.
  2. Assumption on unique solvability of the parametrized VI for every upper-level decision (stated in abstract and used to define the implicit objective) is load-bearing for the entire reduction; the paper should supply sufficient conditions (e.g., uniform strong monotonicity of the stochastic VI mapping) to ensure this holds across the parameter space in the distributed stochastic setting, as local violations would invalidate the implicit problem and all subsequent rates.
minor comments (2)
  1. Clarify the dependence of the smoothing parameter on the iteration index and its interaction with the step-size sequence in the complexity derivations.
  2. Add an explicit reference or table comparing the derived rates to the specific centralized result being matched.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback on our manuscript. We address each of the major comments below and outline the revisions we plan to make.

read point-by-point responses
  1. Referee: The central claim that the proposed methods achieve the best-known complexity bound for centralized nonsmooth nonconvex stochastic optimization (abstract and complexity analysis sections) requires an explicit statement of the target bound (including dependence on epsilon, dimension, and network parameters such as the spectral gap) and a step-by-step verification that distributed tracking errors and inexactness tolerances do not degrade it.

    Authors: We agree that an explicit statement of the target complexity bound will enhance clarity. In the revised manuscript, we will explicitly state the best-known complexity bound from the centralized literature for nonsmooth nonconvex stochastic optimization. We will also add a step-by-step verification in the complexity analysis sections demonstrating that the distributed gradient tracking errors and inexactness tolerances are controlled via appropriate choices of step sizes and smoothing parameters so that they are absorbed into the leading term without degrading the bound, including the dependence on the network spectral gap. revision: yes

  2. Referee: Assumption on unique solvability of the parametrized VI for every upper-level decision (stated in abstract and used to define the implicit objective) is load-bearing for the entire reduction; the paper should supply sufficient conditions (e.g., uniform strong monotonicity of the stochastic VI mapping) to ensure this holds across the parameter space in the distributed stochastic setting, as local violations would invalidate the implicit problem and all subsequent rates.

    Authors: The unique solvability assumption is indeed central to the reduction to the implicit problem. While presented as a standing assumption, we acknowledge the value of supplying sufficient conditions. In the revision, we will add a remark providing such conditions, for example uniform strong monotonicity of the stochastic VI mapping (with modulus independent of the upper-level parameter and the stochastic realization) together with Lipschitz continuity, which ensures uniqueness uniformly over the parameter space and is compatible with the distributed stochastic setting. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation grounded in external complexity results

full rationale

The paper explicitly states the uniqueness of the parametrized VI solution as a standing assumption that renders the implicit objective well-defined, rather than deriving or fitting this property from its own outputs. Complexity claims are positioned as matching or improving upon best-known centralized bounds from prior literature on nonsmooth nonconvex stochastic optimization, without any reduction of the target bound to quantities fitted from the paper's experiments or self-referential definitions. No load-bearing self-citations, uniqueness theorems imported from the same authors, or ansatzes smuggled via citation are present in the provided text. The analysis chain for the zeroth-order gradient tracking methods therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claims rest on uniqueness of the lower-level VI solution for every upper-level parameter and on Lipschitz continuity of the resulting implicit objective; these are standard domain assumptions rather than new entities or fitted constants.

free parameters (2)
  • smoothing parameter
    Used to approximate the nonsmooth implicit objective; its value is chosen to balance bias and variance but is not numerically fitted in the abstract.
  • step-size sequence
    Standard diminishing or constant step sizes typical in stochastic zeroth-order methods; not specified numerically here.
axioms (2)
  • domain assumption The parametrized variational inequality admits a unique solution for every upper-level decision.
    Invoked to ensure the implicit upper-level objective is well-defined; stated in the abstract as the key structural assumption.
  • domain assumption The implicit global objective function is Lipschitz continuous.
    Required for the gradient-tracking analysis; listed among the standard assumptions in the abstract.

pith-pipeline@v0.9.0 · 5841 in / 1616 out tokens · 33712 ms · 2026-05-22T01:12:39.235741+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 1 internal anchor

  1. [1]

    Askeland, T

    M. Askeland, T. Burandt, and S. A. Gabriel , A stochastic mpec approach for grid tariff design with demand-side flexibility, Energy Systems, 14 (2023), pp. 707–729

  2. [2]

    D. P. Bertsekas , Stochastic optimization problems with nondifferentiable cost functionals with an application in stochastic programming, in Proceedings of the 1972 IEEE Conference on Decision and Control and 11th Symposium on Adaptive Processes, IEEE, 1972, pp. 555–559

  3. [3]

    Boyd and L

    S. Boyd and L. V andenberghe , Convex optimization, Cambridge university press, 2004

  4. [4]

    F. H. Clarke, Y. S. Ledyaev, R. J. Stern, and P. R. Wolenski , Nonsmooth analysis and control theory, vol. 178, Springer Science & Business Media, 2008

  5. [5]

    Cui and U

    S. Cui and U. V. Shanbhag , On the computation of equilibria in monotone and potential stochastic hierarchical games, Mathematical Programming, 198 (2023), pp. 1227–1285

  6. [6]

    S. Cui, U. V. Shanbhag, and F. Yousefian , Complexity guarantees for an implicit smoothing-enabled method for stochastic MPECs, Mathematical Programming, 198 (2023), pp. 1153–1225

  7. [7]

    Dirkse, M

    S. Dirkse, M. Ferris, and A. Meeraus , Mathematical programs with equilibrium con- straints: Automatic reformulation and solution via constraint optimization, tech.rep., Technical Report NA-02/11, Oxford University Computing Laboratory, 2002

  8. [8]

    Ebrahimi, Y

    M. Ebrahimi, Y. Qiu, S. Cui, and F. Yousefian , Regularized federated methods with universal guarantees for simple bilevel optimization, arXiv preprint arXiv:2503.08634, (2025)

  9. [9]

    Ebrahimi, U

    M. Ebrahimi, U. V. Shanbhag, and F. Yousefian , Distributed gradient tracking meth- ods with guarantees for computing a solution to stochastic mpecs, in 2024 American Control Conference (ACC), 2024, pp. 2182–2187. 35

  10. [10]

    Evgrafov and M

    A. Evgrafov and M. Patriksson , On the existence of solutions to stochastic mathematical programs with equilibrium constraints, Journal of Optimization Theory and Applications, 121 (2004), pp. 65–76

  11. [11]

    F acchinei, H

    F. F acchinei, H. Jiang, and L. Qi , A smoothing method for mathematical programs with equilibrium constraints, Mathematical programming, 85 (1999), p. 107

  12. [12]

    M. L. Flegel and C. Kanzow , Abadie-type constraint qualification for mathematical pro- grams with equilibrium constraints, Journal of Optimization Theory and Applications, 124 (2005), pp. 595–614

  13. [13]

    Fletcher, S

    R. Fletcher, S. Leyffer, D. Ralph, and S. Scholtes , Local convergence of sqp methods for mathematical programs with equilibrium constraints, SIAM Journal on Optimization, 17 (2006), pp. 259–286

  14. [14]

    Fukushima and J.-S

    M. Fukushima and J.-S. Pang , Some feasibility issues in mathematical programs with equi- librium constraints, SIAM Journal on Optimization, 8 (1998), pp. 673–681

  15. [15]

    Ghiasv and, A

    S. Ghiasv and, A. Reisizadeh, M. Alizadeh, and R. Pedarsani,Communication-efficient and decentralized federated minimax optimization, in 2024 60th Annual Allerton Conference on Communication, Control, and Computing, IEEE, 2024, pp. 01–07

  16. [16]

    , Robust decentralized learning with local updates and gradient tracking, IEEE Transactions on Networking, (2025)

  17. [17]

    Goldstein, Optimization of Lipschitz continuous functions, Mathematical Programming, 13 (1977), pp

    A. Goldstein, Optimization of Lipschitz continuous functions, Mathematical Programming, 13 (1977), pp. 14–22

  18. [18]

    Guo, G.-H

    L. Guo, G.-H. Lin, and J. J. Ye ,Solving mathematical programs with equilibrium constraints, Journal of Optimization Theory and Applications, 166 (2015), pp. 234–256

  19. [19]

    Hajinezhad, M

    D. Hajinezhad, M. Hong, and A. Garcia , Zone: Zeroth-order nonconvex multiagent opti- mization over networks, IEEE Transactions on Automatic Control, 64 (2019), pp. 3995–4010

  20. [20]

    B. F. Hobbs, C. B. Metzler, and J.-S. Pang , Strategic gaming analysis for electric power systems: An mpec approach, IEEE transactions on power systems, 15 (2000), pp. 638–645

  21. [21]

    R. A. Horn and C. R. Johnson , Matrix analysis, Cambridge university press, 2012

  22. [22]

    Jalilzadeh, F

    A. Jalilzadeh, F. Yousefian, and M. Ebrahimi , Stochastic approximation for estimating the price of stability in stochastic nash games, ACM Transactions on Modeling and Computer Simulation (TOMACS), DOI: 10.1145/3632525

  23. [23]

    J. Y. Jane,Necessary and sufficient optimality conditions for mathematical programs with equi- librium constraints, Journal of Mathematical Analysis and Applications, 307 (2005), pp. 350– 369

  24. [24]

    Kanzow and A

    C. Kanzow and A. Schw artz , Mathematical programs with equilibrium constraints: en- hanced fritz john-conditions, new constraint qualifications, and improved exact penalty results, SIAM Journal on Optimization, 20 (2010), pp. 2730–2753

  25. [25]

    H. D. Kaushik and F. Yousefian , A method with convergence rates for optimization problems with variational inequality constraints, SIAM Journal on Optimization, 31 (2021), pp. 2171–2198. 36

  26. [26]

    Kornowski and O

    G. Kornowski and O. Shamir , Oracle complexity in nonsmooth nonconvex optimization, Advances in Neural Information Processing Systems, 34 (2021), pp. 324–334

  27. [27]

    , An algorithm with optimal dimension-dependence for zero-order nonsmooth nonconvex stochastic optimization, Journal of Machine Learning Research, 25 (2024), pp. 1–14

  28. [28]

    Lakshmanan and D

    H. Lakshmanan and D. P. De F arias , Decentralized resource allocation in dynamic net- works of agents, SIAM Journal on Optimization, 19 (2008), pp. 911–940

  29. [29]

    La wphongpanich and D

    S. La wphongpanich and D. W. Hearn , An mpec approach to second-best toll pricing, Mathematical programming, 101 (2004), pp. 33–55

  30. [30]

    G.-H. Lin, X. Chen, and M. Fukushima ,New restricted ncp functions and their applications to stochastic ncp and stochastic mpec, Optimization, 56 (2007), pp. 641–653

  31. [31]

    G.-H. Lin, X. Chen, and M. Fukushima , Solving stochastic mathematical programs with equilibrium constraints via approximation and smoothing implicit programming with penaliza- tion, Mathematical Programming, 116 (2009), pp. 343–368

  32. [32]

    Lin and M

    G.-H. Lin and M. Fukushima , Stochastic equilibrium problems and stochastic mathematical programs with equilibrium constraints: A survey, Pacific Journal of Optimization, 6 (2010), pp. 455–482

  33. [33]

    T. Lin, Z. Zheng, and M. Jordan , Gradient-free methods for deterministic and stochastic nonsmooth nonconvex optimization, Advances in Neural Information Processing Systems, 35 (2022), pp. 26160–26175

  34. [34]

    H. Liu, W. Yu, W. X. Zheng, A. Nedić, and Y. Zhu , Distributed constrained optimization algorithms with linear convergence rate over time-varying unbalanced graphs, Automatica, 159 (2024), p. 111346

  35. [35]

    Luo, J.-S

    Z.-Q. Luo, J.-S. Pang, and D. Ralph , Mathematical programs with equilibrium constraints, Cambridge University Press, 1996

  36. [36]

    Marrinan, U

    L. Marrinan, U. V. Shanbhag, and F. Yousefian , Zeroth-order gradient and quasi-newton methods for nonsmooth nonconvex stochastic optimization , arXiv preprint arXiv:2401.08665, (2023)

  37. [37]

    Mhanna and M

    E. Mhanna and M. Assaad , Single point-based distributed zeroth-order optimization with a non-convex stochastic objective function, in International Conference on Machine Learning, PMLR, 2023, pp. 24701–24719

  38. [38]

    J. J. Moreau, Propriétés des applications «prox», Comptes rendus hebdomadaires des séances de l’Académie des sciences, 256 (1963), pp. 1069–1071

  39. [39]

    Nesterov and V

    Y. Nesterov and V. Spokoiny , Random gradient-free minimization of convex functions, Foundations of Computational Mathematics, 17 (2017), pp. 527–566

  40. [40]

    J. V. Outrata, A generalized mathematical program with equilibrium constraints, SIAM Jour- nal on Control and Optimization, 38 (2000), pp. 1623–1638

  41. [41]

    Patriksson and L

    M. Patriksson and L. Wynter , Stochastic mathematical programs with equilibrium con- straints, Operations research letters, 25 (1999), pp. 159–167. 37

  42. [42]

    Pu and A

    S. Pu and A. Nedić , Distributed stochastic gradient tracking methods, Mathematical Pro- gramming, 187 (2021), pp. 409–457

  43. [43]

    S. Pu, W. Shi, J. Xu, and A. Nedić , Push–pull gradient methods for distributed optimization in networks, IEEE Transactions on Automatic Control, 66 (2020), pp. 1–16

  44. [44]

    Y. Qiu, U. Shanbhag, and F. Yousefian , Zeroth-order methods for nondifferentiable, nonconvex, and hierarchical federated optimization, Advances in Neural Information Processing Systems, 36 (2023)

  45. [45]

    Y. Qiu, U. V. Shanbhag, and F. Yousefian , Zeroth-order federated methods for stochastic mpecs and nondifferentiable nonconvex hierarchical optimization, Mathematics of Operations Research (under review), arXiv preprint arXiv:2309.13024v3, (2024)

  46. [46]

    Qu and N

    G. Qu and N. Li , Harnessing smoothness to accelerate distributed optimization, IEEE Trans- actions on Control of Network Systems, 5 (2017), pp. 1245–1260

  47. [47]

    R. T. Rockafellar , Convex functions and dual extremum problems, PhD thesis, Harvard University, 1963

  48. [48]

    Saadatniaki, R

    F. Saadatniaki, R. Xin, and U. A. Khan , Decentralized optimization over time-varying directed graphs with row and column-stochastic matrices, IEEE Transactions on Automatic Control, 65 (2020), pp. 4769–4780

  49. [49]

    Samadi and F

    S. Samadi and F. Yousefian , Improved guarantees for optimal Nash equilibrium seeking and bilevel variational inequalities, SIAM Journal on Optimization, 35 (2025), pp. 369–399

  50. [50]

    Scholtes and M

    S. Scholtes and M. Stöhr , Exact penalization of mathematical programs with equilibrium constraints, SIAM Journal on Control and Optimization, 37 (1999), pp. 617–652

  51. [51]

    Shapiro and H

    A. Shapiro and H. Xu , Stochastic mathematical programs with equilibrium constraints, mod- elling and sample average approximation, Optimization, 57 (2008), pp. 395–418

  52. [52]

    H. D. Sherali, A. L. Soyster, and F. H. Murphy , Stackelberg-nash-cournot equilibria: characterizations and computations, Operations Research, 31 (1983), pp. 253–276

  53. [53]

    Steffensen and M

    S. Steffensen and M. Ulbrich , A new relaxation scheme for mathematical programs with equilibrium constraints, SIAM Journal on Optimization, 20 (2010), pp. 2504–2539

  54. [54]

    V. Steklov, Sur les expressions asymptotiques decertaines fonctions dfinies par les quations diffrentielles du second ordre et leers applications au problme du dvelopement d’une fonction arbitraire en sries procdant suivant les diverses fonctions, Comm. Charkov Math. Soc, 2 (1907), pp. 97–199

  55. [55]

    Y. Sun, G. Scutari, and A. Daneshmand , Distributed optimization based on gradient tracking revisited: Enhancing convergence rate via surrogation, SIAM Journal on Optimization, 32 (2022), pp. 354–385

  56. [56]

    Y. Tang, J. Zhang, and N. Li , Distributed zero-order algorithms for nonconvex multiagent optimization, IEEE Transactions on Control of Network Systems, 8 (2020), pp. 269–281

  57. [57]

    M. J. W ainwright, High-dimensional statistics: A non-asymptotic viewpoint, vol. 48, Cam- bridge university press, 2019. 38

  58. [58]

    Wogrin, E

    S. Wogrin, E. Centeno, and J. Barquín , Generation capacity expansion in liberalized electricity markets: A stochastic MPEC approach, IEEE Transactions on Power Systems, 26 (2011), pp. 2526–2532

  59. [59]

    Xin and U

    R. Xin and U. A. Khan , A linear algorithm for optimization over directed graphs with geo- metric convergence, IEEE Control Systems Letters, 2 (2018), pp. 315–320

  60. [60]

    R. Xin, U. A. Khan, and S. Kar , An improved convergence analysis for decentralized on- line stochastic non-convex optimization, IEEE Transactions on Signal Processing, 69 (2021), pp. 1842–1858

  61. [61]

    Xu and J

    H. Xu and J. J. Ye , Necessary optimality conditions for two-stage stochastic programs with equilibrium constraints, SIAM Journal on Optimization, 20 (2010), pp. 1685–1715

  62. [62]

    Yousefian, A

    F. Yousefian, A. Nedić, and U. V. Shanbhag , On stochastic gradient and subgradient methods with adaptive steplength sequences, Automatica, 48 (2012), pp. 56–67

  63. [63]

    Zhang, H

    J. Zhang, H. Lin, S. Jegelka, S. Sra, and A. Jadbabaie ,Complexity of finding stationary points of nonconvex nonsmooth functions, in International Conference on Machine Learning, PMLR, 2020, pp. 11173–11182. 7 Appendix Proof of Lemma 1.(i) The first equation can be shown in a similar vein to [6, Lemma 1]. To show the second equation, we have n 2η Ev∈S [(h...

  64. [64]

    For k + 2, we have2 π · (k+2)!! (k+1)!! = 2 π · k+2 k+1 · k!! (k−1)!!

    Suppose the inequality holds for some even k ≥ 2, i.e., 2 π · k!! (k−1)!! ≤ √ k. For k + 2, we have2 π · (k+2)!! (k+1)!! = 2 π · k+2 k+1 · k!! (k−1)!! . Invoking the inductive hypothesis, we obtain 2 π · (k+2)!! (k+1)!! ≤ k+2 k+1 · √ k. Recall that we showedk+2 k+1 · √ k ≤ √ k + 2, completing the proof.□ 39 Proof of Lemma 8.Invoking the gradient tracking ...