Restarted Accelerated Primal-Dual Algorithms with Adaptive Stepsizes for Nonlinear Conic Constrained Convex Optimization

Jinxin Wang; Necdet Serhat Aybat

arxiv: 2605.29291 · v1 · pith:AVXHYVNWnew · submitted 2026-05-28 · 🧮 math.OC

Restarted Accelerated Primal-Dual Algorithms with Adaptive Stepsizes for Nonlinear Conic Constrained Convex Optimization

Necdet Serhat Aybat , Jinxin Wang This is my paper

Pith reviewed 2026-06-29 06:25 UTC · model grok-4.3

classification 🧮 math.OC

keywords primal-dual methodsconic programmingmetric subregularitylinear convergencequadratic growthQCQPadaptive stepsizesrestarts

0 comments

The pith

Restarted accelerated primal-dual algorithms achieve global linear convergence for nonlinear conic convex optimization under metric subregularity of the KKT mapping.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops restarted versions of accelerated primal-dual methods with adaptive and non-monotone backtracking stepsizes for convex problems with nonlinear conic constraints. These algorithms handle minimax reformulations that have non-bilinear terms, which standard methods cannot address directly. Under the assumption that the KKT mapping is metrically subregular, the methods are shown to have a quadratic growth property on a smoothed duality gap, leading to global linear convergence rates. The approach requires only first-order oracles and is demonstrated on quadratically constrained quadratic programs and kernel matrix learning tasks.

Core claim

For convex nonlinear conic programs, the restarted accelerated primal-dual algorithm with backtracking establishes global linear convergence by proving that metric subregularity of the KKT mapping implies quadratic growth of the self-centered smoothed duality gap; this holds even when extending metric subregularity conditions to certain nonconvex problems over polyhedral cones.

What carries the argument

The rAPDB algorithm combining fixed-frequency or adaptive restarts with monotone or non-monotone adaptive stepsize search, which adapts to local curvature in the non-bilinear minimax reformulation.

If this is right

Global linear convergence rates are guaranteed for the proposed methods on problems satisfying the metric subregularity condition.
The algorithms are first-order and thus scalable to large instances with GPU acceleration.
Sufficient conditions ensure metric subregularity for nonconvex problems with convex polyhedral cones.
Numerical tests confirm strong performance on random QCQPs and kernel matrix learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The non-monotone stepsize strategy may offer practical advantages in ill-conditioned problems beyond what linear convergence theory predicts.
Similar restart schemes could apply to other accelerated methods in conic programming with nonlinear couplings.
The quadratic growth property might be leveraged to derive rates for related duality gap measures in broader optimization settings.

Load-bearing premise

The KKT mapping of the optimization problem satisfies metric subregularity.

What would settle it

An instance of a nonlinear conic program where the KKT mapping is metrically subregular but the self-centered smoothed duality gap fails to exhibit quadratic growth, or where the algorithm convergence rate is observed to be sublinear.

Figures

Figures reproduced from arXiv: 2605.29291 by Jinxin Wang, Necdet Serhat Aybat.

**Figure 1.** Figure 1: Convergence performance of rAPDB, rAPDB-ada, APDB, and EGM on randomly generated QCQP instances with m = 10 and n = 1000. The plots display the relative suboptimality and constraint violation against the total number of gradient evaluations. The top row corresponds to monotone step-size search, i.e., cnm = 0, while the bottom row corresponds to non-monotone step-size search, i.e., cnm = 1. Implementation d… view at source ↗

**Figure 2.** Figure 2: Convergence performance of rAPDB, rAPDB-ada, APDB, and EGM on the kernel matrix learning problem instances with n = 4, 000, measured by the relative error against iteration count and CPU time. 5.2 Kernel matrix learning: ℓ2-norm soft margin SVM Suppose we are given a training set consisting of feature vectors {ai} n i=1 ⊂ R d and the corresponding labels {bi} n i=1 ⊂ {−1, +1}. Consider m ∈ Z+ different emb… view at source ↗

**Figure 3.** Figure 3: Convergence performance of rAPDB, rAPDB-ada, APDB, and EGM on the kernel matrix learning problem instances with n = 10, 000, measured by the relative error against iteration count and CPU time. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_3.png] view at source ↗

read the original abstract

We propose restarted accelerated primal-dual algorithms with (non-monotone) backtracking (rAPDB) for convex nonlinear conic programs, with quadratically constrained quadratic programs (QCQPs) as a special case. Unlike linear and quadratic programs, these problems give rise to convex-concave minimax reformulations with non-bilinear coupling terms; therefore, the existing primal-dual methods for bilinear couplings are not applicable. To address this challenge, we build on the accelerated primal-dual method with adaptive stepsize search -- as it adapts to the local curvature -- and develop both fixed-frequency and adaptive restart schemes, incorporating both monotone and non-monotone adaptive step-size search strategies. The resulting algorithms require only first-order information and matrix-vector products, making them suitable for large-scale and GPU-accelerated implementation. Under metric subregularity of the KKT mapping, we prove a quadratic growth property for a self-centered smoothed duality gap and establish global linear convergence of the proposed restarted methods. We also establish sufficient conditions under which the metric subregularity holds even for general nonconvex problems over convex polyhedral cones. These results are new and may be of independent interest. Numerical experiments on random QCQPs and kernel matrix learning instances show that the proposed methods, especially with non-monotone adaptive stepsizes and GPU acceleration, achieve strong practical performance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds restart schemes and non-monotone backtracking to accelerated primal-dual methods for nonlinear conic problems and proves linear convergence under metric subregularity of the KKT map.

read the letter

The central point is that these restarted methods (rAPDB) achieve global linear convergence for convex nonlinear conic programs by establishing a quadratic growth property on a smoothed duality gap, but only when the KKT mapping satisfies metric subregularity.

What is new is the handling of non-bilinear coupling terms that rule out standard bilinear primal-dual schemes. The authors combine an existing adaptive-stepsize primal-dual base with both fixed-frequency and adaptive restart rules, and they test both monotone and non-monotone line searches. They also supply sufficient conditions under which metric subregularity holds for certain nonconvex problems over polyhedral cones; those conditions look like a modest but useful side result.

The algorithmic side is practical. The methods stay first-order and only need matrix-vector products, which fits large-scale and GPU settings. The experiments on random QCQPs and kernel matrix learning instances indicate that the non-monotone adaptive variants are competitive in wall-clock time.

The main limitation is the subregularity assumption itself. It is external to the problem class for general nonlinear cones, even though the paper gives some cases where it is satisfied. Without it the quadratic-growth step and the linear-rate claim do not go through. The numerical tests are on synthetic data and do not directly measure the predicted linear rates, so they mainly show practical speed rather than rate validation.

This work is aimed at researchers who already work on primal-dual first-order methods for conic programs and want variants that handle nonlinear couplings. A reader looking for new solvers for QCQPs would get usable algorithms and some theory to check.

I would send it to peer review. The algorithmic extensions and the subregularity conditions are concrete enough that referees can evaluate the proofs and the practical claims directly.

Referee Report

2 major / 2 minor

Summary. The paper proposes restarted accelerated primal-dual algorithms with monotone and non-monotone adaptive backtracking stepsizes (rAPDB) for convex nonlinear conic programs (with QCQPs as a special case). It proves a quadratic growth property for a self-centered smoothed duality gap and global linear convergence of the restarted methods under the assumption of metric subregularity of the KKT mapping. Sufficient conditions for metric subregularity are also established for certain nonconvex problems over convex polyhedral cones. The algorithms use only first-order information and are tested numerically on random QCQPs and kernel matrix learning problems, showing strong practical performance especially with non-monotone stepsizes and GPU acceleration.

Significance. If the derivations hold, the work extends primal-dual convergence analysis beyond bilinear saddle-point problems to nonlinear couplings and supplies conditional linear-rate guarantees together with sufficient conditions for metric subregularity that may be of independent interest. The adaptive restart and backtracking schemes, combined with first-order and matrix-vector product requirements, position the methods for large-scale implementation.

major comments (2)

[Abstract; main convergence theorems] Abstract and the statements of the main convergence theorems: global linear convergence is established only after deriving quadratic growth from metric subregularity of the KKT mapping. While sufficient conditions are supplied for polyhedral cones (including some nonconvex cases), no analogous conditions or verification procedure is given for general nonlinear cones; the assumption therefore remains external and problem-dependent for the primary problem class stated in the title.
[Numerical experiments] Numerical experiments section: the reported runs demonstrate practical speed but contain no diagnostic checking whether metric subregularity holds on the test instances, nor any measured convergence rates that could corroborate the predicted linear regime. Without such checks the experiments do not directly support the linear-rate claim.

minor comments (2)

[Preliminaries / notation] The precise definition and properties of the 'self-centered smoothed duality gap' should be stated with an equation number at the first appearance rather than deferred.
[Introduction] A short discussion contrasting the new restart schemes with existing restart techniques for accelerated methods would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate whether revisions will be made.

read point-by-point responses

Referee: [Abstract; main convergence theorems] Abstract and the statements of the main convergence theorems: global linear convergence is established only after deriving quadratic growth from metric subregularity of the KKT mapping. While sufficient conditions are supplied for polyhedral cones (including some nonconvex cases), no analogous conditions or verification procedure is given for general nonlinear cones; the assumption therefore remains external and problem-dependent for the primary problem class stated in the title.

Authors: We agree that the linear convergence result is conditional on metric subregularity of the KKT mapping. The paper's main contribution is the development of the restarted algorithms together with the proof that this assumption implies quadratic growth of the self-centered smoothed duality gap and global linear convergence. Sufficient conditions are derived for the important subclass of convex polyhedral cones (covering QCQPs and other practical cases, including certain nonconvex problems). For general nonlinear cones, universal sufficient conditions are inherently problem-dependent because they rely on the specific geometry of the cone and the constraint functions; providing them would require case-by-case analysis that lies beyond the scope of the present work. We will revise the abstract and introduction to clarify the scope of the sufficient conditions and to state explicitly that the linear-rate guarantee holds under the metric-subregularity assumption, with the polyhedral case serving as a verifiable subclass. revision: partial
Referee: [Numerical experiments] Numerical experiments section: the reported runs demonstrate practical speed but contain no diagnostic checking whether metric subregularity holds on the test instances, nor any measured convergence rates that could corroborate the predicted linear regime. Without such checks the experiments do not directly support the linear-rate claim.

Authors: We acknowledge that the current numerical section does not include explicit verification of metric subregularity or quantitative rate measurements. Direct numerical verification of metric subregularity is difficult because it requires an accurate estimate of the solution set. In the revision we will add log-scale plots of the smoothed duality gap versus iteration count for the QCQP and kernel-matrix instances, together with estimated linear rates extracted from the later iterations. These additions will provide visual and quantitative support for the linear regime predicted by the theory. revision: yes

Circularity Check

0 steps flagged

No circularity; convergence rests on external metric subregularity assumption

full rationale

The derivation chain establishes quadratic growth and linear convergence explicitly under the external hypothesis of metric subregularity of the KKT mapping (abstract and theorem statements). Sufficient conditions for subregularity on polyhedral cones are supplied as independent results, not derived from the algorithm or prior self-citations. No self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citation chains appear; the claims remain self-contained against the stated assumptions and do not reduce to their own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; full derivation details unavailable.

axioms (1)

domain assumption Metric subregularity of the KKT mapping
Invoked to prove quadratic growth of the smoothed duality gap and global linear convergence.

pith-pipeline@v0.9.1-grok · 5777 in / 1210 out tokens · 27656 ms · 2026-06-29T06:25:17.350828+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 6 canonical work pages

[1]

Second-order cone programming.Mathematical program- ming, 95(1):3–51, 2003

Farid Alizadeh and Donald Goldfarb. Second-order cone programming.Mathematical program- ming, 95(1):3–51, 2003

2003
[2]

Degenerate nonlinear programming with a quadratic growth condition.SIAM Journal on Optimization, 10(4):1116–1135, 2000

Mihai Anitescu. Degenerate nonlinear programming with a quadratic growth condition.SIAM Journal on Optimization, 10(4):1116–1135, 2000

2000
[3]

Practical large-scale linear programming using primal-dual hybrid gradient

David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O’Donoghue, and Warren Schudy. Practical large-scale linear programming using primal-dual hybrid gradient. Advances in Neural Information Processing Systems, 34:20243–20257, 2021

2021
[4]

PDLP: A practical first-order method for large-scale linear programming

David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O’Donoghue, and Warren Schudy. PDLP: A practical first-order method for large-scale linear programming. Mathematical Programming Computation, pages 1–45, 2026

2026
[5]

Fasterfirst-orderprimal-dualmeth- ods for linear programming using restarts and sharpness.Mathematical Programming, 201(1):133– 184, 2023

DavidApplegate, OliverHinder, HaihaoLu, andMilesLubin. Fasterfirst-orderprimal-dualmeth- ods for linear programming using restarts and sharpness.Mathematical Programming, 201(1):133– 184, 2023

2023
[6]

Parameter-free FISTA by adaptive restart and backtracking.SIAM Journal on Optimiza- tion, 34(4):3259–3285, 2024

Jean-François Aujol, Luca Calatroni, Charles Dossal, Hippolyte Labarrière, and Aude Ronde- pierre. Parameter-free FISTA by adaptive restart and backtracking.SIAM Journal on Optimiza- tion, 34(4):3259–3285, 2024

2024
[7]

A distributed ADMM-like method for resource sharing over time-varying networks.SIAM Journal on Optimization, 29(4):3036–3068, 2019

Necdet Serhat Aybat and Erfan Yazdandoost Hamedani. A distributed ADMM-like method for resource sharing over time-varying networks.SIAM Journal on Optimization, 29(4):3036–3068, 2019

2019
[8]

Projecting onto the intersection of a cone and a sphere.SIAM Journal on Optimization, 28(3):2158–2188, 2018

Heinz H Bauschke, Minh N Bui, and Xianfu Wang. Projecting onto the intersection of a cone and a sphere.SIAM Journal on Optimization, 28(3):2158–2188, 2018

2018
[9]

SIAM, 2001

Aharon Ben-Tal and Arkadi Nemirovski.Lectures on modern convex optimization: analysis, algorithms, and engineering applications. SIAM, 2001

2001
[10]

Springer Science & Business Media, 2000

J Frederic Bonnans and Alexander Shapiro.Perturbation Analysis of Optimization Problems. Springer Science & Business Media, 2000

2000
[11]

Cambridge University Press, 2004

Stephen Boyd and Lieven Vandenberghe.Convex optimization. Cambridge University Press, 2004

2004
[12]

Burke and Abraham Engle

James V. Burke and Abraham Engle. Strong metric (sub)regularity of Karush-Kuhn-Tucker map- pings for piecewise linear-quadratic convex-composite optimization and the quadratic convergence of Newton’s method.Mathematics of Operations Research, 45(3):1164–1192, 2020

2020
[13]

A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011

Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011

2011
[14]

On the ergodic convergence rates of a first-order primal– dual algorithm.Mathematical Programming, 159(1):253–287, 2016

Antonin Chambolle and Thomas Pock. On the ergodic convergence rates of a first-order primal– dual algorithm.Mathematical Programming, 159(1):253–287, 2016

2016
[15]

HPR-LP: An implementation of an HPR method for solving linear programming.Mathematical Programming Computation, pages 1–28, 2025

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. HPR-LP: An implementation of an HPR method for solving linear programming.Mathematical Programming Computation, pages 1–28, 2025

2025
[16]

HPR-QP: A dual Halpern Peaceman-Rachford method for solving large-scale convex composite quadratic programming.arXiv:2507.02470, 2025

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. HPR-QP: A dual Halpern Peaceman-Rachford method for solving large-scale convex composite quadratic programming.arXiv:2507.02470, 2025. 32

work page arXiv 2025
[17]

On the relationships among GPU-accelerated first-order methods for solving linear programming

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. On the relationships among GPU-accelerated first-order methods for solving linear programming. arXiv:2509.23903, 2025

work page arXiv 2025
[18]

An exponential cone programming approach for managing electric vehicle charging.Operations Research, 72(5):2215–2240, 2024

Li Chen, Long He, and Yangfang Zhou. An exponential cone programming approach for managing electric vehicle charging.Operations Research, 72(5):2215–2240, 2024

2024
[19]

Alocalnearlylinearlyconvergentfirst-ordermethodfornonsmooth functions with quadratic growth.Foundations of Computational Mathematics, 25(3):943–1024, 2025

DamekDavisandLiweiJiang. Alocalnearlylinearlyconvergentfirst-ordermethodfornonsmooth functions with quadratic growth.Foundations of Computational Mathematics, 25(3):943–1024, 2025

2025
[20]

A survey of direct methods for sparse linear systems.Acta Numerica, 25:383–566, 2016

Timothy A Davis, Sivasankaran Rajamanickam, and Wissam M Sid-Lakhdar. A survey of direct methods for sparse linear systems.Acta Numerica, 25:383–566, 2016

2016
[21]

CVXPY: A Python-embedded modeling language for convex optimization.Journal of Machine Learning Research, 17(83):1–5, 2016

Steven Diamond and Stephen Boyd. CVXPY: A Python-embedded modeling language for convex optimization.Journal of Machine Learning Research, 17(83):1–5, 2016

2016
[22]

Active set identification and rapid convergence for degenerate primal-dual problems.arXiv:2602.10436, 2026

Mateo Díaz, Pedro Izquierdo Lehmann, Haihao Lu, and Jinwen Yang. Active set identification and rapid convergence for degenerate primal-dual problems.arXiv:2602.10436, 2026

work page arXiv 2026
[23]

Asen L Dontchev and R Tyrrell Rockafellar.Implicit functions and solution mappings, volume
[24]

Error bounds, quadratic growth, and linear conver- gence of proximal methods.Mathematics of Operations Research, 43(3):919–948, 2018

Dmitriy Drusvyatskiy and Adrian S Lewis. Error bounds, quadratic growth, and linear conver- gence of proximal methods.Mathematics of Operations Research, 43(3):919–948, 2018

2018
[25]

Springer, New York, 2003

Francisco Facchinei and Jong-Shi Pang.Finite-Dimensional Variational Inequalities and Com- plementarity Problems. Springer, New York, 2003

2003
[26]

Quadratic error bound of the smoothed gap and the restarted averaged primal- dual hybrid gradient.Open Journal of Mathematical Optimization, 4:1–34, 2023

Olivier Fercoq. Quadratic error bound of the smoothed gap and the restarted averaged primal- dual hybrid gradient.Open Journal of Mathematical Optimization, 4:1–34, 2023

2023
[27]

Projection onto the exponential cone: A univariate root-finding problem

Henrik A Friberg. Projection onto the exponential cone: A univariate root-finding problem. Optimization Methods and Software, 38(3):457–473, 2023

2023
[28]

Computational optimal transport with applications to data sciences.Foundations and Trends®in Machine Learning, 11(5-6):355–607, 2019

Peyré Gabriel and Cuturi Marco. Computational optimal transport with applications to data sciences.Foundations and Trends®in Machine Learning, 11(5-6):355–607, 2019

2019
[29]

Sido: A pharmacology dataset.http://www.causality.inf.ethz.ch/data/ SIDO.html, 2008

Isabelle Guyon. Sido: A pharmacology dataset.http://www.causality.inf.ethz.ch/data/ SIDO.html, 2008. Accessed: 2008

2008
[30]

A primal-dual algorithm with line search for general convex-concave saddle point problems.SIAM Journal on Optimization, 31(2):1299– 1329, 2021

Erfan Yazdandoost Hamedani and Necdet Serhat Aybat. A primal-dual algorithm with line search for general convex-concave saddle point problems.SIAM Journal on Optimization, 31(2):1299– 1329, 2021

2021
[31]

A restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming

Yicheng Huang, Wanyu Zhang, Hongpei Li, Dongdong Ge, Huikang Liu, and Yinyu Ye. A restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming. INFORMS Journal on Computing, 2025

2025
[32]

Nonconvex optimization for regression with fairness constraints

Junpei Komiyama, Akiko Takeda, Junya Honda, and Hajime Shimao. Nonconvex optimization for regression with fairness constraints. InInternational Conference on Machine Learning, pages 2737–2746. PMLR, 2018

2018
[33]

Learning the kernel matrix with semidefinite programming.Journal of Machine Learning Research, 5(Jan):27–72, 2004

Gert RG Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I Jor- dan. Learning the kernel matrix with semidefinite programming.Journal of Machine Learning Research, 5(Jan):27–72, 2004

2004
[34]

Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods

Feng-Yi Liao, Lijun Ding, and Yang Zheng. Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods. In6th Annual Learning for Dynamics & Control Conference, pages 993–1005. PMLR, 2024. 33

2024
[35]

PDCS: A primal-dual large-scale conic programming solver with GPU enhancements.arXiv:2505.00311, 2025

Zhenwei Lin, Zikai Xiong, Dongdong Ge, and Yinyu Ye. PDCS: A primal-dual large-scale conic programming solver with GPU enhancements.arXiv:2505.00311, 2025

work page arXiv 2025
[36]

A technical note on the implementation and use of PDCS.arXiv:2603.15504, 2026

Zhenwei Lin, Zikai Xiong, Dongdong Ge, and Yinyu Ye. A technical note on the implementation and use of PDCS.arXiv:2603.15504, 2026

work page arXiv 2026
[37]

cuPDLP.jl: A GPU implementation of restarted primal-dual hybrid gradient for linear programming in Julia.Operations Research, 73(6):3440–3452, 2025

Haihao Lu and Jinwen Yang. cuPDLP.jl: A GPU implementation of restarted primal-dual hybrid gradient for linear programming in Julia.Operations Research, 73(6):3440–3452, 2025

2025
[38]

An overview of GPU-based first-order methods for linear programming and extensions.arXiv preprint arXiv:2506.02174, 2025

Haihao Lu and Jinwen Yang. An overview of GPU-based first-order methods for linear program- ming and extensions.arXiv:2506.02174, 2025

work page arXiv 2025
[39]

A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, 215(1):771–808, 2026

Haihao Lu and Jinwen Yang. A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, 215(1):771–808, 2026

2026
[40]

Semidefi- nite relaxation of quadratic optimization problems.IEEE Signal Processing Magazine, 27(3):20– 34, 2010

Zhi-Quan Luo, Wing-Kin Ma, Anthony Man-Cho So, Yinyu Ye, and Shuzhong Zhang. Semidefi- nite relaxation of quadratic optimization problems.IEEE Signal Processing Magazine, 27(3):20– 34, 2010

2010
[41]

A first-order primal-dual algorithm with linesearch.SIAM Journal on Optimization, 28(1):411–432, 2018

Yura Malitsky and Thomas Pock. A first-order primal-dual algorithm with linesearch.SIAM Journal on Optimization, 28(1):411–432, 2018

2018
[42]

Angelia Nedic and Asuman Ozdaglar. Ch. cooperative distributed multi-agent optimization. Convex optimization in signal processing and communications, page 340, 2010

2010
[43]

Operator splitting for a homogeneous embedding of the linear comple- mentarity problem.SIAM Journal on Optimization, 31(3):1999–2023, 2021

Brendan O’Donoghue. Operator splitting for a homogeneous embedding of the linear comple- mentarity problem.SIAM Journal on Optimization, 31(3):1999–2023, 2021

1999
[44]

Conic optimization via operator splitting and homogeneous self-dual embedding.Journal of Optimization Theory and Applica- tions, 169(3):1042–1068, June 2016

Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding.Journal of Optimization Theory and Applica- tions, 169(3):1042–1068, June 2016

2016
[45]

Adaptive restart for accelerated gradient schemes

Brendan O’donoghue and Emmanuel Candes. Adaptive restart for accelerated gradient schemes. Foundations of Computational Mathematics, 15(3):715–732, 2015

2015
[46]

Proximal algorithms.Foundations and Trends in Optimization, 1(3):127–239, 2014

Neal Parikh and Stephen Boyd. Proximal algorithms.Foundations and Trends in Optimization, 1(3):127–239, 2014

2014
[47]

Quadratically constrained quadratic programming: Some applications and a method for solution.Zeitschrift für Operations Research, 26(1):105–119, 1982

E Phan-huy Hao. Quadratically constrained quadratic programming: Some applications and a method for solution.Zeitschrift für Operations Research, 26(1):105–119, 1982

1982
[48]

Fast convergence to non-isolated minima: Four equivalent conditions forC 2 functions.Mathematical Programming, 213(1):151–199, 2025

Quentin Rebjock and Nicolas Boumal. Fast convergence to non-isolated minima: Four equivalent conditions forC 2 functions.Mathematical Programming, 213(1):151–199, 2025

2025
[49]

A simple nearly optimal restart scheme for speeding up first-order methods.Foundations of Computational Mathematics, 22(1):211–256, 2022

James Renegar and Benjamin Grimmer. A simple nearly optimal restart scheme for speeding up first-order methods.Foundations of Computational Mathematics, 22(1):211–256, 2022

2022
[50]

Strongly regular generalized equations.Mathematics of Operations Re- search, 5(1):43–62, 1980

Stephen M Robinson. Strongly regular generalized equations.Mathematics of Operations Re- search, 5(1):43–62, 1980

1980
[51]

Springer Science & Business Media, 2009

R Tyrrell Rockafellar and Roger J-B Wets.Variational Analysis. Springer Science & Business Media, 2009

2009
[52]

Sharpness, restart and acceleration.SIAM Journal on Optimization, 30(1):262–289, 2020

Vincent Roulet and Alexandre d’Aspremont. Sharpness, restart and acceleration.SIAM Journal on Optimization, 30(1):262–289, 2020

2020
[53]

OSQP: An operator splitting solver for quadratic programs

Bartolomeo Stellato, Goran Banjac, Paul Goulart, Alberto Bemporad, and Stephen Boyd. OSQP: An operator splitting solver for quadratic programs. In2018 UKACC 12th international confer- ence on control (CONTROL), pages 339–339. IEEE, 2018. 34

2018
[54]

Cambridge university press, 1996

Rangarajan K Sundaram.A first course in optimization theory. Cambridge university press, 1996

1996
[55]

Rest-Katyusha: Exploiting the solution’s structure via scheduled restart schemes.Advances in Neural Information Processing Systems, 31, 2018

Junqi Tang, Mohammad Golbabaee, Francis Bach, et al. Rest-Katyusha: Exploiting the solution’s structure via scheduled restart schemes.Advances in Neural Information Processing Systems, 31, 2018

2018
[56]

Solving Stackelberg pre- diction game with least squares loss via spherically constrained least squares reformulation

Jiali Wang, Wen Huang, Rujun Jiang, Xudong Li, and Alex L Wang. Solving Stackelberg pre- diction game with least squares loss via spherically constrained least squares reformulation. In International Conference on Machine Learning, pages 22665–22679. PMLR, 2022

2022
[57]

Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming

Jieping Ye, Shuiwang Ji, and Jianhui Chen. Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming. InProceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 854–863, 2007

2007
[58]

Metric subregularity and constraint qualifications for convex generalized equations in Banach spaces.SIAM Journal on Optimization, 18(2):437–460, 2007

Xi Yin Zheng and Kung Fu Ng. Metric subregularity and constraint qualifications for convex generalized equations in Banach spaces.SIAM Journal on Optimization, 18(2):437–460, 2007. 7 Appendix 7.1 Proof of Theorem 4.1 SinceK⊂Rm is polyhedral, there existsA∈R¯m×mfor some¯m∈Z+ such that K={y∈Rm :Ay≥0};(72) therefore, for anyx∈Rn, g(x)∈−K ⇐⇒Ag(x)≤0.(73) Defin...

2007

[1] [1]

Second-order cone programming.Mathematical program- ming, 95(1):3–51, 2003

Farid Alizadeh and Donald Goldfarb. Second-order cone programming.Mathematical program- ming, 95(1):3–51, 2003

2003

[2] [2]

Degenerate nonlinear programming with a quadratic growth condition.SIAM Journal on Optimization, 10(4):1116–1135, 2000

Mihai Anitescu. Degenerate nonlinear programming with a quadratic growth condition.SIAM Journal on Optimization, 10(4):1116–1135, 2000

2000

[3] [3]

Practical large-scale linear programming using primal-dual hybrid gradient

David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O’Donoghue, and Warren Schudy. Practical large-scale linear programming using primal-dual hybrid gradient. Advances in Neural Information Processing Systems, 34:20243–20257, 2021

2021

[4] [4]

PDLP: A practical first-order method for large-scale linear programming

David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O’Donoghue, and Warren Schudy. PDLP: A practical first-order method for large-scale linear programming. Mathematical Programming Computation, pages 1–45, 2026

2026

[5] [5]

Fasterfirst-orderprimal-dualmeth- ods for linear programming using restarts and sharpness.Mathematical Programming, 201(1):133– 184, 2023

DavidApplegate, OliverHinder, HaihaoLu, andMilesLubin. Fasterfirst-orderprimal-dualmeth- ods for linear programming using restarts and sharpness.Mathematical Programming, 201(1):133– 184, 2023

2023

[6] [6]

Parameter-free FISTA by adaptive restart and backtracking.SIAM Journal on Optimiza- tion, 34(4):3259–3285, 2024

Jean-François Aujol, Luca Calatroni, Charles Dossal, Hippolyte Labarrière, and Aude Ronde- pierre. Parameter-free FISTA by adaptive restart and backtracking.SIAM Journal on Optimiza- tion, 34(4):3259–3285, 2024

2024

[7] [7]

A distributed ADMM-like method for resource sharing over time-varying networks.SIAM Journal on Optimization, 29(4):3036–3068, 2019

Necdet Serhat Aybat and Erfan Yazdandoost Hamedani. A distributed ADMM-like method for resource sharing over time-varying networks.SIAM Journal on Optimization, 29(4):3036–3068, 2019

2019

[8] [8]

Projecting onto the intersection of a cone and a sphere.SIAM Journal on Optimization, 28(3):2158–2188, 2018

Heinz H Bauschke, Minh N Bui, and Xianfu Wang. Projecting onto the intersection of a cone and a sphere.SIAM Journal on Optimization, 28(3):2158–2188, 2018

2018

[9] [9]

SIAM, 2001

Aharon Ben-Tal and Arkadi Nemirovski.Lectures on modern convex optimization: analysis, algorithms, and engineering applications. SIAM, 2001

2001

[10] [10]

Springer Science & Business Media, 2000

J Frederic Bonnans and Alexander Shapiro.Perturbation Analysis of Optimization Problems. Springer Science & Business Media, 2000

2000

[11] [11]

Cambridge University Press, 2004

Stephen Boyd and Lieven Vandenberghe.Convex optimization. Cambridge University Press, 2004

2004

[12] [12]

Burke and Abraham Engle

James V. Burke and Abraham Engle. Strong metric (sub)regularity of Karush-Kuhn-Tucker map- pings for piecewise linear-quadratic convex-composite optimization and the quadratic convergence of Newton’s method.Mathematics of Operations Research, 45(3):1164–1192, 2020

2020

[13] [13]

A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011

Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision, 40(1):120–145, 2011

2011

[14] [14]

On the ergodic convergence rates of a first-order primal– dual algorithm.Mathematical Programming, 159(1):253–287, 2016

Antonin Chambolle and Thomas Pock. On the ergodic convergence rates of a first-order primal– dual algorithm.Mathematical Programming, 159(1):253–287, 2016

2016

[15] [15]

HPR-LP: An implementation of an HPR method for solving linear programming.Mathematical Programming Computation, pages 1–28, 2025

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. HPR-LP: An implementation of an HPR method for solving linear programming.Mathematical Programming Computation, pages 1–28, 2025

2025

[16] [16]

HPR-QP: A dual Halpern Peaceman-Rachford method for solving large-scale convex composite quadratic programming.arXiv:2507.02470, 2025

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. HPR-QP: A dual Halpern Peaceman-Rachford method for solving large-scale convex composite quadratic programming.arXiv:2507.02470, 2025. 32

work page arXiv 2025

[17] [17]

On the relationships among GPU-accelerated first-order methods for solving linear programming

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. On the relationships among GPU-accelerated first-order methods for solving linear programming. arXiv:2509.23903, 2025

work page arXiv 2025

[18] [18]

An exponential cone programming approach for managing electric vehicle charging.Operations Research, 72(5):2215–2240, 2024

Li Chen, Long He, and Yangfang Zhou. An exponential cone programming approach for managing electric vehicle charging.Operations Research, 72(5):2215–2240, 2024

2024

[19] [19]

Alocalnearlylinearlyconvergentfirst-ordermethodfornonsmooth functions with quadratic growth.Foundations of Computational Mathematics, 25(3):943–1024, 2025

DamekDavisandLiweiJiang. Alocalnearlylinearlyconvergentfirst-ordermethodfornonsmooth functions with quadratic growth.Foundations of Computational Mathematics, 25(3):943–1024, 2025

2025

[20] [20]

A survey of direct methods for sparse linear systems.Acta Numerica, 25:383–566, 2016

Timothy A Davis, Sivasankaran Rajamanickam, and Wissam M Sid-Lakhdar. A survey of direct methods for sparse linear systems.Acta Numerica, 25:383–566, 2016

2016

[21] [21]

CVXPY: A Python-embedded modeling language for convex optimization.Journal of Machine Learning Research, 17(83):1–5, 2016

Steven Diamond and Stephen Boyd. CVXPY: A Python-embedded modeling language for convex optimization.Journal of Machine Learning Research, 17(83):1–5, 2016

2016

[22] [22]

Active set identification and rapid convergence for degenerate primal-dual problems.arXiv:2602.10436, 2026

Mateo Díaz, Pedro Izquierdo Lehmann, Haihao Lu, and Jinwen Yang. Active set identification and rapid convergence for degenerate primal-dual problems.arXiv:2602.10436, 2026

work page arXiv 2026

[23] [23]

Asen L Dontchev and R Tyrrell Rockafellar.Implicit functions and solution mappings, volume

[24] [24]

Error bounds, quadratic growth, and linear conver- gence of proximal methods.Mathematics of Operations Research, 43(3):919–948, 2018

Dmitriy Drusvyatskiy and Adrian S Lewis. Error bounds, quadratic growth, and linear conver- gence of proximal methods.Mathematics of Operations Research, 43(3):919–948, 2018

2018

[25] [25]

Springer, New York, 2003

Francisco Facchinei and Jong-Shi Pang.Finite-Dimensional Variational Inequalities and Com- plementarity Problems. Springer, New York, 2003

2003

[26] [26]

Quadratic error bound of the smoothed gap and the restarted averaged primal- dual hybrid gradient.Open Journal of Mathematical Optimization, 4:1–34, 2023

Olivier Fercoq. Quadratic error bound of the smoothed gap and the restarted averaged primal- dual hybrid gradient.Open Journal of Mathematical Optimization, 4:1–34, 2023

2023

[27] [27]

Projection onto the exponential cone: A univariate root-finding problem

Henrik A Friberg. Projection onto the exponential cone: A univariate root-finding problem. Optimization Methods and Software, 38(3):457–473, 2023

2023

[28] [28]

Computational optimal transport with applications to data sciences.Foundations and Trends®in Machine Learning, 11(5-6):355–607, 2019

Peyré Gabriel and Cuturi Marco. Computational optimal transport with applications to data sciences.Foundations and Trends®in Machine Learning, 11(5-6):355–607, 2019

2019

[29] [29]

Sido: A pharmacology dataset.http://www.causality.inf.ethz.ch/data/ SIDO.html, 2008

Isabelle Guyon. Sido: A pharmacology dataset.http://www.causality.inf.ethz.ch/data/ SIDO.html, 2008. Accessed: 2008

2008

[30] [30]

A primal-dual algorithm with line search for general convex-concave saddle point problems.SIAM Journal on Optimization, 31(2):1299– 1329, 2021

Erfan Yazdandoost Hamedani and Necdet Serhat Aybat. A primal-dual algorithm with line search for general convex-concave saddle point problems.SIAM Journal on Optimization, 31(2):1299– 1329, 2021

2021

[31] [31]

A restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming

Yicheng Huang, Wanyu Zhang, Hongpei Li, Dongdong Ge, Huikang Liu, and Yinyu Ye. A restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming. INFORMS Journal on Computing, 2025

2025

[32] [32]

Nonconvex optimization for regression with fairness constraints

Junpei Komiyama, Akiko Takeda, Junya Honda, and Hajime Shimao. Nonconvex optimization for regression with fairness constraints. InInternational Conference on Machine Learning, pages 2737–2746. PMLR, 2018

2018

[33] [33]

Learning the kernel matrix with semidefinite programming.Journal of Machine Learning Research, 5(Jan):27–72, 2004

Gert RG Lanckriet, Nello Cristianini, Peter Bartlett, Laurent El Ghaoui, and Michael I Jor- dan. Learning the kernel matrix with semidefinite programming.Journal of Machine Learning Research, 5(Jan):27–72, 2004

2004

[34] [34]

Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods

Feng-Yi Liao, Lijun Ding, and Yang Zheng. Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methods. In6th Annual Learning for Dynamics & Control Conference, pages 993–1005. PMLR, 2024. 33

2024

[35] [35]

PDCS: A primal-dual large-scale conic programming solver with GPU enhancements.arXiv:2505.00311, 2025

Zhenwei Lin, Zikai Xiong, Dongdong Ge, and Yinyu Ye. PDCS: A primal-dual large-scale conic programming solver with GPU enhancements.arXiv:2505.00311, 2025

work page arXiv 2025

[36] [36]

A technical note on the implementation and use of PDCS.arXiv:2603.15504, 2026

Zhenwei Lin, Zikai Xiong, Dongdong Ge, and Yinyu Ye. A technical note on the implementation and use of PDCS.arXiv:2603.15504, 2026

work page arXiv 2026

[37] [37]

cuPDLP.jl: A GPU implementation of restarted primal-dual hybrid gradient for linear programming in Julia.Operations Research, 73(6):3440–3452, 2025

Haihao Lu and Jinwen Yang. cuPDLP.jl: A GPU implementation of restarted primal-dual hybrid gradient for linear programming in Julia.Operations Research, 73(6):3440–3452, 2025

2025

[38] [38]

An overview of GPU-based first-order methods for linear programming and extensions.arXiv preprint arXiv:2506.02174, 2025

Haihao Lu and Jinwen Yang. An overview of GPU-based first-order methods for linear program- ming and extensions.arXiv:2506.02174, 2025

work page arXiv 2025

[39] [39]

A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, 215(1):771–808, 2026

Haihao Lu and Jinwen Yang. A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, 215(1):771–808, 2026

2026

[40] [40]

Semidefi- nite relaxation of quadratic optimization problems.IEEE Signal Processing Magazine, 27(3):20– 34, 2010

Zhi-Quan Luo, Wing-Kin Ma, Anthony Man-Cho So, Yinyu Ye, and Shuzhong Zhang. Semidefi- nite relaxation of quadratic optimization problems.IEEE Signal Processing Magazine, 27(3):20– 34, 2010

2010

[41] [41]

A first-order primal-dual algorithm with linesearch.SIAM Journal on Optimization, 28(1):411–432, 2018

Yura Malitsky and Thomas Pock. A first-order primal-dual algorithm with linesearch.SIAM Journal on Optimization, 28(1):411–432, 2018

2018

[42] [42]

Angelia Nedic and Asuman Ozdaglar. Ch. cooperative distributed multi-agent optimization. Convex optimization in signal processing and communications, page 340, 2010

2010

[43] [43]

Operator splitting for a homogeneous embedding of the linear comple- mentarity problem.SIAM Journal on Optimization, 31(3):1999–2023, 2021

Brendan O’Donoghue. Operator splitting for a homogeneous embedding of the linear comple- mentarity problem.SIAM Journal on Optimization, 31(3):1999–2023, 2021

1999

[44] [44]

Conic optimization via operator splitting and homogeneous self-dual embedding.Journal of Optimization Theory and Applica- tions, 169(3):1042–1068, June 2016

Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. Conic optimization via operator splitting and homogeneous self-dual embedding.Journal of Optimization Theory and Applica- tions, 169(3):1042–1068, June 2016

2016

[45] [45]

Adaptive restart for accelerated gradient schemes

Brendan O’donoghue and Emmanuel Candes. Adaptive restart for accelerated gradient schemes. Foundations of Computational Mathematics, 15(3):715–732, 2015

2015

[46] [46]

Proximal algorithms.Foundations and Trends in Optimization, 1(3):127–239, 2014

Neal Parikh and Stephen Boyd. Proximal algorithms.Foundations and Trends in Optimization, 1(3):127–239, 2014

2014

[47] [47]

Quadratically constrained quadratic programming: Some applications and a method for solution.Zeitschrift für Operations Research, 26(1):105–119, 1982

E Phan-huy Hao. Quadratically constrained quadratic programming: Some applications and a method for solution.Zeitschrift für Operations Research, 26(1):105–119, 1982

1982

[48] [48]

Fast convergence to non-isolated minima: Four equivalent conditions forC 2 functions.Mathematical Programming, 213(1):151–199, 2025

Quentin Rebjock and Nicolas Boumal. Fast convergence to non-isolated minima: Four equivalent conditions forC 2 functions.Mathematical Programming, 213(1):151–199, 2025

2025

[49] [49]

A simple nearly optimal restart scheme for speeding up first-order methods.Foundations of Computational Mathematics, 22(1):211–256, 2022

James Renegar and Benjamin Grimmer. A simple nearly optimal restart scheme for speeding up first-order methods.Foundations of Computational Mathematics, 22(1):211–256, 2022

2022

[50] [50]

Strongly regular generalized equations.Mathematics of Operations Re- search, 5(1):43–62, 1980

Stephen M Robinson. Strongly regular generalized equations.Mathematics of Operations Re- search, 5(1):43–62, 1980

1980

[51] [51]

Springer Science & Business Media, 2009

R Tyrrell Rockafellar and Roger J-B Wets.Variational Analysis. Springer Science & Business Media, 2009

2009

[52] [52]

Sharpness, restart and acceleration.SIAM Journal on Optimization, 30(1):262–289, 2020

Vincent Roulet and Alexandre d’Aspremont. Sharpness, restart and acceleration.SIAM Journal on Optimization, 30(1):262–289, 2020

2020

[53] [53]

OSQP: An operator splitting solver for quadratic programs

Bartolomeo Stellato, Goran Banjac, Paul Goulart, Alberto Bemporad, and Stephen Boyd. OSQP: An operator splitting solver for quadratic programs. In2018 UKACC 12th international confer- ence on control (CONTROL), pages 339–339. IEEE, 2018. 34

2018

[54] [54]

Cambridge university press, 1996

Rangarajan K Sundaram.A first course in optimization theory. Cambridge university press, 1996

1996

[55] [55]

Rest-Katyusha: Exploiting the solution’s structure via scheduled restart schemes.Advances in Neural Information Processing Systems, 31, 2018

Junqi Tang, Mohammad Golbabaee, Francis Bach, et al. Rest-Katyusha: Exploiting the solution’s structure via scheduled restart schemes.Advances in Neural Information Processing Systems, 31, 2018

2018

[56] [56]

Solving Stackelberg pre- diction game with least squares loss via spherically constrained least squares reformulation

Jiali Wang, Wen Huang, Rujun Jiang, Xudong Li, and Alex L Wang. Solving Stackelberg pre- diction game with least squares loss via spherically constrained least squares reformulation. In International Conference on Machine Learning, pages 22665–22679. PMLR, 2022

2022

[57] [57]

Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming

Jieping Ye, Shuiwang Ji, and Jianhui Chen. Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming. InProceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 854–863, 2007

2007

[58] [58]

Metric subregularity and constraint qualifications for convex generalized equations in Banach spaces.SIAM Journal on Optimization, 18(2):437–460, 2007

Xi Yin Zheng and Kung Fu Ng. Metric subregularity and constraint qualifications for convex generalized equations in Banach spaces.SIAM Journal on Optimization, 18(2):437–460, 2007. 7 Appendix 7.1 Proof of Theorem 4.1 SinceK⊂Rm is polyhedral, there existsA∈R¯m×mfor some¯m∈Z+ such that K={y∈Rm :Ay≥0};(72) therefore, for anyx∈Rn, g(x)∈−K ⇐⇒Ag(x)≤0.(73) Defin...

2007