On the convergence of doubly stochastic Primal-Dual Hybrid Gradient Method

Huikang Liu; Yiheng Xiao

arxiv: 2605.17883 · v1 · pith:PEVFN4MRnew · submitted 2026-05-18 · 🧮 math.OC

On the convergence of doubly stochastic Primal-Dual Hybrid Gradient Method

Yiheng Xiao , Huikang Liu This is my paper

Pith reviewed 2026-05-20 09:41 UTC · model grok-4.3

classification 🧮 math.OC

keywords primal-dual hybrid gradientsaddle-point problemsstochastic optimizationblock coordinate updatesconvergence ratesconvex-concave problemsrestarted methodsergodic convergence

0 comments

The pith

A doubly stochastic primal-dual hybrid gradient method achieves O(1/K) ergodic convergence for convex saddle-point problems with randomized block updates on both sides.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a doubly stochastic primal-dual hybrid gradient method for block-structured convex-concave saddle-point problems, where both primal and dual variables are updated in random blocks to reduce computational cost on large problems. It proves an O(1/K) rate for the expected restricted primal-dual gap in the general convex case when blockwise step sizes meet specific conditions. A restarted version of the method is shown to converge linearly under a quadratic growth condition measured by the smoothed primal-dual gap. This approach unifies and extends existing primal-dual hybrid gradient and stochastic variants by allowing independent randomization on each side.

Core claim

The doubly stochastic primal-dual hybrid gradient method performs randomized block updates on both primal and dual variables and establishes an O(1/K) ergodic convergence rate for the expected restricted primal-dual gap under suitable blockwise step-size conditions in the general convex setting; its restarted variant achieves linear convergence of the outer iterates under a quadratic growth condition in the smoothed primal-dual gap.

What carries the argument

The doubly stochastic primal-dual hybrid gradient method (DSPDHG), which extends classical PDHG by independently selecting random blocks for primal and dual updates at each iteration.

If this is right

The method reduces exactly to deterministic PDHG when every block is selected at each step.
The method reduces to one-sided stochastic variants when randomization occurs on only the primal or only the dual side.
A restarted outer-loop version delivers linear convergence once a quadratic growth condition holds for the smoothed gap.
Practical runs with standard step sizes match or exceed the speed of PDHG, SPDHG, and their restarted forms on large instances.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same block-randomization idea could be applied to other primal-dual splitting schemes to obtain similar rates without full-vector updates.
The analysis suggests that two-sided stochasticity preserves the same asymptotic rate as one-sided stochasticity when step sizes are chosen blockwise.
Linear convergence under quadratic growth may extend to problems where the gap satisfies a local error bound rather than strict quadratic growth.

Load-bearing premise

Suitable blockwise step-size conditions exist that make the convergence rate hold even when blocks are chosen randomly.

What would settle it

Numerical runs on a standard convex-concave saddle-point test problem where the measured expected restricted primal-dual gap after K iterations fails to decrease proportionally to 1/K when the stated blockwise step-size rules are followed.

Figures

Figures reproduced from arXiv: 2605.17883 by Huikang Liu, Yiheng Xiao.

**Figure 2.** Figure 2: Constraint violation and relative error for MPC problems with [PITH_FULL_IMAGE:figures/full_fig_p023_2.png] view at source ↗

read the original abstract

We study a block-structured class of convex-concave saddle-point problems in which both the primal and dual variables admit natural separable decompositions. Motivated by large-scale applications where a full update on either side can be computationally expensive, we propose a doubly stochastic primal--dual hybrid gradient method (DSPDHG) that performs randomized block updates on both primal and dual variables.The method extends classical PDHG and stochastic PDHG (SPDHG) schemes in a unified manner:it reduces to deterministic PDHG when all blocks are selected and to one-sided stochastic variants when only one side is randomized. For the general convex setting, we establish an $\mathcal{O}(1/K)$ ergodic convergence rate for the expected restricted primal--dual gap under suitable blockwise step-size conditions. We further analyze a restarted variant of DSPDHG under a quadratic growth condition in terms of the smoothed primal-dual gap. Under this regularity assumption, we prove linear convergence of the restarted outer iterates. Numerical evidence is provided to show that restarted DSPDHG with standard step sizes demonstrates competitive practical performance compared with PDHG, SPDHG, and their restarted variants.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper adds a natural doubly stochastic extension to PDHG that randomizes blocks on both sides, with O(1/K) ergodic rates and linear rates under quadratic growth, but the blockwise step-size conditions need closer checking for practicality.

read the letter

The paper introduces DSPDHG, a primal-dual method that draws random blocks for updates on both the primal and dual variables at each iteration. It sits between full deterministic PDHG and the existing one-sided stochastic versions, recovering each as special cases. The main results are an O(1/K) ergodic bound on the expected restricted primal-dual gap for general convex-concave problems and linear convergence of a restarted version under quadratic growth, measured on the smoothed gap. The numerics compare it against PDHG, SPDHG, and their restarted forms and show it holds up with ordinary step-size choices. The proofs appear to follow standard convex-analysis arguments for primal-dual schemes, and the citation list covers the relevant prior work on PDHG and its stochastic variants. The analysis is honest about the assumptions and does not hide behind fitted parameters. The soft spot is the blockwise step-size rule. The abstract states that suitable conditions on the per-block steps deliver the rate, yet it is not obvious whether those conditions stay parameter-free or scale cleanly with the number of blocks and the Lipschitz constant of the coupling term. If the conditions force steps to shrink with more blocks or require knowledge that is hard to estimate, the practical advantage over simpler one-sided methods shrinks. This is a minor but real caveat rather than a fatal gap. The work is aimed at researchers who already use primal-dual methods on large saddle-point problems in imaging or machine learning and want a way to randomize both sides without losing the convergence theory. A reader familiar with the PDHG literature will find the extension straightforward to follow and the claims specific enough to verify. It deserves a serious referee because the extension is clean, the rates are stated precisely, and the numerics give something concrete to check. I would send it to review with the request that the step-size conditions be made fully explicit and tested on partitions with varying numbers of blocks.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes the doubly stochastic primal-dual hybrid gradient (DSPDHG) method for block-structured convex-concave saddle-point problems, performing randomized block updates on both primal and dual variables. It reduces to classical PDHG when all blocks are selected and to one-sided stochastic variants otherwise. For the general convex case the paper claims an O(1/K) ergodic rate on the expected restricted primal-dual gap under suitable blockwise step-size conditions; under an additional quadratic-growth assumption it further claims linear convergence of the restarted outer iterates measured in the smoothed primal-dual gap. Numerical experiments are presented to illustrate practical competitiveness with PDHG, SPDHG and their restarted versions.

Significance. If the stated rates are rigorously established with explicit, practical step-size rules, the work supplies a unified convergence theory for doubly stochastic primal-dual schemes that is directly relevant to large-scale saddle-point problems. The linear-rate result under quadratic growth and the provision of numerical evidence are positive features; the main significance hinges on whether the blockwise step-size conditions can be satisfied without knowledge of unknown constants or without forcing steps that shrink with the number of blocks.

major comments (2)

[§3] §3 (General convex convergence): the O(1/K) ergodic bound on the expected restricted primal-dual gap is stated to hold under 'suitable blockwise step-size conditions' for randomized block selections. The explicit form of these conditions (in terms of block Lipschitz constants of the coupling operator, block probabilities, or norms of the blocks) is not given, nor is it shown that feasible choices exist that remain independent of the number of blocks and do not degrade the rate below O(1/K). This is load-bearing for the central claim.
[§4] §4 (Restarted variant): the linear convergence statement for the restarted outer iterates is proved under a quadratic-growth condition on the smoothed primal-dual gap. It is unclear whether the quadratic-growth assumption is imposed on the original or the smoothed problem and how the smoothing parameter enters the linear rate; without this clarification the result cannot be directly compared with existing restarted PDHG analyses.

minor comments (3)

[Abstract] The abstract and introduction should explicitly state whether the blockwise step-size rule reduces to the standard PDHG rule when each side consists of a single block.
[Notation section] Notation for the restricted primal-dual gap versus the smoothed gap should be made uniform across theorems and proofs.
[Numerical experiments] Numerical section: the step sizes actually used in the experiments should be reported and compared with the theoretical blockwise conditions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and will incorporate clarifications and explicit derivations into the revised version.

read point-by-point responses

Referee: [§3] §3 (General convex convergence): the O(1/K) ergodic bound on the expected restricted primal-dual gap is stated to hold under 'suitable blockwise step-size conditions' for randomized block selections. The explicit form of these conditions (in terms of block Lipschitz constants of the coupling operator, block probabilities, or norms of the blocks) is not given, nor is it shown that feasible choices exist that remain independent of the number of blocks and do not degrade the rate below O(1/K). This is load-bearing for the central claim.

Authors: We agree that the step-size conditions merit a more explicit treatment. In the current manuscript, Assumption 3.1 states the blockwise conditions in terms of the local Lipschitz constants L_{ij} of the coupling operator blocks and the selection probabilities p_i, q_j; Theorem 3.2 then derives the O(1/K) ergodic rate on the expected restricted gap under these conditions. To address the concern directly, the revision will include a new remark (or corollary) that constructs concrete, practical step-size rules: for example, set the primal step sizes τ_i = 1/(2 max_j L_{ij}/p_i) and dual step sizes σ_j = 1/(2 max_i L_{ij}/q_j). These choices depend only on per-block quantities and the local probabilities; because the probabilities sum to one, the resulting constants remain independent of the total number of blocks. Consequently the ergodic rate stays O(1/K) without degradation. We will also verify that these rules reduce correctly to the classical PDHG step-size condition when all blocks are selected with probability one. revision: yes
Referee: [§4] §4 (Restarted variant): the linear convergence statement for the restarted outer iterates is proved under a quadratic-growth condition on the smoothed primal-dual gap. It is unclear whether the quadratic-growth assumption is imposed on the original or the smoothed problem and how the smoothing parameter enters the linear rate; without this clarification the result cannot be directly compared with existing restarted PDHG analyses.

Authors: The quadratic-growth assumption (Assumption 4.1) is imposed directly on the smoothed primal-dual gap function G_ε, not on the original unsmoothed gap; this is stated explicitly in Section 4 and used in Theorem 4.2. The smoothing parameter ε appears in the linear rate through the strong-convexity modulus of G_ε, which is at least ε/2 in the appropriate norm; the contraction factor of the restarted outer iterates is then of the form 1 − Θ(min{μ, ε}), where μ is the quadratic-growth constant. In the revision we will add a short comparison paragraph noting that this dependence is analogous to the smoothing parameter in restarted PDHG analyses (e.g., the rate degrades gracefully as ε → 0 but remains linear for any fixed positive ε). We will also clarify that the restarted scheme is applied to the smoothed problem while the inner DSPDHG iterations use the original (unsmoothed) operator. revision: yes

Circularity Check

0 steps flagged

Derivation is self-contained via standard convex analysis of block updates.

full rationale

The paper establishes the O(1/K) ergodic rate for the expected restricted primal-dual gap directly from the DSPDHG update rules under explicitly stated blockwise step-size conditions, and derives linear convergence of the restarted variant from the quadratic growth assumption on the smoothed gap. These steps rely on standard Lyapunov-style arguments for primal-dual methods and do not reduce to fitted parameters, self-definitions, or load-bearing self-citations. The blockwise conditions are part of the theorem hypotheses rather than presupposed results, and the analysis extends classical PDHG/SPDHG without circular renaming or ansatz smuggling. The derivation remains independent of the target rates.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

The central claims rest on standard assumptions of convexity-concavity and block separability of the variables, plus the quadratic growth condition for the linear rate; no new entities are introduced and step sizes are conditioned rather than freely fitted.

axioms (3)

domain assumption The problem is convex-concave with natural separable block decompositions for primal and dual variables.
Invoked to motivate the block-structured class and enable randomized block updates.
domain assumption Suitable blockwise step-size conditions hold.
Required for the O(1/K) ergodic convergence rate in the general convex setting.
domain assumption Quadratic growth condition on the smoothed primal-dual gap.
Used to establish linear convergence for the restarted variant.

pith-pipeline@v0.9.0 · 5727 in / 1601 out tokens · 39203 ms · 2026-05-20T09:41:22.612496+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

For the general convex setting, we establish an O(1/K) ergodic convergence rate for the expected restricted primal-dual gap under suitable blockwise step-size conditions.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We further analyze a restarted variant of DSPDHG under a quadratic growth condition in terms of the smoothed primal-dual gap.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

[1]

cuhallar: A gpu accelerated low-rank augmented lagrangian method for large-scale semidefinite programming.arXiv preprint arXiv:2505.13719, 2025

Jacob M Aguirre, Diego Cifuentes, Vincent Guigues, Renato DC Monteiro, Victor Hugo Nascimento, and Arnesh Sujanani. cuhallar: A gpu accelerated low-rank augmented lagrangian method for large-scale semidefinite programming.arXiv preprint arXiv:2505.13719, 2025

work page arXiv 2025
[2]

Random extrapolation for primal-dual coordinate descent

Ahmet Alacaoglu, Olivier Fercoq, and V olkan Cevher. Random extrapolation for primal-dual coordinate descent. InInternational conference on machine learning, pages 191–201. PMLR, 2020

work page 2020
[3]

On the convergence of stochastic primal-dual hybrid gradient.SIAM Journal on Optimization, 32(2):1288–1318, 2022

Ahmet Alacaoglu, Olivier Fercoq, and V olkan Cevher. On the convergence of stochastic primal-dual hybrid gradient.SIAM Journal on Optimization, 32(2):1288–1318, 2022

work page 2022
[4]

Practical large-scale linear programming using primal-dual hybrid gradient.Advances in Neural Information Processing Systems, 34:20243–20257, 2021

David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O’Donoghue, and Warren Schudy. Practical large-scale linear programming using primal-dual hybrid gradient.Advances in Neural Information Processing Systems, 34:20243–20257, 2021

work page 2021
[5]

Ehrhardt, Peter Richtárik, and Carola-Bibiane Schönlieb

Antonin Chambolle, Matthias J. Ehrhardt, Peter Richtárik, and Carola-Bibiane Schönlieb. Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications.SIAM Journal on Optimization, 28(4):2783–2808, 2018

work page 2018
[6]

A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of mathematical imaging and vision, 40:120–145, 2011

Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of mathematical imaging and vision, 40:120–145, 2011

work page 2011
[7]

On the ergodic convergence rates of a first-order primal–dual algorithm

Antonin Chambolle and Thomas Pock. On the ergodic convergence rates of a first-order primal–dual algorithm. Mathematical Programming, 159(1):253–287, 2016

work page 2016
[8]

Libsvm: A library for support vector machines.ACM Trans

Chih-Chung Chang and Chih-Jen Lin. Libsvm: A library for support vector machines.ACM Trans. Intell. Syst. Technol., 2(3), May 2011

work page 2011
[9]

Hpr-lp: An implementation of an hpr method for solving linear programming.arXiv preprint arXiv:2408.12179, 2024

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. Hpr-lp: An implementation of an hpr method for solving linear programming.arXiv preprint arXiv:2408.12179, 2024

work page arXiv 2024
[10]

Hpr-qp: A dual halpern peaceman-rachford method for solving large-scale convex composite quadratic programming.arXiv preprint arXiv:2507.02470, 2025

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. Hpr-qp: A dual halpern peaceman-rachford method for solving large-scale convex composite quadratic programming.arXiv preprint arXiv:2507.02470, 2025

work page arXiv 2025
[11]

New understandings and computation on augmented lagrangian methods for low-rank semidefinite programming.arXiv preprint arXiv:2505.15775, 2025

Lijun Ding, Haihao Lu, and Jinwen Yang. New understandings and computation on augmented lagrangian methods for low-rank semidefinite programming.arXiv preprint arXiv:2505.15775, 2025

work page arXiv 2025
[12]

Quadratic error bound of the smoothed gap and the restarted averaged primal-dual hybrid gradient

Olivier Fercoq. Quadratic error bound of the smoothed gap and the restarted averaged primal-dual hybrid gradient. arXiv preprint arXiv:2206.03041, 2022

work page arXiv 2022
[13]

Cardinal optimizer (copt) user guide.arXiv preprint arXiv:2208.14314, 2022

Dongdong Ge, Qi Huangfu, Zizhuo Wang, Jian Wu, and Yinyu Ye. Cardinal optimizer (copt) user guide.arXiv preprint arXiv:2208.14314, 2022

work page arXiv 2022
[14]

A low-rank ADMM splitting approach for semidefinite programming.arXiv preprint arXiv:2403.09133, 2024

Qiushi Han, Chenxi Li, Zhenwei Lin, Caihua Chen, Qi Deng, Dongdong Ge, Huikang Liu, and Yinyu Ye. A low-rank admm splitting approach for semidefinite programming.arXiv preprint arXiv:2403.09133, 2024

work page arXiv 2024
[15]

Accelerating low-rank factorization-based semidefinite programming algorithms on gpu.arXiv preprint arXiv:2407.15049, 2024

Qiushi Han, Zhenwei Lin, Hanwen Liu, Caihua Chen, Qi Deng, Dongdong Ge, and Yinyu Ye. Accelerating low-rank factorization-based semidefinite programming algorithms on gpu.arXiv preprint arXiv:2407.15049, 2024

work page arXiv 2024
[16]

Restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming.arXiv preprint arXiv:2405.16160, 2024

Yicheng Huang, Wanyu Zhang, Hongpei Li, Dongdong Ge, Huikang Liu, and Yinyu Ye. Restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming.arXiv preprint arXiv:2405.16160, 2024

work page arXiv 2024
[17]

A practical GPU-enhanced matrix- free primal-dual method for large-scale conic programs, 2025

Zhenwei Lin, Zikai Xiong, Dongdong Ge, and Yinyu Ye. Pdcs: A primal-dual large-scale conic programming solver with gpu enhancements.arXiv preprint arXiv:2505.00311, 2025. 24 On the Convergence of Doubly Stochastic PDHGA PREPRINT

work page arXiv 2025
[18]

Pdhcg: A scalable first-order method for large-scale competitive market equilibrium computation, 2025

Huikang Liu, Yicheng Huang, Hongpei Li, Dongdong Ge, and Yinyu Ye. Pdhcg: A scalable first-order method for large-scale competitive market equilibrium computation, 2025

work page 2025
[19]

Haihao Lu and Jinwen Yang. cupdlp. jl: A gpu implementation of restarted primal-dual hybrid gradient for linear programming in julia.arXiv preprint arXiv:2311.12180, 2023

work page arXiv 2023
[20]

arXiv preprint arXiv:2407.19689

Haihao Lu and Jinwen Yang. Pdot: A practical primal-dual algorithm and a gpu-based solver for optimal transport. arXiv preprint arXiv:2407.19689, 2024

work page arXiv 2024
[21]

Restarted Halpern PDHG for linear programming

Haihao Lu and Jinwen Yang. Restarted halpern pdhg for linear programming.arXiv preprint arXiv:2407.16144, 2024

work page arXiv 2024
[22]

A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, pages 1–38, 2025

Haihao Lu and Jinwen Yang. A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, pages 1–38, 2025

work page 2025
[23]

Linear convergence of a unified primal–dual algorithm for convex–concave saddle point problems with quadratic growth

Cody Melcher, Afrooz Jalilzadeh, and Erfan Yazdandoost Hamedani. Linear convergence of a unified primal–dual algorithm for convex–concave saddle point problems with quadratic growth. 2025

work page 2025
[24]

Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem

Adams Wei Yu, Qihang Lin, and Tianbao Yang. Doubly stochastic primal-dual coordinate method for bilinear saddle-point problem.arXiv preprint arXiv:1508.03390, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[25]

Solving large multicommodity network flow problems on gpus.arXiv preprint arXiv:2501.17996, 2025

Fangzhao Zhang and Stephen Boyd. Solving large multicommodity network flow problems on gpus.arXiv preprint arXiv:2501.17996, 2025

work page arXiv 2025
[26]

Hot: An efficient halpern accelerating algorithm for optimal transport problems.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Guojun Zhang, Zhexuan Gu, Yancheng Yuan, and Defeng Sun. Hot: An efficient halpern accelerating algorithm for optimal transport problems.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 25

work page 2025

[1] [1]

cuhallar: A gpu accelerated low-rank augmented lagrangian method for large-scale semidefinite programming.arXiv preprint arXiv:2505.13719, 2025

Jacob M Aguirre, Diego Cifuentes, Vincent Guigues, Renato DC Monteiro, Victor Hugo Nascimento, and Arnesh Sujanani. cuhallar: A gpu accelerated low-rank augmented lagrangian method for large-scale semidefinite programming.arXiv preprint arXiv:2505.13719, 2025

work page arXiv 2025

[2] [2]

Random extrapolation for primal-dual coordinate descent

Ahmet Alacaoglu, Olivier Fercoq, and V olkan Cevher. Random extrapolation for primal-dual coordinate descent. InInternational conference on machine learning, pages 191–201. PMLR, 2020

work page 2020

[3] [3]

On the convergence of stochastic primal-dual hybrid gradient.SIAM Journal on Optimization, 32(2):1288–1318, 2022

Ahmet Alacaoglu, Olivier Fercoq, and V olkan Cevher. On the convergence of stochastic primal-dual hybrid gradient.SIAM Journal on Optimization, 32(2):1288–1318, 2022

work page 2022

[4] [4]

Practical large-scale linear programming using primal-dual hybrid gradient.Advances in Neural Information Processing Systems, 34:20243–20257, 2021

David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O’Donoghue, and Warren Schudy. Practical large-scale linear programming using primal-dual hybrid gradient.Advances in Neural Information Processing Systems, 34:20243–20257, 2021

work page 2021

[5] [5]

Ehrhardt, Peter Richtárik, and Carola-Bibiane Schönlieb

Antonin Chambolle, Matthias J. Ehrhardt, Peter Richtárik, and Carola-Bibiane Schönlieb. Stochastic primal-dual hybrid gradient algorithm with arbitrary sampling and imaging applications.SIAM Journal on Optimization, 28(4):2783–2808, 2018

work page 2018

[6] [6]

A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of mathematical imaging and vision, 40:120–145, 2011

Antonin Chambolle and Thomas Pock. A first-order primal-dual algorithm for convex problems with applications to imaging.Journal of mathematical imaging and vision, 40:120–145, 2011

work page 2011

[7] [7]

On the ergodic convergence rates of a first-order primal–dual algorithm

Antonin Chambolle and Thomas Pock. On the ergodic convergence rates of a first-order primal–dual algorithm. Mathematical Programming, 159(1):253–287, 2016

work page 2016

[8] [8]

Libsvm: A library for support vector machines.ACM Trans

Chih-Chung Chang and Chih-Jen Lin. Libsvm: A library for support vector machines.ACM Trans. Intell. Syst. Technol., 2(3), May 2011

work page 2011

[9] [9]

Hpr-lp: An implementation of an hpr method for solving linear programming.arXiv preprint arXiv:2408.12179, 2024

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. Hpr-lp: An implementation of an hpr method for solving linear programming.arXiv preprint arXiv:2408.12179, 2024

work page arXiv 2024

[10] [10]

Hpr-qp: A dual halpern peaceman-rachford method for solving large-scale convex composite quadratic programming.arXiv preprint arXiv:2507.02470, 2025

Kaihuang Chen, Defeng Sun, Yancheng Yuan, Guojun Zhang, and Xinyuan Zhao. Hpr-qp: A dual halpern peaceman-rachford method for solving large-scale convex composite quadratic programming.arXiv preprint arXiv:2507.02470, 2025

work page arXiv 2025

[11] [11]

New understandings and computation on augmented lagrangian methods for low-rank semidefinite programming.arXiv preprint arXiv:2505.15775, 2025

Lijun Ding, Haihao Lu, and Jinwen Yang. New understandings and computation on augmented lagrangian methods for low-rank semidefinite programming.arXiv preprint arXiv:2505.15775, 2025

work page arXiv 2025

[12] [12]

Quadratic error bound of the smoothed gap and the restarted averaged primal-dual hybrid gradient

Olivier Fercoq. Quadratic error bound of the smoothed gap and the restarted averaged primal-dual hybrid gradient. arXiv preprint arXiv:2206.03041, 2022

work page arXiv 2022

[13] [13]

Cardinal optimizer (copt) user guide.arXiv preprint arXiv:2208.14314, 2022

Dongdong Ge, Qi Huangfu, Zizhuo Wang, Jian Wu, and Yinyu Ye. Cardinal optimizer (copt) user guide.arXiv preprint arXiv:2208.14314, 2022

work page arXiv 2022

[14] [14]

A low-rank ADMM splitting approach for semidefinite programming.arXiv preprint arXiv:2403.09133, 2024

Qiushi Han, Chenxi Li, Zhenwei Lin, Caihua Chen, Qi Deng, Dongdong Ge, Huikang Liu, and Yinyu Ye. A low-rank admm splitting approach for semidefinite programming.arXiv preprint arXiv:2403.09133, 2024

work page arXiv 2024

[15] [15]

Accelerating low-rank factorization-based semidefinite programming algorithms on gpu.arXiv preprint arXiv:2407.15049, 2024

Qiushi Han, Zhenwei Lin, Hanwen Liu, Caihua Chen, Qi Deng, Dongdong Ge, and Yinyu Ye. Accelerating low-rank factorization-based semidefinite programming algorithms on gpu.arXiv preprint arXiv:2407.15049, 2024

work page arXiv 2024

[16] [16]

Restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming.arXiv preprint arXiv:2405.16160, 2024

Yicheng Huang, Wanyu Zhang, Hongpei Li, Dongdong Ge, Huikang Liu, and Yinyu Ye. Restarted primal-dual hybrid conjugate gradient method for large-scale quadratic programming.arXiv preprint arXiv:2405.16160, 2024

work page arXiv 2024

[17] [17]

A practical GPU-enhanced matrix- free primal-dual method for large-scale conic programs, 2025

Zhenwei Lin, Zikai Xiong, Dongdong Ge, and Yinyu Ye. Pdcs: A primal-dual large-scale conic programming solver with gpu enhancements.arXiv preprint arXiv:2505.00311, 2025. 24 On the Convergence of Doubly Stochastic PDHGA PREPRINT

work page arXiv 2025

[18] [18]

Pdhcg: A scalable first-order method for large-scale competitive market equilibrium computation, 2025

Huikang Liu, Yicheng Huang, Hongpei Li, Dongdong Ge, and Yinyu Ye. Pdhcg: A scalable first-order method for large-scale competitive market equilibrium computation, 2025

work page 2025

[19] [19]

Haihao Lu and Jinwen Yang. cupdlp. jl: A gpu implementation of restarted primal-dual hybrid gradient for linear programming in julia.arXiv preprint arXiv:2311.12180, 2023

work page arXiv 2023

[20] [20]

arXiv preprint arXiv:2407.19689

Haihao Lu and Jinwen Yang. Pdot: A practical primal-dual algorithm and a gpu-based solver for optimal transport. arXiv preprint arXiv:2407.19689, 2024

work page arXiv 2024

[21] [21]

Restarted Halpern PDHG for linear programming

Haihao Lu and Jinwen Yang. Restarted halpern pdhg for linear programming.arXiv preprint arXiv:2407.16144, 2024

work page arXiv 2024

[22] [22]

A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, pages 1–38, 2025

Haihao Lu and Jinwen Yang. A practical and optimal first-order method for large-scale convex quadratic programming.Mathematical Programming, pages 1–38, 2025

work page 2025

[23] [23]

Linear convergence of a unified primal–dual algorithm for convex–concave saddle point problems with quadratic growth

Cody Melcher, Afrooz Jalilzadeh, and Erfan Yazdandoost Hamedani. Linear convergence of a unified primal–dual algorithm for convex–concave saddle point problems with quadratic growth. 2025

work page 2025

[24] [24]

Doubly Stochastic Primal-Dual Coordinate Method for Bilinear Saddle-Point Problem

Adams Wei Yu, Qihang Lin, and Tianbao Yang. Doubly stochastic primal-dual coordinate method for bilinear saddle-point problem.arXiv preprint arXiv:1508.03390, 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[25] [25]

Solving large multicommodity network flow problems on gpus.arXiv preprint arXiv:2501.17996, 2025

Fangzhao Zhang and Stephen Boyd. Solving large multicommodity network flow problems on gpus.arXiv preprint arXiv:2501.17996, 2025

work page arXiv 2025

[26] [26]

Hot: An efficient halpern accelerating algorithm for optimal transport problems.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

Guojun Zhang, Zhexuan Gu, Yancheng Yuan, and Defeng Sun. Hot: An efficient halpern accelerating algorithm for optimal transport problems.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. 25

work page 2025