pith. sign in

arxiv: 2606.19050 · v1 · pith:WEY3MJM4new · submitted 2026-06-17 · 🧮 math.OC

First-Order Methods for Solving Convex (Strongly) Concave Minimax Problems with Functional Constraints

Pith reviewed 2026-06-26 20:00 UTC · model grok-4.3

classification 🧮 math.OC
keywords minimax optimizationfunctional constraintsproximal augmented Lagrangianfirst-order methodsconvex-concave problemsKKT points
0
0 comments X

The pith

A proximal augmented Lagrangian method solves functional-constrained convex-(strongly-)concave minimax problems to ε-accuracy with Õ(ε^{-1}) first-order complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops first-order methods for minimax problems in which the inner maximization is subject to functional constraints. It exploits strong duality to fold those constraints into the objective, yielding an inexact primal function whose gradients can be approximated by solving an auxiliary maximization. This reformulation supports a proximal augmented Lagrangian method whose subproblems are solved by an inexact accelerated proximal gradient scheme. The resulting algorithm produces an ε-KKT point together with a primal ε-optimal solution at the stated rates.

Core claim

We design a proximal augmented Lagrangian method (PALM) for convex-(strongly-)concave minimax problems with functional constraints. By exploiting strong duality to incorporate the constraints into the objective, we obtain inexact gradients and solve each subproblem via an inexact accelerated proximal gradient scheme, returning an ε-KKT point and primal ε-optimal solution with Õ(ε^{-1}) first-order oracle and iteration complexity in the convex-strongly-concave case and Õ(ε^{-3/2}) dual complexity in the convex-concave case.

What carries the argument

proximal augmented Lagrangian method (PALM) whose subproblems are solved by an inexact accelerated proximal gradient scheme that tolerates approximate gradients from auxiliary maximization

Load-bearing premise

Strong duality holds so the functional constraints on the inner maximization can be moved into the objective without altering the solution set.

What would settle it

A concrete convex-concave minimax instance with functional constraints in which strong duality fails or the dual function is not sufficiently smooth, so that the claimed iteration bounds no longer hold.

read the original abstract

Minimax problems arise in many applications, including robust learning and Stackelberg models. Most existing methods for minimax problems address unconstrained or projection-friendly settings, while functional constrained minimax problems remain far less explored. We study a class of convex-(strongly-)concave minimax problems with functional constraints. By exploiting strong duality, we incorporate the inner-maximization functional constraints into the objective. This allows us to efficiently obtain inexact gradients of the primal function of the reformulation and to design a proximal augmented Lagrangian method (PALM). Each PALM subproblem is solved by an inexact accelerated proximal gradient scheme to handle inexact gradients arising from approximately solving an auxiliary maximization subproblem. We show that the proposed method returns an $\varepsilon$-KKT point and a primal $\varepsilon$-optimal solution, with $\tilde{\mathcal{O}}(\varepsilon^{-1})$ first-order oracle and iteration complexity in the convex-strongly-concave case. For the convex-concave case, the complexity remains the same for the primal gradient evaluation but increases to $\tilde{\mathcal{O}}(\varepsilon^{-\frac{3}{2}})$ for the dual part.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript develops first-order methods for convex-(strongly)concave minimax problems with functional constraints. By invoking strong duality to incorporate the inner-maximization constraints into the objective, the authors obtain an equivalent saddle-point problem that admits inexact gradients; they then solve it via a proximal augmented Lagrangian method (PALM) whose subproblems are handled by an inexact accelerated proximal gradient scheme. The central claims are that the algorithm returns an ε-KKT point together with a primal ε-optimal solution, with Õ(ε^{-1}) first-order oracle and iteration complexity in the convex-strongly-concave regime and Õ(ε^{-3/2}) dual complexity in the convex-concave regime.

Significance. If the duality reformulation is rigorously justified, the paper supplies the first explicit first-order complexity guarantees for a practically relevant class of constrained minimax problems that arise in robust learning and Stackelberg games. The use of inexact gradients arising from auxiliary maximization subproblems is a technically natural extension of PALM, and the stated rates are competitive with the unconstrained setting.

major comments (1)
  1. [Problem reformulation and duality step (abstract and §2)] The equivalence between the original constrained problem and the dualized reformulation (invoked to enable the PALM scheme and inexact-gradient analysis) requires strong duality to hold with zero gap for every fixed primal variable x. No constraint qualification (Slater, relative-interior, or otherwise) is stated in the abstract or indicated in the problem setup; without it, an ε-KKT point of the reformulated problem need not be feasible for the original functional constraints, undermining the claim that the returned point solves the original instance.
minor comments (1)
  1. [Abstract] The abstract states complexity results without listing the precise assumptions (smoothness, strong-concavity modulus, constraint qualification) under which they hold; moving a compact assumption list to the introduction would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address the major comment point by point below.

read point-by-point responses
  1. Referee: The equivalence between the original constrained problem and the dualized reformulation (invoked to enable the PALM scheme and inexact-gradient analysis) requires strong duality to hold with zero gap for every fixed primal variable x. No constraint qualification (Slater, relative-interior, or otherwise) is stated in the abstract or indicated in the problem setup; without it, an ε-KKT point of the reformulated problem need not be feasible for the original functional constraints, undermining the claim that the returned point solves the original instance.

    Authors: We agree that a suitable constraint qualification is required to rigorously justify strong duality (and thus zero duality gap) for every fixed primal variable x, ensuring the reformulation is equivalent and that ε-KKT points of the reformulated problem remain feasible for the original constraints. While the manuscript invokes strong duality to obtain the reformulation, we acknowledge that an explicit statement of the constraint qualification (e.g., Slater's condition) was not included in the abstract or highlighted at the outset of the problem setup. We will revise the manuscript to state this assumption explicitly in the problem formulation section and update the abstract accordingly. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper derives complexity bounds for a proximal augmented Lagrangian method (PALM) applied to a reformulated saddle-point problem obtained via strong duality. The abstract and description contain no equations, fitted parameters, or self-citations that reduce the claimed Õ(ε^{-1}) or Õ(ε^{-3/2}) rates to inputs by construction. The derivation relies on standard inexact proximal gradient analysis and duality, which are external to the target result and do not exhibit self-definition, renaming of known results, or load-bearing self-citation chains. The method is self-contained against external benchmarks in convex optimization theory.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; ledger entries are inferred from the reformulation step described.

axioms (1)
  • domain assumption Strong duality holds for the considered class of convex-(strongly-)concave minimax problems with functional constraints.
    Invoked to fold inner-max constraints into the primal objective and enable inexact gradient access.

pith-pipeline@v0.9.1-grok · 5731 in / 1256 out tokens · 24814 ms · 2026-06-26T20:00:45.399495+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

300 extracted references · 9 linked inside Pith

  1. [1]

    2026 , note =

    Inexact Nesterov's Accelerated Proximal Gradient Method , author =. 2026 , note =

  2. [2]

    arXiv preprint arXiv:2502.19764 , year=

    Inexact Moreau Envelope Lagrangian Method for Non-Convex Constrained Optimization under Local Error Bound Conditions on Constraint Functions , author=. arXiv preprint arXiv:2502.19764 , year=

  3. [3]

    arXiv preprint arXiv:2502.17602 , year=

    A stochastic smoothing framework for nonconvex-nonconcave min-sum-max problems with applications to wasserstein distributionally robust optimization , author=. arXiv preprint arXiv:2502.17602 , year=

  4. [4]

    arXiv preprint arXiv:2501.19214 , year=

    A single-loop SPIDER-type stochastic subgradient method for expectation-constrained nonconvex nonsmooth optimization , author=. arXiv preprint arXiv:2501.19214 , year=

  5. [5]

    arXiv preprint arXiv:2412.14291 , year=

    Projected gradient methods for nonconvex and stochastic optimization: new complexities and auto-conditioned stepsizes , author=. arXiv preprint arXiv:2412.14291 , year=

  6. [6]

    arXiv preprint arXiv:2307.07605 , year=

    First-order Methods for Affinely Constrained Composite Non-convex Non-smooth Problems: Lower Complexity Bound and Near-optimal Methods , author=. arXiv preprint arXiv:2307.07605 , year=

  7. [7]

    arXiv preprint arXiv:2307.07113 , year=

    Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems , author=. arXiv preprint arXiv:2307.07113 , year=

  8. [8]

    Mathematical Programming Computation , pages=

    Damped proximal augmented Lagrangian method for weakly-convex problems with convex constraints , author=. Mathematical Programming Computation , pages=. 2026 , publisher=

  9. [9]

    Transaction on Machine Learning , year=

    Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization , author=. Transaction on Machine Learning , year=

  10. [10]

    Mathematics of Operations Research , year=

    Lower Complexity Bounds of First-Order Methods for Affinely Constrained Composite Nonconvex Problems , author=. Mathematics of Operations Research , year=

  11. [11]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Jointly improving the sample and communication complexities in decentralized stochastic minimax optimization , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  12. [12]

    Computational Optimization and Applications , volume=

    Stochastic inexact augmented Lagrangian method for nonconvex expectation constrained optimization , author=. Computational Optimization and Applications , volume=. 2024 , publisher=

  13. [13]

    SIAM Journal on Optimization , volume=

    Decentralized gradient descent maximization method for composite nonconvex strongly-concave minimax problems , author=. SIAM Journal on Optimization , volume=. 2024 , publisher=

  14. [14]

    International Conference on Machine Learning , volume=

    Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data , author=. International Conference on Machine Learning , volume=

  15. [15]

    Mathematical Programming Computation , pages=

    Parallel and distributed asynchronous adaptive stochastic gradient methods , author=. Mathematical Programming Computation , pages=. 2023 , publisher=

  16. [16]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Proximal stochastic recursive momentum methods for nonconvex composite decentralized optimization , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  17. [17]

    IEEE Transactions on Signal Processing , volume=

    A decentralized primal-dual framework for non-convex smooth consensus optimization , author=. IEEE Transactions on Signal Processing , volume=. 2023 , publisher=

  18. [18]

    SIAM Journal on Optimization , volume=

    Reducing the complexity of two classes of optimization problems by inexact accelerated proximal gradient method , author=. SIAM Journal on Optimization , volume=. 2023 , publisher=

  19. [19]

    Journal of Optimization Theory and Applications , volume=

    Momentum-based variance-reduced proximal stochastic gradient method for composite nonconvex stochastic optimization , author=. Journal of Optimization Theory and Applications , volume=. 2023 , publisher=

  20. [20]

    International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

    Spectral adversarial mixup for few-shot unsupervised domain adaptation , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

  21. [21]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    When neural networks fail to generalize? a model sensitivity perspective , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  22. [22]

    Journal of the Franklin Institute , volume=

    Correntropy based model predictive controller with multi-constraints for robust path trajectory tracking of self-driving vehicle , author=. Journal of the Franklin Institute , volume=. 2023 , publisher=

  23. [23]

    IEEE Transactions on Neural Networks and Learning Systems , volume=

    Correntropy-based low-rank matrix factorization with constraint graph learning for image clustering , author=. IEEE Transactions on Neural Networks and Learning Systems , volume=. 2023 , publisher=

  24. [24]

    Mathematical Programming Computation , volume=

    Adaptive Primal-Dual Stochastic Gradient Method for Expectation-constrained Convex Stochastic Programs , author=. Mathematical Programming Computation , volume=

  25. [25]

    SIAM Journal on Imaging Sciences , volume=

    Distributed stochastic inertial-accelerated methods with delayed derivatives for nonconvex problems , author=. SIAM Journal on Imaging Sciences , volume=. 2022 , publisher=

  26. [26]

    First-order methods for problems with

    Xu, Yangyang , journal=. First-order methods for problems with. 2022 , publisher=

  27. [27]

    Computational Optimization and Applications , volume=

    Complexity of an inexact proximal-point penalty method for constrained smooth non-convex optimization , author=. Computational Optimization and Applications , volume=. 2022 , publisher=

  28. [28]

    Proceedings of the 36th AAAI Conference on Artificial Intelligence , pages=

    Zeroth-order Optimization for Composite Problems with Functional Constraints , author=. Proceedings of the 36th AAAI Conference on Artificial Intelligence , pages=

  29. [29]

    International Conference on Artificial Intelligence and Statistics , pages=

    Rate-improved inexact augmented Lagrangian method for constrained nonconvex optimization , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

  30. [30]

    Mathematical Programming , volume=

    Iteration complexity of inexact augmented lagrangian methods for constrained convex programming , author=. Mathematical Programming , volume=. 2021 , publisher=

  31. [31]

    INFORMS Journal on Optimization , volume=

    Augmented Lagrangian--Based First-Order Methods for Convex-Constrained Programs with Weakly Convex Objective , author=. INFORMS Journal on Optimization , volume=. 2021 , publisher=

  32. [32]

    Mathematical Programming , volume=

    Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems , author=. Mathematical Programming , volume=. 2021 , publisher=

  33. [33]

    INFORMS Journal on Optimization , volume=

    First-order methods for constrained convex programming based on linearized augmented Lagrangian function , author=. INFORMS Journal on Optimization , volume=. 2021 , publisher=

  34. [34]

    INFORMS Journal on Optimization , volume=

    Katyusha acceleration for convex finite-sum compositional optimization , author=. INFORMS Journal on Optimization , volume=. 2021 , publisher=

  35. [35]

    SIAM Journal on Optimization , volume=

    Primal-dual stochastic gradient method for convex programs with many functional constraints , author=. SIAM Journal on Optimization , volume=. 2020 , publisher=

  36. [36]

    2020 IEEE 11th Sensor Array and Multichannel Signal Processing Workshop (SAM) , pages=

    Greedy coordinate descent method on non-negative quadratic programming , author=. 2020 IEEE 11th Sensor Array and Multichannel Signal Processing Workshop (SAM) , pages=. 2020 , organization=

  37. [37]

    IEEE transactions on neural networks and learning systems , volume=

    Maximum correntropy criterion-based robust semisupervised concept factorization for image representation , author=. IEEE transactions on neural networks and learning systems , volume=. 2020 , publisher=

  38. [38]

    Computational Optimization and Applications , volume=

    Markov chain block coordinate descent , author=. Computational Optimization and Applications , volume=. 2020 , publisher=

  39. [39]

    Journal of the Operations Research Society of China , pages=

    On the Convergence of Asynchronous Parallel Iteration with Unbounded Delays , author=. Journal of the Operations Research Society of China , pages=. 2019 , publisher=

  40. [40]

    Journal of the Operations Research Society of China , volume=

    Randomized Primal-Dual Proximal Block Coordinate Updates , author=. Journal of the Operations Research Society of China , volume=

  41. [41]

    Computational Optimization and Applications , pages=

    Asynchronous parallel primal-dual block coordinate update methods for affinely constrained convex programs , author=. Computational Optimization and Applications , pages=

  42. [42]

    IEEE Transactions on Circuits and Systems for Video Technology , pages=

    Maximum Correntropy Criterion based Sparse Subspace Learning for Unsupervised Feature Selection , author=. IEEE Transactions on Circuits and Systems for Video Technology , pages=

  43. [43]

    Computational Optimization and Applications , volume=

    Accelerated Primal-Dual Proximal Block Coordinate Updating Methods for Constrained Convex Optimization , author=. Computational Optimization and Applications , volume=

  44. [44]

    Xu, Yangyang , journal=. Hybrid. 2018 , publisher=

  45. [45]

    Advances in Neural Information Processing Systems , pages=

    A Block Coordinate Ascent Algorithm for Mean-Variance Optimization , author=. Advances in Neural Information Processing Systems , pages=

  46. [46]

    2018 , publisher=

    Oliveira, Danilo Elias and Wolkowicz, Henry and Xu, Yangyang , journal=. 2018 , publisher=

  47. [47]

    2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

    Robust PCA via dictionary based outlier pursuit , author=. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2018 , organization=

  48. [48]

    Linear and Multilinear Algebra , volume=

    On the convergence of higher-order orthogonal iteration , author=. Linear and Multilinear Algebra , volume=. 2018 , publisher=

  49. [49]

    Journal of Scientific Computing , volume=

    A globally convergent algorithm for nonconvex optimization based on block coordinate update , author=. Journal of Scientific Computing , volume=. 2017 , publisher=

  50. [50]

    SIAM Journal on Optimization , volume=

    Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming , author=. SIAM Journal on Optimization , volume=. 2017 , publisher=

  51. [51]

    Journal of Computational Mathematics , pages=

    Fast algorithms for higher-order singular value decomposition from incomplete data , author=. Journal of Computational Mathematics , pages=. 2017 , publisher=

  52. [52]

    Annals of Mathematical Sciences and Applications , volume=

    Coordinate-friendly structures, algorithms and applications , author=. Annals of Mathematical Sciences and Applications , volume=. 2016 , publisher=

  53. [53]

    SIAM Journal on Scientific Computing , volume=

    ARock: an algorithmic framework for asynchronous parallel coordinate updates , author=. SIAM Journal on Scientific Computing , volume=. 2016 , publisher=

  54. [54]

    arXiv preprint arXiv:1610.00040 , year=

    A primer on coordinate descent algorithms , author=. arXiv preprint arXiv:1610.00040 , year=

  55. [55]

    Pattern Recognition , volume=

    Global and local structure preserving sparse subspace learning: An iterative approach to unsupervised feature selection , author=. Pattern Recognition , volume=. 2016 , publisher=

  56. [56]

    Inverse Problems and Imaging , volume=

    A fast patch-dictionary method for whole image recovery , author=. Inverse Problems and Imaging , volume=. 2016 , publisher=

  57. [57]

    Pattern Analysis and Applications , volume=

    Proximal gradient method for huberized support vector machine , author=. Pattern Analysis and Applications , volume=. 2016 , publisher=

  58. [58]

    SIAM Journal on Optimization , volume=

    Block stochastic gradient iteration for convex and nonconvex optimization , author=. SIAM Journal on Optimization , volume=. 2015 , publisher=

  59. [59]

    Inverse Problems and Imaging , volume=

    Parallel matrix factorization for low-rank tensor completion , author=. Inverse Problems and Imaging , volume=. 2015 , publisher=

  60. [60]

    Mathematical Programming Computation , volume=

    Alternating proximal gradient method for sparse nonnegative Tucker decomposition , author=. Mathematical Programming Computation , volume=. 2015 , publisher=

  61. [61]

    Inverse Problems and Imaging , volume=

    Learning circulant sensing kernels , author=. Inverse Problems and Imaging , volume=. 2014 , publisher=

  62. [62]

    arXiv preprint arXiv:1404.4104 , year=

    Sparse bilinear logistic regression , author=. arXiv preprint arXiv:1404.4104 , year=

  63. [63]

    SIAM Journal on imaging sciences , volume=

    A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization and completion , author=. SIAM Journal on imaging sciences , volume=. 2013 , publisher=

  64. [64]

    SIAM Journal on Numerical Analysis , volume=

    Improved iteratively reweighted least squares for unconstrained smoothed ell\_q minimization , author=. SIAM Journal on Numerical Analysis , volume=. 2013 , publisher=

  65. [65]

    2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=

    Decentralized low-rank matrix completion , author=. 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages=. 2012 , organization=

  66. [66]

    Frontiers of Mathematics in China , volume=

    An alternating direction algorithm for matrix completion with nonnegative factors , author=. Frontiers of Mathematics in China , volume=. 2012 , publisher=

  67. [67]

    SIAM journal on imaging sciences , volume=

    A fast iterative shrinkage-thresholding algorithm for linear inverse problems , author=. SIAM journal on imaging sciences , volume=. 2009 , publisher=

  68. [68]

    SIAM Journal on Optimization , volume=

    Proximally guided stochastic subgradient method for nonsmooth, nonconvex problems , author=. SIAM Journal on Optimization , volume=. 2019 , publisher=

  69. [69]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Regression shrinkage and selection via the lasso , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 1996 , publisher=

  70. [70]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Model selection and estimation in regression with grouped variables , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2006 , publisher=

  71. [71]

    Preprint, arXiv:2308.06470 , year=

    On the Optimal Lower and Upper Complexity Bounds for a Class of Composite Optimization Problems , author=. Preprint, arXiv:2308.06470 , year=

  72. [72]

    Li and P

    Z. Li and P. Chen and S. Liu and S. Lu and Y. Xu , booktitle=. Rate-improved inexact augmented. 2021 , organization=

  73. [73]

    R. T. Rockafellar , journal=. Augmented. 1976 , publisher=

  74. [74]

    2025 , journal=

    Lower Complexity Bounds of First-order Methods for Affinely Constrained Composite Non-convex Problems , author=. 2025 , journal=

  75. [75]

    Dahal and W

    H. Dahal and W. Liu and Y. Xu , journal=. Damped Proximal Augmented

  76. [76]

    Nonlinear Analysis: Theory, Methods & Applications , volume=

    Directional derivative of a minimax function , author=. Nonlinear Analysis: Theory, Methods & Applications , volume=. 1985 , publisher=

  77. [77]

    SIAM Journal on Optimization , volume=

    First-order methods for problems with O (1) functional constraints can have almost the same convergence rate as for unconstrained problems , author=. SIAM Journal on Optimization , volume=. 2022 , publisher=

  78. [78]

    IEEE transactions on Information Theory , volume=

    Analysis of multiresolution image denoising schemes using generalized Gaussian and complexity priors , author=. IEEE transactions on Information Theory , volume=. 1999 , publisher=

  79. [79]

    Mathematical Programming , volume=

    Efficiency of minimizing compositions of convex functions and smooth maps , author=. Mathematical Programming , volume=. 2019 , publisher=

  80. [80]

    Journal of Scientific Computing , volume=

    Global convergence of ADMM in nonconvex nonsmooth optimization , author=. Journal of Scientific Computing , volume=. 2019 , publisher=

Showing first 80 references.