pith. sign in

arxiv: 2505.01824 · v2 · pith:L5COH5UNnew · submitted 2025-05-03 · 🧮 math.OC · cs.SY· eess.SY

Smoothness of the Augmented Lagrangian Dual in Convex Optimization

Pith reviewed 2026-05-22 16:22 UTC · model grok-4.3

classification 🧮 math.OC cs.SYeess.SY
keywords augmented Lagrangiandual smoothnessconvex optimizationlinear constraintsdual functionexistence of minimizerquadratic penalty
0
0 comments X

The pith

Under the condition that the standard dual has nonempty domain, the augmented Lagrangian dual is 1/ρ-smooth everywhere.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper considers linearly constrained convex minimization problems and defines the standard dual function φ together with its augmented version φ_ρ that includes a quadratic penalty term. It proves that whenever dom φ is nonempty, φ_ρ is smooth with modulus exactly 1/ρ at every point, and the inner minimization defining φ_ρ always attains its minimum for any multiplier. These conclusions replace stronger requirements such as coercivity or bounded level sets that appear in earlier analyses of augmented Lagrangian methods.

Core claim

Under the fundamental condition that dom φ ≠ ∅, φ_ρ is 1/ρ-smooth everywhere and the solution to min_x L_ρ(x, λ) exists for any λ ∈ R^p. These properties are shown for any closed proper convex f and any linear map A, substantially weakening the assumptions commonly used to guarantee smoothness and existence in the augmented dual.

What carries the argument

The augmented Lagrangian dual φ_ρ(λ) obtained by infimizing the augmented Lagrangian L_ρ(x, λ) = f(x) + ⟨λ, Ax − b⟩ + (ρ/2)‖Ax − b‖², whose quadratic term supplies both the uniform smoothness and the existence of the inner minimizer.

If this is right

  • Dual ascent or proximal-point algorithms applied to φ_ρ may employ a fixed step size ρ without local adjustment.
  • The x-subproblem in augmented Lagrangian or ADMM iterations is guaranteed to be solvable for every dual iterate.
  • Convergence proofs for augmented-Lagrangian schemes can proceed with only the nonempty-domain hypothesis instead of interior-point or strong-convexity assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same quadratic augmentation may yield analogous smoothness for problems with inequality constraints once the dual is suitably lifted.
  • Practical checks for nonempty dom φ remain difficult, yet the result clarifies the precise boundary between well-behaved and ill-behaved duals.
  • Smoothness with explicit modulus could be used to derive explicit iteration complexity bounds for first-order dual methods.

Load-bearing premise

There exists at least one multiplier for which the ordinary dual function φ takes a finite value.

What would settle it

Take any convex f and A such that dom φ is nonempty, compute or estimate the gradient of φ_ρ at two points, and check whether its difference quotient exceeds 1/ρ.

read the original abstract

This paper focuses on the general linearly constrained optimization problem: $\min_{x \in \mathbb{R}^d} f(x) \ \text{s.t.} \ Ax = b$, where $f: \mathbb{R}^d \rightarrow \mathbb{R} \cup \{+\infty\}$ is a closed proper convex function, $A \in \mathbb{R}^{p \times d}$, and $b \in \mathbb{R}^p$. We define the standard dual function $\phi(\lambda) = \inf_x \{f(x) + \langle \lambda, A x - b \rangle\}$, the augmented Lagrangian $\mathcal{L}_{\rho}(x, \lambda) = f(x) + \langle \lambda, Ax - b \rangle + \frac{\rho}{2}\|Ax - b\|^2$ ($\rho > 0$), and the augmented Lagrangian dual function $\phi_{\rho}(\lambda) = \inf_x \mathcal{L}_{\rho}(x, \lambda)$. Under the fundamental condition that $\text{dom} \ \phi \neq \emptyset$, we establish that: (1) $\phi_{\rho}$ is $\frac{1}{\rho}$-smooth everywhere; and (2) the solution to $\min_{x \in \mathbb{R}^d} \mathcal{L}_{\rho}(x, \lambda)$ exists for any $\lambda \in \mathbb{R}^p$. These theoretical findings substantially weaken the stringent assumptions typically imposed in the literature to ensure such properties.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper considers the linearly constrained convex problem min f(x) s.t. Ax=b with f closed proper convex. It defines the standard dual φ(λ)=inf_x {f(x)+⟨λ,Ax-b⟩} and the augmented dual φ_ρ(λ)=inf_x L_ρ(x,λ) where L_ρ is the augmented Lagrangian. Under the sole assumption dom φ ≠ ∅ it claims that φ_ρ is 1/ρ-smooth on all of R^p and that the infimum defining φ_ρ(λ) is attained for every λ.

Significance. If the claims were correct they would relax common coercivity or strong-convexity assumptions used to guarantee smoothness and attainment in augmented-Lagrangian theory, potentially simplifying convergence analyses for first-order methods.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (main theorem): the claim that dom φ ≠ ∅ implies attainment of min_x L_ρ(x,λ) for every λ is false. Counter-example: d=2, p=1, A=[1 0], b=0, f(x1,x2)=½x1² + exp(x2). Then φ(λ) is finite for all λ (dom φ = R), yet L_ρ(x,λ) = (½+ρ/2)x1² + λ x1 + exp(x2) has infimum equal to the completed-square value over x1 but is never attained as x2→-∞ along ker(A). This directly falsifies claim (2).
  2. [§4] §4 (proof of smoothness): the argument that 1/ρ-smoothness follows from the same dom φ ≠ ∅ condition appears to rely on the same attainment step used for claim (2); once attainment fails the subdifferential argument for smoothness cannot be completed for arbitrary λ.
minor comments (1)
  1. Notation: the paper uses φ_ρ for the augmented dual; it would help to explicitly relate it to the standard Moreau envelope or proximal mapping of the dual.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the explicit counterexample, which has helped us identify an error in one of our claims. We acknowledge that the attainment result does not hold under the sole assumption dom φ ≠ ∅. However, we maintain that the smoothness result is correct and can be established by a revised argument that does not depend on attainment. We will update the manuscript to correct the inaccurate claim, replace the flawed proof, and clarify the conditions needed for attainment.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (main theorem): the claim that dom φ ≠ ∅ implies attainment of min_x L_ρ(x,λ) for every λ is false. Counter-example: d=2, p=1, A=[1 0], b=0, f(x1,x2)=½x1² + exp(x2). Then φ(λ) is finite for all λ (dom φ = R), yet L_ρ(x,λ) = (½+ρ/2)x1² + λ x1 + exp(x2) has infimum equal to the completed-square value over x1 but is never attained as x2→-∞ along ker(A). This directly falsifies claim (2).

    Authors: We have verified the counterexample and agree that it is valid: the infimum of L_ρ is approached as x2 → -∞ but is never attained for any finite x. This shows that claim (2) is false under only dom φ ≠ ∅. We will revise the manuscript to remove this claim from the abstract and main theorem, and we will add a remark explaining that attainment requires additional assumptions such as coercivity of f along ker(A). revision: yes

  2. Referee: [§4] §4 (proof of smoothness): the argument that 1/ρ-smoothness follows from the same dom φ ≠ ∅ condition appears to rely on the same attainment step used for claim (2); once attainment fails the subdifferential argument for smoothness cannot be completed for arbitrary λ.

    Authors: We agree that the current proof of smoothness is invalid because it relies on attainment of the infimum in x. Nevertheless, the 1/ρ-smoothness statement itself remains true. We will replace the proof with a corrected argument that expresses φ_ρ via an auxiliary convex function ψ(y) = inf{f(x) : Ax = y} and recognizes φ_ρ as a composition involving the Moreau envelope of ψ. The gradient of φ_ρ is then given by a proximal mapping, which is nonexpansive and yields the desired 1/ρ-Lipschitz continuity without requiring attainment in the original variables. We will also confirm the result on the referee's counterexample, where φ_ρ is explicitly quadratic and hence smooth. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard convex analysis to definitions.

full rationale

The paper derives the 1/ρ-smoothness of φ_ρ and attainment of the infimum in L_ρ(x, λ) directly from the definitions of φ and φ_ρ together with the assumption dom φ ≠ ∅, using properties of closed proper convex functions and infimal projections. No self-definitional reductions, fitted parameters presented as predictions, or load-bearing self-citations appear; the central claims rest on external convex-analysis facts rather than reducing to the paper's own inputs by construction. The derivation is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard convex-analysis background and one domain condition; no free parameters or new entities are introduced.

axioms (2)
  • domain assumption f is a closed proper convex function from R^d to R union +∞
    This is the basic setup stated for the primal problem.
  • domain assumption dom φ ≠ ∅
    Invoked explicitly as the fundamental condition that enables both the smoothness and attainment results.

pith-pipeline@v0.9.0 · 5802 in / 1327 out tokens · 94145 ms · 2026-05-22T16:22:22.388420+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Augmented Lagrangian Method for Last-Iterate Convergence for Constrained MDPs

    cs.LG 2026-05 unverdicted novelty 6.0

    An inexact augmented Lagrangian method with projected Q-ascent yields global last-iterate convergence guarantees for constrained MDP policy optimization, extending from tabular to log-linear and non-linear policies.

  2. Accelerated Decentralized Constraint-Coupled Optimization: A Dual$^2$ Approach

    math.OC 2025-05 unverdicted novelty 6.0

    The Dual² approach produces iD2A and MiD2A gradient methods that achieve asymptotic convergence under milder conditions on the public function and linear rates with reduced communication and computation complexity.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · cited by 2 Pith papers

  1. [1]

    Multiplier and gradient methods,

    M. R. Hestenes, “Multiplier and gradient methods,” Journal of Optimization Theory and Applications , vol. 4, no. 5, pp. 303–320, 1969

  2. [2]

    A method for nonlinear constraints in mini mization problems,

    M. J. Powell, “A method for nonlinear constraints in mini mization problems,” Optimization, pp. 283–298, 1969. 6

  3. [3]

    Computationa l complexity of inexact gradient augmented lagrangian methods: Application to constrained mpc,

    V . Nedelcu, I. Necoara, and Q. Tran-Dinh, “Computationa l complexity of inexact gradient augmented lagrangian methods: Application to constrained mpc,” SIAM Journal on Control and Optimization , vol. 52, no. 5, pp. 3109–3134, 2014

  4. [4]

    Iteration-complexity of first -order augmented lagrangian methods for convex programming,

    G. Lan and R. D. Monteiro, “Iteration-complexity of first -order augmented lagrangian methods for convex programming,” Mathematical Programming , vol. 155, no. 1, pp. 511–547, 2016

  5. [5]

    First-order m ethods of smooth convex optimization with inexact oracle,

    O. Devolder, F. Glineur, and Y . Nesterov, “First-order m ethods of smooth convex optimization with inexact oracle,” Mathematical Programming , vol. 146, no. 1, pp. 37–75, 2014

  6. [6]

    Convergence rates of in exact proximal-gradient methods for convex optimization,

    M. Schmidt, N. Roux, and F. Bach, “Convergence rates of in exact proximal-gradient methods for convex optimization,” Advances in Neural Information Processing Systems , vol. 24, 2011

  7. [7]

    D. P . Bertsekas, Nonlinear Programming . Athena Scientific, 2016, vol. 4

  8. [8]

    Smooth minimization of non-smooth functi ons,

    Y . Nesterov, “Smooth minimization of non-smooth functi ons,” Mathematical Programming, vol. 103, pp. 127– 152, 2005

  9. [9]

    On the linear convergence of the al ternating direction method of multipliers,

    M. Hong and Z.-Q. Luo, “On the linear convergence of the al ternating direction method of multipliers,” Mathematical Programming , vol. 162, no. 1, pp. 165–199, 2017

  10. [10]

    V andenberghe, Optimization Methods for Large-Scale Systems

    L. V andenberghe, Optimization Methods for Large-Scale Systems . Lecture Slides, UCLA, 2022. [Online]. Available: https://www.seas.ucla.edu/∼ vandenbe/236C

  11. [11]

    Nesterov, Lectures on Convex Optimization

    Y . Nesterov, Lectures on Convex Optimization . Springer, 2018, vol. 137

  12. [12]

    On the acceleration of augmented lagr angian method for linearly constrained optimiza- tion,

    B. He and X. Y uan, “On the acceleration of augmented lagr angian method for linearly constrained optimiza- tion,” Optimization Online , vol. 3, 2010

  13. [13]

    Accelerated bregma n method for linearly constrained–minimization,

    M. Kang, S. Y un, H. Woo, and M. Kang, “Accelerated bregma n method for linearly constrained–minimization,” Journal of Scientific Computing , vol. 56, no. 3, pp. 515–534, 2013

  14. [14]

    Davis, Mathematical Programming I

    D. Davis, Mathematical Programming I . Lecture Notes, Cornell University, 2016. [Online]. Avail able: https://people.orie.cornell.edu/dsd95/orie63002016.html

  15. [15]

    Hiriart-Urruty and C

    J.-B. Hiriart-Urruty and C. Lemar´ echal, Fundamentals of Convex Analysis . Springer Science & Business Media, 2004

  16. [16]

    R. T. Rockafellar, Convex Analysis. Princeton University Press, 1970