Smoothness of the Augmented Lagrangian Dual in Convex Optimization

Jingwang Li; Vincent Lau

arxiv: 2505.01824 · v2 · pith:L5COH5UNnew · submitted 2025-05-03 · 🧮 math.OC · cs.SY· eess.SY

Smoothness of the Augmented Lagrangian Dual in Convex Optimization

Jingwang Li , Vincent Lau This is my paper

Pith reviewed 2026-05-22 16:22 UTC · model grok-4.3

classification 🧮 math.OC cs.SYeess.SY

keywords augmented Lagrangiandual smoothnessconvex optimizationlinear constraintsdual functionexistence of minimizerquadratic penalty

0 comments

The pith

Under the condition that the standard dual has nonempty domain, the augmented Lagrangian dual is 1/ρ-smooth everywhere.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper considers linearly constrained convex minimization problems and defines the standard dual function φ together with its augmented version φ_ρ that includes a quadratic penalty term. It proves that whenever dom φ is nonempty, φ_ρ is smooth with modulus exactly 1/ρ at every point, and the inner minimization defining φ_ρ always attains its minimum for any multiplier. These conclusions replace stronger requirements such as coercivity or bounded level sets that appear in earlier analyses of augmented Lagrangian methods.

Core claim

Under the fundamental condition that dom φ ≠ ∅, φ_ρ is 1/ρ-smooth everywhere and the solution to min_x L_ρ(x, λ) exists for any λ ∈ R^p. These properties are shown for any closed proper convex f and any linear map A, substantially weakening the assumptions commonly used to guarantee smoothness and existence in the augmented dual.

What carries the argument

The augmented Lagrangian dual φ_ρ(λ) obtained by infimizing the augmented Lagrangian L_ρ(x, λ) = f(x) + ⟨λ, Ax − b⟩ + (ρ/2)‖Ax − b‖², whose quadratic term supplies both the uniform smoothness and the existence of the inner minimizer.

If this is right

Dual ascent or proximal-point algorithms applied to φ_ρ may employ a fixed step size ρ without local adjustment.
The x-subproblem in augmented Lagrangian or ADMM iterations is guaranteed to be solvable for every dual iterate.
Convergence proofs for augmented-Lagrangian schemes can proceed with only the nonempty-domain hypothesis instead of interior-point or strong-convexity assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same quadratic augmentation may yield analogous smoothness for problems with inequality constraints once the dual is suitably lifted.
Practical checks for nonempty dom φ remain difficult, yet the result clarifies the precise boundary between well-behaved and ill-behaved duals.
Smoothness with explicit modulus could be used to derive explicit iteration complexity bounds for first-order dual methods.

Load-bearing premise

There exists at least one multiplier for which the ordinary dual function φ takes a finite value.

What would settle it

Take any convex f and A such that dom φ is nonempty, compute or estimate the gradient of φ_ρ at two points, and check whether its difference quotient exceeds 1/ρ.

read the original abstract

This paper focuses on the general linearly constrained optimization problem: $\min_{x \in \mathbb{R}^d} f(x) \ \text{s.t.} \ Ax = b$, where $f: \mathbb{R}^d \rightarrow \mathbb{R} \cup \{+\infty\}$ is a closed proper convex function, $A \in \mathbb{R}^{p \times d}$, and $b \in \mathbb{R}^p$. We define the standard dual function $\phi(\lambda) = \inf_x \{f(x) + \langle \lambda, A x - b \rangle\}$, the augmented Lagrangian $\mathcal{L}_{\rho}(x, \lambda) = f(x) + \langle \lambda, Ax - b \rangle + \frac{\rho}{2}\|Ax - b\|^2$ ($\rho > 0$), and the augmented Lagrangian dual function $\phi_{\rho}(\lambda) = \inf_x \mathcal{L}_{\rho}(x, \lambda)$. Under the fundamental condition that $\text{dom} \ \phi \neq \emptyset$, we establish that: (1) $\phi_{\rho}$ is $\frac{1}{\rho}$-smooth everywhere; and (2) the solution to $\min_{x \in \mathbb{R}^d} \mathcal{L}_{\rho}(x, \lambda)$ exists for any $\lambda \in \mathbb{R}^p$. These theoretical findings substantially weaken the stringent assumptions typically imposed in the literature to ensure such properties.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims both smoothness and attainment for the augmented dual under only dom φ ≠ ∅, but a counterexample shows attainment fails when the objective recedes along ker(A).

read the letter

Hi, the main point is that the central claim does not hold. The abstract states that dom φ nonempty implies both that φ_ρ is 1/ρ-smooth everywhere and that min_x L_ρ(x, λ) is attained for every λ. The counterexample with f(x₁, x₂) = ½x₁² + exp(x₂), A = [1 0], b = 0 works: φ is finite for all λ so the domain condition holds, yet L_ρ approaches its infimum as x₂ → -∞ without ever reaching it because the exponential term never attains its lower bound. This directly falsifies the existence statement under the stated hypothesis alone. The smoothness part might survive on its own, but the paper bundles the two results together as following from the same minimal condition. The attempt to drop strong convexity or interior-point assumptions is a reasonable direction and could matter for convergence analysis of ADMM-type methods if the smoothness result can be isolated and proved correctly. The flaw here is not minor; it affects a load-bearing part of the stated theorem, and the abstract presents both properties as new under the single condition. The proof appears to have overlooked recession directions in the kernel. This is aimed at specialists in convex optimization who track dual properties for algorithm design. A reader hunting for weaker conditions on augmented Lagrangians might skim it for ideas, but the error means the contribution is not reliable without fixes. I would not send this to peer review until the existence claim is corrected or the assumptions are strengthened to guarantee attainment.

Referee Report

2 major / 1 minor

Summary. The paper considers the linearly constrained convex problem min f(x) s.t. Ax=b with f closed proper convex. It defines the standard dual φ(λ)=inf_x {f(x)+⟨λ,Ax-b⟩} and the augmented dual φ_ρ(λ)=inf_x L_ρ(x,λ) where L_ρ is the augmented Lagrangian. Under the sole assumption dom φ ≠ ∅ it claims that φ_ρ is 1/ρ-smooth on all of R^p and that the infimum defining φ_ρ(λ) is attained for every λ.

Significance. If the claims were correct they would relax common coercivity or strong-convexity assumptions used to guarantee smoothness and attainment in augmented-Lagrangian theory, potentially simplifying convergence analyses for first-order methods.

major comments (2)

[Abstract and §3] Abstract and §3 (main theorem): the claim that dom φ ≠ ∅ implies attainment of min_x L_ρ(x,λ) for every λ is false. Counter-example: d=2, p=1, A=[1 0], b=0, f(x1,x2)=½x1² + exp(x2). Then φ(λ) is finite for all λ (dom φ = R), yet L_ρ(x,λ) = (½+ρ/2)x1² + λ x1 + exp(x2) has infimum equal to the completed-square value over x1 but is never attained as x2→-∞ along ker(A). This directly falsifies claim (2).
[§4] §4 (proof of smoothness): the argument that 1/ρ-smoothness follows from the same dom φ ≠ ∅ condition appears to rely on the same attainment step used for claim (2); once attainment fails the subdifferential argument for smoothness cannot be completed for arbitrary λ.

minor comments (1)

Notation: the paper uses φ_ρ for the augmented dual; it would help to explicitly relate it to the standard Moreau envelope or proximal mapping of the dual.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the explicit counterexample, which has helped us identify an error in one of our claims. We acknowledge that the attainment result does not hold under the sole assumption dom φ ≠ ∅. However, we maintain that the smoothness result is correct and can be established by a revised argument that does not depend on attainment. We will update the manuscript to correct the inaccurate claim, replace the flawed proof, and clarify the conditions needed for attainment.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (main theorem): the claim that dom φ ≠ ∅ implies attainment of min_x L_ρ(x,λ) for every λ is false. Counter-example: d=2, p=1, A=[1 0], b=0, f(x1,x2)=½x1² + exp(x2). Then φ(λ) is finite for all λ (dom φ = R), yet L_ρ(x,λ) = (½+ρ/2)x1² + λ x1 + exp(x2) has infimum equal to the completed-square value over x1 but is never attained as x2→-∞ along ker(A). This directly falsifies claim (2).

Authors: We have verified the counterexample and agree that it is valid: the infimum of L_ρ is approached as x2 → -∞ but is never attained for any finite x. This shows that claim (2) is false under only dom φ ≠ ∅. We will revise the manuscript to remove this claim from the abstract and main theorem, and we will add a remark explaining that attainment requires additional assumptions such as coercivity of f along ker(A). revision: yes
Referee: [§4] §4 (proof of smoothness): the argument that 1/ρ-smoothness follows from the same dom φ ≠ ∅ condition appears to rely on the same attainment step used for claim (2); once attainment fails the subdifferential argument for smoothness cannot be completed for arbitrary λ.

Authors: We agree that the current proof of smoothness is invalid because it relies on attainment of the infimum in x. Nevertheless, the 1/ρ-smoothness statement itself remains true. We will replace the proof with a corrected argument that expresses φ_ρ via an auxiliary convex function ψ(y) = inf{f(x) : Ax = y} and recognizes φ_ρ as a composition involving the Moreau envelope of ψ. The gradient of φ_ρ is then given by a proximal mapping, which is nonexpansive and yields the desired 1/ρ-Lipschitz continuity without requiring attainment in the original variables. We will also confirm the result on the referee's counterexample, where φ_ρ is explicitly quadratic and hence smooth. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation applies standard convex analysis to definitions.

full rationale

The paper derives the 1/ρ-smoothness of φ_ρ and attainment of the infimum in L_ρ(x, λ) directly from the definitions of φ and φ_ρ together with the assumption dom φ ≠ ∅, using properties of closed proper convex functions and infimal projections. No self-definitional reductions, fitted parameters presented as predictions, or load-bearing self-citations appear; the central claims rest on external convex-analysis facts rather than reducing to the paper's own inputs by construction. The derivation is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard convex-analysis background and one domain condition; no free parameters or new entities are introduced.

axioms (2)

domain assumption f is a closed proper convex function from R^d to R union +∞
This is the basic setup stated for the primal problem.
domain assumption dom φ ≠ ∅
Invoked explicitly as the fundamental condition that enables both the smoothness and attainment results.

pith-pipeline@v0.9.0 · 5802 in / 1327 out tokens · 94145 ms · 2026-05-22T16:22:22.388420+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Augmented Lagrangian Method for Last-Iterate Convergence for Constrained MDPs
cs.LG 2026-05 unverdicted novelty 6.0

An inexact augmented Lagrangian method with projected Q-ascent yields global last-iterate convergence guarantees for constrained MDP policy optimization, extending from tabular to log-linear and non-linear policies.
Accelerated Decentralized Constraint-Coupled Optimization: A Dual$^2$ Approach
math.OC 2025-05 unverdicted novelty 6.0

The Dual² approach produces iD2A and MiD2A gradient methods that achieve asymptotic convergence under milder conditions on the public function and linear rates with reduced communication and computation complexity.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · cited by 2 Pith papers

[1]

Multiplier and gradient methods,

M. R. Hestenes, “Multiplier and gradient methods,” Journal of Optimization Theory and Applications , vol. 4, no. 5, pp. 303–320, 1969

work page 1969
[2]

A method for nonlinear constraints in mini mization problems,

M. J. Powell, “A method for nonlinear constraints in mini mization problems,” Optimization, pp. 283–298, 1969. 6

work page 1969
[3]

Computationa l complexity of inexact gradient augmented lagrangian methods: Application to constrained mpc,

V . Nedelcu, I. Necoara, and Q. Tran-Dinh, “Computationa l complexity of inexact gradient augmented lagrangian methods: Application to constrained mpc,” SIAM Journal on Control and Optimization , vol. 52, no. 5, pp. 3109–3134, 2014

work page 2014
[4]

Iteration-complexity of ﬁrst -order augmented lagrangian methods for convex programming,

G. Lan and R. D. Monteiro, “Iteration-complexity of ﬁrst -order augmented lagrangian methods for convex programming,” Mathematical Programming , vol. 155, no. 1, pp. 511–547, 2016

work page 2016
[5]

First-order m ethods of smooth convex optimization with inexact oracle,

O. Devolder, F. Glineur, and Y . Nesterov, “First-order m ethods of smooth convex optimization with inexact oracle,” Mathematical Programming , vol. 146, no. 1, pp. 37–75, 2014

work page 2014
[6]

Convergence rates of in exact proximal-gradient methods for convex optimization,

M. Schmidt, N. Roux, and F. Bach, “Convergence rates of in exact proximal-gradient methods for convex optimization,” Advances in Neural Information Processing Systems , vol. 24, 2011

work page 2011
[7]

D. P . Bertsekas, Nonlinear Programming . Athena Scientiﬁc, 2016, vol. 4

work page 2016
[8]

Smooth minimization of non-smooth functi ons,

Y . Nesterov, “Smooth minimization of non-smooth functi ons,” Mathematical Programming, vol. 103, pp. 127– 152, 2005

work page 2005
[9]

On the linear convergence of the al ternating direction method of multipliers,

M. Hong and Z.-Q. Luo, “On the linear convergence of the al ternating direction method of multipliers,” Mathematical Programming , vol. 162, no. 1, pp. 165–199, 2017

work page 2017
[10]

V andenberghe, Optimization Methods for Large-Scale Systems

L. V andenberghe, Optimization Methods for Large-Scale Systems . Lecture Slides, UCLA, 2022. [Online]. Available: https://www.seas.ucla.edu/∼ vandenbe/236C

work page 2022
[11]

Nesterov, Lectures on Convex Optimization

Y . Nesterov, Lectures on Convex Optimization . Springer, 2018, vol. 137

work page 2018
[12]

On the acceleration of augmented lagr angian method for linearly constrained optimiza- tion,

B. He and X. Y uan, “On the acceleration of augmented lagr angian method for linearly constrained optimiza- tion,” Optimization Online , vol. 3, 2010

work page 2010
[13]

Accelerated bregma n method for linearly constrained–minimization,

M. Kang, S. Y un, H. Woo, and M. Kang, “Accelerated bregma n method for linearly constrained–minimization,” Journal of Scientiﬁc Computing , vol. 56, no. 3, pp. 515–534, 2013

work page 2013
[14]

Davis, Mathematical Programming I

D. Davis, Mathematical Programming I . Lecture Notes, Cornell University, 2016. [Online]. Avail able: https://people.orie.cornell.edu/dsd95/orie63002016.html

work page 2016
[15]

Hiriart-Urruty and C

J.-B. Hiriart-Urruty and C. Lemar´ echal, Fundamentals of Convex Analysis . Springer Science & Business Media, 2004

work page 2004
[16]

R. T. Rockafellar, Convex Analysis. Princeton University Press, 1970

work page 1970

[1] [1]

Multiplier and gradient methods,

M. R. Hestenes, “Multiplier and gradient methods,” Journal of Optimization Theory and Applications , vol. 4, no. 5, pp. 303–320, 1969

work page 1969

[2] [2]

A method for nonlinear constraints in mini mization problems,

M. J. Powell, “A method for nonlinear constraints in mini mization problems,” Optimization, pp. 283–298, 1969. 6

work page 1969

[3] [3]

Computationa l complexity of inexact gradient augmented lagrangian methods: Application to constrained mpc,

V . Nedelcu, I. Necoara, and Q. Tran-Dinh, “Computationa l complexity of inexact gradient augmented lagrangian methods: Application to constrained mpc,” SIAM Journal on Control and Optimization , vol. 52, no. 5, pp. 3109–3134, 2014

work page 2014

[4] [4]

Iteration-complexity of ﬁrst -order augmented lagrangian methods for convex programming,

G. Lan and R. D. Monteiro, “Iteration-complexity of ﬁrst -order augmented lagrangian methods for convex programming,” Mathematical Programming , vol. 155, no. 1, pp. 511–547, 2016

work page 2016

[5] [5]

First-order m ethods of smooth convex optimization with inexact oracle,

O. Devolder, F. Glineur, and Y . Nesterov, “First-order m ethods of smooth convex optimization with inexact oracle,” Mathematical Programming , vol. 146, no. 1, pp. 37–75, 2014

work page 2014

[6] [6]

Convergence rates of in exact proximal-gradient methods for convex optimization,

M. Schmidt, N. Roux, and F. Bach, “Convergence rates of in exact proximal-gradient methods for convex optimization,” Advances in Neural Information Processing Systems , vol. 24, 2011

work page 2011

[7] [7]

D. P . Bertsekas, Nonlinear Programming . Athena Scientiﬁc, 2016, vol. 4

work page 2016

[8] [8]

Smooth minimization of non-smooth functi ons,

Y . Nesterov, “Smooth minimization of non-smooth functi ons,” Mathematical Programming, vol. 103, pp. 127– 152, 2005

work page 2005

[9] [9]

On the linear convergence of the al ternating direction method of multipliers,

M. Hong and Z.-Q. Luo, “On the linear convergence of the al ternating direction method of multipliers,” Mathematical Programming , vol. 162, no. 1, pp. 165–199, 2017

work page 2017

[10] [10]

V andenberghe, Optimization Methods for Large-Scale Systems

L. V andenberghe, Optimization Methods for Large-Scale Systems . Lecture Slides, UCLA, 2022. [Online]. Available: https://www.seas.ucla.edu/∼ vandenbe/236C

work page 2022

[11] [11]

Nesterov, Lectures on Convex Optimization

Y . Nesterov, Lectures on Convex Optimization . Springer, 2018, vol. 137

work page 2018

[12] [12]

On the acceleration of augmented lagr angian method for linearly constrained optimiza- tion,

B. He and X. Y uan, “On the acceleration of augmented lagr angian method for linearly constrained optimiza- tion,” Optimization Online , vol. 3, 2010

work page 2010

[13] [13]

Accelerated bregma n method for linearly constrained–minimization,

M. Kang, S. Y un, H. Woo, and M. Kang, “Accelerated bregma n method for linearly constrained–minimization,” Journal of Scientiﬁc Computing , vol. 56, no. 3, pp. 515–534, 2013

work page 2013

[14] [14]

Davis, Mathematical Programming I

D. Davis, Mathematical Programming I . Lecture Notes, Cornell University, 2016. [Online]. Avail able: https://people.orie.cornell.edu/dsd95/orie63002016.html

work page 2016

[15] [15]

Hiriart-Urruty and C

J.-B. Hiriart-Urruty and C. Lemar´ echal, Fundamentals of Convex Analysis . Springer Science & Business Media, 2004

work page 2004

[16] [16]

R. T. Rockafellar, Convex Analysis. Princeton University Press, 1970

work page 1970