Parallel Sequential Quadratic Programming with Overlapping Graph Decomposition and Exact Augmented Lagrangian
Pith reviewed 2026-05-24 03:54 UTC · model grok-4.3
The pith
Overlapping graph decomposition in SQP produces descent directions for the exact augmented Lagrangian, ensuring global convergence of parallel solves for large graph-structured nonlinear programs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By leveraging overlapping graph decomposition to compute an approximate Newton direction in parallel, the method performs a backtracking line search on the exact augmented Lagrangian. Built on the exponential decay of sensitivity of gsNLPs, the approximate direction is shown to be a descent direction, yielding global convergence with local linear rate that improves exponentially in overlap size, for sufficiently large overlaps, under mild regularity conditions on the graph topology.
What carries the argument
overlapping graph decomposition to compute parallel approximate Newton directions, which remain descent directions for the exact augmented Lagrangian due to exponential sensitivity decay
If this is right
- Global convergence holds when overlaps are sufficiently large.
- The local linear convergence rate improves exponentially with increasing overlap size.
- The guarantees extend to dynamic programs on linear graphs as a special case.
- The method applies to PDE-constrained optimization, multi-stage stochastic optimization, and general network optimization.
- Standard mild regularity conditions on graph topology suffice along with the sensitivity decay property.
Where Pith is reading between the lines
- Similar overlapping techniques might improve convergence in other parallel optimization algorithms for networked systems.
- Testing the method on graphs with varying connectivity could reveal minimal overlap thresholds for different topologies.
- Extensions to inexact solves or stochastic settings could further enhance scalability for very large problems.
- The approach suggests that sensitivity decay properties can be exploited more broadly in decomposition methods for structured optimization.
Load-bearing premise
The exponential decay of sensitivity of the graph-structured nonlinear programs with respect to the graph topology ensures that the approximate Newton direction remains a descent direction for the merit function.
What would settle it
A counterexample gsNLP on a graph where sensitivity does not decay exponentially, for which the method with any finite overlap fails to produce a descent direction or converge globally.
read the original abstract
In this paper, we address the challenge of solving large-scale graph-structured nonlinear programs (gsNLPs) in a scalable manner. GsNLPs are problems in which the objective and constraint functions are associated with nodes on a graph and depend on the variables of adjacent nodes. This graph-structured formulation encompasses various specific instances, such as dynamic optimization, PDE-constrained optimization, multi-stage stochastic optimization, and general network optimization. By leveraging the sequential quadratic programming (SQP) framework, we propose a globally convergent overlapping graph decomposition method to solve large-scale gsNLPs under standard mild regularity conditions on the graph topology. In each iteration, we perform an overlapping graph decomposition to compute an approximate Newton direction in a parallel environment. Then, we select a suitable stepsize and update the primal-dual iterate by performing a backtracking line search on an exact augmented Lagrangian merit function. Built on the exponential decay of sensitivity of gsNLPs, we show that the approximate Newton direction is a descent direction of the augmented Lagrangian, which leads to global convergence with local linear convergence rate. In particular, global convergence is achieved for sufficiently large overlaps, and the local linear convergence rate improves exponentially in terms of the overlap size. Our results match existing state-of-the-art guarantees established for dynamic programs (which simply correspond to linear graphs). We validate the theory on two PDE-constrained problems: a semilinear elliptic problem and a boundary heating problem.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a parallel SQP algorithm for graph-structured nonlinear programs (gsNLPs) that uses overlapping graph decomposition to compute approximate Newton directions in parallel. Global convergence is established via a backtracking line search on an exact augmented Lagrangian merit function, under mild graph-topology regularity conditions; the local linear convergence rate is shown to improve exponentially with overlap size. The results recover known guarantees for dynamic programs on linear graphs and are illustrated on two PDE-constrained test problems.
Significance. If the central convergence argument holds, the work supplies the first general-graph extension of overlap-based parallel SQP with both global convergence and an explicit exponential dependence of the local rate on overlap size. This is a substantive theoretical advance for scalable structured optimization, directly applicable to dynamic, stochastic, and PDE-constrained problems.
minor comments (4)
- The abstract and introduction state that global convergence holds 'for sufficiently large overlaps,' but the precise quantitative threshold (in terms of graph diameter or spectral radius) is not stated explicitly; adding a corollary or remark that isolates this threshold would improve readability.
- Notation for the overlapping subproblems (e.g., the definition of the local gsNLP on each subgraph) should be introduced once in §2 and used consistently; several later sections re-define the same quantities with slightly different symbols.
- Figure 1 (schematic of the overlapping decomposition) would benefit from an explicit legend indicating which nodes belong to the overlap sets; the current caption is insufficient for readers unfamiliar with the graph notation.
- The numerical section reports iteration counts and CPU times but does not tabulate the observed linear rates versus overlap size; adding a small table or plot of rate versus overlap would directly support the exponential-improvement claim.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were provided in the report.
Circularity Check
No significant circularity identified
full rationale
The derivation relies on the exponential decay of sensitivities in gsNLPs, which the paper states follows directly from the graph structure under mild regularity conditions on topology. This property is invoked to establish that the approximate Newton direction is a descent direction for the exact augmented Lagrangian, enabling the standard SQP line-search convergence argument. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided chain; the central claims remain independent of the target results once the graph-induced decay is granted. The analysis is internally consistent with classical SQP theory and matches known results for linear graphs without reducing to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption exponential decay of sensitivity of gsNLPs with respect to distant nodes
- domain assumption standard mild regularity conditions on the graph topology
Reference graph
Works this paper leans on
- [1]
-
[2]
A. Beccuti, T. Geyer, and M. Morari , Temporal Lagrangian decomposition of model predictive control for hybrid systems , in 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601), IEEE, 2004
work page 2004
-
[3]
D. P. Bertsekas, Constrained optimization and Lagrange multiplier methods, no. 4 in Athena scientific optimization and computation series, Athena Scientific, Belmont, Mass., 1996
work page 1996
-
[4]
L. T. Biegler , Nonlinear Programming, Society for Industrial and Applied Mathematics, Jan. 2010
work page 2010
-
[5]
L. T. Biegler, O. Ghattas, M. Heinkenschloss, and B. van Bloemen W aanders , Large-Scale PDE-Constrained Optimization: An Introduction , Springer Berlin Heidelberg, 2003
work page 2003
-
[6]
T. F. Chan and B. F. Smith , Domain decomposition and multigrid algorithms for elliptic problems on unstructured meshes, Contemporary Mathematics, 180 (1994), pp. 175–175
work page 1994
-
[7]
T. F. Chan and J. Zou , Additive schwarz domain decomposition methods for elliptic problems on unstructured meshes, Numerical Algorithms, 8 (1994), pp. 329–346. 24 RUNXIN NI, SEN NA, SUNGHO SHIN, AND MIHAI ANITESCU
work page 1994
-
[8]
J. C. Dunn and D. P. Bertsekas , Efficient dynamic programming implementations of newton’s method for unconstrained optimal control problems , Journal of Optimization Theory and Appli- cations, 63 (1989), pp. 23–38
work page 1989
-
[9]
I. Dunning, J. Huchette, and M. Lubin , Jump: A modeling language for mathematical optimization , SIAM Review, 59 (2017), pp. 295–320
work page 2017
-
[10]
E. Ghadimi, A. Teixeira, I. Shames, and M. Johansson , Optimal parameter selection for the al- ternating direction method of multipliers (ADMM): Quadratic problems , IEEE Transactions on Automatic Control, 60 (2015), pp. 644–658
work page 2015
-
[11]
N. I. M. Gould and P. L. Toint , SQP Methods for Large-Scale Nonlinear Programming , Springer US, 2000, pp. 149–178
work page 2000
-
[12]
I. Hong, S. Na, M. W. Mahoney, and M. Kolar , Constrained optimization via exact augmented lagrangian and randomized iterative sketching , International Conference on Machine Learning, (2023)
work page 2023
-
[13]
J. R. Jackson and I. E. Grossmann , Temporal decomposition scheme for nonlinear multisite produc- tion planning and distribution models , Industrial & Engineering Chemistry Research, 42 (2003), pp. 3045–3055
work page 2003
- [14]
-
[15]
S. Na , Global convergence of online optimization for nonlinear model predictive control , Advances in Neural Information Processing Systems, 34 (2021), pp. 12441–12453
work page 2021
- [16]
- [17]
-
[18]
S. Na, M. Anitescu, and M. Kolar , An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians, Mathematical Programming, 199 (2022), pp. 721–791
work page 2022
-
[19]
S. Na, M. Anitescu, and M. Kolar , A fast temporal decomposition procedure for long-horizon nonlinear dynamic programming, Mathematics of Operations Research, (2023)
work page 2023
-
[20]
S. Na, S. Shin, M. Anitescu, and V. M. Zavala , On the convergence of overlapping schwarz de- composition for nonlinear optimal control , IEEE Transactions on Automatic Control, 67 (2022), pp. 5996–6011
work page 2022
- [21]
-
[22]
M. V. F. Pereira and L. M. V. G. Pinto , Multi-stage stochastic optimization applied to energy planning, Mathematical Programming, 52 (1991), pp. 359–375
work page 1991
-
[23]
S. Shin, M. Anitescu, and V. M. Zavala , Overlapping Schwarz decomposition for constrained qua- dratic programs, in 2020 59th IEEE Conference on Decision and Control (CDC), IEEE, Dec. 2020
work page 2020
-
[24]
S. Shin, M. Anitescu, and V. M. Zavala , Exponential decay of sensitivity in graph-structured non- linear programs, SIAM Journal on Optimization, 32 (2022), pp. 1156–1183
work page 2022
-
[25]
S. Shin, Y. Lin, G. Qu, A. Wierman, and M. Anitescu , Near-optimal distributed linear-quadratic regulator for networked systems, SIAM Journal on Control and Optimization, 61 (2023), pp. 1113– 1135
work page 2023
-
[26]
S. Shin, V. M. Zavala, and M. Anitescu , Decentralized schemes with overlap for solving graph- structured optimization problems , IEEE Transactions on Control of Network Systems, 7 (2020), pp. 1225–1236
work page 2020
-
[27]
A. W ¨achter and L. T. Biegler , On the implementation of an interior-point filter line-search algo- rithm for large-scale nonlinear programming, Mathematical Programming, 106 (2005), pp. 25–57
work page 2005
- [28]
-
[29]
V. M. Zavala , New architectures for hierarchical predictive control , IFAC-PapersOnLine, 49 (2016), pp. 43–48
work page 2016
-
[30]
A. Zlotnik, M. Chertkov, and S. Backhaus , Optimal control of transient flow in natural gas networks, in 2015 54th IEEE Conference on Decision and Control (CDC), IEEE, Dec. 2015. Government License: The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy ...
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.