pith. sign in

arxiv: 2402.17170 · v3 · pith:2V46QRUQnew · submitted 2024-02-27 · 🧮 math.OC

Parallel Sequential Quadratic Programming with Overlapping Graph Decomposition and Exact Augmented Lagrangian

Pith reviewed 2026-05-24 03:54 UTC · model grok-4.3

classification 🧮 math.OC
keywords graph-structured nonlinear programsoverlapping graph decompositionsequential quadratic programmingaugmented Lagrangianglobal convergencelocal linear convergencePDE-constrained optimizationparallel optimization
0
0 comments X

The pith

Overlapping graph decomposition in SQP produces descent directions for the exact augmented Lagrangian, ensuring global convergence of parallel solves for large graph-structured nonlinear programs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a parallel method for solving large-scale graph-structured nonlinear programs by decomposing the graph with overlaps and computing approximate Newton directions within a sequential quadratic programming framework. It proves that these directions are descent directions for an exact augmented Lagrangian merit function, provided the overlaps are large enough, leading to global convergence. The local convergence rate improves exponentially as the overlap size increases. This approach applies to problems like PDE-constrained optimization and dynamic optimization, matching known guarantees for linear graphs. A sympathetic reader would care because it enables scalable, parallel computation for problems that arise in many applications without requiring special structure beyond graph topology.

Core claim

By leveraging overlapping graph decomposition to compute an approximate Newton direction in parallel, the method performs a backtracking line search on the exact augmented Lagrangian. Built on the exponential decay of sensitivity of gsNLPs, the approximate direction is shown to be a descent direction, yielding global convergence with local linear rate that improves exponentially in overlap size, for sufficiently large overlaps, under mild regularity conditions on the graph topology.

What carries the argument

overlapping graph decomposition to compute parallel approximate Newton directions, which remain descent directions for the exact augmented Lagrangian due to exponential sensitivity decay

If this is right

  • Global convergence holds when overlaps are sufficiently large.
  • The local linear convergence rate improves exponentially with increasing overlap size.
  • The guarantees extend to dynamic programs on linear graphs as a special case.
  • The method applies to PDE-constrained optimization, multi-stage stochastic optimization, and general network optimization.
  • Standard mild regularity conditions on graph topology suffice along with the sensitivity decay property.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar overlapping techniques might improve convergence in other parallel optimization algorithms for networked systems.
  • Testing the method on graphs with varying connectivity could reveal minimal overlap thresholds for different topologies.
  • Extensions to inexact solves or stochastic settings could further enhance scalability for very large problems.
  • The approach suggests that sensitivity decay properties can be exploited more broadly in decomposition methods for structured optimization.

Load-bearing premise

The exponential decay of sensitivity of the graph-structured nonlinear programs with respect to the graph topology ensures that the approximate Newton direction remains a descent direction for the merit function.

What would settle it

A counterexample gsNLP on a graph where sensitivity does not decay exponentially, for which the method with any finite overlap fails to produce a descent direction or converge globally.

read the original abstract

In this paper, we address the challenge of solving large-scale graph-structured nonlinear programs (gsNLPs) in a scalable manner. GsNLPs are problems in which the objective and constraint functions are associated with nodes on a graph and depend on the variables of adjacent nodes. This graph-structured formulation encompasses various specific instances, such as dynamic optimization, PDE-constrained optimization, multi-stage stochastic optimization, and general network optimization. By leveraging the sequential quadratic programming (SQP) framework, we propose a globally convergent overlapping graph decomposition method to solve large-scale gsNLPs under standard mild regularity conditions on the graph topology. In each iteration, we perform an overlapping graph decomposition to compute an approximate Newton direction in a parallel environment. Then, we select a suitable stepsize and update the primal-dual iterate by performing a backtracking line search on an exact augmented Lagrangian merit function. Built on the exponential decay of sensitivity of gsNLPs, we show that the approximate Newton direction is a descent direction of the augmented Lagrangian, which leads to global convergence with local linear convergence rate. In particular, global convergence is achieved for sufficiently large overlaps, and the local linear convergence rate improves exponentially in terms of the overlap size. Our results match existing state-of-the-art guarantees established for dynamic programs (which simply correspond to linear graphs). We validate the theory on two PDE-constrained problems: a semilinear elliptic problem and a boundary heating problem.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The paper proposes a parallel SQP algorithm for graph-structured nonlinear programs (gsNLPs) that uses overlapping graph decomposition to compute approximate Newton directions in parallel. Global convergence is established via a backtracking line search on an exact augmented Lagrangian merit function, under mild graph-topology regularity conditions; the local linear convergence rate is shown to improve exponentially with overlap size. The results recover known guarantees for dynamic programs on linear graphs and are illustrated on two PDE-constrained test problems.

Significance. If the central convergence argument holds, the work supplies the first general-graph extension of overlap-based parallel SQP with both global convergence and an explicit exponential dependence of the local rate on overlap size. This is a substantive theoretical advance for scalable structured optimization, directly applicable to dynamic, stochastic, and PDE-constrained problems.

minor comments (4)
  1. The abstract and introduction state that global convergence holds 'for sufficiently large overlaps,' but the precise quantitative threshold (in terms of graph diameter or spectral radius) is not stated explicitly; adding a corollary or remark that isolates this threshold would improve readability.
  2. Notation for the overlapping subproblems (e.g., the definition of the local gsNLP on each subgraph) should be introduced once in §2 and used consistently; several later sections re-define the same quantities with slightly different symbols.
  3. Figure 1 (schematic of the overlapping decomposition) would benefit from an explicit legend indicating which nodes belong to the overlap sets; the current caption is insufficient for readers unfamiliar with the graph notation.
  4. The numerical section reports iteration counts and CPU times but does not tabulate the observed linear rates versus overlap size; adding a small table or plot of rate versus overlap would directly support the exponential-improvement claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation relies on the exponential decay of sensitivities in gsNLPs, which the paper states follows directly from the graph structure under mild regularity conditions on topology. This property is invoked to establish that the approximate Newton direction is a descent direction for the exact augmented Lagrangian, enabling the standard SQP line-search convergence argument. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided chain; the central claims remain independent of the target results once the graph-induced decay is granted. The analysis is internally consistent with classical SQP theory and matches known results for linear graphs without reducing to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two domain assumptions about the problem class; no free parameters or new postulated entities are introduced in the abstract.

axioms (2)
  • domain assumption exponential decay of sensitivity of gsNLPs with respect to distant nodes
    Invoked to establish that the parallel approximate Newton direction is a descent direction for the augmented Lagrangian merit function.
  • domain assumption standard mild regularity conditions on the graph topology
    Required to guarantee global convergence for sufficiently large overlaps.

pith-pipeline@v0.9.0 · 5795 in / 1489 out tokens · 41855 ms · 2026-05-24T03:54:04.779649+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

  1. [1]

    Antil, D

    H. Antil, D. P. Kouri, and D. Ridzal , Alesqp: An augmented lagrangian equality-constrained sqp method for optimization with general constraints , SIAM Journal on Optimization, 33 (2023), pp. 237–266

  2. [2]

    Beccuti, T

    A. Beccuti, T. Geyer, and M. Morari , Temporal Lagrangian decomposition of model predictive control for hybrid systems , in 2004 43rd IEEE Conference on Decision and Control (CDC) (IEEE Cat. No.04CH37601), IEEE, 2004

  3. [3]

    D. P. Bertsekas, Constrained optimization and Lagrange multiplier methods, no. 4 in Athena scientific optimization and computation series, Athena Scientific, Belmont, Mass., 1996

  4. [4]

    L. T. Biegler , Nonlinear Programming, Society for Industrial and Applied Mathematics, Jan. 2010

  5. [5]

    L. T. Biegler, O. Ghattas, M. Heinkenschloss, and B. van Bloemen W aanders , Large-Scale PDE-Constrained Optimization: An Introduction , Springer Berlin Heidelberg, 2003

  6. [6]

    T. F. Chan and B. F. Smith , Domain decomposition and multigrid algorithms for elliptic problems on unstructured meshes, Contemporary Mathematics, 180 (1994), pp. 175–175

  7. [7]

    T. F. Chan and J. Zou , Additive schwarz domain decomposition methods for elliptic problems on unstructured meshes, Numerical Algorithms, 8 (1994), pp. 329–346. 24 RUNXIN NI, SEN NA, SUNGHO SHIN, AND MIHAI ANITESCU

  8. [8]

    J. C. Dunn and D. P. Bertsekas , Efficient dynamic programming implementations of newton’s method for unconstrained optimal control problems , Journal of Optimization Theory and Appli- cations, 63 (1989), pp. 23–38

  9. [9]

    Dunning, J

    I. Dunning, J. Huchette, and M. Lubin , Jump: A modeling language for mathematical optimization , SIAM Review, 59 (2017), pp. 295–320

  10. [10]

    Ghadimi, A

    E. Ghadimi, A. Teixeira, I. Shames, and M. Johansson , Optimal parameter selection for the al- ternating direction method of multipliers (ADMM): Quadratic problems , IEEE Transactions on Automatic Control, 60 (2015), pp. 644–658

  11. [11]

    N. I. M. Gould and P. L. Toint , SQP Methods for Large-Scale Nonlinear Programming , Springer US, 2000, pp. 149–178

  12. [12]

    I. Hong, S. Na, M. W. Mahoney, and M. Kolar , Constrained optimization via exact augmented lagrangian and randomized iterative sketching , International Conference on Machine Learning, (2023)

  13. [13]

    J. R. Jackson and I. E. Grossmann , Temporal decomposition scheme for nonlinear multisite produc- tion planning and distribution models , Industrial & Engineering Chemistry Research, 42 (2003), pp. 3045–3055

  14. [14]

    Kozma, C

    A. Kozma, C. Conte, and M. Diehl , Benchmarking large-scale distributed convex quadratic pro- gramming algorithms, Optimization Methods and Software, 30 (2014), pp. 191–214

  15. [15]

    Na , Global convergence of online optimization for nonlinear model predictive control , Advances in Neural Information Processing Systems, 34 (2021), pp

    S. Na , Global convergence of online optimization for nonlinear model predictive control , Advances in Neural Information Processing Systems, 34 (2021), pp. 12441–12453

  16. [16]

    Na and M

    S. Na and M. Anitescu , Exponential decay in the sensitivity analysis of nonlinear dynamic program- ming, SIAM Journal on Optimization, 30 (2020), pp. 1527–1554

  17. [17]

    Na and M

    S. Na and M. Anitescu , Superconvergence of online optimization for model predictive control, IEEE Transactions on Automatic Control, 68 (2023), pp. 1383–1398

  18. [18]

    S. Na, M. Anitescu, and M. Kolar , An adaptive stochastic sequential quadratic programming with differentiable exact augmented lagrangians, Mathematical Programming, 199 (2022), pp. 721–791

  19. [19]

    S. Na, M. Anitescu, and M. Kolar , A fast temporal decomposition procedure for long-horizon nonlinear dynamic programming, Mathematics of Operations Research, (2023)

  20. [20]

    S. Na, S. Shin, M. Anitescu, and V. M. Zavala , On the convergence of overlapping schwarz de- composition for nonlinear optimal control , IEEE Transactions on Automatic Control, 67 (2022), pp. 5996–6011

  21. [21]

    Nocedal and S

    J. Nocedal and S. J. Wright , Numerical optimization, Springer, 2006

  22. [22]

    M. V. F. Pereira and L. M. V. G. Pinto , Multi-stage stochastic optimization applied to energy planning, Mathematical Programming, 52 (1991), pp. 359–375

  23. [23]

    S. Shin, M. Anitescu, and V. M. Zavala , Overlapping Schwarz decomposition for constrained qua- dratic programs, in 2020 59th IEEE Conference on Decision and Control (CDC), IEEE, Dec. 2020

  24. [24]

    S. Shin, M. Anitescu, and V. M. Zavala , Exponential decay of sensitivity in graph-structured non- linear programs, SIAM Journal on Optimization, 32 (2022), pp. 1156–1183

  25. [25]

    S. Shin, Y. Lin, G. Qu, A. Wierman, and M. Anitescu , Near-optimal distributed linear-quadratic regulator for networked systems, SIAM Journal on Control and Optimization, 61 (2023), pp. 1113– 1135

  26. [26]

    S. Shin, V. M. Zavala, and M. Anitescu , Decentralized schemes with overlap for solving graph- structured optimization problems , IEEE Transactions on Control of Network Systems, 7 (2020), pp. 1225–1236

  27. [27]

    W ¨achter and L

    A. W ¨achter and L. T. Biegler , On the implementation of an interior-point filter line-search algo- rithm for large-scale nonlinear programming, Mathematical Programming, 106 (2005), pp. 25–57

  28. [28]

    Xu and M

    W. Xu and M. Anitescu , Exponentially accurate temporal decomposition for long-horizon linear- quadratic dynamic optimization , SIAM Journal on Optimization, 28 (2018), pp. 2541–2573

  29. [29]

    V. M. Zavala , New architectures for hierarchical predictive control , IFAC-PapersOnLine, 49 (2016), pp. 43–48

  30. [30]

    Zlotnik, M

    A. Zlotnik, M. Chertkov, and S. Backhaus , Optimal control of transient flow in natural gas networks, in 2015 54th IEEE Conference on Decision and Control (CDC), IEEE, Dec. 2015. Government License: The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy ...