pith. sign in

arxiv: 2604.14803 · v1 · submitted 2026-04-16 · 🧮 math.OC

Anderson Acceleration for Linearly Converging SQP-Type Methods

Pith reviewed 2026-05-10 11:19 UTC · model grok-4.3

classification 🧮 math.OC
keywords Anderson accelerationsequential quadratic programmingSQP methodsconstrained optimizationoptimal controlconvergence accelerationinexact methodsnonlinear programming
0
0 comments X

The pith

Anderson acceleration improves local convergence rates for a family of inexact SQP methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that Anderson acceleration, a technique for speeding fixed-point iterations, can be applied to sequential quadratic programming (SQP) steps in constrained nonlinear optimization. It establishes faster local convergence near the solution for a general class of SQP-type methods, including inexact variants, and adds a heuristic to counteract slower progress when iterates start far from the optimum. The approach is demonstrated in optimal control examples through implementation in the acados software framework. Sympathetic readers would care because SQP remains a standard solver for problems with constraints, and even modest speed-ups in iteration count translate directly to reduced solve times in repeated or real-time applications.

Core claim

The central claim is that the local convergence behavior of a general family of (inexact) SQP-type methods can benefit from Anderson acceleration, with a simple heuristic introduced to alleviate slower convergence farther from the solution.

What carries the argument

Anderson acceleration applied to the sequence of iterates generated by SQP steps, treated as a fixed-point iteration.

If this is right

  • Local convergence rates of linearly converging SQP methods improve when Anderson acceleration is applied to their iterates.
  • The same acceleration applies to inexact SQP variants without altering the underlying quadratic subproblem structure.
  • A simple heuristic stabilizes performance when starting points lie far from the solution while preserving the accelerated local behavior.
  • Implementation in existing SQP solvers such as those in acados yields measurable reductions in iteration count on optimal control tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same acceleration pattern may transfer to other optimization algorithms that admit a fixed-point interpretation, such as certain interior-point or augmented Lagrangian methods.
  • In real-time model predictive control, the reduction in outer iterations could lower the overall latency enough to enable higher sampling rates.
  • Global convergence guarantees would require additional safeguards on the mixing parameters used by Anderson acceleration.

Load-bearing premise

That the SQP iteration map behaves sufficiently like a fixed-point iteration for Anderson acceleration to accelerate it without introducing instability or divergence.

What would settle it

Numerical tests on standard optimal control problems in which adding Anderson acceleration either fails to reduce iteration counts near the solution or causes divergence in at least one SQP variant.

Figures

Figures reproduced from arXiv: 2604.14803 by David Kiessling, Jonathan Frey, Katrin Baumg\"artner, Moritz Diehl.

Figure 1
Figure 1. Figure 1: KKT residuals for SQP with exact Hessian, GGN, [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Left: Convergence of zero-order iterations with [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
read the original abstract

Although Anderson acceleration (AA) is known to speed up fixed-point iterations, it is rarely applied in constrained optimization, in particular sequential quadratic programming (SQP). We show that the local convergence behavior of a general family of (inexact) SQP-type methods can benefit from AA and introduce a simple heuristic to alleviate slower convergence farther from the solution. The method is implemented in the software framework acados. Numerical examples from optimal control illustrate consistent improvements in convergence of different SQP-type methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that Anderson acceleration (AA) can improve the local convergence behavior of a general family of inexact SQP-type methods for constrained optimization problems. It represents SQP iterations as fixed-point maps to which AA applies, introduces a simple heuristic to mitigate slower convergence away from the solution, and demonstrates consistent practical improvements via numerical examples from optimal control implemented in the acados framework.

Significance. If the local convergence analysis and heuristic are rigorously justified, the work would provide a lightweight, structure-preserving acceleration technique for SQP solvers that could be valuable in real-time optimal control and other constrained optimization settings. The open implementation in acados and the numerical illustrations on multiple SQP variants constitute practical strengths.

major comments (3)
  1. [§3.2] §3.2, the fixed-point representation of inexact SQP: the central claim requires that each SQP step defines a map x^{k+1}=G(x^k) to which AA applies while preserving the linear contraction rate. However, because the QP subproblem data depend nonlinearly on x^k and the solution must satisfy linearized constraints, it is unclear whether the convex combination produced by AA yields a valid subsequent QP or remains in the region where the contraction factor is guaranteed. No explicit bound is given showing that the accelerated map remains contractive near the solution.
  2. [§4.3] §4.3, Theorem 2 on local convergence: the proof sketch assumes the heuristic (which switches or modifies the AA mixing far from the solution) leaves the asymptotic rate unchanged, yet no perturbation analysis or neighborhood argument is supplied demonstrating that the heuristic does not degrade the linear rate or introduce instability once iterates enter the local basin.
  3. [§5] §5, numerical examples: while the reported iteration counts and CPU times show improvement, the tables do not include the contraction factors or residual norms that would allow direct verification of the claimed local linear rate improvement; without these, it is difficult to confirm that the observed speedup is due to the theoretical mechanism rather than the heuristic alone.
minor comments (2)
  1. [§2] Notation for the Anderson mixing coefficients and the memory length m is introduced without a clear reference to the standard AA formulation; a brief reminder of the update formula would improve readability.
  2. [§5] The abstract states 'consistent improvements' but the text does not quantify how often the heuristic is activated versus pure AA; a short table or sentence on activation frequency would clarify the practical role of the heuristic.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. The comments identify important points where the analysis and presentation can be clarified and strengthened. We address each major comment below and will incorporate the suggested revisions into the next version of the manuscript.

read point-by-point responses
  1. Referee: [§3.2] §3.2, the fixed-point representation of inexact SQP: the central claim requires that each SQP step defines a map x^{k+1}=G(x^k) to which AA applies while preserving the linear contraction rate. However, because the QP subproblem data depend nonlinearly on x^k and the solution must satisfy linearized constraints, it is unclear whether the convex combination produced by AA yields a valid subsequent QP or remains in the region where the contraction factor is guaranteed. No explicit bound is given showing that the accelerated map remains contractive near the solution.

    Authors: We agree that an explicit neighborhood argument would make the application of Anderson acceleration to the SQP fixed-point map more rigorous. In the revised manuscript we will add a short lemma establishing that, for iterates sufficiently close to the solution, the convex combination produced by AA remains inside the open set on which the underlying SQP map G is contractive. The argument relies on the local Lipschitz continuity of the QP data and the fact that the feasible set of the linearized constraints is preserved under small perturbations of the right-hand side. revision: yes

  2. Referee: [§4.3] §4.3, Theorem 2 on local convergence: the proof sketch assumes the heuristic (which switches or modifies the AA mixing far from the solution) leaves the asymptotic rate unchanged, yet no perturbation analysis or neighborhood argument is supplied demonstrating that the heuristic does not degrade the linear rate or introduce instability once iterates enter the local basin.

    Authors: The referee correctly notes that the interaction between the heuristic and the local linear rate requires a more precise statement. We will revise the proof of Theorem 2 to include a perturbation argument: once all previous iterates lie inside a sufficiently small ball around the solution, the heuristic is automatically deactivated (or its modification is bounded by an arbitrarily small term), so the asymptotic contraction factor of the accelerated iteration coincides with that of the unaccelerated map. revision: yes

  3. Referee: [§5] §5, numerical examples: while the reported iteration counts and CPU times show improvement, the tables do not include the contraction factors or residual norms that would allow direct verification of the claimed local linear rate improvement; without these, it is difficult to confirm that the observed speedup is due to the theoretical mechanism rather than the heuristic alone.

    Authors: We will augment Section 5 with additional columns (or a supplementary table) that report the observed contraction factors ||x^{k+1}−x^*||/||x^k−x^*|| and the KKT residual norms for both the plain and accelerated variants on each example. This will allow direct numerical verification of the local linear-rate improvement once the iterates enter the basin identified in the analysis. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation applies known AA to SQP without self-referential reduction

full rationale

The paper represents SQP-type iterations as fixed-point maps x^{k+1}=G(x^k) to which Anderson acceleration is applied, then analyzes local linear convergence rates under standard assumptions on the underlying SQP contraction. No equation or theorem reduces by construction to a fitted parameter or self-citation; the heuristic for far-from-solution iterates is presented as an empirical safeguard without claiming it leaves the local contraction factor invariant by definition. The central result rests on independent fixed-point theory and numerical examples rather than tautological renaming or imported uniqueness theorems from the same authors.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract contains no explicit free parameters, axioms, or invented entities; the contribution rests on standard fixed-point acceleration and SQP theory without additional postulates.

pith-pipeline@v0.9.0 · 5376 in / 1096 out tokens · 49028 ms · 2026-05-10T11:19:25.096467+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

  1. [1]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sent...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

  3. [3]

    Anderson, D.G. (1965). Iterative procedures for nonlinear integral equations. J. ACM (JACM), 12(4)

  4. [4]

    Baumg \"a rtner, K., Zanelli, A., and Diehl, M. (2019). Zero-order moving horizon estimation. In IEEE Conf. Decis. Control

  5. [5]

    and Diehl, M

    Baum \- g \"a rtner, K. and Diehl, M. (2022). The extended G auss- N ewton method for nonconvex loss functions and its application to time-optimal model predictive control. In Proc. Amer. Control Conf. (ACC)

  6. [6]

    Baumgärtner, K., Frey, J., Hashemi, R., and Diehl, M. (2021). Zero-order moving horizon estimation for large-scale nonlinear processes. Comput. & Chem. Eng., 154

  7. [7]

    Bock, H.G., Diehl, M., Kostina, E.A., and Schl\"oder, J.P. (2007). Constrained optimal feedback control of systems governed by large differential algebraic equations. In Real-Time Online PDE-Constrained Optim. SIAM

  8. [8]

    Delaiss \'e , N., Demeester, T., Haelterman, R., and Degroote, J. (2023). Quasi-newton methods for partitioned simulation of fluid--structure interaction reviewed in the generalized B royden framework. Arch. Comput. Methods Eng., 30(5)

  9. [9]

    Evans, C., Pollock, S., Rebholz, L.G., and Xiao, M. (2020). A proof that anderson acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically). SIAM J. Numer. Anal., 58(1)

  10. [10]

    (2024 a )

    Frey, J., Gao, Y., Messerer, F., Lahr, A., Zeilinger, M.N., and Diehl, M. (2024 a ). Efficient zero-order robust optimization for real-time model predictive control with acados. In Proc. Eur. Control Conf. (ECC)

  11. [11]

    (2024 b )

    Frey, J., Nurkanovi\'c, A., and Diehl, M. (2024 b ). Advanced-step real-time iterations with four levels – new error bounds and fast implementation in acados. IEEE Control Syst. Lett

  12. [12]

    Frison, G., Kouzoupis, D., Sartor, T., Zanelli, A., and Diehl, M. (2018). BLASFEO : Basic linear algebra subroutines for embedded optimization. ACM Trans. Math. Softw. (TOMS), 44(4)

  13. [13]

    Garstka, M., Cannon, M., and Goulart, P. (2022). Safeguarded anderson acceleration for parametric nonexpansive operators. In 2022 Eur. Control Conf. (ECC)

  14. [14]

    Gros, S., Zanon, M., Quirynen, R., Bemporad, A., and Diehl, M. (2020). From linear to nonlinear MPC : bridging the gap via the real-time iteration. Int. J. Control, 93(1)

  15. [15]

    Homburger, H., Frey, J., Wirtensohn, S., Diehl, M., and Reuter, J. (2025). Millisecond NMPC for swing-up and stabilization of the F uruta pendulum in real world. IEEE Trans. Control Syst. Technol

  16. [16]

    Kiessling, D. (2025). Algorithmic Advances in Feasible-Iterate and SQP Methods for Direct Optimal Control . Ph.D. thesis, KU Leuven

  17. [17]

    Kiessling, D., Pas, P., Astudillo, A., Patrinos, P., and Swevers, J. (2023). Anderson accelerated feasible sequential linear programming. IFAC-PapersOnLine, 56(2). 22nd IFAC World Congress

  18. [18]

    Kiessling, D., Zanelli, A., Nurkanović, A., Gillis, J., Diehl, M., Zeilinger, M., Pipeleers, G., and Swevers, J. (2022). A feasible sequential linear programming algorithm with application to time-optimal path planning problems. In 2022 IEEE 61st Conf. Decis. Control (CDC)

  19. [19]

    Lahr, A., Zanelli, A., Carron, A., and Zeilinger, M.N. (2023). Zero-order optimization for G aussian process-based model predictive control. Eur. J. Control, 74

  20. [20]

    and Todorov, E

    Li, W. and Todorov, E. (2004). Iterative linear quadratic regulator design for nonlinear biological movement systems. In Proc. 1st Int. Conf. Inform. Control, Automat. Robot

  21. [21]

    Messerer, F., Baumg \"a rtner, K., and Diehl, M. (2021). Survey of sequential convex programming and generalized G auss- N ewton methods. ESAIM: Proc. Surveys, 71

  22. [22]

    and Wright, S.J

    Nocedal, J. and Wright, S.J. (2006). Numerical Optimization. Springer Ser. Operations Res. Financial Eng. Springer, 2 edition

  23. [23]

    Numerow, L., Zanelli, A., Carron, A., and Zeilinger, M.N. (2024). Inherently robust suboptimal MPC for autonomous racing with anytime feasible SQP . IEEE Robot. Automat. Lett., 9(7)

  24. [24]

    Nurkanovi\'c, A., Zanelli, A., Albrecht, S., and Diehl, M. (2019). T he A dvanced S tep R eal T ime I teration for N M P C . In Proc. IEEE Conf. Decis. Control (CDC)

  25. [25]

    Ostrowski, A.M. (1973). Solutions of Equations in Euclidean and Banach Spaces. Academic Press, New York and London

  26. [26]

    Krylov Subspace Acceleration for First-Order Splitting Methods in Convex Quadratic Programming

    Pereira, G.B. and Goulart, P.J. (2025). Krylov subspace acceleration for first-order splitting methods in convex quadratic programming. arXiv preprint arXiv:2511.06323

  27. [27]

    and Rebholz, L.G

    Pollock, S. and Rebholz, L.G. (2021). Anderson acceleration for contractive and noncontractive operators. IMA J. Numer. Anal., 41(4)

  28. [28]

    Tassa, Y., Mansard, N., and Todorov, E. (2014). Control-limited differential dynamic programming. In IEEE Int. Conf. Robot. Automat

  29. [29]

    and Li, W

    Todorov, E. and Li, W. (2005). A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems. In Proc. Amer. Control Conf. (ACC)

  30. [30]

    Verschueren, R., Frison, G., Kouzoupis, D., Frey, J., van Duijkeren, N., Zanelli, A., Novoselnik, B., Albin, T., Quirynen, R., and Diehl, M. (2021). acados -- a modular open-source framework for fast embedded optimal control. Math. Program. Computation

  31. [31]

    Verschueren, R., van Duijkeren, N., Quirynen, R., and Diehl, M. (2016). Exploiting convexity in direct optimal control: a sequential convex quadratic programming method. In Proc. IEEE Conf. Decis. Control (CDC)

  32. [32]

    Verschueren, R., Zanon, M., Quirynen, R., and Diehl, M. (2017). A sparsity preserving convexification procedure for indefinite quadratic programs arising in direct optimal control. SIAM J. Optim., 27(3)

  33. [33]

    and Ni, P

    Walker, H.F. and Ni, P. (2011). Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal., 49(4)

  34. [34]

    Zanelli, A., Quirynen, R., and Diehl, M. (2016). An efficient inexact NMPC scheme with stability and feasibility guarantees. In Proc. 10th IFAC Symp. Nonlinear Control Syst. Monterey, CA, USA

  35. [35]

    Zanelli, A., Frey, J., Messerer, F., and Diehl, M. (2021). Zero-order robust nonlinear model predictive control with ellipsoidal uncertainty sets. Proc. IFAC Conf. Nonlinear Model Predictive Control (NMPC)

  36. [36]

    Zhang, J., O'Donoghue, B., and Boyd, S. (2020). Globally convergent type- I A nderson acceleration for nonsmooth fixed-point iterations. SIAM J. Optim., 30(4)