Anderson Acceleration for Linearly Converging SQP-Type Methods
Pith reviewed 2026-05-10 11:19 UTC · model grok-4.3
The pith
Anderson acceleration improves local convergence rates for a family of inexact SQP methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the local convergence behavior of a general family of (inexact) SQP-type methods can benefit from Anderson acceleration, with a simple heuristic introduced to alleviate slower convergence farther from the solution.
What carries the argument
Anderson acceleration applied to the sequence of iterates generated by SQP steps, treated as a fixed-point iteration.
If this is right
- Local convergence rates of linearly converging SQP methods improve when Anderson acceleration is applied to their iterates.
- The same acceleration applies to inexact SQP variants without altering the underlying quadratic subproblem structure.
- A simple heuristic stabilizes performance when starting points lie far from the solution while preserving the accelerated local behavior.
- Implementation in existing SQP solvers such as those in acados yields measurable reductions in iteration count on optimal control tasks.
Where Pith is reading between the lines
- The same acceleration pattern may transfer to other optimization algorithms that admit a fixed-point interpretation, such as certain interior-point or augmented Lagrangian methods.
- In real-time model predictive control, the reduction in outer iterations could lower the overall latency enough to enable higher sampling rates.
- Global convergence guarantees would require additional safeguards on the mixing parameters used by Anderson acceleration.
Load-bearing premise
That the SQP iteration map behaves sufficiently like a fixed-point iteration for Anderson acceleration to accelerate it without introducing instability or divergence.
What would settle it
Numerical tests on standard optimal control problems in which adding Anderson acceleration either fails to reduce iteration counts near the solution or causes divergence in at least one SQP variant.
Figures
read the original abstract
Although Anderson acceleration (AA) is known to speed up fixed-point iterations, it is rarely applied in constrained optimization, in particular sequential quadratic programming (SQP). We show that the local convergence behavior of a general family of (inexact) SQP-type methods can benefit from AA and introduce a simple heuristic to alleviate slower convergence farther from the solution. The method is implemented in the software framework acados. Numerical examples from optimal control illustrate consistent improvements in convergence of different SQP-type methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that Anderson acceleration (AA) can improve the local convergence behavior of a general family of inexact SQP-type methods for constrained optimization problems. It represents SQP iterations as fixed-point maps to which AA applies, introduces a simple heuristic to mitigate slower convergence away from the solution, and demonstrates consistent practical improvements via numerical examples from optimal control implemented in the acados framework.
Significance. If the local convergence analysis and heuristic are rigorously justified, the work would provide a lightweight, structure-preserving acceleration technique for SQP solvers that could be valuable in real-time optimal control and other constrained optimization settings. The open implementation in acados and the numerical illustrations on multiple SQP variants constitute practical strengths.
major comments (3)
- [§3.2] §3.2, the fixed-point representation of inexact SQP: the central claim requires that each SQP step defines a map x^{k+1}=G(x^k) to which AA applies while preserving the linear contraction rate. However, because the QP subproblem data depend nonlinearly on x^k and the solution must satisfy linearized constraints, it is unclear whether the convex combination produced by AA yields a valid subsequent QP or remains in the region where the contraction factor is guaranteed. No explicit bound is given showing that the accelerated map remains contractive near the solution.
- [§4.3] §4.3, Theorem 2 on local convergence: the proof sketch assumes the heuristic (which switches or modifies the AA mixing far from the solution) leaves the asymptotic rate unchanged, yet no perturbation analysis or neighborhood argument is supplied demonstrating that the heuristic does not degrade the linear rate or introduce instability once iterates enter the local basin.
- [§5] §5, numerical examples: while the reported iteration counts and CPU times show improvement, the tables do not include the contraction factors or residual norms that would allow direct verification of the claimed local linear rate improvement; without these, it is difficult to confirm that the observed speedup is due to the theoretical mechanism rather than the heuristic alone.
minor comments (2)
- [§2] Notation for the Anderson mixing coefficients and the memory length m is introduced without a clear reference to the standard AA formulation; a brief reminder of the update formula would improve readability.
- [§5] The abstract states 'consistent improvements' but the text does not quantify how often the heuristic is activated versus pure AA; a short table or sentence on activation frequency would clarify the practical role of the heuristic.
Simulated Author's Rebuttal
We thank the referee for the thoughtful and constructive report. The comments identify important points where the analysis and presentation can be clarified and strengthened. We address each major comment below and will incorporate the suggested revisions into the next version of the manuscript.
read point-by-point responses
-
Referee: [§3.2] §3.2, the fixed-point representation of inexact SQP: the central claim requires that each SQP step defines a map x^{k+1}=G(x^k) to which AA applies while preserving the linear contraction rate. However, because the QP subproblem data depend nonlinearly on x^k and the solution must satisfy linearized constraints, it is unclear whether the convex combination produced by AA yields a valid subsequent QP or remains in the region where the contraction factor is guaranteed. No explicit bound is given showing that the accelerated map remains contractive near the solution.
Authors: We agree that an explicit neighborhood argument would make the application of Anderson acceleration to the SQP fixed-point map more rigorous. In the revised manuscript we will add a short lemma establishing that, for iterates sufficiently close to the solution, the convex combination produced by AA remains inside the open set on which the underlying SQP map G is contractive. The argument relies on the local Lipschitz continuity of the QP data and the fact that the feasible set of the linearized constraints is preserved under small perturbations of the right-hand side. revision: yes
-
Referee: [§4.3] §4.3, Theorem 2 on local convergence: the proof sketch assumes the heuristic (which switches or modifies the AA mixing far from the solution) leaves the asymptotic rate unchanged, yet no perturbation analysis or neighborhood argument is supplied demonstrating that the heuristic does not degrade the linear rate or introduce instability once iterates enter the local basin.
Authors: The referee correctly notes that the interaction between the heuristic and the local linear rate requires a more precise statement. We will revise the proof of Theorem 2 to include a perturbation argument: once all previous iterates lie inside a sufficiently small ball around the solution, the heuristic is automatically deactivated (or its modification is bounded by an arbitrarily small term), so the asymptotic contraction factor of the accelerated iteration coincides with that of the unaccelerated map. revision: yes
-
Referee: [§5] §5, numerical examples: while the reported iteration counts and CPU times show improvement, the tables do not include the contraction factors or residual norms that would allow direct verification of the claimed local linear rate improvement; without these, it is difficult to confirm that the observed speedup is due to the theoretical mechanism rather than the heuristic alone.
Authors: We will augment Section 5 with additional columns (or a supplementary table) that report the observed contraction factors ||x^{k+1}−x^*||/||x^k−x^*|| and the KKT residual norms for both the plain and accelerated variants on each example. This will allow direct numerical verification of the local linear-rate improvement once the iterates enter the basin identified in the analysis. revision: yes
Circularity Check
No significant circularity; derivation applies known AA to SQP without self-referential reduction
full rationale
The paper represents SQP-type iterations as fixed-point maps x^{k+1}=G(x^k) to which Anderson acceleration is applied, then analyzes local linear convergence rates under standard assumptions on the underlying SQP contraction. No equation or theorem reduces by construction to a fitted parameter or self-citation; the heuristic for far-from-solution iterates is presented as an empirical safeguard without claiming it leaves the local contraction factor invariant by definition. The central result rests on independent fixed-point theory and numerical examples rather than tautological renaming or imported uniqueness theorems from the same authors.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
, " * write output.state after.block = add.period write newline
ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sent...
-
[2]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
-
[3]
Anderson, D.G. (1965). Iterative procedures for nonlinear integral equations. J. ACM (JACM), 12(4)
work page 1965
-
[4]
Baumg \"a rtner, K., Zanelli, A., and Diehl, M. (2019). Zero-order moving horizon estimation. In IEEE Conf. Decis. Control
work page 2019
-
[5]
Baum \- g \"a rtner, K. and Diehl, M. (2022). The extended G auss- N ewton method for nonconvex loss functions and its application to time-optimal model predictive control. In Proc. Amer. Control Conf. (ACC)
work page 2022
-
[6]
Baumgärtner, K., Frey, J., Hashemi, R., and Diehl, M. (2021). Zero-order moving horizon estimation for large-scale nonlinear processes. Comput. & Chem. Eng., 154
work page 2021
-
[7]
Bock, H.G., Diehl, M., Kostina, E.A., and Schl\"oder, J.P. (2007). Constrained optimal feedback control of systems governed by large differential algebraic equations. In Real-Time Online PDE-Constrained Optim. SIAM
work page 2007
-
[8]
Delaiss \'e , N., Demeester, T., Haelterman, R., and Degroote, J. (2023). Quasi-newton methods for partitioned simulation of fluid--structure interaction reviewed in the generalized B royden framework. Arch. Comput. Methods Eng., 30(5)
work page 2023
-
[9]
Evans, C., Pollock, S., Rebholz, L.G., and Xiao, M. (2020). A proof that anderson acceleration improves the convergence rate in linearly converging fixed-point methods (but not in those converging quadratically). SIAM J. Numer. Anal., 58(1)
work page 2020
- [10]
- [11]
-
[12]
Frison, G., Kouzoupis, D., Sartor, T., Zanelli, A., and Diehl, M. (2018). BLASFEO : Basic linear algebra subroutines for embedded optimization. ACM Trans. Math. Softw. (TOMS), 44(4)
work page 2018
-
[13]
Garstka, M., Cannon, M., and Goulart, P. (2022). Safeguarded anderson acceleration for parametric nonexpansive operators. In 2022 Eur. Control Conf. (ECC)
work page 2022
-
[14]
Gros, S., Zanon, M., Quirynen, R., Bemporad, A., and Diehl, M. (2020). From linear to nonlinear MPC : bridging the gap via the real-time iteration. Int. J. Control, 93(1)
work page 2020
-
[15]
Homburger, H., Frey, J., Wirtensohn, S., Diehl, M., and Reuter, J. (2025). Millisecond NMPC for swing-up and stabilization of the F uruta pendulum in real world. IEEE Trans. Control Syst. Technol
work page 2025
-
[16]
Kiessling, D. (2025). Algorithmic Advances in Feasible-Iterate and SQP Methods for Direct Optimal Control . Ph.D. thesis, KU Leuven
work page 2025
-
[17]
Kiessling, D., Pas, P., Astudillo, A., Patrinos, P., and Swevers, J. (2023). Anderson accelerated feasible sequential linear programming. IFAC-PapersOnLine, 56(2). 22nd IFAC World Congress
work page 2023
-
[18]
Kiessling, D., Zanelli, A., Nurkanović, A., Gillis, J., Diehl, M., Zeilinger, M., Pipeleers, G., and Swevers, J. (2022). A feasible sequential linear programming algorithm with application to time-optimal path planning problems. In 2022 IEEE 61st Conf. Decis. Control (CDC)
work page 2022
-
[19]
Lahr, A., Zanelli, A., Carron, A., and Zeilinger, M.N. (2023). Zero-order optimization for G aussian process-based model predictive control. Eur. J. Control, 74
work page 2023
-
[20]
Li, W. and Todorov, E. (2004). Iterative linear quadratic regulator design for nonlinear biological movement systems. In Proc. 1st Int. Conf. Inform. Control, Automat. Robot
work page 2004
-
[21]
Messerer, F., Baumg \"a rtner, K., and Diehl, M. (2021). Survey of sequential convex programming and generalized G auss- N ewton methods. ESAIM: Proc. Surveys, 71
work page 2021
-
[22]
Nocedal, J. and Wright, S.J. (2006). Numerical Optimization. Springer Ser. Operations Res. Financial Eng. Springer, 2 edition
work page 2006
-
[23]
Numerow, L., Zanelli, A., Carron, A., and Zeilinger, M.N. (2024). Inherently robust suboptimal MPC for autonomous racing with anytime feasible SQP . IEEE Robot. Automat. Lett., 9(7)
work page 2024
-
[24]
Nurkanovi\'c, A., Zanelli, A., Albrecht, S., and Diehl, M. (2019). T he A dvanced S tep R eal T ime I teration for N M P C . In Proc. IEEE Conf. Decis. Control (CDC)
work page 2019
-
[25]
Ostrowski, A.M. (1973). Solutions of Equations in Euclidean and Banach Spaces. Academic Press, New York and London
work page 1973
-
[26]
Krylov Subspace Acceleration for First-Order Splitting Methods in Convex Quadratic Programming
Pereira, G.B. and Goulart, P.J. (2025). Krylov subspace acceleration for first-order splitting methods in convex quadratic programming. arXiv preprint arXiv:2511.06323
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[27]
Pollock, S. and Rebholz, L.G. (2021). Anderson acceleration for contractive and noncontractive operators. IMA J. Numer. Anal., 41(4)
work page 2021
-
[28]
Tassa, Y., Mansard, N., and Todorov, E. (2014). Control-limited differential dynamic programming. In IEEE Int. Conf. Robot. Automat
work page 2014
- [29]
-
[30]
Verschueren, R., Frison, G., Kouzoupis, D., Frey, J., van Duijkeren, N., Zanelli, A., Novoselnik, B., Albin, T., Quirynen, R., and Diehl, M. (2021). acados -- a modular open-source framework for fast embedded optimal control. Math. Program. Computation
work page 2021
-
[31]
Verschueren, R., van Duijkeren, N., Quirynen, R., and Diehl, M. (2016). Exploiting convexity in direct optimal control: a sequential convex quadratic programming method. In Proc. IEEE Conf. Decis. Control (CDC)
work page 2016
-
[32]
Verschueren, R., Zanon, M., Quirynen, R., and Diehl, M. (2017). A sparsity preserving convexification procedure for indefinite quadratic programs arising in direct optimal control. SIAM J. Optim., 27(3)
work page 2017
- [33]
-
[34]
Zanelli, A., Quirynen, R., and Diehl, M. (2016). An efficient inexact NMPC scheme with stability and feasibility guarantees. In Proc. 10th IFAC Symp. Nonlinear Control Syst. Monterey, CA, USA
work page 2016
-
[35]
Zanelli, A., Frey, J., Messerer, F., and Diehl, M. (2021). Zero-order robust nonlinear model predictive control with ellipsoidal uncertainty sets. Proc. IFAC Conf. Nonlinear Model Predictive Control (NMPC)
work page 2021
-
[36]
Zhang, J., O'Donoghue, B., and Boyd, S. (2020). Globally convergent type- I A nderson acceleration for nonsmooth fixed-point iterations. SIAM J. Optim., 30(4)
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.