pith. sign in

arxiv: 2604.05566 · v2 · submitted 2026-04-07 · 🧮 math.OC

Accelerating Full-Scale Nonlinear Model Predictive Control via Surrogate Dynamics Optimization

Pith reviewed 2026-05-10 19:19 UTC · model grok-4.3

classification 🧮 math.OC
keywords nonlinear model predictive controlsurrogate modelwarm startoptimizationmachine learningpressurized water reactorload following
0
0 comments X

The pith

A machine learning surrogate solves a lightweight auxiliary problem to warm-start full-scale nonlinear model predictive control and accelerate its convergence.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Surrogate Dynamics Optimization as a way to reduce the computational burden of nonlinear model predictive control when the embedded process model is expensive to simulate. It trains a surrogate model on limited data to approximate the original optimization and supply initial guesses that let the full-scale solver reach solutions faster. In a 24-hour load-following control task for a pressurized water reactor, the method delivers quicker convergence inside a fixed budget and requires two orders of magnitude less data to train than behavior cloning. A reader would care because this combination of speed and data efficiency could make NMPC viable for more industrial systems that currently cannot afford repeated full-model simulations.

Core claim

Surrogate Dynamics Optimization uses a machine learning surrogate to solve an auxiliary, lower-cost optimization problem whose solution serves as a warm start for the full-scale NMPC solver. When applied to 24-hour optimal load-following control of a pressurized water reactor, this produces consistent gains in convergence speed within a fixed computational budget and cuts the cost of generating training data by two orders of magnitude relative to behavior cloning, all while remaining compatible with existing simulation and optimization tools.

What carries the argument

Surrogate Dynamics Optimization (SDO), the framework that replaces the original dynamics with a learned surrogate to create a lightweight auxiliary problem whose solution initializes the full-scale NMPC solver.

Load-bearing premise

The surrogate must generate initial guesses close enough to the true optimum that the full-scale solver converges more quickly without losing closed-loop performance or safety.

What would settle it

In the pressurized water reactor simulation, running SDO and measuring no reduction in the number of iterations required for NMPC convergence or an increase in tracking error would show that the warm-start benefit does not hold.

Figures

Figures reproduced from arXiv: 2604.05566 by Alain Grosset\^ete, Guillaume Dupr\'e, Nicolas Vayatis (CB), Perceval Beja-Battais (CB).

Figure 1
Figure 1. Figure 1: Comparison of the objective cost achieved after a [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Training process using reference sequences [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison between surrogate accuracy (MSE on validation set) and the relative gain [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of relevant state and algebraic variables during a reference transient using the control input [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

Driven by advances in hardware and software technologies, nonlinear model predictive control (NMPC) has gained increasing adoption in both industry and academia over the past decades. However, its practical deployment is often limited by the computational cost of simulating the embedded process model, especially for high-dimensional, multi-time-scale, or nonlinear systems commonly found in real-world applications. Thus, this paper introduces Surrogate Dynamics Optimization (SDO), a warm-start framework for full-scale NMPC to address the limitation of standard initialization strategies. The approach relies on a machine learning surrogate model to solve a lightweight auxiliary problem that approximates the original one. The methodology is reproducible and compatible with inhouse simulation and optimization tools, a key consideration in industrial contexts. Data efficiency of SDO, as well as the impact of surrogate design on the overall performance, are evaluated through a non-trivial simulation case study: 24-hour optimal load-following control of a pressurized water reactor. The results show consistent improvements in NMPC convergence speed within a fixed computational budget, while reducing training data generation costs by two orders of magnitude compared to behavior cloning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes Surrogate Dynamics Optimization (SDO), a warm-start framework for full-scale nonlinear model predictive control (NMPC). It employs a machine learning surrogate to solve a lightweight auxiliary optimization problem that approximates the original NMPC, generating initial guesses to accelerate the full-scale solver. The method is evaluated on a 24-hour optimal load-following control case study for a pressurized water reactor, claiming consistent convergence speed improvements within a fixed computational budget and a two-order-of-magnitude reduction in training data generation costs relative to behavior cloning. The approach is presented as reproducible and compatible with in-house industrial simulation and optimization tools.

Significance. If the surrogate-generated initial guesses remain sufficiently close to the true optima, SDO could meaningfully lower the computational barrier to deploying NMPC on high-dimensional nonlinear systems while preserving closed-loop performance and safety. The explicit evaluation of surrogate design choices, the reported data-efficiency gain over behavior cloning, and the emphasis on reproducibility with in-house tools are concrete strengths that would support practical adoption if the central approximation-quality assumption is verified.

major comments (2)
  1. Abstract: the claim of 'consistent improvements in NMPC convergence speed within a fixed computational budget' is stated without quantitative metrics (e.g., iteration counts, wall-clock times, or closed-loop performance indicators with error bars), ablation results on surrogate accuracy, or explicit comparisons of safety metrics, leaving the magnitude and reliability of the reported gains unassessable from the provided description.
  2. Abstract: no explicit error bounds on the surrogate dynamics approximation, no worst-case analysis of error propagation to full-scale solver convergence, and no demonstration that closed-loop trajectories remain unchanged when the surrogate is degraded are supplied. This leaves the load-bearing assumption—that surrogate initial guesses are close enough to guarantee faster convergence without degrading performance or safety—unverified for the PWR case study.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We address each major comment point by point below, indicating where revisions will be made to improve clarity and assessability while remaining faithful to the empirical scope of the work.

read point-by-point responses
  1. Referee: Abstract: the claim of 'consistent improvements in NMPC convergence speed within a fixed computational budget' is stated without quantitative metrics (e.g., iteration counts, wall-clock times, or closed-loop performance indicators with error bars), ablation results on surrogate accuracy, or explicit comparisons of safety metrics, leaving the magnitude and reliability of the reported gains unassessable from the provided description.

    Authors: We agree that the abstract would benefit from greater specificity to allow readers to assess the gains directly. The full manuscript already contains these quantitative details, ablation studies on surrogate design choices, and safety metric comparisons in the 24-hour PWR case study (Sections 4 and 5). In the revised manuscript we will update the abstract to include explicit references to key metrics such as the two-order-of-magnitude reduction in training data generation costs and the observed improvements in solver iterations and wall-clock time within the fixed budget, while noting the ablation and safety results. revision: yes

  2. Referee: Abstract: no explicit error bounds on the surrogate dynamics approximation, no worst-case analysis of error propagation to full-scale solver convergence, and no demonstration that closed-loop trajectories remain unchanged when the surrogate is degraded are supplied. This leaves the load-bearing assumption—that surrogate initial guesses are close enough to guarantee faster convergence without degrading performance or safety—unverified for the PWR case study.

    Authors: The manuscript empirically supports the assumption via the PWR load-following experiments, where surrogate warm starts produce faster convergence while preserving closed-loop trajectories, constraint satisfaction, and safety metrics comparable to the full-scale baseline. We acknowledge that the work does not supply theoretical error bounds, worst-case propagation analysis, or explicit tests with deliberately degraded surrogates, as the contribution centers on practical data-efficient warm-starting rather than formal guarantees. We will revise the abstract to more clearly reference the empirical verification in the case study and add a short limitations paragraph in the conclusions noting the absence of theoretical bounds as an avenue for future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected in SDO framework or empirical claims

full rationale

The paper proposes Surrogate Dynamics Optimization (SDO) as a warm-start technique that trains an ML surrogate to solve a lightweight auxiliary approximation of the full NMPC problem, then uses its solution to initialize the full-scale solver. All reported gains in convergence speed within a fixed budget and two-order-of-magnitude reduction in training-data cost versus behavior cloning are obtained from direct simulation on the 24-hour PWR load-following case study. No equations, uniqueness theorems, or fitted parameters are presented that would make the performance metrics equivalent to the inputs by construction; the surrogate is trained separately on generated data, the auxiliary problem is distinct from the original NMPC objective, and the evaluation metrics are externally measured quantities rather than self-referential definitions or renamings of known results.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the existence of a sufficiently accurate yet cheap surrogate model and on the assumption that warm-start quality directly translates into faster convergence of the full-scale solver; no explicit free parameters, axioms, or invented entities are named in the abstract.

pith-pipeline@v0.9.0 · 5508 in / 1083 out tokens · 23522 ms · 2026-05-10T19:19:14.294713+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 1 internal anchor

  1. [1]

    A survey of industrial model predictive control technology,

    S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive control technology,” Control engineering practice , vol. 11, no. 7, pp. 733–764, 2003

  2. [2]

    A survey of numerical methods for optimal control,

    A. V . Rao, “A survey of numerical methods for optimal control,” Advances in the Astronautical Sciences , vol. 135, no. 1, pp. 497–528, 2009

  3. [3]

    Efficient numerical methods for nonlinear MPC and moving horizon estimation,

    M. Diehl, H. J. Ferreau, and N. Haverbeke, “Efficient numerical methods for nonlinear MPC and moving horizon estimation,” in Nonlinear model predictive control . Springer, 2009, pp. 391–417

  4. [4]

    Review on model predictive control: An engineering perspective,

    M. Schwenzer, M. Ay, T. Bergs, and D. Abel, “Review on model predictive control: An engineering perspective,” The International Journal of Advanced Manufacturing Technology , vol. 117, no. 5, pp. 1327–1349, 2021

  5. [5]

    do-mpc: Towards FAIR nonlinear and robust model predictive control,

    F. Fiedler, B. Karg, L. Lüken, D. Brandner, M. Heinlein, F. Brabender, and S. Lucia, “do-mpc: Towards FAIR nonlinear and robust model predictive control,”Control Engineering Practice, vol. 140, p. 105676, 2023

  6. [6]

    Fast NMPC schemes for regulatory and economic NMPC–a review,

    I. J. Wolf and W. Marquardt, “Fast NMPC schemes for regulatory and economic NMPC–a review,” Journal of Process Control , vol. 44, pp. 162–183, 2016

  7. [7]

    Faster model predictive control via self-supervised initialization learning,

    Z. Li, X. Wang, L. Chen, R. Paleja, S. Nageshrao, and M. Gombo- lay, “Faster model predictive control via self-supervised initialization learning,” arXiv preprint arXiv:2408.03394 , 2024

  8. [8]

    Learning-based model predictive control: Toward safe learning in control,

    L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger, “Learning-based model predictive control: Toward safe learning in control,” Annual Review of Control, Robotics, and Autonomous Sys- tems, vol. 3, no. 1, pp. 269–296, 2020

  9. [9]

    Machine learning accelerated real-time model predictive control for power systems,

    R. R. Hossain and R. Kumar, “Machine learning accelerated real-time model predictive control for power systems,” IEEE/CAA Journal of Automatica Sinica, vol. 10, no. 4, pp. 916–930, 2023

  10. [10]

    Accelerating nonlinear model predictive control through machine learning,

    Y . Vaupel, N. C. Hamacher, A. Caspari, A. Mhamdi, I. G. Kevrekidis, and A. Mitsos, “Accelerating nonlinear model predictive control through machine learning,” Journal of process control , vol. 92, pp. 261–270, 2020

  11. [11]

    Learning- aided warmstart of model predictive control in uncertain fast-changing traffic,

    M.-K. Bouzidi, Y . Yao, D. Goehring, and J. Reichardt, “Learning- aided warmstart of model predictive control in uncertain fast-changing traffic,” in 2024 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2024, pp. 14 265–14 271

  12. [12]

    Real-time whole-body model predictive control for bipedal locomotion with novel kino-dynamic model and warm-start method,

    J. Kim, H. Lee, and J. Park, “Real-time whole-body model predictive control for bipedal locomotion with novel kino-dynamic model and warm-start method,” International Journal of Control, Automation and Systems, vol. 23, no. 11, pp. 3338–3348, 2025

  13. [13]

    Optimal control and model reduction of nonlinear DAE models,

    J. Sjöberg, “Optimal control and model reduction of nonlinear DAE models,” Ph.D. dissertation, Institutionen för systemteknik, 2008

  14. [14]

    Reinforcement learning: A survey,

    L. P. Kaelbling, M. L. Littman, and A. W. Moore, “Reinforcement learning: A survey,” Journal of artificial intelligence research , vol. 4, pp. 237–285, 1996

  15. [15]

    Is behavior cloning all you need? understanding horizon in imitation learning,

    D. J. Foster, A. Block, and D. Misra, “Is behavior cloning all you need? understanding horizon in imitation learning,” Advances in Neural Information Processing Systems , vol. 37, pp. 120 602–120 666, 2024

  16. [16]

    Machine learning based digital twin for dynamical systems with multiple time-scales,

    S. Chakraborty and S. Adhikari, “Machine learning based digital twin for dynamical systems with multiple time-scales,” Computers & structures, vol. 243, p. 106410, 2021

  17. [17]

    Nonlinear predictive control of a bioreactor by surrogate model approximation of flux balance analysis,

    R. D. de Oliveira, M. N. Guedes, J. Matias, and G. A. Le Roux, “Nonlinear predictive control of a bioreactor by surrogate model approximation of flux balance analysis,” Industrial & Engineering Chemistry Research, vol. 60, no. 40, pp. 14 464–14 475, 2021

  18. [18]

    Surrogate-based optimization,

    Z.-H. Han, K.-S. Zhang et al., “Surrogate-based optimization,” Real- world applications of genetic algorithms, vol. 343, pp. 343–362, 2012

  19. [19]

    Learning for casadi: Data-driven models in numerical optimization,

    T. Salzmann, J. Arrizabalaga, J. Andersson, M. Pavone, and M. Ryll, “Learning for casadi: Data-driven models in numerical optimization,” in 6th Annual Learning for Dynamics & Control Conference . PMLR, 2024, pp. 541–553

  20. [20]

    Real-time neural MPC: Deep learning model predictive control for quadrotors and agile robotic platforms,

    T. Salzmann, E. Kaufmann, J. Arrizabalaga, M. Pavone, D. Scara- muzza, and M. Ryll, “Real-time neural MPC: Deep learning model predictive control for quadrotors and agile robotic platforms,” IEEE Robotics and Automation Letters , vol. 8, no. 4, pp. 2397–2404, 2023

  21. [21]

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,

    M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,” Journal of Computational physics , vol. 378, pp. 686–707, 2019

  22. [22]

    Physics-informed machine learning,

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, “Physics-informed machine learning,” Nature Reviews Physics, vol. 3, no. 6, pp. 422–440, 2021

  23. [23]

    Adam: A Method for Stochastic Optimization

    D. P. Kingma and J. Ba, “Adam: A method for stochastic opti- mization,” in International Conference on Learning Representations (ICLR), 2015. [Online]. Available: https://arxiv.org/abs/1412.6980

  24. [24]

    R. T. Rockafellar and R. J. Wets, Variational analysis. Springer, 1998

  25. [25]

    J. F. Bonnans and A. Shapiro, Perturbation analysis of optimization problems. Springer Science & Business Media, 2013

  26. [26]

    The safety filter: A unified view of safety-critical control in autonomous systems,

    K.-C. Hsu, H. Hu, and J. F. Fisac, “The safety filter: A unified view of safety-critical control in autonomous systems,” Annual Review of Control, Robotics, and Autonomous Systems , vol. 7, 2023

  27. [27]

    Enhanced flexibility of PWRs (Mode A) using an efficient NMPC-based bo- ration/dilution system,

    G. Dupré, A. Grossetête, P. Chevrel, and M. Yagoubi, “Enhanced flexibility of PWRs (Mode A) using an efficient NMPC-based bo- ration/dilution system,” in 2021 european control conference (ECC) . IEEE, 2021, pp. 1092–1098

  28. [28]

    The OAPS solution: a real-time pre- dictive system for flexible PWR operation,

    G. Dupré and A. Grossetête, “The OAPS solution: a real-time pre- dictive system for flexible PWR operation,” in International Congress on Advances in Nuclear Power Plants (ICAPP) , 2025, accepted, to appear

  29. [29]

    Reuss, Précis de neutronique

    P. Reuss, Précis de neutronique . EDP sciences, 2003

  30. [30]

    W. M. Stacey, Nuclear reactor physics, 3rd ed. John Wiley & Sons, 2018

  31. [31]

    Optuna: A next-generation hyperparameter optimization framework,

    T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” in Proceed- ings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining , 2019, pp. 2623–2631