pith. machine review for the scientific record. sign in

arxiv: 2604.21030 · v1 · submitted 2026-04-22 · 📡 eess.SY · cs.AI· cs.RO· cs.SY· math.OC

Recognition: unknown

A Systematic Review and Taxonomy of Reinforcement Learning-Model Predictive Control Integration for Linear Systems

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:19 UTC · model grok-4.3

classification 📡 eess.SY cs.AIcs.ROcs.SYmath.OC
keywords reinforcement learningmodel predictive controllinear systemssystematic literature reviewtaxonomyintegration strategieshybrid controladaptive control
0
0 comments X

The pith

This review organizes RL-MPC integrations for linear systems into a five-dimensional taxonomy and synthesizes recurring design patterns from the literature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a systematic literature review of how reinforcement learning is combined with model predictive control in linear or linearized systems. It classifies published studies across RL functional roles, algorithm classes, MPC formulations, cost-function structures, and application domains. A cross-dimensional synthesis identifies common pairings and trends while noting persistent issues such as computational cost, sample efficiency, robustness, and stability guarantees. The resulting structure supplies a reference that researchers can use to position new work or select components for hybrid controllers. By mapping the fragmented literature, the review aims to support more systematic development of adaptive constrained control.

Core claim

The literature on RL-MPC integrations for linear systems can be organized through a taxonomy based on RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains, with cross-synthesis revealing recurring design patterns, methodological trends, and challenges including computational burden, sample efficiency, robustness, and closed-loop guarantees.

What carries the argument

A multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains that structures the reviewed studies and enables identification of design patterns through cross-dimensional synthesis.

If this is right

  • New RL-MPC proposals can be located in the taxonomy to show their relation to existing combinations.
  • Observed associations between RL algorithm classes and MPC formulations can guide component selection for specific domains.
  • Documented challenges such as computational burden direct attention to efficiency improvements in future designs.
  • The synthesis supports evaluation of closed-loop stability and robustness when RL augments linear MPC.
  • Patterns in cost-function structures and application domains can inform tailored controller development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same taxonomy approach could later be applied to nonlinear systems to compare how integration strategies differ.
  • Controlled experiments could test whether the most frequent pattern combinations actually deliver the reported performance benefits.
  • The reference structure may help avoid redundant exploration of already-covered RL-MPC pairings.
  • The organization highlights opportunities to incorporate additional guarantees from safe RL or distributed control into the hybrid setting.

Load-bearing premise

The five chosen taxonomy dimensions are sufficient to capture all major integration strategies without omissions, and the database search through 2025 captured a representative set of relevant peer-reviewed studies.

What would settle it

Discovery of multiple significant peer-reviewed papers on RL-MPC for linear systems that cannot be placed in any of the five taxonomy categories or were missed by the systematic search would show the review is incomplete.

Figures

Figures reproduced from arXiv: 2604.21030 by Davoud Nikkhouy, Mahshad Rastegarmoghaddam, Malihe Abdolbaghi, Mohsen Jalaeian Farimani, Roya Khalili Amirabadi, Shima Samadzadeh.

Figure 1
Figure 1. Figure 1: This figure illustrates the systematic review protocol employed in this study. Developed in accordance with [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Taxonomy of Reinforcement Learning roles in MPC–RL integrations. [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The co-occurrence between categories based on RL roles in MPC–RL integrations. [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Temporal distribution of RL roles applied in MPC–RL integrations. [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Taxonomy of Reinforcement Learning algorithms implemented in MPC–RL frameworks. [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Temporal evolution of RL algorithm types adopted in MPC–RL literature. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Taxonomy of Cost Functions in MPC–RL Linear Control [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Taxonomy of MPC Formulations 21 [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Taxonomy of MPC–RL application fields in linear control [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Application fields of MPC–RL integration during years [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: MPC type vs. RL algorithms in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: MPC type vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: RL algorithms vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Application fields vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: MPC type vs. application fields in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p025_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: MPC cost functions vs. application fields in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p026_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: MPC cost functions vs. MPC type in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p026_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: MPC cost functions vs. RL algorithms in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p027_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: MPC cost functions vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p027_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: RL algorithms vs. application fields in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p028_20.png] view at source ↗
read the original abstract

The integration of Model Predictive Control (MPC) and Reinforcement Learning (RL) has emerged as a promising paradigm for constrained decision-making and adaptive control. MPC offers structured optimization, explicit constraint handling, and established stability tools, whereas RL provides data-driven adaptation and performance improvement in the presence of uncertainty and model mismatch. Despite the rapid growth of research on RL--MPC integration, the literature remains fragmented, particularly for control architectures built on linear or linearized predictive models. This paper presents a comprehensive Systematic Literature Review (SLR) of RL--MPC integrations for linear and linearized systems, covering peer-reviewed and formally indexed studies published until 2025. The reviewed studies are organized through a multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains. In addition, a cross-dimensional synthesis is conducted to identify recurring design patterns and reported associations among these dimensions within the reviewed corpus. The review highlights methodological trends, commonly adopted integration strategies, and recurring practical challenges, including computational burden, sample efficiency, robustness, and closed-loop guarantees. The resulting synthesis provides a structured reference for researchers and practitioners seeking to design or analyze RL--MPC architectures based on linear or linearized predictive control formulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper presents a comprehensive systematic literature review (SLR) of RL-MPC integrations for linear and linearized systems. It organizes the reviewed studies via a multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains; conducts a cross-dimensional synthesis to identify recurring design patterns and associations; and highlights methodological trends, integration strategies, and practical challenges including computational burden, sample efficiency, robustness, and closed-loop guarantees. The result is positioned as a structured reference for researchers and practitioners.

Significance. If the underlying SLR methodology proves complete and the taxonomy exhaustive, the work would provide a useful organizing framework for the fragmented literature on RL-MPC hybrids in linear systems, potentially guiding design choices and identifying open challenges. The cross-dimensional synthesis is a constructive element that could help the community move beyond ad-hoc integrations.

major comments (1)
  1. [Abstract / Methods] Abstract and (presumed) Methods section: the manuscript asserts a 'comprehensive' SLR covering peer-reviewed studies until 2025 but supplies no concrete details on search strings, databases queried, inclusion/exclusion criteria, number of papers screened or included, quality assessment, or PRISMA flow. Without these elements the representativeness of the corpus and the validity of the taxonomy and cross-synthesis cannot be verified, directly undercutting the central claim of comprehensiveness.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review. We agree that the systematic literature review methodology must be documented with greater transparency to support the claim of comprehensiveness. We address the single major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and (presumed) Methods section: the manuscript asserts a 'comprehensive' SLR covering peer-reviewed studies until 2025 but supplies no concrete details on search strings, databases queried, inclusion/exclusion criteria, number of papers screened or included, quality assessment, or PRISMA flow. Without these elements the representativeness of the corpus and the validity of the taxonomy and cross-synthesis cannot be verified, directly undercutting the central claim of comprehensiveness.

    Authors: We acknowledge the validity of this observation. The submitted manuscript does not contain a dedicated Methods section or PRISMA flow diagram describing the SLR protocol. In the revised version we will insert a new Methods section that explicitly reports: the databases searched (IEEE Xplore, Scopus, Web of Science, ScienceDirect, and SpringerLink), the precise search strings and Boolean combinations employed, the inclusion and exclusion criteria (peer-reviewed English-language studies on RL-MPC for linear or linearized systems published through 2025, with clear focus on integration architectures), the screening process, the number of records identified, screened, and finally included, any quality-assessment steps applied, and a PRISMA flow diagram. These additions will enable independent verification of corpus representativeness and will not alter the taxonomy or cross-dimensional synthesis already presented. revision: yes

Circularity Check

0 steps flagged

No circularity in literature review synthesis

full rationale

This paper is a systematic literature review synthesizing external peer-reviewed studies on RL-MPC integrations. It contains no original derivations, equations, predictions, fitted parameters, or self-referential definitions that could reduce to inputs by construction. The taxonomy dimensions and cross-dimensional synthesis draw directly from the reviewed corpus without any load-bearing self-citation chains or ansatz smuggling. The central claims rest on the external literature rather than internal tautologies, making the work self-contained against the circularity criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that the literature corpus is comprehensively and representatively captured by standard academic indexing up to 2025 and that the selected taxonomy dimensions adequately partition the design space.

axioms (1)
  • domain assumption The body of peer-reviewed and formally indexed studies on RL-MPC for linear systems published until 2025 can be exhaustively identified through standard academic databases and search procedures.
    The review's claim of comprehensive coverage depends on this premise.

pith-pipeline@v0.9.0 · 5565 in / 1423 out tokens · 55615 ms · 2026-05-09T23:19:06.068217+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

92 extracted references · 82 canonical work pages · 1 internal anchor

  1. [1]

    S. Qin, T. A. Badgwell, A survey of industrial model predictive control technology, Control Engi- neering Practice 11 (7) (2003) 733–764.doi:10.1016/S0967-0661(02)00186-7

  2. [2]

    D.Q.Mayne, Modelpredictivecontrol: Recentdevelopmentsandfuturepromise, Automatica50(12) (2014) 2967–2986.doi:10.1016/j.automatica.2014.10.128

  3. [3]

    J. B. Rawlings, D. Q. Mayne, M. Diehl, Model Predictive Control: Theory, Computation, and Design, 2nd Edition, Nob Hill Publishing, 2017. 28

  4. [4]

    E. F. Camacho, C. Bordons, Model Predictive Control, 2nd Edition, Advanced Textbooks in Control and Signal Processing, Springer, London, UK, 2007.doi:10.1007/978-0-85729-398-5

  5. [5]

    Wang, Model Predictive Control System Design and Implementation Using MATLAB, Springer, London, UK, 2009.doi:10.1007/978-1-84882-331-0

    L. Wang, Model Predictive Control System Design and Implementation Using MATLAB, Springer, London, UK, 2009.doi:10.1007/978-1-84882-331-0

  6. [6]

    R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, 2nd Edition, Adaptive Com- putation and Machine Learning Series, MIT Press, Cambridge, MA, USA, 2018

  7. [7]

    Khalili Amirabadi, M

    R. Khalili Amirabadi, M. Jalaeian-Farimani, O. S. Fard, Lstm-empowered reinforcement learning in bi-level optimal control for nonlinear systems with uncertain dynamics, ISA Transactions 168 (2026) 465–478.doi:10.1016/j.isatra.2025.11.027

  8. [8]

    T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous controlwithdeepreinforcementlearning, in: ProceedingsoftheInternationalConferenceonLearning Representations (ICLR), 2016.arXiv:1509.02971

  9. [9]

    V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Ried- miller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis, Human-level control through deep reinforcement learning, Nature 518 (7540) (2015) 529–533.doi:10.1038/nature14236

  10. [10]

    García, F

    J. García, F. Fernández, A comprehensive survey on safe reinforcement learning, Journal of Machine Learning Research 16 (1) (2015) 1437–1480.doi:10.5555/2789272.2886795

  11. [11]

    Brunke, M

    L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, A. P. Schoellig, Safe learning in robotics: From learning-based control to safe reinforcement learning, Annual Re- view of Control, Robotics, and Autonomous Systems 5 (2022) 411–444.doi:10.1146/ annurev-control-042920-020211

  12. [12]

    R. K. Amirabadi, O. S. Fard, M. J. Farimani, Towards optimal control of hpv model using safe reinforcement learning with actor-critic neural networks, Expert Systems with Applications 264 (2025) 125783.doi:10.1016/j.eswa.2024.125783

  13. [13]

    Kiumarsi, K

    B. Kiumarsi, K. G. Vamvoudakis, H. Modares, F. L. Lewis, Optimal and autonomous control using reinforcement learning: A survey, IEEE Transactions on Neural Networks and Learning Systems 29 (6) (2018) 2042–2062.doi:10.1109/TNNLS.2017.2773458

  14. [14]

    Hewing, J

    L. Hewing, J. Kabzan, M. N. Zeilinger, Cautious model predictive control using gaussian process regression, IEEE Transactions on Control Systems Technology 28 (6) (2020) 2736–2743.doi:10. 1109/TCST.2019.2949757

  15. [15]

    S. Gros, M. Zanon, Data-driven economic nmpc using reinforcement learning, IEEE Transactions on Automatic Control 65 (2) (2020) 636–648.doi:10.1109/TAC.2019.2913768

  16. [16]

    Farshidi, K

    S. Farshidi, K. Rezaee, S. Mazaheri, A. H. Rahimi, A. Dadashzadeh, M. Ziabakhsh, S. Eskandari, S.Jansen, Understandinguserintentmodelingforconversationalrecommendersystems: asystematic literature review, User Modeling and User-Adapted Interaction 34 (5) (2024) 1643–1706.doi: 10.1007/s11257-024-09398-x

  17. [17]

    Y. Xiao, M. Watson, Guidance on Conducting a Systematic Literature Review, Journal of Planning Education and Research 39 (1) (2019) 93–112.doi:10.1177/0739456X17723971

  18. [18]

    Okoli, A Guide to Conducting a Standalone Systematic Literature Review, Communications of the Association for Information Systems 37 (2015).doi:10.17705/1CAIS.03743

    C. Okoli, A Guide to Conducting a Standalone Systematic Literature Review, Communications of the Association for Information Systems 37 (2015).doi:10.17705/1CAIS.03743

  19. [19]

    S. Gros, M. Zanon, Reinforcement learning based on mpc and the stochastic policy gradient method, in: 2021 American Control Conference (ACC), 2021, pp. 1947–1952.doi:10.23919/ACC50511. 2021.9482765

  20. [20]

    Pang, Z.-P

    B. Pang, Z.-P. Jiang, I. Mareels, Reinforcement learning for adaptive optimal control of continuous- time linear periodic systems, Automatica 118 (2020) 109035.doi:10.1016/j.automatica.2020. 109035

  21. [21]

    E. F. Camacho, C. Bordons, Model Predictive Control, 2nd Edition, Springer, 2007. 29

  22. [22]

    D. Q. Mayne, J. B. Rawlings, C. V. Rao, P. O. M. Scokaert, Constrained model predictive con- trol: Stability and optimality, Automatica 36 (6) (2000) 789–814.doi:10.1016/S0005-1098(99) 00214-9

  23. [23]

    Schwenzer, M

    M. Schwenzer, M. Ay, T. Bergs, D. Abel, Review on model predictive control: An engineering perspective, The International Journal of Advanced Manufacturing Technology 117 (5) (2021) 1327– 1349.doi:10.1007/s00170-021-07682-3

  24. [24]

    X. Li, T. E. Marlin, Model predictive control with robust feasibility, Journal of Process Control 21 (3) (2011) 415–435.doi:10.1016/j.jprocont.2010.11.006

  25. [25]

    S. Chen, V. M. Preciado, M. Morari, N. Matni, Robust model predictive control with polytopic model uncertainty through System Level Synthesis, Automatica 162 (2024) 111431.doi:10.1016/ j.automatica.2023.111431

  26. [26]

    R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, 1998

  27. [27]

    R.Khalili-Amirabadi, M.Jalaeian-Farimani, O.Solaymani-Fard, Self-organizingdual-bufferadaptive clustering experience replay (sodacer) for safe reinforcement learning in optimal control, Scientific Reports (2026).doi:10.1038/s41598-026-44517-1

  28. [28]

    A. Zai, B. Brown, Deep Reinforcement Learning in Action, Manning Publications, 2020

  29. [29]

    Lapan, Deep Reinforcement Learning Hands-On: A practical and easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF, Packt Publishing, Birmingham, 2024

    M. Lapan, Deep Reinforcement Learning Hands-On: A practical and easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF, Packt Publishing, Birmingham, 2024

  30. [30]

    Bertsekas, Reinforcement Learning and Optimal Control, Athena Scientific, 2019

    D. Bertsekas, Reinforcement Learning and Optimal Control, Athena Scientific, 2019

  31. [31]

    Zanon, S

    M. Zanon, S. Gros, Safe Reinforcement Learning Using Robust MPC, IEEE Transactions on Auto- matic Control 66 (8) (2021) 3638–3652.doi:10.1109/TAC.2020.3024161

  32. [32]

    E. Bøhn, S. Gros, S. Moe, T. A. Johansen, Optimization of the model predictive control meta- parameters through reinforcement learning, Engineering Applications of Artificial Intelligence 123 (2023) 106211.doi:10.1016/j.engappai.2023.106211

  33. [33]

    S. N. Gros, M. Zanon, Learning for MPC with stability & safety guarantees, Automatica 146 (2022). doi:10.1016/j.automatica.2022.110598

  34. [34]

    S. Zhao, B. Nguyen, H. Lu, R. Yu, X. Wu, Reinforcement learning-based coordinated control ar- bitration for vehicle yaw motion with parameter activated torque distributor, Control Engineering Practice 166 (2025).doi:10.1016/j.conengprac.2025.106618

  35. [35]

    K. Yuan, Y. Huang, S. Yang, Z. Zhou, Y. Wang, D. Cao, H. Chen, Evolutionary Decision-Making and Planning for Autonomous Driving Based on Safe and Rational Exploration and Exploitation, Engineering 33 (2024) 108 – 120.doi:10.1016/j.eng.2023.03.018

  36. [36]

    X. Liu, W. Zhang, C. Shao, Y. Wang, Q. Cong, L. Ma, Autonomous collaborative optimization con- trol of earth pressure balance shield machine based on hierarchical control architecture, Engineering Applications of Artificial Intelligence 137 (2024).doi:10.1016/j.engappai.2024.109200

  37. [37]

    R. Wu, J. Jiang, W. Lu, Y. Rui, D. Ngoduy, B. Ran, A dual-layer path planning approach for ramp merging with integrated risk management, Expert Systems with Applications 276 (2025). doi:10.1016/j.eswa.2025.127167

  38. [38]

    Y. Qi, Y. Lv, Z. Qu, S. Guo, Online learning MPC for switched systems with performance dependent mixed switching law 361 (15) (1 2024).doi:10.1016/j.jfranklin.2024.107124

  39. [39]

    M.-s. Kim, T. Park, Model Predictive Control With Reinforcement Learning-Based Speed Profile Generation in Racing Simulator, IEEE Access 13 (2025) 42887 – 42896.doi:10.1109/ACCESS.2025. 3547820

  40. [40]

    Alfonso, F

    L. Alfonso, F. Giannini, G. Franze’, G. Fedele, F. Pupo, G. Fortino, Autonomous Vehicle Platoons in Urban Road Networks: A Joint Distributed Reinforcement Learning and Model Predictive Control Approach, IEEE/CAA Journal of Automatica Sinica 11 (1) (2024) 141 – 156.doi:10.1109/JAS. 2023.123705. 30

  41. [41]

    C. F. Oliveira da Silva, A. Dabiri, B. De Schutter, Integrating Reinforcement Learning and Model Predictive Control for Mixed- Logical Dynamical Systems, IEEE Open Journal of Control Systems 4 (2025) 316 – 331.doi:10.1109/OJCSYS.2025.3601435

  42. [42]

    H. Ren, R. Zhong, H. Gui, Learning-Based Model Predictive Control for Cooperative Spacecraft Swarm Reconfiguration, IEEE Transactions on Aerospace and Electronic Systems (2025).doi: 10.1109/TAES.2025.3624705

  43. [43]

    W. Dai, T. Li, L. Zhang, Y. Jia, H. Yan, Multi-Rate Layered Operational Optimal Control for Large-Scale Industrial Processes, IEEE Transactions on Industrial Informatics 18 (7) (2022) 4749 – 4761.doi:10.1109/TII.2021.3105487

  44. [44]

    J. Xie, X. Xu, F. Wang, Z. Liu, L. Chen, Coordination Control Strategy for Human-Machine Coop- erative Steering of Intelligent Vehicles: A Reinforcement Learning Approach, IEEE Transactions on IntelligentTransportationSystems23(11)(2022)21163–21177.doi:10.1109/TITS.2022.3187016

  45. [45]

    Giannini, G

    F. Giannini, G. Franze’, F. Pupo, G. Fortino, A Sustainable Multi-Agent Routing Algorithm for Ve- hicle Platoons in Urban Networks, IEEE Transactions on Intelligent Transportation Systems 24 (12) (2023) 14830 – 14840.doi:10.1109/TITS.2023.3305463

  46. [46]

    H. Wang, L. Feng, Y. Zhang, J. Zhou, H. Du, Human-Machine Authority Allocation in Indirect Cooperative Shared Steering Control With TD3 Reinforcement Learning, IEEE Transactions on Vehicular Technology 73 (6) (2024) 7576 – 7588.doi:10.1109/TVT.2024.3352047

  47. [47]

    Reinforced Model Predictive Guidance and Control for Spacecraft Proximity Operations,

    L. Capra, A. Brandonisio, M. R. Lavagna, Reinforced Model Predictive Guidance and Control for Spacecraft Proximity Operations, Aerospace 12 (9) (2025).doi:10.3390/aerospace12090837

  48. [48]

    Zhang, X

    Z. Zhang, X. Chang, H. Ma, H. An, L. Lang, Model Predictive Control of Quadruped Robot Based on Reinforcement Learning, Applied Sciences (Switzerland) 13 (1) (2023).doi:10.3390/app13010154

  49. [49]

    Y. Li, Z. Chen, C. Wu, H. Mao, P. Sun, A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG, Biomimetics 8 (5) (2023).doi:10.3390/biomimetics8050382

  50. [50]

    Mensah Akwasi, H

    A. Mensah Akwasi, H. Chen, J. Liu, D. Otuo-Acheampong, Hybrid Adaptive Learning-Based Con- trol for Grid-Forming Inverters: Real-Time Adaptive Voltage Regulation, Multi-Level Disturbance Rejection, and Lyapunov-Based Stability, Energies 18 (16) (2025).doi:10.3390/en18164296

  51. [51]

    P. T. Jardine, M. Kogan, S. N. Givigi, S. Yousefi, Adaptive predictive control of a differential drive robot tuned with reinforcement learning, International Journal of Adaptive Control and Signal Processing 33 (2) (2019) 410 – 423.doi:10.1002/acs.2882

  52. [52]

    P. T. Jardine, S. N. Givigi, Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience, Journal of Intelligent and Robotic Systems: Theory and Applications 102 (3) (2021).doi:10.1007/s10846-021-01387-1

  53. [53]

    K. Zhu, G. Zhang, C. Zhu, Y. Niu, J. Z. Liu, A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC, Applied Energy 391 (2025).doi:10.1016/j.apenergy.2025.125850

  54. [54]

    J. Liu, B. Wu, X. Meng, J. Wu, Z. Ma, LearnAMR: Learning-based adaptive model predictive con- trol enhanced by reinforcement learning for optimizing energy flexibility in building energy systems incorporating demand-side management, Applied Energy 401 (2025).doi:10.1016/j.apenergy. 2025.126707

  55. [55]

    Liang, S

    Y. Liang, S. Zhang, W. Zhao, C. Wang, K. Xu, W. Liang, Coordinated control of yaw and roll stability in heavy vehicles considering dynamic safety requirements, Control Engineering Practice 148 (2024).doi:10.1016/j.conengprac.2024.105963

  56. [56]

    J. Peng, X. Liu, C. Wu, D. Pi, J. Zhou, Deep reinforcement learning-tuning hierarchical vehicle trajectory tracking framework based on improved kinematic model predictive control, Engineering Applications of Artificial Intelligence 162 (2025).doi:10.1016/j.engappai.2025.112551

  57. [57]

    Razmi, O

    D. Razmi, O. Babayomi, Z. Zhang, Reinforcement learning-driven dynamic Model Predictive Control for adaptive real-time multi-agent management of microgrids, International Journal of Electrical Power and Energy Systems 170 (2025).doi:10.1016/j.ijepes.2025.110823. 31

  58. [58]

    S. Yuan, Q. Hua, Q. Shuai, L. Sun, Energy management and performance improvement for fuel cell hybrid electric vehicle with reinforcement learning-based dual model predictive control, International Journal of Hydrogen Energy 185 (2025).doi:10.1016/j.ijhydene.2025.151770

  59. [59]

    Jiang, H

    P. Jiang, H. Xia, J. Zhang, Y. Zhu, Y. Jiang, W. Ran, J. Pang, Research on intelligent hierarchical control method for U-tube steam generator water level based on TD3-MPC, Nuclear Engineering and Technology 57 (11) (2025).doi:10.1016/j.net.2025.103741

  60. [60]

    H. Bao, Y. Wang, H. Zhu, X. Li, F. Yu, Numerical and experimental analysis of motion control of offshore fishing unmanned underwater vehicle in ocean environment, Ocean Engineering 295 (2024). doi:10.1016/j.oceaneng.2024.116886

  61. [61]

    M. Z. Yameen, Z. Lu, F. F. El-Sousy, W. Younis, B. A. Zardari, A. K. Junejo, Improving frequency stability in grid-forming inverters with adaptive model predictive control and novel COA-jDE opti- mized reinforcement learning, Scientific Reports 15 (1) (2025).doi:10.1038/s41598-025-00896-5

  62. [62]

    P.Fan, J.Yang, S.Ke, Y.Wen, Y.Li, L.Xie, Loadfrequencycontrolstrategyforislandedmultimicro- grids with V2G dependent on learning-based model predictive control, IET Generation, Transmission and Distribution 17 (21) (2023) 4763 – 4780.doi:10.1049/gtd2.12994

  63. [63]

    X. Hu, Z. Zhai, J. Liu, C. Wang, N. Liu, X. Chen, TD3-Based Model Predictive Control for Satel- lite Formation-Keeping, Journal of Aerospace Engineering 37 (6) (2024).doi:10.1061/JAEEEZ. ASENG-5646

  64. [64]

    P. T. Jardine, S. N. Givigi, S. Yousefi, Leveraging Data Engineering to Improve Unmanned Aerial Vehicle Control Design, IEEE Systems Journal 15 (2) (2021) 2595 – 2606.doi:10.1109/JSYST. 2020.3003203

  65. [65]

    Nejatbakhsh Esfahani, U

    H. Nejatbakhsh Esfahani, U. G. Vaidya, J. Mohammadpour-Velni, Performance-Oriented Data- Driven Control: Fusing Koopman Operator and MPC-Based Reinforcement Learning, IEEE Control Systems Letters 8 (2024) 3021 – 3026.doi:10.1109/LCSYS.2024.3520904

  66. [66]

    Zuliani, E

    R. Zuliani, E. C. Balta, J. Lygeros, BP-MPC: Optimizing the Closed-Loop Performance of MPC Using Backpropagation, IEEE Transactions on Automatic Control 70 (9) (2025) 5690 – 5704.doi: 10.1109/TAC.2025.3545767

  67. [67]

    Zhang, P

    Y. Zhang, P. Wang, L. Yu, N. Li, Adaptive Tuning of Dynamic Matrix Control for Uncertain Industrial Systems With Deep Reinforcement Learning, IEEE Transactions on Automation Science and Engineering 22 (2025) 8695 – 8708.doi:10.1109/TASE.2024.3487878

  68. [68]

    Luque, D

    A. Luque, D. Parent, A. Colomé, C. Ocampo-Martinez, C. Torras, Model Predictive Control for Dynamic Cloth Manipulation: Parameter Learning and Experimental Validation, IEEE Transactions on Control Systems Technology 32 (4) (2024) 1254 – 1270.doi:10.1109/TCST.2024.3362514

  69. [69]

    Y. Wan, Q. Xu, T. Dragičević, Reinforcement Learning-Based Predictive Control for Power Elec- tronic Converters, IEEE Transactions on Industrial Electronics 72 (5) (2025) 5353 – 5364.doi: 10.1109/TIE.2024.3472299

  70. [70]

    X. Fan, C. Bu, X. Zhao, J. Sui, H. Mo, Incremental Double Q-Learning-Enhanced MPC for Trajec- tory Tracking of Mobile Robots, IEEE Transactions on Instrumentation and Measurement 74 (2025). doi:10.1109/TIM.2025.3545523

  71. [71]

    L. Yan, J. Liang, K. Yang, Bi-Level Control of Weaving Sections in Mixed Traffic Environments With Connected and Automated Vehicles, IEEE Transactions on Intelligent Transportation Systems (2025).doi:10.1109/TITS.2025.3610019

  72. [72]

    Locher, E

    L. Locher, E. Stai, G. Hug, A Safe Combined Reinforcement Learning and Model Predictive Control Scheme For Utility-Level Battery Control in Distribution Grids, IEEE Transactions on Smart Grid (2025).doi:10.1109/TSG.2025.3617966

  73. [73]

    K. Li, Z. Wang, D. Q. Truong, J. Yoon, Reinforcement Learning-Based Hyperparameter Tuning for Adaptive Model Predictive Controllers in Battery Thermal Management, IEEE Transactions on Vehicular Technology 74 (8) (2025) 12058 – 12071.doi:10.1109/TVT.2025.3552968. 32

  74. [74]

    Amiri, S

    F. Amiri, S. Sadr, Improvement of Frequency Stability in Shipboard Microgrids Based on MPC- Reinforcement Learning, Journal of Electrical and Computer Engineering 2025 (1) (2025).doi: 10.1155/jece/3139447

  75. [75]

    J. Chen, S. Jiang, Z. Zhou, M. Zhang, X. Ming, N. Guo, Lateral semi-trailer truck control using a parameter self-learning MPC method in urban environment, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 238 (5) (2024) 964 – 976.doi: 10.1177/09544070221149068

  76. [76]

    K. Feng, X. Li, W. Li, Adaptive MPC path-tracking controller based on reinforcement learning and preview-based PID controller, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 239 (12) (2025) 5380 – 5396.doi:10.1177/09544070241287965

  77. [77]

    L. Wang, S. Yang, K. Yuan, Y. Huang, H. Chen, A Combined Reinforcement Learning and Model Predictive Control for Car-Following Maneuver of Autonomous Vehicles, Chinese Journal of Me- chanical Engineering (English Edition) 36 (1) (2023).doi:10.1186/s10033-023-00904-7

  78. [78]

    M.Usama, A.Salaje, T.Chevet, N.Langlois, OptimalWeightingFactorsDesignforModelPredictive Current Controller for Enhanced Dynamic Performance of PMSM Employing Deep Reinforcement Learning, Applied Sciences (Switzerland) 15 (11) (2025).doi:10.3390/app15115874

  79. [79]

    J. A. Yang, C. Kuo, Integrating vehicle positioning and path tracking practices for an autonomous vehicle prototype in campus environment, Electronics (Switzerland) 10 (21) (2021).doi:10.3390/ electronics10212703

  80. [80]

    Zhang, X

    S. Zhang, X. Zhuan, Two-Dimensional Car-Following Control Strategy for Electric Vehicle Based on MPC and DQN, Symmetry 14 (8) (2022).doi:10.3390/sym14081718

Showing first 80 references.