arxiv: 2604.21030 · v1 · submitted 2026-04-22 · 📡 eess.SY · cs.AI· cs.RO· cs.SY· math.OC

Recognition: unknown

A Systematic Review and Taxonomy of Reinforcement Learning-Model Predictive Control Integration for Linear Systems

Mohsen Jalaeian Farimani , Roya Khalili Amirabadi , Davoud Nikkhouy , Malihe Abdolbaghi , Mahshad Rastegarmoghaddam , Shima Samadzadeh

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:19 UTC · model grok-4.3

classification 📡 eess.SY cs.AIcs.ROcs.SYmath.OC

keywords reinforcement learningmodel predictive controllinear systemssystematic literature reviewtaxonomyintegration strategieshybrid controladaptive control

0 comments

The pith

This review organizes RL-MPC integrations for linear systems into a five-dimensional taxonomy and synthesizes recurring design patterns from the literature.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts a systematic literature review of how reinforcement learning is combined with model predictive control in linear or linearized systems. It classifies published studies across RL functional roles, algorithm classes, MPC formulations, cost-function structures, and application domains. A cross-dimensional synthesis identifies common pairings and trends while noting persistent issues such as computational cost, sample efficiency, robustness, and stability guarantees. The resulting structure supplies a reference that researchers can use to position new work or select components for hybrid controllers. By mapping the fragmented literature, the review aims to support more systematic development of adaptive constrained control.

Core claim

The literature on RL-MPC integrations for linear systems can be organized through a taxonomy based on RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains, with cross-synthesis revealing recurring design patterns, methodological trends, and challenges including computational burden, sample efficiency, robustness, and closed-loop guarantees.

What carries the argument

A multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains that structures the reviewed studies and enables identification of design patterns through cross-dimensional synthesis.

If this is right

New RL-MPC proposals can be located in the taxonomy to show their relation to existing combinations.
Observed associations between RL algorithm classes and MPC formulations can guide component selection for specific domains.
Documented challenges such as computational burden direct attention to efficiency improvements in future designs.
The synthesis supports evaluation of closed-loop stability and robustness when RL augments linear MPC.
Patterns in cost-function structures and application domains can inform tailored controller development.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same taxonomy approach could later be applied to nonlinear systems to compare how integration strategies differ.
Controlled experiments could test whether the most frequent pattern combinations actually deliver the reported performance benefits.
The reference structure may help avoid redundant exploration of already-covered RL-MPC pairings.
The organization highlights opportunities to incorporate additional guarantees from safe RL or distributed control into the hybrid setting.

Load-bearing premise

The five chosen taxonomy dimensions are sufficient to capture all major integration strategies without omissions, and the database search through 2025 captured a representative set of relevant peer-reviewed studies.

What would settle it

Discovery of multiple significant peer-reviewed papers on RL-MPC for linear systems that cannot be placed in any of the five taxonomy categories or were missed by the systematic search would show the review is incomplete.

Figures

Figures reproduced from arXiv: 2604.21030 by Davoud Nikkhouy, Mahshad Rastegarmoghaddam, Malihe Abdolbaghi, Mohsen Jalaeian Farimani, Roya Khalili Amirabadi, Shima Samadzadeh.

**Figure 1.** Figure 1: This figure illustrates the systematic review protocol employed in this study. Developed in accordance with [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: Taxonomy of Reinforcement Learning roles in MPC–RL integrations. [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: The co-occurrence between categories based on RL roles in MPC–RL integrations. [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Temporal distribution of RL roles applied in MPC–RL integrations. [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Taxonomy of Reinforcement Learning algorithms implemented in MPC–RL frameworks. [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Temporal evolution of RL algorithm types adopted in MPC–RL literature. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Taxonomy of Cost Functions in MPC–RL Linear Control [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

**Figure 8.** Figure 8: Taxonomy of MPC Formulations 21 [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 9.** Figure 9: Taxonomy of MPC–RL application fields in linear control [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗

**Figure 10.** Figure 10: Application fields of MPC–RL integration during years [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗

**Figure 11.** Figure 11: MPC type vs. RL algorithms in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗

**Figure 12.** Figure 12: MPC type vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗

**Figure 13.** Figure 13: RL algorithms vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗

**Figure 14.** Figure 14: Application fields vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗

**Figure 15.** Figure 15: MPC type vs. application fields in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p025_15.png] view at source ↗

**Figure 16.** Figure 16: MPC cost functions vs. application fields in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p026_16.png] view at source ↗

**Figure 17.** Figure 17: MPC cost functions vs. MPC type in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p026_17.png] view at source ↗

**Figure 18.** Figure 18: MPC cost functions vs. RL algorithms in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p027_18.png] view at source ↗

**Figure 19.** Figure 19: MPC cost functions vs. RL roles in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p027_19.png] view at source ↗

**Figure 20.** Figure 20: RL algorithms vs. application fields in the integration of MPC–RL for linear control [PITH_FULL_IMAGE:figures/full_fig_p028_20.png] view at source ↗

read the original abstract

The integration of Model Predictive Control (MPC) and Reinforcement Learning (RL) has emerged as a promising paradigm for constrained decision-making and adaptive control. MPC offers structured optimization, explicit constraint handling, and established stability tools, whereas RL provides data-driven adaptation and performance improvement in the presence of uncertainty and model mismatch. Despite the rapid growth of research on RL--MPC integration, the literature remains fragmented, particularly for control architectures built on linear or linearized predictive models. This paper presents a comprehensive Systematic Literature Review (SLR) of RL--MPC integrations for linear and linearized systems, covering peer-reviewed and formally indexed studies published until 2025. The reviewed studies are organized through a multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains. In addition, a cross-dimensional synthesis is conducted to identify recurring design patterns and reported associations among these dimensions within the reviewed corpus. The review highlights methodological trends, commonly adopted integration strategies, and recurring practical challenges, including computational burden, sample efficiency, robustness, and closed-loop guarantees. The resulting synthesis provides a structured reference for researchers and practitioners seeking to design or analyze RL--MPC architectures based on linear or linearized predictive control formulations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical taxonomy and pattern synthesis for RL-MPC on linear systems, but the review methods stay too thin to confirm the corpus is representative.

read the letter

The paper organizes RL-MPC work on linear systems through a multi-dimensional taxonomy and does a cross-synthesis to spot patterns. That's the core new element here, focused on this sub-area where literature is fragmented. It covers functional roles for RL, algorithm types, MPC formulations, cost structures, and domains. The synthesis highlights trends in integration strategies and common practical problems such as computational demands, sample efficiency, robustness, and closed-loop properties. This setup gives a structured way to look at existing designs. The paper stays grounded by drawing from published studies and noting challenges without inventing new claims. The soft spots center on verification of the review process. No specifics appear on the databases searched, search terms, inclusion and exclusion rules, or the number of papers reviewed. The taxonomy dimensions are presented as useful, but without evidence they were tested for completeness or that the search captured the full relevant set up to 2025, the synthesis could reflect selection bias rather than the whole field. If the full text fills this in with a clear methods section, that would address the main concern. This paper is for control engineers and RL practitioners dealing with linear or linearized models who need a reference to navigate options and see what has been tried. It could help avoid reinventing approaches. It should go to peer review so experts can assess the coverage and suggest improvements to the taxonomy or methods.

Referee Report

1 major / 0 minor

Summary. The paper presents a comprehensive systematic literature review (SLR) of RL-MPC integrations for linear and linearized systems. It organizes the reviewed studies via a multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains; conducts a cross-dimensional synthesis to identify recurring design patterns and associations; and highlights methodological trends, integration strategies, and practical challenges including computational burden, sample efficiency, robustness, and closed-loop guarantees. The result is positioned as a structured reference for researchers and practitioners.

Significance. If the underlying SLR methodology proves complete and the taxonomy exhaustive, the work would provide a useful organizing framework for the fragmented literature on RL-MPC hybrids in linear systems, potentially guiding design choices and identifying open challenges. The cross-dimensional synthesis is a constructive element that could help the community move beyond ad-hoc integrations.

major comments (1)

[Abstract / Methods] Abstract and (presumed) Methods section: the manuscript asserts a 'comprehensive' SLR covering peer-reviewed studies until 2025 but supplies no concrete details on search strings, databases queried, inclusion/exclusion criteria, number of papers screened or included, quality assessment, or PRISMA flow. Without these elements the representativeness of the corpus and the validity of the taxonomy and cross-synthesis cannot be verified, directly undercutting the central claim of comprehensiveness.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review. We agree that the systematic literature review methodology must be documented with greater transparency to support the claim of comprehensiveness. We address the single major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract / Methods] Abstract and (presumed) Methods section: the manuscript asserts a 'comprehensive' SLR covering peer-reviewed studies until 2025 but supplies no concrete details on search strings, databases queried, inclusion/exclusion criteria, number of papers screened or included, quality assessment, or PRISMA flow. Without these elements the representativeness of the corpus and the validity of the taxonomy and cross-synthesis cannot be verified, directly undercutting the central claim of comprehensiveness.

Authors: We acknowledge the validity of this observation. The submitted manuscript does not contain a dedicated Methods section or PRISMA flow diagram describing the SLR protocol. In the revised version we will insert a new Methods section that explicitly reports: the databases searched (IEEE Xplore, Scopus, Web of Science, ScienceDirect, and SpringerLink), the precise search strings and Boolean combinations employed, the inclusion and exclusion criteria (peer-reviewed English-language studies on RL-MPC for linear or linearized systems published through 2025, with clear focus on integration architectures), the screening process, the number of records identified, screened, and finally included, any quality-assessment steps applied, and a PRISMA flow diagram. These additions will enable independent verification of corpus representativeness and will not alter the taxonomy or cross-dimensional synthesis already presented. revision: yes

Circularity Check

0 steps flagged

No circularity in literature review synthesis

full rationale

This paper is a systematic literature review synthesizing external peer-reviewed studies on RL-MPC integrations. It contains no original derivations, equations, predictions, fitted parameters, or self-referential definitions that could reduce to inputs by construction. The taxonomy dimensions and cross-dimensional synthesis draw directly from the reviewed corpus without any load-bearing self-citation chains or ansatz smuggling. The central claims rest on the external literature rather than internal tautologies, making the work self-contained against the circularity criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that the literature corpus is comprehensively and representatively captured by standard academic indexing up to 2025 and that the selected taxonomy dimensions adequately partition the design space.

axioms (1)

domain assumption The body of peer-reviewed and formally indexed studies on RL-MPC for linear systems published until 2025 can be exhaustively identified through standard academic databases and search procedures.
The review's claim of comprehensive coverage depends on this premise.

pith-pipeline@v0.9.0 · 5565 in / 1423 out tokens · 55615 ms · 2026-05-09T23:19:06.068217+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

92 extracted references · 82 canonical work pages · 1 internal anchor

[1]

S. Qin, T. A. Badgwell, A survey of industrial model predictive control technology, Control Engi- neering Practice 11 (7) (2003) 733–764.doi:10.1016/S0967-0661(02)00186-7

work page doi:10.1016/s0967-0661(02)00186-7 2003
[2]

D.Q.Mayne, Modelpredictivecontrol: Recentdevelopmentsandfuturepromise, Automatica50(12) (2014) 2967–2986.doi:10.1016/j.automatica.2014.10.128

work page doi:10.1016/j.automatica.2014.10.128 2014
[3]

J. B. Rawlings, D. Q. Mayne, M. Diehl, Model Predictive Control: Theory, Computation, and Design, 2nd Edition, Nob Hill Publishing, 2017. 28

2017
[4]

E. F. Camacho, C. Bordons, Model Predictive Control, 2nd Edition, Advanced Textbooks in Control and Signal Processing, Springer, London, UK, 2007.doi:10.1007/978-0-85729-398-5

work page doi:10.1007/978-0-85729-398-5 2007
[5]

Wang, Model Predictive Control System Design and Implementation Using MATLAB, Springer, London, UK, 2009.doi:10.1007/978-1-84882-331-0

L. Wang, Model Predictive Control System Design and Implementation Using MATLAB, Springer, London, UK, 2009.doi:10.1007/978-1-84882-331-0

work page doi:10.1007/978-1-84882-331-0 2009
[6]

R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, 2nd Edition, Adaptive Com- putation and Machine Learning Series, MIT Press, Cambridge, MA, USA, 2018

2018
[7]

Khalili Amirabadi, M

R. Khalili Amirabadi, M. Jalaeian-Farimani, O. S. Fard, Lstm-empowered reinforcement learning in bi-level optimal control for nonlinear systems with uncertain dynamics, ISA Transactions 168 (2026) 465–478.doi:10.1016/j.isatra.2025.11.027

work page doi:10.1016/j.isatra.2025.11.027 2026
[8]

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, D. Wierstra, Continuous controlwithdeepreinforcementlearning, in: ProceedingsoftheInternationalConferenceonLearning Representations (ICLR), 2016.arXiv:1509.02971

work page internal anchor Pith review arXiv 2016
[9]

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Ried- miller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, D. Hassabis, Human-level control through deep reinforcement learning, Nature 518 (7540) (2015) 529–533.doi:10.1038/nature14236

work page doi:10.1038/nature14236 2015
[10]

García, F

J. García, F. Fernández, A comprehensive survey on safe reinforcement learning, Journal of Machine Learning Research 16 (1) (2015) 1437–1480.doi:10.5555/2789272.2886795

work page doi:10.5555/2789272.2886795 2015
[11]

Brunke, M

L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, A. P. Schoellig, Safe learning in robotics: From learning-based control to safe reinforcement learning, Annual Re- view of Control, Robotics, and Autonomous Systems 5 (2022) 411–444.doi:10.1146/ annurev-control-042920-020211

2022
[12]

R. K. Amirabadi, O. S. Fard, M. J. Farimani, Towards optimal control of hpv model using safe reinforcement learning with actor-critic neural networks, Expert Systems with Applications 264 (2025) 125783.doi:10.1016/j.eswa.2024.125783

work page doi:10.1016/j.eswa.2024.125783 2025
[13]

Kiumarsi, K

B. Kiumarsi, K. G. Vamvoudakis, H. Modares, F. L. Lewis, Optimal and autonomous control using reinforcement learning: A survey, IEEE Transactions on Neural Networks and Learning Systems 29 (6) (2018) 2042–2062.doi:10.1109/TNNLS.2017.2773458

work page doi:10.1109/tnnls.2017.2773458 2018
[14]

Hewing, J

L. Hewing, J. Kabzan, M. N. Zeilinger, Cautious model predictive control using gaussian process regression, IEEE Transactions on Control Systems Technology 28 (6) (2020) 2736–2743.doi:10. 1109/TCST.2019.2949757

work page arXiv 2020
[15]

S. Gros, M. Zanon, Data-driven economic nmpc using reinforcement learning, IEEE Transactions on Automatic Control 65 (2) (2020) 636–648.doi:10.1109/TAC.2019.2913768

work page doi:10.1109/tac.2019.2913768 2020
[16]

Farshidi, K

S. Farshidi, K. Rezaee, S. Mazaheri, A. H. Rahimi, A. Dadashzadeh, M. Ziabakhsh, S. Eskandari, S.Jansen, Understandinguserintentmodelingforconversationalrecommendersystems: asystematic literature review, User Modeling and User-Adapted Interaction 34 (5) (2024) 1643–1706.doi: 10.1007/s11257-024-09398-x

work page doi:10.1007/s11257-024-09398-x 2024
[17]

Y. Xiao, M. Watson, Guidance on Conducting a Systematic Literature Review, Journal of Planning Education and Research 39 (1) (2019) 93–112.doi:10.1177/0739456X17723971

work page doi:10.1177/0739456x17723971 2019
[18]

Okoli, A Guide to Conducting a Standalone Systematic Literature Review, Communications of the Association for Information Systems 37 (2015).doi:10.17705/1CAIS.03743

C. Okoli, A Guide to Conducting a Standalone Systematic Literature Review, Communications of the Association for Information Systems 37 (2015).doi:10.17705/1CAIS.03743

work page doi:10.17705/1cais.03743 2015
[19]

S. Gros, M. Zanon, Reinforcement learning based on mpc and the stochastic policy gradient method, in: 2021 American Control Conference (ACC), 2021, pp. 1947–1952.doi:10.23919/ACC50511. 2021.9482765

work page doi:10.23919/acc50511 2021
[20]

Pang, Z.-P

B. Pang, Z.-P. Jiang, I. Mareels, Reinforcement learning for adaptive optimal control of continuous- time linear periodic systems, Automatica 118 (2020) 109035.doi:10.1016/j.automatica.2020. 109035

work page doi:10.1016/j.automatica.2020 2020
[21]

E. F. Camacho, C. Bordons, Model Predictive Control, 2nd Edition, Springer, 2007. 29

2007
[22]

D. Q. Mayne, J. B. Rawlings, C. V. Rao, P. O. M. Scokaert, Constrained model predictive con- trol: Stability and optimality, Automatica 36 (6) (2000) 789–814.doi:10.1016/S0005-1098(99) 00214-9

work page doi:10.1016/s0005-1098(99 2000
[23]

Schwenzer, M

M. Schwenzer, M. Ay, T. Bergs, D. Abel, Review on model predictive control: An engineering perspective, The International Journal of Advanced Manufacturing Technology 117 (5) (2021) 1327– 1349.doi:10.1007/s00170-021-07682-3

work page doi:10.1007/s00170-021-07682-3 2021
[24]

X. Li, T. E. Marlin, Model predictive control with robust feasibility, Journal of Process Control 21 (3) (2011) 415–435.doi:10.1016/j.jprocont.2010.11.006

work page doi:10.1016/j.jprocont.2010.11.006 2011
[25]

S. Chen, V. M. Preciado, M. Morari, N. Matni, Robust model predictive control with polytopic model uncertainty through System Level Synthesis, Automatica 162 (2024) 111431.doi:10.1016/ j.automatica.2023.111431

work page arXiv 2024
[26]

R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, 1998

1998
[27]

R.Khalili-Amirabadi, M.Jalaeian-Farimani, O.Solaymani-Fard, Self-organizingdual-bufferadaptive clustering experience replay (sodacer) for safe reinforcement learning in optimal control, Scientific Reports (2026).doi:10.1038/s41598-026-44517-1

work page doi:10.1038/s41598-026-44517-1 2026
[28]

A. Zai, B. Brown, Deep Reinforcement Learning in Action, Manning Publications, 2020

2020
[29]

Lapan, Deep Reinforcement Learning Hands-On: A practical and easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF, Packt Publishing, Birmingham, 2024

M. Lapan, Deep Reinforcement Learning Hands-On: A practical and easy-to-follow guide to RL from Q-learning and DQNs to PPO and RLHF, Packt Publishing, Birmingham, 2024

2024
[30]

Bertsekas, Reinforcement Learning and Optimal Control, Athena Scientific, 2019

D. Bertsekas, Reinforcement Learning and Optimal Control, Athena Scientific, 2019

2019
[31]

Zanon, S

M. Zanon, S. Gros, Safe Reinforcement Learning Using Robust MPC, IEEE Transactions on Auto- matic Control 66 (8) (2021) 3638–3652.doi:10.1109/TAC.2020.3024161

work page doi:10.1109/tac.2020.3024161 2021
[32]

E. Bøhn, S. Gros, S. Moe, T. A. Johansen, Optimization of the model predictive control meta- parameters through reinforcement learning, Engineering Applications of Artificial Intelligence 123 (2023) 106211.doi:10.1016/j.engappai.2023.106211

work page doi:10.1016/j.engappai.2023.106211 2023
[33]

S. N. Gros, M. Zanon, Learning for MPC with stability & safety guarantees, Automatica 146 (2022). doi:10.1016/j.automatica.2022.110598

work page doi:10.1016/j.automatica.2022.110598 2022
[34]

S. Zhao, B. Nguyen, H. Lu, R. Yu, X. Wu, Reinforcement learning-based coordinated control ar- bitration for vehicle yaw motion with parameter activated torque distributor, Control Engineering Practice 166 (2025).doi:10.1016/j.conengprac.2025.106618

work page doi:10.1016/j.conengprac.2025.106618 2025
[35]

K. Yuan, Y. Huang, S. Yang, Z. Zhou, Y. Wang, D. Cao, H. Chen, Evolutionary Decision-Making and Planning for Autonomous Driving Based on Safe and Rational Exploration and Exploitation, Engineering 33 (2024) 108 – 120.doi:10.1016/j.eng.2023.03.018

work page doi:10.1016/j.eng.2023.03.018 2024
[36]

X. Liu, W. Zhang, C. Shao, Y. Wang, Q. Cong, L. Ma, Autonomous collaborative optimization con- trol of earth pressure balance shield machine based on hierarchical control architecture, Engineering Applications of Artificial Intelligence 137 (2024).doi:10.1016/j.engappai.2024.109200

work page doi:10.1016/j.engappai.2024.109200 2024
[37]

R. Wu, J. Jiang, W. Lu, Y. Rui, D. Ngoduy, B. Ran, A dual-layer path planning approach for ramp merging with integrated risk management, Expert Systems with Applications 276 (2025). doi:10.1016/j.eswa.2025.127167

work page doi:10.1016/j.eswa.2025.127167 2025
[38]

Y. Qi, Y. Lv, Z. Qu, S. Guo, Online learning MPC for switched systems with performance dependent mixed switching law 361 (15) (1 2024).doi:10.1016/j.jfranklin.2024.107124

work page doi:10.1016/j.jfranklin.2024.107124 2024
[39]

M.-s. Kim, T. Park, Model Predictive Control With Reinforcement Learning-Based Speed Profile Generation in Racing Simulator, IEEE Access 13 (2025) 42887 – 42896.doi:10.1109/ACCESS.2025. 3547820

work page doi:10.1109/access.2025 2025
[40]

Alfonso, F

L. Alfonso, F. Giannini, G. Franze’, G. Fedele, F. Pupo, G. Fortino, Autonomous Vehicle Platoons in Urban Road Networks: A Joint Distributed Reinforcement Learning and Model Predictive Control Approach, IEEE/CAA Journal of Automatica Sinica 11 (1) (2024) 141 – 156.doi:10.1109/JAS. 2023.123705. 30

work page doi:10.1109/jas 2024
[41]

C. F. Oliveira da Silva, A. Dabiri, B. De Schutter, Integrating Reinforcement Learning and Model Predictive Control for Mixed- Logical Dynamical Systems, IEEE Open Journal of Control Systems 4 (2025) 316 – 331.doi:10.1109/OJCSYS.2025.3601435

work page doi:10.1109/ojcsys.2025.3601435 2025
[42]

H. Ren, R. Zhong, H. Gui, Learning-Based Model Predictive Control for Cooperative Spacecraft Swarm Reconfiguration, IEEE Transactions on Aerospace and Electronic Systems (2025).doi: 10.1109/TAES.2025.3624705

work page doi:10.1109/taes.2025.3624705 2025
[43]

W. Dai, T. Li, L. Zhang, Y. Jia, H. Yan, Multi-Rate Layered Operational Optimal Control for Large-Scale Industrial Processes, IEEE Transactions on Industrial Informatics 18 (7) (2022) 4749 – 4761.doi:10.1109/TII.2021.3105487

work page doi:10.1109/tii.2021.3105487 2022
[44]

J. Xie, X. Xu, F. Wang, Z. Liu, L. Chen, Coordination Control Strategy for Human-Machine Coop- erative Steering of Intelligent Vehicles: A Reinforcement Learning Approach, IEEE Transactions on IntelligentTransportationSystems23(11)(2022)21163–21177.doi:10.1109/TITS.2022.3187016

work page doi:10.1109/tits.2022.3187016 2022
[45]

Giannini, G

F. Giannini, G. Franze’, F. Pupo, G. Fortino, A Sustainable Multi-Agent Routing Algorithm for Ve- hicle Platoons in Urban Networks, IEEE Transactions on Intelligent Transportation Systems 24 (12) (2023) 14830 – 14840.doi:10.1109/TITS.2023.3305463

work page doi:10.1109/tits.2023.3305463 2023
[46]

H. Wang, L. Feng, Y. Zhang, J. Zhou, H. Du, Human-Machine Authority Allocation in Indirect Cooperative Shared Steering Control With TD3 Reinforcement Learning, IEEE Transactions on Vehicular Technology 73 (6) (2024) 7576 – 7588.doi:10.1109/TVT.2024.3352047

work page doi:10.1109/tvt.2024.3352047 2024
[47]

Reinforced Model Predictive Guidance and Control for Spacecraft Proximity Operations,

L. Capra, A. Brandonisio, M. R. Lavagna, Reinforced Model Predictive Guidance and Control for Spacecraft Proximity Operations, Aerospace 12 (9) (2025).doi:10.3390/aerospace12090837

work page doi:10.3390/aerospace12090837 2025
[48]

Zhang, X

Z. Zhang, X. Chang, H. Ma, H. An, L. Lang, Model Predictive Control of Quadruped Robot Based on Reinforcement Learning, Applied Sciences (Switzerland) 13 (1) (2023).doi:10.3390/app13010154

work page doi:10.3390/app13010154 2023
[49]

Y. Li, Z. Chen, C. Wu, H. Mao, P. Sun, A Hierarchical Framework for Quadruped Robots Gait Planning Based on DDPG, Biomimetics 8 (5) (2023).doi:10.3390/biomimetics8050382

work page doi:10.3390/biomimetics8050382 2023
[50]

Mensah Akwasi, H

A. Mensah Akwasi, H. Chen, J. Liu, D. Otuo-Acheampong, Hybrid Adaptive Learning-Based Con- trol for Grid-Forming Inverters: Real-Time Adaptive Voltage Regulation, Multi-Level Disturbance Rejection, and Lyapunov-Based Stability, Energies 18 (16) (2025).doi:10.3390/en18164296

work page doi:10.3390/en18164296 2025
[51]

P. T. Jardine, M. Kogan, S. N. Givigi, S. Yousefi, Adaptive predictive control of a differential drive robot tuned with reinforcement learning, International Journal of Adaptive Control and Signal Processing 33 (2) (2019) 410 – 423.doi:10.1002/acs.2882

work page doi:10.1002/acs.2882 2019
[52]

P. T. Jardine, S. N. Givigi, Improving Control Performance of Unmanned Aerial Vehicles through Shared Experience, Journal of Intelligent and Robotic Systems: Theory and Applications 102 (3) (2021).doi:10.1007/s10846-021-01387-1

work page doi:10.1007/s10846-021-01387-1 2021
[53]

K. Zhu, G. Zhang, C. Zhu, Y. Niu, J. Z. Liu, A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC, Applied Energy 391 (2025).doi:10.1016/j.apenergy.2025.125850

work page doi:10.1016/j.apenergy.2025.125850 2025
[54]

J. Liu, B. Wu, X. Meng, J. Wu, Z. Ma, LearnAMR: Learning-based adaptive model predictive con- trol enhanced by reinforcement learning for optimizing energy flexibility in building energy systems incorporating demand-side management, Applied Energy 401 (2025).doi:10.1016/j.apenergy. 2025.126707

work page doi:10.1016/j.apenergy 2025
[55]

Liang, S

Y. Liang, S. Zhang, W. Zhao, C. Wang, K. Xu, W. Liang, Coordinated control of yaw and roll stability in heavy vehicles considering dynamic safety requirements, Control Engineering Practice 148 (2024).doi:10.1016/j.conengprac.2024.105963

work page doi:10.1016/j.conengprac.2024.105963 2024
[56]

J. Peng, X. Liu, C. Wu, D. Pi, J. Zhou, Deep reinforcement learning-tuning hierarchical vehicle trajectory tracking framework based on improved kinematic model predictive control, Engineering Applications of Artificial Intelligence 162 (2025).doi:10.1016/j.engappai.2025.112551

work page doi:10.1016/j.engappai.2025.112551 2025
[57]

Razmi, O

D. Razmi, O. Babayomi, Z. Zhang, Reinforcement learning-driven dynamic Model Predictive Control for adaptive real-time multi-agent management of microgrids, International Journal of Electrical Power and Energy Systems 170 (2025).doi:10.1016/j.ijepes.2025.110823. 31

work page doi:10.1016/j.ijepes.2025.110823 2025
[58]

S. Yuan, Q. Hua, Q. Shuai, L. Sun, Energy management and performance improvement for fuel cell hybrid electric vehicle with reinforcement learning-based dual model predictive control, International Journal of Hydrogen Energy 185 (2025).doi:10.1016/j.ijhydene.2025.151770

work page doi:10.1016/j.ijhydene.2025.151770 2025
[59]

Jiang, H

P. Jiang, H. Xia, J. Zhang, Y. Zhu, Y. Jiang, W. Ran, J. Pang, Research on intelligent hierarchical control method for U-tube steam generator water level based on TD3-MPC, Nuclear Engineering and Technology 57 (11) (2025).doi:10.1016/j.net.2025.103741

work page doi:10.1016/j.net.2025.103741 2025
[60]

H. Bao, Y. Wang, H. Zhu, X. Li, F. Yu, Numerical and experimental analysis of motion control of offshore fishing unmanned underwater vehicle in ocean environment, Ocean Engineering 295 (2024). doi:10.1016/j.oceaneng.2024.116886

work page doi:10.1016/j.oceaneng.2024.116886 2024
[61]

M. Z. Yameen, Z. Lu, F. F. El-Sousy, W. Younis, B. A. Zardari, A. K. Junejo, Improving frequency stability in grid-forming inverters with adaptive model predictive control and novel COA-jDE opti- mized reinforcement learning, Scientific Reports 15 (1) (2025).doi:10.1038/s41598-025-00896-5

work page doi:10.1038/s41598-025-00896-5 2025
[62]

P.Fan, J.Yang, S.Ke, Y.Wen, Y.Li, L.Xie, Loadfrequencycontrolstrategyforislandedmultimicro- grids with V2G dependent on learning-based model predictive control, IET Generation, Transmission and Distribution 17 (21) (2023) 4763 – 4780.doi:10.1049/gtd2.12994

work page doi:10.1049/gtd2.12994 2023
[63]

X. Hu, Z. Zhai, J. Liu, C. Wang, N. Liu, X. Chen, TD3-Based Model Predictive Control for Satel- lite Formation-Keeping, Journal of Aerospace Engineering 37 (6) (2024).doi:10.1061/JAEEEZ. ASENG-5646

work page doi:10.1061/jaeeez 2024
[64]

P. T. Jardine, S. N. Givigi, S. Yousefi, Leveraging Data Engineering to Improve Unmanned Aerial Vehicle Control Design, IEEE Systems Journal 15 (2) (2021) 2595 – 2606.doi:10.1109/JSYST. 2020.3003203

work page doi:10.1109/jsyst 2021
[65]

Nejatbakhsh Esfahani, U

H. Nejatbakhsh Esfahani, U. G. Vaidya, J. Mohammadpour-Velni, Performance-Oriented Data- Driven Control: Fusing Koopman Operator and MPC-Based Reinforcement Learning, IEEE Control Systems Letters 8 (2024) 3021 – 3026.doi:10.1109/LCSYS.2024.3520904

work page doi:10.1109/lcsys.2024.3520904 2024
[66]

Zuliani, E

R. Zuliani, E. C. Balta, J. Lygeros, BP-MPC: Optimizing the Closed-Loop Performance of MPC Using Backpropagation, IEEE Transactions on Automatic Control 70 (9) (2025) 5690 – 5704.doi: 10.1109/TAC.2025.3545767

work page doi:10.1109/tac.2025.3545767 2025
[67]

Zhang, P

Y. Zhang, P. Wang, L. Yu, N. Li, Adaptive Tuning of Dynamic Matrix Control for Uncertain Industrial Systems With Deep Reinforcement Learning, IEEE Transactions on Automation Science and Engineering 22 (2025) 8695 – 8708.doi:10.1109/TASE.2024.3487878

work page doi:10.1109/tase.2024.3487878 2025
[68]

Luque, D

A. Luque, D. Parent, A. Colomé, C. Ocampo-Martinez, C. Torras, Model Predictive Control for Dynamic Cloth Manipulation: Parameter Learning and Experimental Validation, IEEE Transactions on Control Systems Technology 32 (4) (2024) 1254 – 1270.doi:10.1109/TCST.2024.3362514

work page doi:10.1109/tcst.2024.3362514 2024
[69]

Y. Wan, Q. Xu, T. Dragičević, Reinforcement Learning-Based Predictive Control for Power Elec- tronic Converters, IEEE Transactions on Industrial Electronics 72 (5) (2025) 5353 – 5364.doi: 10.1109/TIE.2024.3472299

work page doi:10.1109/tie.2024.3472299 2025
[70]

X. Fan, C. Bu, X. Zhao, J. Sui, H. Mo, Incremental Double Q-Learning-Enhanced MPC for Trajec- tory Tracking of Mobile Robots, IEEE Transactions on Instrumentation and Measurement 74 (2025). doi:10.1109/TIM.2025.3545523

work page doi:10.1109/tim.2025.3545523 2025
[71]

L. Yan, J. Liang, K. Yang, Bi-Level Control of Weaving Sections in Mixed Traffic Environments With Connected and Automated Vehicles, IEEE Transactions on Intelligent Transportation Systems (2025).doi:10.1109/TITS.2025.3610019

work page doi:10.1109/tits.2025.3610019 2025
[72]

Locher, E

L. Locher, E. Stai, G. Hug, A Safe Combined Reinforcement Learning and Model Predictive Control Scheme For Utility-Level Battery Control in Distribution Grids, IEEE Transactions on Smart Grid (2025).doi:10.1109/TSG.2025.3617966

work page doi:10.1109/tsg.2025.3617966 2025
[73]

K. Li, Z. Wang, D. Q. Truong, J. Yoon, Reinforcement Learning-Based Hyperparameter Tuning for Adaptive Model Predictive Controllers in Battery Thermal Management, IEEE Transactions on Vehicular Technology 74 (8) (2025) 12058 – 12071.doi:10.1109/TVT.2025.3552968. 32

work page doi:10.1109/tvt.2025.3552968 2025
[74]

Amiri, S

F. Amiri, S. Sadr, Improvement of Frequency Stability in Shipboard Microgrids Based on MPC- Reinforcement Learning, Journal of Electrical and Computer Engineering 2025 (1) (2025).doi: 10.1155/jece/3139447

work page doi:10.1155/jece/3139447 2025
[75]

J. Chen, S. Jiang, Z. Zhou, M. Zhang, X. Ming, N. Guo, Lateral semi-trailer truck control using a parameter self-learning MPC method in urban environment, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 238 (5) (2024) 964 – 976.doi: 10.1177/09544070221149068

work page doi:10.1177/09544070221149068 2024
[76]

K. Feng, X. Li, W. Li, Adaptive MPC path-tracking controller based on reinforcement learning and preview-based PID controller, Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering 239 (12) (2025) 5380 – 5396.doi:10.1177/09544070241287965

work page doi:10.1177/09544070241287965 2025
[77]

L. Wang, S. Yang, K. Yuan, Y. Huang, H. Chen, A Combined Reinforcement Learning and Model Predictive Control for Car-Following Maneuver of Autonomous Vehicles, Chinese Journal of Me- chanical Engineering (English Edition) 36 (1) (2023).doi:10.1186/s10033-023-00904-7

work page doi:10.1186/s10033-023-00904-7 2023
[78]

M.Usama, A.Salaje, T.Chevet, N.Langlois, OptimalWeightingFactorsDesignforModelPredictive Current Controller for Enhanced Dynamic Performance of PMSM Employing Deep Reinforcement Learning, Applied Sciences (Switzerland) 15 (11) (2025).doi:10.3390/app15115874

work page doi:10.3390/app15115874 2025
[79]

J. A. Yang, C. Kuo, Integrating vehicle positioning and path tracking practices for an autonomous vehicle prototype in campus environment, Electronics (Switzerland) 10 (21) (2021).doi:10.3390/ electronics10212703

2021
[80]

Zhang, X

S. Zhang, X. Zhuan, Two-Dimensional Car-Following Control Strategy for Electric Vehicle Based on MPC and DQN, Symmetry 14 (8) (2022).doi:10.3390/sym14081718

work page doi:10.3390/sym14081718 2022

Showing first 80 references.