Approximate Dynamic Programming For Linear Systems with State and Input Constraints

Ankush Chakrabarty; Claus Danielson; Rien Quirynen; Weinan Gao

arxiv: 1906.11369 · v1 · pith:OZNZHZRTnew · submitted 2019-06-26 · 📡 eess.SY · cs.LG· cs.SY· math.DS· math.OC

Approximate Dynamic Programming For Linear Systems with State and Input Constraints

Ankush Chakrabarty , Rien Quirynen , Claus Danielson , Weinan Gao This is my paper

Pith reviewed 2026-05-25 15:05 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.SYmath.DSmath.OC

keywords approximate dynamic programminginvariant setsconstraint satisfactionlinear systemsreinforcement learningoptimal controllinear quadratic regulatorpolicy iteration

0 comments

The pith

Invariant sets let approximate dynamic programming update policies for linear systems without ever violating state or input constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that policy updates inside an approximate dynamic programming loop can be made to preserve a precomputed invariant set, so every subsequent state and input stays inside the allowed limits for all future time. At the same time the sequence of policies is shown to approach the optimal linear quadratic regulator solution as the approximation error goes to zero. A data-driven algorithm is given that implements these updates from measured trajectories. A sympathetic reader would care because reinforcement learning in continuous spaces has lacked any general guarantee against unsafe behavior during learning, which has blocked its use wherever constraint violations carry real costs.

Core claim

By embedding invariant sets into the policy-update step of approximate dynamic programming, the closed-loop trajectory of a linear system is forced to remain inside the given state and input constraint set at every instant, while the value function and policy still converge asymptotically to those of the unconstrained linear quadratic regulator.

What carries the argument

Invariant sets, regions of the state space that remain feasible forever under the closed-loop dynamics and that are preserved by each ADP policy update.

If this is right

Every updated policy satisfies the original constraints at every future time.
The data-driven algorithm produces a sequence of feasible controllers without requiring an explicit model at runtime.
As the number of updates grows, the performance approaches that of the optimal unconstrained LQR controller.
The same invariance argument applies whenever the sets can be identified offline.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same invariance argument could be checked on any system where polytopic or ellipsoidal invariant sets are already known.
One could test whether the data-driven version still works when the invariant set is only approximated rather than computed exactly.
The approach supplies a concrete way to blend offline set computation with online learning, which may connect to other safe-control architectures that pre-compute feasible regions.

Load-bearing premise

Invariant sets for the constrained linear system exist and can be computed so that every ADP policy update keeps the system inside them indefinitely.

What would settle it

A numerical run on a double-integrator plant with box constraints in which any state or input exceeds its limit at any time step during repeated policy updates, or in which the closed-loop cost stops approaching the known LQR optimum.

Figures

Figures reproduced from arXiv: 1906.11369 by Ankush Chakrabarty, Claus Danielson, Rien Quirynen, Weinan Gao.

**Figure 2.** Figure 2: Results of constrained ADP for 5-state, 2-input dynamic system: [A] State evolution [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

read the original abstract

Enforcing state and input constraints during reinforcement learning (RL) in continuous state spaces is an open but crucial problem which remains a roadblock to using RL in safety-critical applications. This paper leverages invariant sets to update control policies within an approximate dynamic programming (ADP) framework that guarantees constraint satisfaction for all time and converges to the optimal policy (in a linear quadratic regulator sense) asymptotically. An algorithm for implementing the proposed constrained ADP approach in a data-driven manner is provided. The potential of this formalism is demonstrated via numerical examples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper folds invariant sets into ADP to keep linear-system constraints satisfied forever while converging to LQR, but the data-driven updates are not forced to preserve invariance.

read the letter

The main thing to know is that the authors embed invariant sets inside an ADP loop for constrained linear systems. The policy updates are supposed to keep the state inside the set at every step, so constraints hold for all time, and the policy still converges to the unconstrained LQR solution. They also give a data-driven algorithm and run it on numerical examples. That combination is what is actually new here. Earlier ADP work and invariant-set MPC exist separately, but the specific construction that updates the policy while preserving the set indefinitely is the fresh piece. The examples are basic but they show the constraints respected in the tested cases, which is useful to see. The soft spot is the one in the stress-test note. The policy improvement step uses sampled trajectories and what appears to be a standard least-squares or temporal-difference update. For the closed-loop system to map the polytope into itself, the gain must satisfy a specific invariance condition. Nothing in the update rule constrains the new gain to that set. Finite-sample error can therefore produce a gain that sends the set outside itself, so the perpetual-satisfaction guarantee does not follow automatically. If the paper projects the update or proves the particular step preserves invariance, that step needs to be explicit. This is for control researchers who already know ADP and invariant sets and want a concrete way to combine them for safety. A reader looking for an algorithm they can try on linear plants with hard constraints will get something usable from the examples. The problem is real and the high-level idea is clear, so the paper deserves a serious referee to check the proofs and the data-driven invariance claim. I would send it to peer review.

Referee Report

2 major / 2 minor

Summary. The paper proposes a constrained approximate dynamic programming (ADP) framework for linear systems that incorporates positively invariant sets to enforce state and input constraints at every time step during policy iteration. The approach is claimed to guarantee constraint satisfaction for all time while ensuring asymptotic convergence of the policy to the optimal infinite-horizon LQR solution; a data-driven implementation via sampled trajectories is provided along with numerical examples.

Significance. If the invariance preservation property holds under the data-driven updates, the result would meaningfully advance safe RL for constrained linear systems by combining set-theoretic invariance with ADP, offering a route to constraint satisfaction without relying on post-hoc projection or barrier functions. The data-driven algorithm and explicit convergence claim to LQR are positive features.

major comments (2)

[§3 and Algorithm 1] §3 (Policy Update via Invariant Sets) and Algorithm 1: the data-driven least-squares / temporal-difference update for the feedback gain is performed without any explicit constraint or projection onto the (non-convex) set of matrices that preserve positive invariance of the chosen set X. Consequently the closed-loop map (A+BK)X may exit X after the first improvement step, undermining the “constraint satisfaction for all time” guarantee.
[Theorem 2] Theorem 2 (Asymptotic Convergence): the proof sketch assumes that every iterate remains inside the invariance-preserving policy class, but the data-driven step described in §4 does not enforce this; the convergence claim therefore rests on an unverified invariance preservation property under approximation error.

minor comments (2)

[§2] Notation for the invariant set X and the associated polyhedral representation is introduced without a dedicated preliminary subsection; readers must infer the exact invariance condition from later equations.
[§5] Numerical examples section reports trajectories but omits quantitative metrics (e.g., maximum constraint violation over 100 Monte-Carlo runs or distance to LQR gain) that would allow direct verification of the claimed guarantees.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on the manuscript. We address each major comment below and agree that revisions are needed to strengthen the invariance guarantees in the data-driven setting.

read point-by-point responses

Referee: [§3 and Algorithm 1] §3 (Policy Update via Invariant Sets) and Algorithm 1: the data-driven least-squares / temporal-difference update for the feedback gain is performed without any explicit constraint or projection onto the (non-convex) set of matrices that preserve positive invariance of the chosen set X. Consequently the closed-loop map (A+BK)X may exit X after the first improvement step, undermining the “constraint satisfaction for all time” guarantee.

Authors: We agree that the data-driven least-squares update in §4 and Algorithm 1 lacks an explicit projection or constraint to ensure the updated gain K preserves positive invariance of X. The current formulation does not guarantee that (A+BK)X remains inside X after each improvement step. We will revise the manuscript to incorporate a projection step onto the set of invariance-preserving gains (or provide a proof that the update preserves invariance under the stated assumptions) and update Algorithm 1 and the text in §3 and §4 accordingly. revision: yes
Referee: [Theorem 2] Theorem 2 (Asymptotic Convergence): the proof sketch assumes that every iterate remains inside the invariance-preserving policy class, but the data-driven step described in §4 does not enforce this; the convergence claim therefore rests on an unverified invariance preservation property under approximation error.

Authors: The referee is correct that the proof of Theorem 2 assumes all iterates remain in the invariance-preserving policy class, yet the data-driven implementation does not enforce this. We will revise the manuscript to either modify the update rule (via projection or additional constraints) to ensure the assumption holds or adjust the theorem and proof to rigorously account for cases where invariance is preserved only approximately. The revised version will include these changes. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation chain is self-contained against external LQR benchmark

full rationale

The provided abstract and context describe an ADP procedure that augments policy updates with invariant-set constraints to enforce safety while targeting the standard infinite-horizon LQR optimum asymptotically. No equations, fitted-parameter predictions, or self-citation chains are exhibited that would reduce any claimed result to its own inputs by construction. The convergence claim is to an externally defined LQR solution rather than a quantity defined inside the algorithm, and the data-driven implementation is presented as a practical realization rather than a tautological renaming. This is the normal case of a non-circular paper whose central guarantee rests on independent set-invariance theory and classical LQR optimality.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Ledger constructed from abstract only; full paper may contain additional fitted parameters or assumptions in the policy update rules or invariant set computations.

axioms (2)

domain assumption The system is linear (time-invariant).
Title and abstract explicitly restrict the setting to linear systems.
domain assumption Invariant sets for the state and input constraints can be found or approximated.
The method relies on leveraging such sets to enforce perpetual constraint satisfaction.

pith-pipeline@v0.9.0 · 5626 in / 1268 out tokens · 34530 ms · 2026-05-25T15:05:25.586132+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Reinforcement learning and adaptive dynamic programming for feedback control,

F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits and Systems Magazine , vol. 9, no. 3, pp. 32–50, 2009

work page 2009
[2]

Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,

F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, “Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,”IEEE Control Systems, vol. 32, no. 6, pp. 76–105, 2012

work page 2012
[3]

D. P. Bertsekas, Dynamic Programming and Optimal Control , 2nd ed. Athena Scientiﬁc, 2000

work page 2000
[4]

Optimal and Autonomous Control Using Reinforcement Learning: A Survey,

B. Kiumarsi, K. G. Vamvoudakis, H. Modares, and F. L. Lewis, “Optimal and Autonomous Control Using Reinforcement Learning: A Survey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2042–2062, 2018

work page 2042
[5]

An iterative technique for the computation of the steady state gains for the discrete optimal regulator,

G. Hewer, “An iterative technique for the computation of the steady state gains for the discrete optimal regulator,” IEEE Transactions on Automatic Control , vol. 16, no. 4, pp. 382–384, aug 1971. [Online]. Available: http://ieeexplore.ieee.org/document/1099755/

work page arXiv 1971
[6]

On an iterative technique for riccati equation computations,

D. L. Kleinman, “On an iterative technique for riccati equation computations,” IEEE Trans. on Automatic Control, vol. 13, no. 1, pp. 114–115, 1968

work page 1968
[7]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, 2011

work page 2011
[8]

Adaptive linear quadratic control using policy iteration,

S. J. Bradtke, B. E. Ydstie, and A. G. Barto, “Adaptive linear quadratic control using policy iteration,” in Proc. of the American Control Conference, vol. 3. Citeseer, 1994, pp. 3475–3475

work page 1994
[9]

Q-learning,

C. J. Watkins and P. Dayan, “Q-learning,” Machine learning , vol. 8, no. 3-4, pp. 279–292, 1992

work page 1992
[10]

Neural networks for control and system identiﬁcation,

P. J. Werbos, “Neural networks for control and system identiﬁcation,” in Proc. of the 28th IEEE Conf. Dec. and Control . IEEE, 1989, pp. 260–265. 15

work page 1989
[11]

Adaptive optimal control for continuous-time linear systems based on policy iteration,

D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, “Adaptive optimal control for continuous-time linear systems based on policy iteration,” Automatica, vol. 45, no. 2, pp. 477–484, 2009

work page 2009
[12]

Reinforcement learning and distributed local model synthesis,

T. Landelius, “Reinforcement learning and distributed local model synthesis,” Ph.D. disserta- tion, Link¨ oping University Electronic Press, 1997

work page 1997
[13]

Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control,

A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control,” Automatica, vol. 43, no. 3, pp. 473–481, 2007

work page 2007
[14]

Approximate optimal trajectory tracking for continuous-time nonlinear systems,

R. Kamalapurkar, H. Dinh, S. Bhasin, and W. E. Dixon, “Approximate optimal trajectory tracking for continuous-time nonlinear systems,” Automatica, vol. 51, pp. 40–48, 2015

work page 2015
[15]

Adaptive dynamic programming and adaptive optimal output reg- ulation of linear systems,

W. Gao and Z.-P. Jiang, “Adaptive dynamic programming and adaptive optimal output reg- ulation of linear systems,” IEEE Transactions on Automatic Control , vol. 61, no. 12, pp. 4164–4169, 2016

work page 2016
[16]

Data-driven adaptive dynamic programming for continuous- time fully cooperative games with partially constrained inputs,

Q. Zhang, D. Zhao, and Y. Zhu, “Data-driven adaptive dynamic programming for continuous- time fully cooperative games with partially constrained inputs,” Neurocomputing, vol. 238, pp. 377–386, 2017

work page 2017
[17]

Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms,

H. Zhang, H. Jiang, C. Luo, and G. Xiao, “Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms,” IEEE transactions on cybernetics, vol. 47, no. 10, pp. 3331–3340, 2017

work page 2017
[18]

Discrete-time nonlinear hjb solution using approximate dynamic programming: Convergence proof,

A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear hjb solution using approximate dynamic programming: Convergence proof,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) , vol. 38, no. 4, pp. 943–949, 2008

work page 2008
[19]

Revisiting Approximate Dynamic Programming and its Convergence,

A. Heydari, “Revisiting Approximate Dynamic Programming and its Convergence,” IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2733–2743, 2014

work page 2014
[20]

Robust constrained model predictive control using linear matrix inequalities,

M. V. Kothare, V. Balakrishnan, and M. Morari, “Robust constrained model predictive control using linear matrix inequalities,” Automatica, vol. 32, no. 10, pp. 1361–1379, 1996

work page 1996
[21]

Sup- port vector machine informed explicit nonlinear model predictive control using low-discrepancy sequences

A. Chakrabarty, V. C. Dinh, M. J. Corless, A. E. Rundell, S. H. Zak, G. T. Buzzardet al., “Sup- port vector machine informed explicit nonlinear model predictive control using low-discrepancy sequences.” IEEE Trans. Automat. Contr. , vol. 62, no. 1, pp. 135–148, 2017

work page 2017
[22]

Automated driving: Safe motion planning using positively invariant sets,

K. Berntorp, A. Weiss, C. Danielson, I. V. Kolmanovsky, and S. Di Cairano, “Automated driving: Safe motion planning using positively invariant sets,” in Proc. of the 20th Int. Conf. on Intelligent Transportation Sys. (ITSC) . IEEE, 2017, pp. 1–6

work page 2017
[23]

Safe learning of regions of attraction for uncertain, nonlinear systems with gaussian processes,

F. Berkenkamp, R. Moriconi, A. P. Schoellig, and A. Krause, “Safe learning of regions of attraction for uncertain, nonlinear systems with gaussian processes,” Proc. of the IEEE Conf. Decision and Control, pp. 4661–4666, 2016

work page 2016
[24]

Safe reinforcement learning: Learning with supervision using a constraint-admissible set,

Z. Li, U. Kalabi´ c, and T. Chu, “Safe reinforcement learning: Learning with supervision using a constraint-admissible set,” in Proc. of the American Control Conference (ACC) . IEEE, 2018, pp. 6390–6395. 16

work page 2018
[25]

Direct data-driven control of constrained systems,

D. Piga, S. Formentin, and A. Bemporad, “Direct data-driven control of constrained systems,” IEEE Transactions on Control Systems Technology , vol. 26, no. 4, pp. 1422–1429, 2018

work page 2018
[26]

Ljung, System identiﬁcation: Theory for the User

L. Ljung, System identiﬁcation: Theory for the User . Upper Saddle River, N.J.: Prentice Hall, 1999

work page 1999
[27]

Semideﬁnite optimization,

M. J. Todd, “Semideﬁnite optimization,” Acta Numerica, vol. 10, p. 515560, 2001

work page 2001
[28]

Semideﬁnite programming,

L. Vandenberghe and S. Boyd, “Semideﬁnite programming,” SIAM Review, vol. 38, no. 1, pp. 49–95, 1996

work page 1996
[29]

Alternating direction augmented lagrangian methods for semideﬁnite programming,

Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmented lagrangian methods for semideﬁnite programming,” Mathematical Programming Computation , vol. 2, no. 3, pp. 203–230, Dec 2010

work page 2010
[30]

Fast ADMM for semideﬁnite programs with chordal sparsity,

Y. Zheng, G. Fantuzzi, A. Papachristodoulou, P. Goulart, and A. Wynn, “Fast ADMM for semideﬁnite programs with chordal sparsity,” in 2017 American Control Conference (ACC) , May 2017, pp. 3335–3340. 17

work page 2017

[1] [1]

Reinforcement learning and adaptive dynamic programming for feedback control,

F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits and Systems Magazine , vol. 9, no. 3, pp. 32–50, 2009

work page 2009

[2] [2]

Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,

F. L. Lewis, D. Vrabie, and K. G. Vamvoudakis, “Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers,”IEEE Control Systems, vol. 32, no. 6, pp. 76–105, 2012

work page 2012

[3] [3]

D. P. Bertsekas, Dynamic Programming and Optimal Control , 2nd ed. Athena Scientiﬁc, 2000

work page 2000

[4] [4]

Optimal and Autonomous Control Using Reinforcement Learning: A Survey,

B. Kiumarsi, K. G. Vamvoudakis, H. Modares, and F. L. Lewis, “Optimal and Autonomous Control Using Reinforcement Learning: A Survey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2042–2062, 2018

work page 2042

[5] [5]

An iterative technique for the computation of the steady state gains for the discrete optimal regulator,

G. Hewer, “An iterative technique for the computation of the steady state gains for the discrete optimal regulator,” IEEE Transactions on Automatic Control , vol. 16, no. 4, pp. 382–384, aug 1971. [Online]. Available: http://ieeexplore.ieee.org/document/1099755/

work page arXiv 1971

[6] [6]

On an iterative technique for riccati equation computations,

D. L. Kleinman, “On an iterative technique for riccati equation computations,” IEEE Trans. on Automatic Control, vol. 13, no. 1, pp. 114–115, 1968

work page 1968

[7] [7]

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. MIT Press, 2011

work page 2011

[8] [8]

Adaptive linear quadratic control using policy iteration,

S. J. Bradtke, B. E. Ydstie, and A. G. Barto, “Adaptive linear quadratic control using policy iteration,” in Proc. of the American Control Conference, vol. 3. Citeseer, 1994, pp. 3475–3475

work page 1994

[9] [9]

Q-learning,

C. J. Watkins and P. Dayan, “Q-learning,” Machine learning , vol. 8, no. 3-4, pp. 279–292, 1992

work page 1992

[10] [10]

Neural networks for control and system identiﬁcation,

P. J. Werbos, “Neural networks for control and system identiﬁcation,” in Proc. of the 28th IEEE Conf. Dec. and Control . IEEE, 1989, pp. 260–265. 15

work page 1989

[11] [11]

Adaptive optimal control for continuous-time linear systems based on policy iteration,

D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, “Adaptive optimal control for continuous-time linear systems based on policy iteration,” Automatica, vol. 45, no. 2, pp. 477–484, 2009

work page 2009

[12] [12]

Reinforcement learning and distributed local model synthesis,

T. Landelius, “Reinforcement learning and distributed local model synthesis,” Ph.D. disserta- tion, Link¨ oping University Electronic Press, 1997

work page 1997

[13] [13]

Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control,

A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control,” Automatica, vol. 43, no. 3, pp. 473–481, 2007

work page 2007

[14] [14]

Approximate optimal trajectory tracking for continuous-time nonlinear systems,

R. Kamalapurkar, H. Dinh, S. Bhasin, and W. E. Dixon, “Approximate optimal trajectory tracking for continuous-time nonlinear systems,” Automatica, vol. 51, pp. 40–48, 2015

work page 2015

[15] [15]

Adaptive dynamic programming and adaptive optimal output reg- ulation of linear systems,

W. Gao and Z.-P. Jiang, “Adaptive dynamic programming and adaptive optimal output reg- ulation of linear systems,” IEEE Transactions on Automatic Control , vol. 61, no. 12, pp. 4164–4169, 2016

work page 2016

[16] [16]

Data-driven adaptive dynamic programming for continuous- time fully cooperative games with partially constrained inputs,

Q. Zhang, D. Zhao, and Y. Zhu, “Data-driven adaptive dynamic programming for continuous- time fully cooperative games with partially constrained inputs,” Neurocomputing, vol. 238, pp. 377–386, 2017

work page 2017

[17] [17]

Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms,

H. Zhang, H. Jiang, C. Luo, and G. Xiao, “Discrete-time nonzero-sum games for multiplayer using policy-iteration-based adaptive dynamic programming algorithms,” IEEE transactions on cybernetics, vol. 47, no. 10, pp. 3331–3340, 2017

work page 2017

[18] [18]

Discrete-time nonlinear hjb solution using approximate dynamic programming: Convergence proof,

A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear hjb solution using approximate dynamic programming: Convergence proof,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) , vol. 38, no. 4, pp. 943–949, 2008

work page 2008

[19] [19]

Revisiting Approximate Dynamic Programming and its Convergence,

A. Heydari, “Revisiting Approximate Dynamic Programming and its Convergence,” IEEE Transactions on Cybernetics, vol. 44, no. 12, pp. 2733–2743, 2014

work page 2014

[20] [20]

Robust constrained model predictive control using linear matrix inequalities,

M. V. Kothare, V. Balakrishnan, and M. Morari, “Robust constrained model predictive control using linear matrix inequalities,” Automatica, vol. 32, no. 10, pp. 1361–1379, 1996

work page 1996

[21] [21]

Sup- port vector machine informed explicit nonlinear model predictive control using low-discrepancy sequences

A. Chakrabarty, V. C. Dinh, M. J. Corless, A. E. Rundell, S. H. Zak, G. T. Buzzardet al., “Sup- port vector machine informed explicit nonlinear model predictive control using low-discrepancy sequences.” IEEE Trans. Automat. Contr. , vol. 62, no. 1, pp. 135–148, 2017

work page 2017

[22] [22]

Automated driving: Safe motion planning using positively invariant sets,

K. Berntorp, A. Weiss, C. Danielson, I. V. Kolmanovsky, and S. Di Cairano, “Automated driving: Safe motion planning using positively invariant sets,” in Proc. of the 20th Int. Conf. on Intelligent Transportation Sys. (ITSC) . IEEE, 2017, pp. 1–6

work page 2017

[23] [23]

Safe learning of regions of attraction for uncertain, nonlinear systems with gaussian processes,

F. Berkenkamp, R. Moriconi, A. P. Schoellig, and A. Krause, “Safe learning of regions of attraction for uncertain, nonlinear systems with gaussian processes,” Proc. of the IEEE Conf. Decision and Control, pp. 4661–4666, 2016

work page 2016

[24] [24]

Safe reinforcement learning: Learning with supervision using a constraint-admissible set,

Z. Li, U. Kalabi´ c, and T. Chu, “Safe reinforcement learning: Learning with supervision using a constraint-admissible set,” in Proc. of the American Control Conference (ACC) . IEEE, 2018, pp. 6390–6395. 16

work page 2018

[25] [25]

Direct data-driven control of constrained systems,

D. Piga, S. Formentin, and A. Bemporad, “Direct data-driven control of constrained systems,” IEEE Transactions on Control Systems Technology , vol. 26, no. 4, pp. 1422–1429, 2018

work page 2018

[26] [26]

Ljung, System identiﬁcation: Theory for the User

L. Ljung, System identiﬁcation: Theory for the User . Upper Saddle River, N.J.: Prentice Hall, 1999

work page 1999

[27] [27]

Semideﬁnite optimization,

M. J. Todd, “Semideﬁnite optimization,” Acta Numerica, vol. 10, p. 515560, 2001

work page 2001

[28] [28]

Semideﬁnite programming,

L. Vandenberghe and S. Boyd, “Semideﬁnite programming,” SIAM Review, vol. 38, no. 1, pp. 49–95, 1996

work page 1996

[29] [29]

Alternating direction augmented lagrangian methods for semideﬁnite programming,

Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmented lagrangian methods for semideﬁnite programming,” Mathematical Programming Computation , vol. 2, no. 3, pp. 203–230, Dec 2010

work page 2010

[30] [30]

Fast ADMM for semideﬁnite programs with chordal sparsity,

Y. Zheng, G. Fantuzzi, A. Papachristodoulou, P. Goulart, and A. Wynn, “Fast ADMM for semideﬁnite programs with chordal sparsity,” in 2017 American Control Conference (ACC) , May 2017, pp. 3335–3340. 17

work page 2017