pith. machine review for the scientific record. sign in

arxiv: 2604.01372 · v2 · submitted 2026-04-01 · 📡 eess.SY · cs.SY

Recognition: 2 theorem links

· Lean Theorem

Temporal Logic Control of Nonlinear Stochastic Systems with Online Performance Optimization

Authors on Pith no claims yet

Pith reviewed 2026-05-13 21:44 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords IMDP abstractiontemporal logic controlstochastic systemsmodel predictive controlpolicy synthesisnonlinear dynamicsonline optimizationsafety guarantees
0
0 comments X

The pith

Nonlinear stochastic systems gain control policies from an IMDP abstraction that form a verified set, letting online optimizers like MPC tune performance without losing the minimum satisfaction probability for temporal logic specs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a new abstraction method that turns nonlinear discrete-time stochastic systems into interval Markov decision processes whose output is not one fixed policy but an entire collection of policies. Each policy in the collection meets the given temporal logic specification with at least a stated minimum probability, so any online algorithm can pick among them while the collective guarantee remains intact. Model predictive control is then used inside this collection to minimize an independent cost such as energy use. Experiments indicate that the resulting closed-loop behavior outperforms single-policy abstractions while the degradation in the probabilistic bound stays small. The approach therefore separates the safety verification task from the performance tuning task.

Core claim

We construct an interval MDP abstraction of a nonlinear stochastic system that returns a set of policies, each satisfying a temporal logic specification with a minimum probability. Any policy chosen from this set at runtime preserves the same minimum probability, so an arbitrary online optimizer (for example MPC) can be applied to minimize a separate cost function while the original probabilistic guarantee is retained.

What carries the argument

The verified policy set produced by the IMDP abstraction, which any online selector can draw from without dropping below the collective satisfaction probability.

If this is right

  • Any online algorithm such as MPC can now optimize a cost independent of the temporal logic specification.
  • The closed-loop performance improves relative to single-policy abstractions while the satisfaction probability bound is preserved.
  • The method applies directly to safety-critical autonomous systems modeled as nonlinear stochastic dynamics.
  • The abstraction step and the online optimization step can be performed separately.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same set-construction idea could be tested on hybrid or continuous-time stochastic models to see whether the policy collection remains tractable.
  • Reducing the size of the policy set while keeping the probability guarantee might further improve online computation speed.
  • Comparing the method against robust-MDP or chance-constraint approaches on the same benchmarks would quantify how much extra flexibility the set provides.

Load-bearing premise

The online optimizer is restricted to choosing only policies inside the abstracted set and never selects actions outside it.

What would settle it

Run the closed-loop system with the online MPC and record whether the observed frequency of specification satisfaction falls below the minimum probability promised for the entire policy set.

Figures

Figures reproduced from arXiv: 2604.01372 by Alessandro Abate, Alessandro Riccardi, Bart De Schutter, Luca Laurenti, Thom Badings.

Figure 1
Figure 1. Figure 1: Our framework leverages formal abstractions to com [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Our abstraction is based on a partition of the state [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Change in the satisfaction probability of the con [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 7
Figure 7. Figure 7: Simulated state trajectories for the double integrator, [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Mountain car simulations with different ϵ for the baseline (policy; red) and our MPC controller (blue). (a) ϵ = [0.1; 0.2] (b) ϵ = [0.15; 0.3] [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Dubins simulations with different ϵ for the baseline without MPC (policy; red) and our MPC controller (blue). ϵ = 0.5, which reduces the cost by 11.6% at a loss in λ of around 10%. Furthermore, the cost reduction is slightly higher for N = 5 than N = 3, showing that a longer hori￾zon indeed allows for better online optimization, at the cost of higher time (TMPC step ) to solve each MPC instance. Mountain c… view at source ↗
read the original abstract

The deployment of autonomous systems in safety-critical environments requires control policies that guarantee satisfaction of complex control specifications. These systems are commonly modeled as nonlinear discrete-time stochastic systems. A~popular approach to computing a policy that provably satisfies a complex control specification is to construct a finite-state abstraction, often represented as a Markov decision process (MDP) with intervals of transition probabilities, i.e., an interval MDP (IMDP). However, existing abstraction techniques compute a \emph{single policy}, thus leaving no room for online cost or performance optimization, e.g., of energy consumption. To overcome this limitation, we propose a novel IMDP abstraction technique that yields a \emph{set of policies}, each of which satisfies the control specification with a certain minimum probability. We can thus use any online control algorithm to search through this set of verified policies while retaining the guaranteed satisfaction probability of the entire policy set. In particular, we employ model predictive control (MPC) to minimize a desired cost function that is independent of the control specification considered in the abstraction. Our experiments demonstrate that our approach yields better control performance than state-of-the-art single-policy abstraction techniques, with a small degradation of the guarantees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes a novel IMDP abstraction for nonlinear discrete-time stochastic systems that computes a set of policies, each satisfying a temporal logic specification with a minimum probability. This enables online algorithms such as MPC to optimize a performance cost (independent of the specification) by searching within the verified set while preserving the overall satisfaction probability guarantee. Experiments report improved performance over single-policy abstraction baselines with only small degradation in guarantees.

Significance. If the central claim on probability preservation holds, the result is significant: it extends existing IMDP techniques to support online performance optimization without sacrificing formal probabilistic guarantees, addressing a practical limitation of single-policy abstractions in safety-critical stochastic control.

major comments (2)
  1. [Abstract and online optimization description] The central claim (abstract) that any online control algorithm can search the verified policy set while retaining the minimum satisfaction probability requires an explicit mechanism to restrict the optimizer (e.g., MPC) to policies inside the set. Standard MPC optimizes a cost over the original nonlinear dynamics without such constraints; the manuscript provides no indication that the policy set is encoded as enforceable constraints (linear, convex, or otherwise) that survive insertion into the online quadratic program.
  2. [IMDP abstraction and policy set construction] The abstraction must provably ensure that every policy selectable by the online optimizer induces a trajectory whose satisfaction probability is at least the certified minimum for the entire set. No details are given on how the set construction (IMDP abstraction) encodes admissible policies in a form that prevents selection of out-of-set controls, which is load-bearing for the guarantee.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of how the verified policy set interfaces with online optimizers, and we address each point below with clarifications and commitments to revision.

read point-by-point responses
  1. Referee: [Abstract and online optimization description] The central claim (abstract) that any online control algorithm can search the verified policy set while retaining the minimum satisfaction probability requires an explicit mechanism to restrict the optimizer (e.g., MPC) to policies inside the set. Standard MPC optimizes a cost over the original nonlinear dynamics without such constraints; the manuscript provides no indication that the policy set is encoded as enforceable constraints (linear, convex, or otherwise) that survive insertion into the online quadratic program.

    Authors: We agree that an explicit restriction mechanism is required for the guarantee to hold in practice. The manuscript constructs the policy set as all control mappings that respect the IMDP interval transitions and achieve the minimum satisfaction probability under the abstraction. For online use, this set is intended to be enforced by restricting the MPC decision variables to the admissible controls at each abstract state (e.g., via state-dependent lookup or parametric constraints derived from the abstraction). While the current text describes the set construction and states that MPC searches within it, we acknowledge insufficient detail on the encoding. We will revise by adding a dedicated subsection on constraint formulation, including how the admissible set is represented to remain compatible with the quadratic program (linear or convex where possible) and a brief example of insertion into the MPC optimization. revision: yes

  2. Referee: [IMDP abstraction and policy set construction] The abstraction must provably ensure that every policy selectable by the online optimizer induces a trajectory whose satisfaction probability is at least the certified minimum for the entire set. No details are given on how the set construction (IMDP abstraction) encodes admissible policies in a form that prevents selection of out-of-set controls, which is load-bearing for the guarantee.

    Authors: The IMDP abstraction computes the set by solving a reachability problem over interval probabilities, yielding all policies whose induced abstract trajectories satisfy the specification with probability at least the certified minimum; any concrete policy chosen from this set therefore inherits the guarantee by construction of the abstraction. The admissible controls are encoded as the control choices at each abstract state that keep the interval transition probabilities within the satisfying bounds. To prevent out-of-set selections, the online optimizer is restricted to these controls. We will revise the manuscript to include an expanded description of the set representation (e.g., as a state-to-control-set mapping) together with a short argument showing that every selectable policy remains inside the certified set, thereby closing the gap noted by the referee. revision: yes

Circularity Check

0 steps flagged

Minor self-citation present but derivation remains independent and non-circular

full rationale

The paper extends standard IMDP abstraction methods to produce a set of policies rather than a single policy, allowing online optimization (e.g., MPC) while preserving a minimum satisfaction probability. No derivation step reduces by construction to its own inputs, fitted parameters renamed as predictions, or self-referential definitions. Existing IMDP techniques are cited as background; the novelty lies in the set-based output and its integration with online control, which is presented as a direct construction without load-bearing self-citation chains or uniqueness theorems imported from the authors' prior work. The central claim is self-contained against the abstraction procedure described and does not collapse to renaming known results or smuggling ansatzes via citation. This yields a low circularity score consistent with normal extension of prior methods.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions in stochastic abstraction and temporal logic control; no new free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Nonlinear discrete-time stochastic systems can be soundly abstracted to interval MDPs preserving probabilistic satisfaction of temporal logic specifications
    This is the foundational premise of the abstraction technique described in the abstract.

pith-pipeline@v0.9.0 · 5517 in / 1190 out tokens · 30864 ms · 2026-05-13T21:44:29.578733+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 1 internal anchor

  1. [1]

    Abate, M

    A. Abate, M. Giacobbe, and D. Roy. Quantitative supermartingale certificates. InComputer Aided Verification, pages 3–28. Springer, 2025

  2. [2]

    Abate, M

    A. Abate, M. Prandini, J. Lygeros, and S. Sastry. Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems.Automatica, 44:2724–2734, 2008

  3. [3]

    Alshiekh, R

    M. Alshiekh, R. Bloem, R. Ehlers, B. K¨ onighofer, S. Niekum, and U. Topcu. Safe reinforcement learning via shielding. InProceedings of the AAAI Conference on Artificial Intelligence, pages 2669–2678, 2018

  4. [4]

    A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada. Control barrier function based quadratic programs for safety critical systems.IEEE Transactions on Automatic Control, 62:3861– 3876, 2017

  5. [5]

    K. J. ˚Astr¨ om.Introduction to Stochastic Control Theory. Courier Corporation, 2012

  6. [6]

    Badings and A Abate

    T. Badings and A Abate. Probabilistic alternating simulations for policy synthesis in uncertain stochastic dynamical systems. In2025 IEEE 64th Conference on Decision and Control (CDC), pages 3919–3924, 2025

  7. [7]

    Badings, W

    T. Badings, W. Koops, S. Junges, and N. Jansen. Policy verification in stochastic dynamical systems using logarithmic neural certificates. InComputer Aided Verification, pages 349–375. Springer, 2025

  8. [8]

    Badings, L

    T. Badings, L. Romao, A. Abate, D. Parker, H. A. Poonawala, M. Stoelinga, and N. Jansen. Robust control for dynamical systems with non-Gaussian noise via formal abstractions. Journal of Artificial Intelligence Research, 76:341–391, 2023

  9. [9]

    Baier and J

    C. Baier and J. Katoen.Principles of Model Checking. The MIT Press, 2008

  10. [10]

    Belta, B

    C. Belta, B. Yordanov, and E. A. Gol.Formal Methods for Discrete-Time Dynamical Systems, volume 89. Springer, 2017

  11. [11]

    Bemporad and M

    A. Bemporad and M. Morari. Control of systems integrating logic, dynamics, and constraints.Automatica, 35:407–427, 1999

  12. [12]

    D. P. Bertsekas and S. E. Shreve.Stochastic Optimal Control: The Discrete-Time Case. Athena Scientific, 1996

  13. [13]

    S. Carr, N. Jansen, S. Junges, and U. Topcu. Safe reinforcement learning via shielding under partial observability. InProceedings of the AAAI Conference on Artificial Intelligence, pages 14748–14756, 2023

  14. [14]

    Cauchi, L

    N. Cauchi, L. Laurenti, M. Lahijanian, A. Abate, M. Kwiatkowska, and L. Cardelli. Efficiency through uncertainty: scalable formal synthesis for stochastic hybrid systems. InProceedings of the 22nd ACM International Conference on Hybrid Systems: Computation and Control, pages 240–251, 2019

  15. [15]

    A. Clark. Control barrier functions for stochastic systems. Automatica, 130:1–9, 2021

  16. [16]

    Dawson, S

    C. Dawson, S. Gao, and C. Fan. Safe control with learned certificates: A survey of neural Lyapunov, barrier, and contraction methods for robotics and control.IEEE Transactions on Robotics, 39:1749–1767, 2023

  17. [17]

    W. H. Fleming and R. Rishel.Deterministic and Stochastic Optimal Control. Springer, 2012

  18. [18]

    Girard and G

    A. Girard and G. J. Pappas. Hierarchical control system design using approximate simulation.Automatica, 45:566– 571, 2009

  19. [19]

    Gracia and M

    I. Gracia and M. Lahijanian. Beyond interval MDPs: Tight and efficient abstractions of stochastic systems. https://doi.org/10.48550/arXiv.2507.02213, 2025

  20. [20]

    Gracia, L

    I. Gracia, L. Laurenti, M. J. Mazo, A. Abate, and M. Lahijanian. Temporal logic control for nonlinear stochastic systems under unknown disturbances. In7th Annual Conference on Learning for Dynamics and Control (L4DC), pages 1–13, 2025

  21. [21]

    Gurobi Optimizer Reference Manual, 2025

    Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2025

  22. [22]

    Haesaert and S

    S. Haesaert and S. Soudjani. Robust dynamic programming for temporal logic control of stochastic systems.IEEE Transactions on Automatic Control, 66:2496–2511, 2021

  23. [23]

    Haesaert, S

    S. Haesaert, S. E. Z. Soudjani, and A. Abate. Verification of general Markov decision processes by approximate similarity relations and policy refinement.SIAM Journal on Control and Optimization, 55:2333–2367, 2017

  24. [24]

    W. P. M. H. Heemels, B. De Schutter, and A. Bemporad. Equivalence of hybrid dynamical models.Automatica, 37:1085–1091, 2001

  25. [25]

    Hensel, S

    C. Hensel, S. Junges, J. P. Katoen, T. Quatmann, and M. Volk. The probabilistic model checker Storm. International Journal on Software Tools for Technology Transfer, 24:589–610, 2022

  26. [26]

    T. A. Henzinger, K. Mallik, P. Sadeghi, and D. ˇZikeli´ c. Supermartingale certificates for quantitative omega-regular verification and control. InComputer Aided Verification, pages 29–55. Springer, 2025

  27. [27]

    Hewing, K

    L. Hewing, K. P. Wabersich, M. Menner, and M. N. Zeilinger. Learning-Based model predictive control: Toward safe learning in control.Annual Review of Control, Robotics, and Autonomous Systems, 3:269–296, 2020

  28. [28]

    K. C. Hsu, H. Hu, and J. F. Fisac. The safety filter: A unified view of safety-critical control in autonomous systems.Annual Review of Control, Robotics, and Autonomous Systems, 7:47– 72, 2024

  29. [29]

    G. N. Iyengar. Robust dynamic programming.Mathematics of Operations Research, 30:257–280, 2005

  30. [30]

    Jackson, L

    J. Jackson, L. Laurenti, E. Frew, and M. Lahijanian. Strategy synthesis for partially-known switched stochastic systems. In Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control, pages 1–11, 2021

  31. [31]

    Jagtap, S

    P. Jagtap, S. Soudjani, and M. Zamani. Formal synthesis of stochastic systems via control barrier certificates.IEEE Transactions on Automatic Control, 66:3097–3110, 2021

  32. [32]

    Kallenberg.Foundations of Modern Probability

    O. Kallenberg.Foundations of Modern Probability. Springer, third edition, 2021

  33. [33]

    M. Krstic. Inverse optimal safety filters.IEEE Transactions on Automatic Control, 69:16–31, 2024

  34. [34]

    P. R. Kumar and P. Varaiya.Stochastic Systems: Estimation, Identification, and Adaptive Control. SIAM, 2015

  35. [35]

    Kwiatkowska, G

    M. Kwiatkowska, G. Norman, and D. Parker. PRISM 4.0: Verification of probabilistic real-time systems. InComputer Aided Verification, pages 585–591. Springer, 2011

  36. [36]

    Lahijanian, S

    M. Lahijanian, S. B. Andersson, and C. Belta. Formal verification and synthesis for discrete-time stochastic systems.IEEE Transactions on Automatic Control, 60:2031– 2045, 2015

  37. [37]

    A. Lavaei. Abstraction-based synthesis of stochastic hybrid systems. InProceedings of the 27th ACM International Conference on Hybrid Systems: Computation and Control (HSCC), pages 1–11, 2024. 14

  38. [38]

    Lavaei, S

    A. Lavaei, S. Soudjani, A. Abate, and M. Zamani. Automated verification and synthesis of stochastic hybrid systems: A survey.Automatica, 146:1–40, 2022

  39. [39]

    Lavaei, S

    A. Lavaei, S. Soudjani, E. Frazzoli, and M. Zamani. Constructing MDP abstractions using data with formal guarantees.IEEE Control Systems Letters, 7:460–465, 2023

  40. [40]

    F. B. Mathiesen, S. Haesaert, and L. Laurenti. Scalable control synthesis for stochastic systems via structural IMDP abstractions. InProceedings of the 28th ACM International Conference on Hybrid Systems: Computation and Control, pages 1–12, 2025

  41. [41]

    A. Mesbah. Stochastic model predictive control: An overview and perspectives for future research.IEEE Control Systems Magazine, 36:30–44, 2016

  42. [42]

    A. W. Moore. Efficient memory-based learning for robot control. Technical report, University of Cambridge, Computer Laboratory, 1990

  43. [43]

    M. M. Morato and M. S. Felix. Data science and model predictive control: A survey of recent advances on data-driven MPC algorithms.Journal of Process Control, 144:1–17, 2024

  44. [44]

    Nazeri, T

    M. Nazeri, T. Badings, S. Soudjani, and A. Abate. Data- Driven yet formal policy synthesis for stochastic nonlinear dynamical systems. In7th Annual Conference on Learning for Dynamics and Control (L4DC), pages 1–15, 2025

  45. [45]

    Ni and M

    T. Ni and M. Kamgarpour. A learning-based approach to stochastic optimal control under reach-avoid constraint. In Proceedings of the 28th ACM International Conference on Hybrid Systems: Computation and Control, pages 1–8, 2025

  46. [46]

    Nilim and L

    A. Nilim and L. El Ghaoui. Robust control of Markov decision processes with uncertain transition matrices.Operations Research, 53:780–798, 2005

  47. [47]

    A. Pnueli. The temporal logic of programs. In18th Annual Symposium on Foundations of Computer Science (SFCS 1977), pages 46–57, 1977

  48. [48]

    Prajna, A

    S. Prajna, A. Jadbabaie, and G. J. Pappas. A framework for worst-case and stochastic safety verification using barrier certificates.IEEE Transactions on Automatic Control, 52:1415–1428, 2007

  49. [49]

    Y. Qin, M. Cao, and B. D. O. Anderson. Lyapunov criterion for stochastic systems and its applications in distributed computation.IEEE Transactions on Automatic Control, 65:546–560, 2020

  50. [50]

    J. B. Rawlings, D. Q. Mayne, and M. Diehl.Model Predictive Control: Theory, Computation, and Design. Nob Hill Publishing, second edition, 2017

  51. [51]

    Reiter, J

    R. Reiter, J. Hoffmann, D. Reinhardt, F. Messerer, K. Baumg¨ artner, S. Sawant, J. B¨ odecker, M. Diehl, and S. Gros. Synthesis of model predictive control and reinforcement learning: Survey and classification.Annual Reviews in Control, 61:1–36, 2026

  52. [52]

    Riccardi, T

    A. Riccardi, T. Badings, L. Laurenti, A. Abate, and B. De Schutter. Code for publication: Temporal Logic Control of Nonlinear Stochastic Systems with Online Performance Optimization. https://www.doi.org/10.4121/631b574d-40e8- 4951-b3ac-e304e3f34b13, 2026

  53. [53]

    Riccardi, L

    A. Riccardi, L. Laurenti, and B. De Schutter. Partitioning techniques for non-centralized predictive control: A systematic review and novel theoretical insights.Annual Reviews in Control, 61:1–41, 2026

  54. [54]

    Scattolini

    R. Scattolini. Architectures for distributed and hierarchical model predictive control – A review.Journal of Process Control, 19:723–731, 2009

  55. [55]

    Segala and N

    R. Segala and N. A. Lynch. Probabilistic simulations for probabilistic processes.Nordic Journal of Computing, 2:250– 237, 1995

  56. [56]

    Siciliano, L

    B. Siciliano, L. Sciavicco, L. Villani, and G. Oriolo.Robotics: Modelling, Planning and Control. Springer, 2009

  57. [57]

    E. Sontag. Nonlinear regulation: The piecewise linear approach.IEEE Transactions on Automatic Control, 26:346– 358, 1981

  58. [58]

    Suilen, T

    M. Suilen, T. Badings, E. M. Bovy, D. Parker, and N. Jansen. Robust Markov decision processes: A place where AI and formal methods meet. InPrinciples of Verification: Cycling the Probabilistic Landscape, pages 126–154. Springer, 2025

  59. [59]

    Tabuada.Verification and Control of Hybrid Systems: A Symbolic Approach

    P. Tabuada.Verification and Control of Hybrid Systems: A Symbolic Approach. Springer, 2009

  60. [60]

    Gymnasium: A Standard Interface for Reinforcement Learning Environments

    M. Towers, A. Kwiatkowski, J. Terry, J. U. Balis, G. De Cola, T. Deleu, M. Goul˜ ao, A. Kallinteris, M. Krimmel, A. KG, R. Perez-Vicente, A. Pierr´ e, S. Schulhoff, J. J. Tai, H. Tan, and O. G. Younis. Gymnasium: A standard interface for reinforcement learning environments. https://doi.org/10.48550/arXiv.2407.17032, 2025

  61. [61]

    Van Huijgevoort, O

    B. Van Huijgevoort, O. Sch¨ on, S. Soudjani, and S. Haesaert. SySCoRe: Synthesis via stochastic coupling relations. In Proceedings of the 26th ACM International Conference on Hybrid Systems: Computation and Control, pages 1–11, 2023

  62. [62]

    K. P. Wabersich and M. N. Zeilinger. A predictive safety filter for learning-based control of constrained nonlinear dynamical systems.Automatica, 129:1–13, 2021

  63. [63]

    E. M. Wolff, U. Topcu, and R. M. Murray. Robust control of uncertain Markov decision processes with temporal logic specifications. In2012 IEEE 51st IEEE Conference on Decision and Control (CDC), pages 3372–3379, 2012

  64. [64]

    ˇZikeli´ c, M

    D. ˇZikeli´ c, M. Lechner, T. A. Henzinger, and K. Chatterjee. Learning control policies for stochastic systems with reach- avoid guarantees. InProceedings of the AAAI Conference on Artificial Intelligence, pages 11926–11935, 2023. 15