pith. sign in

A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Reinforcement learning (RL) has proven to be particularly effective in solving complex decision-making problems for a wide range of applications. Safe reinforcement learning refers to a class of constrained problems where the constraint violations lead to partial or complete system failure. The goal of this review is to provide an overview of safe RL techniques using Lyapunov and barrier functions to guarantee this notion of safety (stability of the system in terms of a computed policy and constraint satisfaction during training and deployment). Three concrete takeaways emerge from our analysis: (i) the field has shifted decisively from model-based to model-free formulations since 2017, with combined CLF-CBF approaches becoming the most active sub-area post-2022; (ii) per-class open problems are now well-defined, certificate validity under function approximation and distribution shift for Lyapunov methods, feasibility and deadlock under hard CBF-QP shielding for barrier methods, and joint CLF--CBF feasibility under model uncertainty for combined methods; and (iii) deployment to high-dimensional and partially observable settings remains the dominant scalability barrier across all three classes. The different approaches employed are discussed in detail along with their shortcomings and benefits to provide critique and possible future research directions. The review demonstrates promising scope for providing safety guarantees for complex dynamical systems with operational constraints using model-based and model-free RL.

fields

cs.RO 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

citing papers explorer

Showing 1 of 1 citing paper.

  • Robust Koopman Control Barrier Filters for Safe Actor-Critic Reinforcement Learning cs.RO · 2026-05-26 · unverdicted · none · ref 2 · internal anchor

    Robust Koopman-CBF SAC learns Koopman predictors from data, tightens lifted CBF constraints with a data-estimated residual margin, and applies a QP safety filter inside SAC, reporting zero constraint violations on CartPole while matching unconstrained returns.