A process algebra with guarded choice and recursion is compiled to global and then projected local Mealy machines that filter safe joint actions for each agent in Dec-POMDPs using belief-style state subsets.
Safe multi-agent reinforcement learning via shielding,
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
A literature review of safe RL using Lyapunov and barrier functions that identifies a shift to model-free methods since 2017, well-defined open problems per approach class, and high-dimensional scalability as the main barrier.
citing papers explorer
-
Generating Local Shields for Decentralised Partially Observable Markov Decision Processes
A process algebra with guarded choice and recursion is compiled to global and then projected local Mealy machines that filter safe joint actions for each agent in Dec-POMDPs using belief-style state subsets.
-
A Review On Safe Reinforcement Learning Using Lyapunov and Barrier Functions
A literature review of safe RL using Lyapunov and barrier functions that identifies a shift to model-free methods since 2017, well-defined open problems per approach class, and high-dimensional scalability as the main barrier.