SAMBA: Safe Model-Based & Active Reinforcement Learning

Aivar Sootla; Alexander I. Cowen-Rivers; Daniel Palenicek; Haitham Ammar; Jun Wang; Mohammed Abdullah; Vincent Moens

arxiv: 2006.09436 · v1 · pith:JG5JEOLJnew · submitted 2020-06-12 · 💻 cs.LG · cs.AI· cs.RO· stat.ML

SAMBA: Safe Model-Based & Active Reinforcement Learning

Alexander I. Cowen-Rivers , Daniel Palenicek , Vincent Moens , Mohammed Abdullah , Aivar Sootla , Jun Wang , Haitham Ammar This is my paper

classification 💻 cs.LG cs.AIcs.ROstat.ML

keywords activesafeconstraintsframeworklearningmetricsnovelreinforcement

0 comments

read the original abstract

In this paper, we propose SAMBA, a novel framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. Our method builds upon PILCO to enable active exploration using novel(semi-)metrics for out-of-sample Gaussian process evaluation optimised through a multi-objective problem that supports conditional-value-at-risk constraints. We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations. Our results show orders of magnitude reductions in samples and violations compared to state-of-the-art methods. Lastly, we provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.

This paper has not been read by Pith yet.

SAMBA: Safe Model-Based & Active Reinforcement Learning

discussion (0)