Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control

· 2017 · cs.SY · arXiv 1706.06491

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Trial-and-error based reinforcement learning (RL) has seen rapid advancements in recent times, especially with the advent of deep neural networks. However, the majority of autonomous RL algorithms require a large number of interactions with the environment. A large number of interactions may be impractical in many real-world applications, such as robotics, and many practical systems have to obey limitations in the form of state space or control constraints. To reduce the number of system interactions while simultaneously handling constraints, we propose a model-based RL framework based on probabilistic Model Predictive Control (MPC). In particular, we propose to learn a probabilistic transition model using Gaussian Processes (GPs) to incorporate model uncertainty into long-term predictions, thereby, reducing the impact of model errors. We then use MPC to find a control sequence that minimises the expected long-term cost. We provide theoretical guarantees for first-order optimality in the GP-based transition models with deterministic approximate inference for long-term planning. We demonstrate that our approach does not only achieve state-of-the-art data efficiency, but also is a principled way for RL in constrained environments.

representative citing papers

Benchmarking Model-Based Reinforcement Learning

cs.LG · 2019-07-03 · accept · novelty 7.0

Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.

Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning

eess.SY · 2019-06-27 · unverdicted · novelty 6.0

Develops a learning-based MPC algorithm that uses confidence intervals on trajectories and terminal set constraints to guarantee safety throughout RL exploration and training.

Uncertainty-aware Model-based Policy Optimization

cs.LG · 2019-06-25 · unverdicted · novelty 5.0

Introduces a framework that learns an uncertainty-aware dynamics model and optimizes the policy via automatic differentiation through the model, reporting competitive asymptotic performance with significantly lower sample complexity than baselines on continuous control benchmarks.

citing papers explorer

Showing 3 of 3 citing papers.

Benchmarking Model-Based Reinforcement Learning cs.LG · 2019-07-03 · accept · none · ref 25 · internal anchor
Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.
Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning eess.SY · 2019-06-27 · unverdicted · none · ref 25 · internal anchor
Develops a learning-based MPC algorithm that uses confidence intervals on trajectories and terminal set constraints to guarantee safety throughout RL exploration and training.
Uncertainty-aware Model-based Policy Optimization cs.LG · 2019-06-25 · unverdicted · none · ref 12 · internal anchor
Introduces a framework that learns an uncertainty-aware dynamics model and optimizes the policy via automatic differentiation through the model, reporting competitive asymptotic performance with significantly lower sample complexity than baselines on continuous control benchmarks.

Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control

fields

years

verdicts

representative citing papers

citing papers explorer