Model-Based Reinforcement Learning via Meta-Policy Optimization

· 2018 · cs.LG · arXiv 1809.05214

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models. Using an ensemble of learned dynamic models, MB-MPO meta-learns a policy that can quickly adapt to any model in the ensemble with one policy gradient step. This steers the meta-policy towards internalizing consistent dynamics predictions among the ensemble while shifting the burden of behaving optimally w.r.t. the model discrepancies towards the adaptation step. Our experiments show that MB-MPO is more robust to model imperfections than previous model-based approaches. Finally, we demonstrate that our approach is able to match the asymptotic performance of model-free methods while requiring significantly less experience.

representative citing papers

Benchmarking Model-Based Reinforcement Learning

cs.LG · 2019-07-03 · accept · novelty 7.0

Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.

Uncertainty-aware Model-based Policy Optimization

cs.LG · 2019-06-25 · unverdicted · novelty 5.0

Introduces a framework that learns an uncertainty-aware dynamics model and optimizes the policy via automatic differentiation through the model, reporting competitive asymptotic performance with significantly lower sample complexity than baselines on continuous control benchmarks.

Calibrated Model-Based Deep Reinforcement Learning

cs.LG · 2019-06-19 · unverdicted · novelty 5.0

Augmenting model-based RL agents with calibrated predictive uncertainties improves planning, sample efficiency, and exploration on continuous control tasks.

citing papers explorer

Showing 3 of 3 citing papers.

Benchmarking Model-Based Reinforcement Learning cs.LG · 2019-07-03 · accept · none · ref 8 · internal anchor
Introduces a benchmark suite of over 18 MBRL environments, evaluates multiple algorithms under consistent settings, and identifies three core challenges: dynamics bottleneck, planning horizon dilemma, and early-termination dilemma.
Uncertainty-aware Model-based Policy Optimization cs.LG · 2019-06-25 · unverdicted · none · ref 5 · internal anchor
Introduces a framework that learns an uncertainty-aware dynamics model and optimizes the policy via automatic differentiation through the model, reporting competitive asymptotic performance with significantly lower sample complexity than baselines on continuous control benchmarks.
Calibrated Model-Based Deep Reinforcement Learning cs.LG · 2019-06-19 · unverdicted · none · ref 7 · internal anchor
Augmenting model-based RL agents with calibrated predictive uncertainties improves planning, sample efficiency, and exploration on continuous control tasks.

Model-Based Reinforcement Learning via Meta-Policy Optimization

fields

years

verdicts

representative citing papers

citing papers explorer