Meta Reinforcement Learning with Latent Variable Gaussian Processes

Katja Hofmann; Marc Peter Deisenroth; Steind\'or S{\ae}mundsson

arxiv: 1803.07551 · v2 · pith:SSJPV4CFnew · submitted 2018-03-20 · 📊 stat.ML · cs.LG

Meta Reinforcement Learning with Latent Variable Gaussian Processes

Steind\'or S{\ae}mundsson , Katja Hofmann , Marc Peter Deisenroth This is my paper

classification 📊 stat.ML cs.LG

keywords taskslearningdatametalatentmodelreinforcementrelationship

0 comments

read the original abstract

Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e.g., robotics, animal experiments or drug design. Meta learning is one way to increase the data efficiency of learning algorithms by generalizing learned concepts from a set of training tasks to unseen, but related, tasks. Often, this relationship between tasks is hard coded or relies in some other way on human expertise. In this paper, we frame meta learning as a hierarchical latent variable model and infer the relationship between tasks automatically from data. We apply our framework in a model-based reinforcement learning setting and show that our meta-learning model effectively generalizes to novel tasks by identifying how new tasks relate to prior ones from minimal data. This results in up to a 60% reduction in the average interaction time needed to solve tasks compared to strong baselines.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Environment Probing Interaction Policies
cs.RO 2019-07 unverdicted novelty 6.0

EPI policies use a transition-predictability reward to probe environments and condition task policies, outperforming standard generalization methods on novel test environments.