Learning a Multi-Modal Policy via Imitating Demonstrations with Mixed Behaviors

Fang-I Hsiao; Jui-Hsuan Kuo; Min Sun

arxiv: 1903.10304 · v1 · pith:GXJP3PKHnew · submitted 2019-03-25 · 💻 cs.LG · stat.ML

Learning a Multi-Modal Policy via Imitating Demonstrations with Mixed Behaviors

Fang-I Hsiao , Jui-Hsuan Kuo , Min Sun This is my paper

classification 💻 cs.LG stat.ML

keywords policydemonstrationsbehaviorbehaviorslatentmethodmulti-modalapproach

0 comments

read the original abstract

We propose a novel approach to train a multi-modal policy from mixed demonstrations without their behavior labels. We develop a method to discover the latent factors of variation in the demonstrations. Specifically, our method is based on the variational autoencoder with a categorical latent variable. The encoder infers discrete latent factors corresponding to different behaviors from demonstrations. The decoder, as a policy, performs the behaviors accordingly. Once learned, the policy is able to reproduce a specific behavior by simply conditioning on a categorical vector. We evaluate our method on three different tasks, including a challenging task with high-dimensional visual inputs. Experimental results show that our approach is better than various baseline methods and competitive with a multi-modal policy trained by ground truth behavior labels.

This paper has not been read by Pith yet.

Learning a Multi-Modal Policy via Imitating Demonstrations with Mixed Behaviors

discussion (0)