On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

Mingyi Hong; Qi Cai; Yongxin Chen; Zhaoran Wang

arxiv: 1901.03674 · v1 · pith:EIV4A6S3new · submitted 2019-01-11 · 💻 cs.LG · cs.AI· math.OC· stat.ML

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

Qi Cai , Mingyi Hong , Yongxin Chen , Zhaoran Wang This is my paper

classification 💻 cs.LG cs.AImath.OCstat.ML

keywords learningconvergenceimitationadversarialalternatinggenerativegloballinear

0 comments

read the original abstract

We study the global convergence of generative adversarial imitation learning for linear quadratic regulators, which is posed as minimax optimization. To address the challenges arising from non-convex-concave geometry, we analyze the alternating gradient algorithm and establish its Q-linear rate of convergence to a unique saddle point, which simultaneously recovers the globally optimal policy and reward function. We hope our results may serve as a small step towards understanding and taming the instability in imitation learning as well as in more general non-convex-concave alternating minimax optimization that arises from reinforcement learning and generative adversarial learning.

This paper has not been read by Pith yet.

On the Global Convergence of Imitation Learning: A Case for Linear Quadratic Regulator

discussion (0)