Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

Balazs Szorenyi; Gal Dalal; Gugan Thoppe; Shie Mannor

arxiv: 1703.05376 · v5 · pith:7MVT5V3Unew · submitted 2017-03-15 · 💻 cs.AI

Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

Gal Dalal , Balazs Szorenyi , Gugan Thoppe , Shie Mannor This is my paper

classification 💻 cs.AI

keywords two-timescaleresultalgorithmsanalysisapproximationboundconvergencefinite

0 comments

read the original abstract

Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). Their iterates have two parts that are updated using distinct stepsizes. In this work, we develop a novel recipe for their finite sample analysis. Using this, we provide a concentration bound, which is the first such result for a two-timescale SA. The type of bound we obtain is known as `lock-in probability'. We also introduce a new projection scheme, in which the time between successive projections increases exponentially. This scheme allows one to elegantly transform a lock-in probability into a convergence rate result for projected two-timescale SA. From this latter result, we then extract key insights on stepsize selection. As an application, we finally obtain convergence rates for the projected two-timescale RL algorithms GTD(0), GTD2, and TDC.

This paper has not been read by Pith yet.

Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

discussion (0)