An Analysis of Categorical Distributional Reinforcement Learning

Marc G. Bellemare; Mark Rowland; R\'emi Munos; Will Dabney; Yee Whye Teh

arxiv: 1802.08163 · v1 · pith:YLBBR7SHnew · submitted 2018-02-22 · 📊 stat.ML

An Analysis of Categorical Distributional Reinforcement Learning

Mark Rowland , Marc G. Bellemare , Will Dabney , R\'emi Munos , Yee Whye Teh This is my paper

classification 📊 stat.ML

keywords distributionalcdrllearningreinforcementalgorithmscategoricalrecentlyalgorithm

0 comments

read the original abstract

Distributional approaches to value-based reinforcement learning model the entire distribution of returns, rather than just their expected values, and have recently been shown to yield state-of-the-art empirical performance. This was demonstrated by the recently proposed C51 algorithm, based on categorical distributional reinforcement learning (CDRL) [Bellemare et al., 2017]. However, the theoretical properties of CDRL algorithms are not yet well understood. In this paper, we introduce a framework to analyse CDRL algorithms, establish the importance of the projected distributional Bellman operator in distributional RL, draw fundamental connections between CDRL and the Cram\'er distance, and give a proof of convergence for sample-based categorical distributional reinforcement learning algorithms.

This paper has not been read by Pith yet.

An Analysis of Categorical Distributional Reinforcement Learning

discussion (0)