The association problem in wireless networks: a Policy Gradient Reinforcement Learning approach

Ilham El Bouloumi; Richard Combes; Stephane Senecal; Zwi Altman

arxiv: 1306.2554 · v1 · pith:WE6MVYPOnew · submitted 2013-06-11 · 💻 cs.NI · cs.IT· cs.LG· math.IT

The association problem in wireless networks: a Policy Gradient Reinforcement Learning approach

Richard Combes , Ilham El Bouloumi , Stephane Senecal , Zwi Altman This is my paper

classification 💻 cs.NI cs.ITcs.LGmath.IT

keywords learningpgrlalgorithmassociationpolicyrobustgradientopposed

0 comments

read the original abstract

The purpose of this paper is to develop a self-optimized association algorithm based on PGRL (Policy Gradient Reinforcement Learning), which is both scalable, stable and robust. The term robust means that performance degradation in the learning phase should be forbidden or limited to predefined thresholds. The algorithm is model-free (as opposed to Value Iteration) and robust (as opposed to Q-Learning). The association problem is modeled as a Markov Decision Process (MDP). The policy space is parameterized. The parameterized family of policies is then used as expert knowledge for the PGRL. The PGRL converges towards a local optimum and the average cost decreases monotonically during the learning process. The properties of the solution make it a good candidate for practical implementation. Furthermore, the robustness property allows to use the PGRL algorithm in an "always-on" learning mode.

This paper has not been read by Pith yet.

The association problem in wireless networks: a Policy Gradient Reinforcement Learning approach

discussion (0)