An Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits

Peter Frazier; Weici Hu

arxiv: 1707.00205 · v1 · pith:Y57HPO6Jnew · submitted 2017-07-01 · 🧮 math.OC

An Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits

Weici Hu , Peter Frazier This is my paper

classification 🧮 math.OC

keywords optimalpolicyarmsasymptoticallyindexindex-basedproblemproblems

0 comments

read the original abstract

We consider restless multi-armed bandit (RMAB) with a finite horizon and multiple pulls per period. Leveraging the Lagrangian relaxation, we approximate the problem with a collection of single arm problems. We then propose an index-based policy that uses optimal solutions of the single arm problems to index individual arms, and offer a proof that it is asymptotically optimal as the number of arms tends to infinity. We also use simulation to show that this index-based policy performs better than the state-of-art heuristics in various problem settings.

This paper has not been read by Pith yet.

An Asymptotically Optimal Index Policy for Finite-Horizon Restless Bandits

discussion (0)