Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis

Naomi Ehrich Leonard; Paul Reverdy; Vaibhav Srivastava

arxiv: 1507.01160 · v2 · pith:ALZGHX2Gnew · submitted 2015-07-05 · 🧮 math.OC · cs.LG· stat.ML

Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis

Vaibhav Srivastava , Paul Reverdy , Naomi Ehrich Leonard This is my paper

classification 🧮 math.OC cs.LGstat.ML

keywords correlatedperformancealgorithmalgorithmsbanditbayesiancorrelationinfluence

0 comments

read the original abstract

We consider the correlated multiarmed bandit (MAB) problem in which the rewards associated with each arm are modeled by a multivariate Gaussian random variable, and we investigate the influence of the assumptions in the Bayesian prior on the performance of the upper credible limit (UCL) algorithm and a new correlated UCL algorithm. We rigorously characterize the influence of accuracy, confidence, and correlation scale in the prior on the decision-making performance of the algorithms. Our results show how priors and correlation structure can be leveraged to improve performance.

This paper has not been read by Pith yet.

Correlated Multiarmed Bandit Problem: Bayesian Algorithms and Regret Analysis

discussion (0)