A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

Bob Mankoff; Ervin Tanczos; Robert Nowak

arxiv: 1709.03570 · v1 · pith:XFYMCAAVnew · submitted 2017-09-11 · 🧮 math.ST · stat.TH

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

Bob Mankoff , Robert Nowak , Ervin Tanczos This is my paper

classification 🧮 math.ST stat.TH

keywords algorithmboundeddistributionskl-lucbachievedalgorithmsanalogueanytime

0 comments

read the original abstract

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.

This paper has not been read by Pith yet.

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

discussion (0)