pith. sign in

arxiv: 1709.03570 · v1 · pith:XFYMCAAVnew · submitted 2017-09-11 · 🧮 math.ST · stat.TH

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

classification 🧮 math.ST stat.TH
keywords algorithmboundeddistributionskl-lucbachievedalgorithmsanalogueanytime
0
0 comments X
read the original abstract

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.