A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

Giuseppe Burtini; Jason Loeppky; Ramon Lawrence

arxiv: 1510.00757 · v4 · pith:JQTORTCMnew · submitted 2015-10-02 · 📊 stat.ML · cs.LG

A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit

Giuseppe Burtini , Jason Loeppky , Ramon Lawrence This is my paper

classification 📊 stat.ML cs.LG

keywords designexperimentmulti-armedonlinebanditexploremodelstochastic

0 comments

read the original abstract

Adaptive and sequential experiment design is a well-studied area in numerous domains. We survey and synthesize the work of the online statistical learning paradigm referred to as multi-armed bandits integrating the existing research as a resource for a certain class of online experiments. We first explore the traditional stochastic model of a multi-armed bandit, then explore a taxonomic scheme of complications to that model, for each complication relating it to a specific requirement or consideration of the experiment design context. Finally, at the end of the paper, we present a table of known upper-bounds of regret for all studied algorithms providing both perspectives for future theoretical work and a decision-making tool for practitioners looking for theoretical guarantees.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Productization Challenges of Contextual Multi-Armed Bandits
cs.IR 2019-07 accept novelty 3.0

The authors enumerate and address six productization challenges encountered while running contextual multi-armed bandits for two large-scale web use cases.