pith. sign in

arxiv: 1507.04910 · v1 · pith:T6JFXOSEnew · submitted 2015-07-17 · 💻 cs.LG

Lower Bounds for Multi-armed Bandit with Non-equivalent Multiple Plays

classification 💻 cs.LG
keywords boundsbanditlowermulti-armedmultiplenon-equivalentplaysproblem
0
0 comments X
read the original abstract

We study the stochastic multi-armed bandit problem with non-equivalent multiple plays where, at each step, an agent chooses not only a set of arms, but also their order, which influences reward distribution. In several problem formulations with different assumptions, we provide lower bounds for regret with standard asymptotics $O(\log{t})$ but novel coefficients and provide optimal algorithms, thus proving that these bounds cannot be improved.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.