Multiple Identifications in Multi-Armed Bandits

· 2012 · cs.LG · arXiv 1205.3181

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

We study the problem of identifying the top $m$ arms in a multi-armed bandit game. Our proposed solution relies on a new algorithm based on successive rejects of the seemingly bad arms, and successive accepts of the good ones. This algorithmic contribution allows to tackle other multiple identifications settings that were previously out of reach. In particular we show that this idea of successive accepts and rejects applies to the multi-bandit best arm identification problem.

representative citing papers

Anytime-valid Optimal Policy Identification

stat.ME · 2026-06-16 · unverdicted · novelty 6.0

Constructs a time-indexed set S_t retaining the true optimal policy uniformly over time with high probability, enabling early stopping with sample complexity O((log |Π| + log log(1/Δ_min))/Δ_min²) when the optimum is unique.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Anytime-valid Optimal Policy Identification stat.ME · 2026-06-16 · unverdicted · none · ref 33 · internal anchor
Constructs a time-indexed set S_t retaining the true optimal policy uniformly over time with high probability, enabling early stopping with sample complexity O((log |Π| + log log(1/Δ_min))/Δ_min²) when the optimum is unique.

Multiple Identifications in Multi-Armed Bandits

fields

years

verdicts

representative citing papers

citing papers explorer