Optimal Best Arm Identification with Fixed Confidence

· 2016 · math.ST · arXiv 1602.04589

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

We give a complete characterization of the complexity of best-arm identification in one-parameter bandit problems. We prove a new, tight lower bound on the sample complexity. We propose the `Track-and-Stop' strategy, which we prove to be asymptotically optimal. It consists in a new sampling rule (which tracks the optimal proportions of arm draws highlighted by the lower bound) and in a stopping rule named after Chernoff, for which we give a new analysis.

representative citing papers

Anytime-valid Optimal Policy Identification

stat.ME · 2026-06-16 · unverdicted · novelty 6.0

Constructs a time-indexed set S_t retaining the true optimal policy uniformly over time with high probability, enabling early stopping with sample complexity O((log |Π| + log log(1/Δ_min))/Δ_min²) when the optimum is unique.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Anytime-valid Optimal Policy Identification stat.ME · 2026-06-16 · unverdicted · none · ref 5 · internal anchor
Constructs a time-indexed set S_t retaining the true optimal policy uniformly over time with high probability, enabling early stopping with sample complexity O((log |Π| + log log(1/Δ_min))/Δ_min²) when the optimum is unique.

Optimal Best Arm Identification with Fixed Confidence

fields

years

verdicts

representative citing papers

citing papers explorer