Bandits with heavy tail

· 2012 · stat.ML · arXiv 1209.1727

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

The stochastic multi-armed bandit problem is well understood when the reward distributions are sub-Gaussian. In this paper we examine the bandit problem under the weaker assumption that the distributions have moments of order 1+\epsilon, for some $\epsilon \in (0,1]$. Surprisingly, moments of order 2 (i.e., finite variance) are sufficient to obtain regret bounds of the same order as under sub-Gaussian reward distributions. In order to achieve such regret, we define sampling strategies based on refined estimators of the mean such as the truncated empirical mean, Catoni's M-estimator, and the median-of-means estimator. We also derive matching lower bounds that also show that the best achievable regret deteriorates when \epsilon <1.

representative citing papers

Scheduling jobs with unknown size distribution in a M/G/1 queue: the shifted empirical Gittins

math.PR · 2026-06-23 · unverdicted · novelty 6.0

Shifted empirical Gittins indices derived from discretized and right-shifted samples of a bounded job-size distribution yield an index policy that is asymptotically optimal for response-time minimization in M/G/1 queues.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Scheduling jobs with unknown size distribution in a M/G/1 queue: the shifted empirical Gittins math.PR · 2026-06-23 · unverdicted · none · ref 11 · internal anchor
Shifted empirical Gittins indices derived from discretized and right-shifted samples of a bounded job-size distribution yield an index policy that is asymptotically optimal for response-time minimization in M/G/1 queues.

Bandits with heavy tail

fields

years

verdicts

representative citing papers

citing papers explorer