A covariance-adapting algorithm for semi-bandits achieves asymptotically tight regret bounds under a new sub-exponential distribution family, with direct application to sparse rewards.
28 COVARIANCE SEMI-BANDITS e1 e2 e{1,2} µt−1 • • • Figure 3:Confidence regions build byESCB-C(the pseudo-ellipse), andCUCB-KL(the rectangle), for∥·∥ 1 constrained outcomes
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Covariance-adapting algorithm for semi-bandits with application to sparse rewards
A covariance-adapting algorithm for semi-bandits achieves asymptotically tight regret bounds under a new sub-exponential distribution family, with direct application to sparse rewards.