Nonparametric bandits with single-index rewards: Optimality and adaptivity.arXiv preprint arXiv:2512.24669

Wanteng Ma, T Tony Cai · arXiv 2512.24669

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

stat.ML · 2026-05-10 · conditional · novelty 7.0

Single-index bandits have optimal regret of order T^{2/3} via a two-phase algorithm that estimates the index direction with a Stein estimator then applies UCB on a grid.

citing papers explorer

Showing 1 of 1 citing paper.

Optimal Regret for Single Index Bandits stat.ML · 2026-05-10 · conditional · none · ref 8
Single-index bandits have optimal regret of order T^{2/3} via a two-phase algorithm that estimates the index direction with a Stein estimator then applies UCB on a grid.

Nonparametric bandits with single-index rewards: Optimality and adaptivity.arXiv preprint arXiv:2512.24669

fields

years

verdicts

representative citing papers

citing papers explorer