Adaptivity in linear bandits for ε-best arm identification gives only logarithmic improvements on hypercube, ℓ2 ball, m-sets and multi-task settings but polynomial-factor gains on a specially constructed action set, enabled by an adaptive O(d log(1/δ)/ε²) ℓ2-norm estimator.
arXiv preprint arXiv:2504.00461 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
An algorithm achieves Õ(√(A ln(S) T)/ε) regret for extensive-form bandits under ε-local differential privacy, claimed as the first such result.
citing papers explorer
-
On the Power of Adaptivity for $\varepsilon$-Best Arm Identification in Linear Bandits
Adaptivity in linear bandits for ε-best arm identification gives only logarithmic improvements on hypercube, ℓ2 ball, m-sets and multi-task settings but polynomial-factor gains on a specially constructed action set, enabled by an adaptive O(d log(1/δ)/ε²) ℓ2-norm estimator.
-
Differential Privacy in the Extensive-Form Bandit Problem
An algorithm achieves Õ(√(A ln(S) T)/ε) regret for extensive-form bandits under ε-local differential privacy, claimed as the first such result.