On Regret Bounds of Thompson Sampling for Bayesian Optimization
read the original abstract
We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on $1/\delta$ with probability $\delta$, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on $\delta$, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon $T$. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on $T$.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Nearly-Optimal Algorithm for Adversarial Kernelized Bandits
Exponential-weight algorithm achieves Õ(√(T γ_T)) adversarial regret for kernelized bandits and is optimal up to logs for squared exponential and Matérn kernels, with an efficient Nyström variant.
-
Nearly-Optimal Algorithm for Adversarial Kernelized Bandits
Exponential-weight algorithm attains Õ(√(T γ_T)) adversarial regret for kernelized bandits, with matching lower bounds for SE and Matérn kernels plus a Nyström-efficient variant.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.