SupSplitLog achieves Õ(√(dT)) regret for logistic bandits without context diversity assumptions by splitting samples for an initial estimator and Newton correction, and can adapt to data-dependent bounds.
Bernstein-type dimension-free concentration for self-normalised martingales.arXiv preprint arXiv:2507.20982
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
We introduce a dimension-free Bernstein-type tail inequality for self-normalised martingales, where the normalisation uses the predictable quadratic variation and the radius depends on the information gain of the observed covariance. As applications, we provide ellipsoidal confidence sequences for logistic regression with adaptively chosen Hilbert-valued covariates, and give instance-adaptive regret bounds for Hilbert-armed logistic bandits.
verdicts
UNVERDICTED 2representative citing papers
Derives vector-valued self-normalized concentration bounds for light-tailed processes beyond sub-Gaussianity, with applications to online linear regression and linear bandits.
citing papers explorer
-
Logistic Bandits with $\tilde{O}(\sqrt{dT})$ Regret without Context Diversity Assumptions
SupSplitLog achieves Õ(√(dT)) regret for logistic bandits without context diversity assumptions by splitting samples for an initial estimator and Newton correction, and can adapt to data-dependent bounds.
-
Vector-valued self-normalized concentration inequalities beyond sub-Gaussianity
Derives vector-valued self-normalized concentration bounds for light-tailed processes beyond sub-Gaussianity, with applications to online linear regression and linear bandits.