Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning
read the original abstract
We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information. For full information feedback and Lipschitz losses, we design the first explicit algorithm achieving the minimax regret rate (up to log factors). In a partial feedback model motivated by second-price auctions, we obtain algorithms for Lipschitz and semi-Lipschitz losses with regret bounds improving on the known bounds for standard bandit feedback. Our analysis combines novel results for contextual second-price auctions with a novel algorithmic approach based on chaining. When the context space is Euclidean, our chaining approach is efficient and delivers an even better regret bound.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Chaining Meets Chain Rule: Multilevel Entropic Regularization and Training of Neural Nets
Derives algorithm-dependent generalization bounds for neural nets using multilevel entropic regularization and proposes a Metropolis-simulated multi-scale Gibbs training procedure tested on a two-layer net for MNIST.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.