pith. sign in

arxiv: 1403.5341 · v2 · pith:7RS33G7Lnew · submitted 2014-03-21 · 💻 cs.LG

An Information-Theoretic Analysis of Thompson Sampling

classification 💻 cs.LG
keywords analysisinformationinformation-theoreticsamplingthompsonacrossappliesbounds
0
0 comments X
read the original abstract

We provide an information-theoretic analysis of Thompson sampling that applies across a broad range of online optimization problems in which a decision-maker must learn from partial feedback. This analysis inherits the simplicity and elegance of information theory and leads to regret bounds that scale with the entropy of the optimal-action distribution. This strengthens preexisting results and yields new insight into how information improves performance.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Bayesian policy gradient and actor-critic algorithms

    cs.LG 2026-04 unverdicted novelty 6.0

    Bayesian modeling of policy gradients as Gaussian processes and actor-critic variants reduce sample needs and provide uncertainty estimates compared to Monte-Carlo methods.