A Tutorial on Thompson Sampling

Russo, D · 2020 · arXiv 1707.02038

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Graph Dimensionality Reduction for Contextual Bandits: Structure-Specific Regret Bounds under Approximate Smoothness and Noisy Eigenspaces

cs.LG · 2026-06-26 · unverdicted · novelty 7.0

GraphDR-LinUCB projects contextual bandit arms onto a graph's low-frequency eigenspace to obtain the first Õ(k√T) regret bound under approximate smoothness, with a spectral predictor Γ_k that matches outcomes on five of six real datasets.

Budgeted Online Influence Maximization

cs.LG · 2026-04-21 · unverdicted · novelty 7.0

A new algorithm for online influence maximization under a total budget constraint using the independent cascade model and edge-level semi-bandit feedback, with improved regret bounds for both budgeted and cardinality settings.

Contextual Scalarisation Thompson Sampling for multi-objective decisions in public media

cs.IR · 2026-05-29 · unverdicted · novelty 4.0

CSTS learns context-dependent weights for multiple objectives in a multi-objective contextual bandit and outperforms fixed-weight and standard contextual bandit baselines on Swiss public broadcaster programming data.

Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial

cs.LG · 2026-04-01 · accept · novelty 2.0

Bayesian optimization automates the scientific discovery cycle by modeling observations with surrogate models and using acquisition functions to select experiments that balance known information with new exploration.

citing papers explorer

Showing 4 of 4 citing papers.

Graph Dimensionality Reduction for Contextual Bandits: Structure-Specific Regret Bounds under Approximate Smoothness and Noisy Eigenspaces cs.LG · 2026-06-26 · unverdicted · none · ref 36
GraphDR-LinUCB projects contextual bandit arms onto a graph's low-frequency eigenspace to obtain the first Õ(k√T) regret bound under approximate smoothness, with a spectral predictor Γ_k that matches outcomes on five of six real datasets.
Budgeted Online Influence Maximization cs.LG · 2026-04-21 · unverdicted · none · ref 179
A new algorithm for online influence maximization under a total budget constraint using the independent cascade model and edge-level semi-bandit feedback, with improved regret bounds for both budgeted and cardinality settings.
Contextual Scalarisation Thompson Sampling for multi-objective decisions in public media cs.IR · 2026-05-29 · unverdicted · none · ref 21
CSTS learns context-dependent weights for multiple objectives in a multi-objective contextual bandit and outperforms fixed-weight and standard contextual bandit baselines on Swiss public broadcaster programming data.
Efficient and Principled Scientific Discovery through Bayesian Optimization: A Tutorial cs.LG · 2026-04-01 · accept · none · ref 81
Bayesian optimization automates the scientific discovery cycle by modeling observations with surrogate models and using acquisition functions to select experiments that balance known information with new exploration.

A Tutorial on Thompson Sampling

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer