Spectral bandits achieve scalable regret in graph-structured recommendation by using an effective dimension to learn good policies from few node evaluations.
https://arxiv.org/pdf/1605.08722.pdf An empirical evaluation of Thompson sampling
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A new algorithm for online influence maximization under a total budget constraint using the independent cascade model and edge-level semi-bandit feedback, with improved regret bounds for both budgeted and cardinality settings.
No algorithm can be optimal in both stochastic and adversarial best-arm identification; a new parameter-free algorithm matches the derived lower bound up to log factors in stochastic cases while handling adversarial rewards.
citing papers explorer
-
Spectral bandits
Spectral bandits achieve scalable regret in graph-structured recommendation by using an effective dimension to learn good policies from few node evaluations.
-
Budgeted Online Influence Maximization
A new algorithm for online influence maximization under a total budget constraint using the independent cascade model and edge-level semi-bandit feedback, with improved regret bounds for both budgeted and cardinality settings.
-
Best of both worlds: Stochastic & adversarial best-arm identification
No algorithm can be optimal in both stochastic and adversarial best-arm identification; a new parameter-free algorithm matches the derived lower bound up to log factors in stochastic cases while handling adversarial rewards.