A Survey on Contextual Multi-armed Bandits
classification
💻 cs.LG
keywords
contextualsurveyadversarialalgorithmalgorithmsanalyzeassumptionbandit
read the original abstract
In this survey we cover a few stochastic and adversarial contextual bandit algorithms. We analyze each algorithm's assumption and regret bound.
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Latent Order Bandits
Latent order bandits require only a known partial order on actions within each latent state rather than full reward distributions, enabling UCB and posterior-sampling algorithms with regret bounds that match or exceed...
-
Identifiable Latent Bandits: Leveraging observational data for personalized decision-making
Identifiable latent bandits apply nonlinear ICA to observational data to recover representations sufficient for inferring optimal actions in new instances, shortening exploration time.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.