pith. sign in

arxiv: 1108.4961 · v1 · pith:3Q47I2FTnew · submitted 2011-08-24 · 💻 cs.LG

Non-trivial two-armed partial-monitoring games are bandits

classification 💻 cs.LG
keywords gamegamespartial-monitoringactionsadversaryavailablebandit-likebandits
0
0 comments X
read the original abstract

We consider online learning in partial-monitoring games against an oblivious adversary. We show that when the number of actions available to the learner is two and the game is nontrivial then it is reducible to a bandit-like game and thus the minimax regret is $\Theta(\sqrt{T})$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.