Online learning with graph-structured feedback against adaptive adversaries

Po-Ling Loh; Zhili Feng

arxiv: 1804.00335 · v1 · pith:5NLATAN5new · submitted 2018-04-01 · 💻 cs.LG · cs.IT· math.IT· stat.ML

Online learning with graph-structured feedback against adaptive adversaries

Zhili Feng , Po-Ling Loh This is my paper

classification 💻 cs.LG cs.ITmath.ITstat.ML

keywords feedbackwidetildeadversarylowerboundboundedboundsgraph-structured

0 comments

read the original abstract

We derive upper and lower bounds for the policy regret of $T$-round online learning problems with graph-structured feedback, where the adversary is nonoblivious but assumed to have a bounded memory. We obtain upper bounds of $\widetilde O(T^{2/3})$ and $\widetilde O(T^{3/4})$ for strongly-observable and weakly-observable graphs, respectively, based on analyzing a variant of the Exp3 algorithm. When the adversary is allowed a bounded memory of size 1, we show that a matching lower bound of $\widetilde\Omega(T^{2/3})$ is achieved in the case of full-information feedback. We also study the particular loss structure of an oblivious adversary with switching costs, and show that in such a setting, non-revealing strongly-observable feedback graphs achieve a lower bound of $\widetilde\Omega(T^{2/3})$, as well.

This paper has not been read by Pith yet.

Online learning with graph-structured feedback against adaptive adversaries

discussion (0)