Algorithms achieve O(T^{1/2}) regret in contextual Stackelberg games via reduction to linear contextual bandits, improving on prior O(T^{2/3}) rates.
Improved algorithms for linear stochastic bandits
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it