On the Impossibility of Convergence of Mixed Strategies with Optimal No - Regret Learning

Muthukumar, V · 2020 · arXiv 2022.0016

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.LG · 2026-04-17 · unverdicted · novelty 7.0

In bandit-feedback zero-sum games, uncoupled algorithms achieve last-iterate Nash convergence at the optimal rate of O(T^{-1/4}).

Showing 1 of 1 citing paper.

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback cs.LG · 2026-04-17 · unverdicted · none · ref 29
In bandit-feedback zero-sum games, uncoupled algorithms achieve last-iterate Nash convergence at the optimal rate of O(T^{-1/4}).