With opponent-action feedback in zero-sum games, an efficient algorithm achieves near-optimal t^{-1/2} last-iterate convergence in duality gap with high probability.
Conference On Learning Theory , pages=
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
A robust variant of binary search achieves regret O(C + log T) for dynamic pricing with known corruption C and O(C + log² T) when unknown.
An algorithm for online resource allocation with budget and general constraints achieves O(sqrt(T)) regret in stochastic and alpha-regret in adversarial regimes with bounded constraint violations.
citing papers explorer
-
Near-Optimal Last-Iterate Convergence for Zero-Sum Games with Bandit Feedback and Opponent Actions
With opponent-action feedback in zero-sum games, an efficient algorithm achieves near-optimal t^{-1/2} last-iterate convergence in duality gap with high probability.
-
Toward Optimal Regret in Robust Pricing: Decoupling Corruption and Time
A robust variant of binary search achieves regret O(C + log T) for dynamic pricing with known corruption C and O(C + log² T) when unknown.
-
Online Resource Allocation With General Constraints
An algorithm for online resource allocation with budget and general constraints achieves O(sqrt(T)) regret in stochastic and alpha-regret in adversarial regimes with bounded constraint violations.