Recognition: unknown
Solving Combinatorial Games using Products, Projections and Lexicographically Optimal Bases
read the original abstract
In order to find Nash-equilibria for two-player zero-sum games where each player plays combinatorial objects like spanning trees, matchings etc, we consider two online learning algorithms: the online mirror descent (OMD) algorithm and the multiplicative weights update (MWU) algorithm. The OMD algorithm requires the computation of a certain Bregman projection, that has closed form solutions for simple convex sets like the Euclidean ball or the simplex. However, for general polyhedra one often needs to exploit the general machinery of convex optimization. We give a novel primal-style algorithm for computing Bregman projections on the base polytopes of polymatroids. Next, in the case of the MWU algorithm, although it scales logarithmically in the number of pure strategies or experts $N$ in terms of regret, the algorithm takes time polynomial in $N$; this especially becomes a problem when learning combinatorial objects. We give a general recipe to simulate the multiplicative weights update algorithm in time polynomial in their natural dimension. This is useful whenever there exists a polynomial time generalized counting oracle (even if approximate) over these objects. Finally, using the combinatorial structure of symmetric Nash-equilibria (SNE) when both players play bases of matroids, we show that these can be found with a single projection or convex minimization (without using online learning).
This paper has not been read by Pith yet.
Forward citations
Cited by 2 Pith papers
-
Local LMO: Constrained Gradient Optimization via a Local Linear Minimization Oracle
Local LMO is a new projection-free method that achieves the convergence rates of projected gradient descent for constrained optimization by using local linear minimization oracles over small balls.
-
Why Global LLM Leaderboards Are Misleading: Small Portfolios for Heterogeneous Supervised ML
Global Bradley-Terry rankings of LLMs are misleading due to structured heterogeneity in user preferences, and small (λ, ν)-portfolios recover coherent subpopulations that cover over 96% of votes with just five rankings.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.