Neural Population Learning beyond Symmetric Zero-sum Games
read the original abstract
We study computationally efficient methods for finding equilibria in n-player general-sum games, specifically ones that afford complex visuomotor skills. We show how existing methods would struggle in this setting, either computationally or in theory. We then introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated Equilibrium (CCE) of the game. We show empirical convergence in a suite of OpenSpiel games, validated rigorously by exact game solvers. We then deploy NeuPL-JPSRO to complex domains, where our approach enables adaptive coordination in a MuJoCo control domain and skill transfer in capture-the-flag. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality, paving the way towards solving real-world games between heterogeneous players with mixed motives.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Towards Learning Representations of Policies in Two-Player Zero-Sum Imperfect-Information Games
Basic dataset creation, embedding learning, and evaluation tasks on Kuhn and Leduc Poker demonstrate that useful behavioral representations appear in the learned embeddings.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.