pith. sign in

arxiv: 1606.08718 · v4 · pith:SDD4E4EJnew · submitted 2016-06-28 · 💻 cs.GT

Learning Nash Equilibrium for General-Sum Markov Games from Batch Data

classification 💻 cs.GT
keywords equilibriumgeneral-sumnashgameslearningmultiplayerbatchbellman-like
0
0 comments X
read the original abstract

This paper addresses the problem of learning a Nash equilibrium in $\gamma$-discounted multiplayer general-sum Markov Games (MG). A key component of this model is the possibility for the players to either collaborate or team apart to increase their rewards. Building an artificial player for general-sum MGs implies to learn more complex strategies which are impossible to obtain by using techniques developed for two-player zero-sum MGs. In this paper, we introduce a new definition of $\epsilon$-Nash equilibrium in MGs which grasps the strategy's quality for multiplayer games. We prove that minimizing the norm of two Bellman-like residuals implies the convergence to such an $\epsilon$-Nash equilibrium. Then, we show that minimizing an empirical estimate of the $L_p$ norm of these Bellman-like residuals allows learning for general-sum games within the batch setting. Finally, we introduce a neural network architecture named NashNetwork that successfully learns a Nash equilibrium in a generic multiplayer general-sum turn-based MG.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.