pith. machine review for the scientific record. sign in

arxiv: 1901.08106 · v2 · submitted 2019-01-23 · 💻 cs.LG · cs.GT· cs.MA· stat.ML

Recognition: unknown

Open-ended Learning in Symmetric Zero-sum Games

Authors on Pith no claims yet
classification 💻 cs.LG cs.GTcs.MAstat.ML
keywords gamesagentsnontransitivepsrozero-sumconstructexistingframework
0
0 comments X
read the original abstract

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'. If the game is approximately transitive, then self-play generates sequences of agents of increasing strength. However, nontransitive games, such as rock-paper-scissors, can exhibit strategic cycles, and there is no longer a clear objective -- we want agents to increase in strength, but against whom is unclear. In this paper, we introduce a geometric framework for formulating agent objectives in zero-sum games, in order to construct adaptive sequences of objectives that yield open-ended learning. The framework allows us to reason about population performance in nontransitive games, and enables the development of a new algorithm (rectified Nash response, PSRO_rN) that uses game-theoretic niching to construct diverse populations of effective agents, producing a stronger set of agents than existing algorithms. We apply PSRO_rN to two highly nontransitive resource allocation games and find that PSRO_rN consistently outperforms the existing alternatives.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Dota 2 with Large Scale Deep Reinforcement Learning

    cs.LG 2019-12 accept novelty 7.0

    OpenAI Five achieved superhuman performance in Dota 2 by defeating the world champions using scaled self-play reinforcement learning.