pith. machine review for the scientific record. sign in

arxiv: 1711.00832 · v2 · submitted 2017-11-02 · 💻 cs.AI · cs.GT· cs.LG· cs.MA

Recognition: unknown

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Authors on Pith no claims yet
classification 💻 cs.AI cs.GTcs.LGcs.MA
keywords learningpoliciesreinforcementinrlagentsalgorithmbestduring
0
0 comments X
read the original abstract

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.