A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Angeliki Lazaridou; Audrunas Gruslys; David Silver; Julien Perolat; Karl Tuyls; Marc Lanctot; Thore Graepel; Vinicius Zambaldi

arxiv: 1711.00832 · v2 · pith:XG2LM6VVnew · submitted 2017-11-02 · 💻 cs.AI · cs.GT· cs.LG· cs.MA

A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning

Marc Lanctot , Vinicius Zambaldi , Audrunas Gruslys , Angeliki Lazaridou , Karl Tuyls , Julien Perolat , David Silver , Thore Graepel This is my paper

classification 💻 cs.AI cs.GTcs.LGcs.MA

keywords learningpoliciesreinforcementinrlagentsalgorithmbestduring

0 comments

read the original abstract

To achieve general intelligence, agents must learn how to interact with others in a shared environment: this is the challenge of multiagent reinforcement learning (MARL). The simplest form is independent reinforcement learning (InRL), where each agent treats its experience as part of its (non-stationary) environment. In this paper, we first observe that policies learned using InRL can overfit to the other agents' policies during training, failing to sufficiently generalize during execution. We introduce a new metric, joint-policy correlation, to quantify this effect. We describe an algorithm for general MARL, based on approximate best responses to mixtures of policies generated using deep reinforcement learning, and empirical game-theoretic analysis to compute meta-strategies for policy selection. The algorithm generalizes previous ones such as InRL, iterated best response, double oracle, and fictitious play. Then, we present a scalable implementation which reduces the memory requirement using decoupled meta-solvers. Finally, we demonstrate the generality of the resulting policies in two partially observable settings: gridworld coordination games and poker.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

How Much Due Diligence Before You Bid? Learning in Intractable Takeover Auctions
cs.AI 2026-06 unverdicted novelty 6.0

Self-play RL in a takeover auction model shows optimal due diligence is modest and finite, decreasing with cost and competition, while simple general methods outperform specialized ones in large intractable games.