arxiv: 2604.25972 · v1 · submitted 2026-04-28 · 💻 cs.LG · cs.AI· cs.MA

Recognition: unknown

A Survey of Multi-Agent Deep Reinforcement Learning with Graph Neural Network-Based Communication

Valentin Cuzin-Rambaud (LIRIS , UCBL) , Laetitia Matignon (LIRIS , Maxime Morge (LIRIS

Authors on Pith no claims yet

Pith reviewed 2026-05-07 16:48 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.MA

keywords multi-agent reinforcement learninggraph neural networkscommunicationsurveycooperative agentsdeep reinforcement learninginteraction graphs

0 comments

The pith

A survey organizes GNN-based communication methods in multi-agent reinforcement learning and proposes a unified process to classify them.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper observes that research on multi-agent reinforcement learning with communication via graph neural networks has grown without a clear way to tell the approaches apart. It reviews recent works in the area and introduces a generalized GNN-based communication process. This structure aims to reveal how agents exchange information through interaction graphs to improve their internal representations. A reader would care because the framework turns scattered methods into something easier to compare and understand. The survey therefore serves as both a map of the field and a tool for seeing its common ideas.

Core claim

The authors claim that existing MARL methods with GNN communication can be described by a single generalized process in which agents build an interaction graph, use GNN layers to propagate messages, and update their policies with the enriched representations. By laying out this process the survey makes the shared mechanics across different papers visible and supplies a way to distinguish the methods from one another.

What carries the argument

The generalized GNN-based communication process, which defines the steps of graph construction, message passing through GNN layers, and integration of received information into each agent's decision making.

If this is right

Methods can now be grouped by how they build the interaction graph and how they apply the GNN layers.
Readers can trace the same sequence of steps across papers that previously looked unrelated.
Future designs can be described directly in terms of where they deviate from the generalized process.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could be used to spot patterns that current methods share and therefore to suggest hybrid approaches that combine strengths from several papers.
It might also highlight which stages of the communication process have received the least attention so far.
Researchers outside the immediate subfield could use the same steps to translate ideas from single-agent GNN work into the multi-agent setting.

Load-bearing premise

The current literature on GNN communication in multi-agent reinforcement learning has no explicit structure that lets readers classify and compare the different methods.

What would settle it

If every recent paper on the topic already fits cleanly into an existing taxonomy without needing the new generalized process, or if applying the process leaves the key ideas as unclear as before, the proposed framework would not achieve its stated goal.

Figures

Figures reproduced from arXiv: 2604.25972 by Laetitia Matignon (LIRIS, Maxime Morge (LIRIS, UCBL), Valentin Cuzin-Rambaud (LIRIS.

**Figure 1.** Figure 1: An example of MPNN: the green node, aggre view at source ↗

read the original abstract

In multi-agent reinforcement learning (MARL), the integration of a communication mechanism, allowing agents to better learn to coordinate their actions and converge on their objectives by sharing information. Based on an interaction graph, a subclass of methods employs graph neural networks (GNNs) to learn the communication, enabling agents to improve their internal representations by enriching them with information exchanged. With growing research, we note a lack of explicit structure and framework to distinguish and classify MARL approaches with communication based on GNNs. Thus, this paper surveys recent works in this field. We propose a generalized GNN-based communication process with the goal of making the underlying concepts behind the methods more obvious and accessible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This survey organizes existing GNN communication methods in MARL and offers a generalized process as a conceptual map, but adds no new technical results or validations.

read the letter

This survey organizes existing GNN communication methods in MARL and offers a generalized process as a conceptual map, but adds no new technical results or validations. The authors point out that the literature lacks a clear structure for classifying these approaches and try to fill that gap with a high-level breakdown of the communication pipeline. That organizational step is the main contribution and could help readers see common patterns across papers without having to reconstruct them from scratch each time. The review of recent works appears systematic based on the abstract, and the intent to make concepts more accessible is reasonable for a growing subfield. The paper does not claim new algorithms, experiments, or theorems, which keeps expectations in line with what a survey can deliver. The soft spots are predictable for this format. The generalized process is presented as a synthesis rather than something derived or tested, so its usefulness hinges on whether the classification actually reduces confusion in practice or simply restates existing ideas in new terms. Without empirical checks or detailed comparisons in the full text, it is hard to judge how well the framework handles edge cases or divergent methods. This paper is mainly for researchers new to GNN-augmented MARL or for anyone needing a quick reference to the main lines of work. People already active in the area will probably treat it as background rather than a source of fresh ideas. It deserves peer review because a well-executed survey can improve how the community discusses and builds on these methods, provided the classification is comprehensive and the authors address any overlaps with prior overviews.

Referee Report

0 major / 2 minor

Summary. The paper surveys recent literature on multi-agent deep reinforcement learning (MARL) methods that integrate graph neural networks (GNNs) for agent communication. It identifies a gap in explicit classification frameworks for these GNN-based approaches and proposes a generalized GNN-based communication process as an organizing conceptual tool to clarify underlying mechanisms across surveyed works.

Significance. If the proposed generalization functions as an effective taxonomic lens, the survey could help consolidate a rapidly expanding subfield by making conceptual commonalities more accessible, potentially aiding new researchers and highlighting directions for future MARL communication designs. As a purely expository contribution without new theorems, experiments, or code, its value rests on the breadth of coverage and the framework's clarity rather than deductive or empirical novelty.

minor comments (2)

The abstract states that the generalized process aims to make concepts 'more obvious and accessible,' but the manuscript would benefit from an explicit comparison table or diagram in the framework section showing how at least three concrete surveyed methods map onto the generalized process steps.
The claim of a 'lack of explicit structure' in the introduction would be strengthened by citing at least two prior surveys or taxonomies on MARL communication (even if they do not focus on GNNs) to demonstrate the specific gap being addressed.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our survey and for recommending minor revision. The report correctly identifies our contribution as proposing a generalized GNN-based communication process to provide structure to the growing literature on MARL methods that use GNNs for agent communication. No specific major comments were raised in the report, so we have no individual points requiring detailed rebuttal or revision at this stage. We will use the opportunity of minor revision to further emphasize the clarity of the proposed framework and the breadth of coverage to ensure it serves as an effective taxonomic lens for the subfield.

Circularity Check

0 steps flagged

No significant circularity: survey proposes taxonomic framework without derivations or self-referential reductions

full rationale

This is a literature survey paper that reviews external works on MARL with GNN-based communication and proposes a generalized process purely as an organizational and expository tool to clarify concepts. No equations, predictions, fitted parameters, theorems, or first-principles derivations are advanced that could reduce to inputs by construction. The central contribution is taxonomic rather than deductive; it cites external literature without load-bearing self-citations or uniqueness claims imported from prior author work. The generalized communication process is presented as a conceptual abstraction, not a fitted or self-defined result. This is a standard honest non-finding for survey papers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Survey paper with no mathematical derivations, empirical fits, or new physical entities; the generalized process is a conceptual organizing tool rather than a postulated entity with independent evidence.

invented entities (1)

Generalized GNN-based communication process no independent evidence
purpose: To provide an explicit structure for classifying and understanding MARL methods that use GNNs for communication
Proposed in the abstract to address the noted lack of framework; no falsifiable predictions or external validation supplied.

pith-pipeline@v0.9.0 · 5435 in / 1083 out tokens · 53135 ms · 2026-05-07T16:48:57.437642+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 3 canonical work pages · 2 internal anchors

[1]

Multi-Agent Reinforcement Learning: Foundations and Modern Approaches

Stefano V . Albrecht, Filippos Christianos, and Lukas Schäfer. "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches" . MIT Press, 2024

2024
[2]

Heterogeneous Multi-Robot Reinforcement Learn- ing

Matteo Bettini, Ajay Shankar, and Amanda Prorok. "Heterogeneous Multi-Robot Reinforcement Learn- ing". In AAMAS, pages 1485–1494, 2023

2023
[3]

Is independent learning all you need in the starcraft multi-agent challenge?arXiv preprint arXiv:2011.09533, 2020

Christian Schroeder De Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip HS Torr, Mingfei Sun, and Shimon Whiteson. "Is Independent Learning All You Need in the Starcraft Multi-Agent Challenge?". arXiv preprint arXiv:2011.09533, 2020

work page arXiv 2011
[4]

Group-Aware Coordination Graph for Multi-Agent Reinforcement Learning

Wei Duan, Jie Lu, and Junyu Xuan. "Group-Aware Coordination Graph for Multi-Agent Reinforcement Learning". In IJCAI, 2024

2024
[5]

Inferring La- tent Temporal Sparse Coordination Graph for Mul- tiagent Reinforcement Learning

Wei Duan, Jie Lu, and Junyu Xuan. "Inferring La- tent Temporal Sparse Coordination Graph for Mul- tiagent Reinforcement Learning". IEEE Transac- tions on Neural Networks and Learning Systems , 36(8):14358–14370, 2025-08

2025
[6]

Bandwidth-Constrained Variational Message En- coding for Cooperative Multi-Agent Reinforcement Learning

Wei Duan, Jie Lu, En Yu, and Junyu Xuan. "Bandwidth-Constrained Variational Message En- coding for Cooperative Multi-Agent Reinforcement Learning". In AAMAS, page 13, 2026

2026
[7]

Neural Message Passing for Quantum Chemistry

Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. "Neural Message Passing for Quantum Chemistry". In ICML, pages 1263–1272, 2017

2017
[8]

Graph Neural Network- Based Multi-Agent Reinforcement Learning for Re- silient Distributed Coordination of Multi-Robot Sys- tems

Anthony Goeckner, Yueyuan Sui, Nicolas Martinet, Xinliang Li, and Qi Zhu. "Graph Neural Network- Based Multi-Agent Reinforcement Learning for Re- silient Distributed Coordination of Multi-Robot Sys- tems". In IROS, pages 5732–5739, 2024

2024
[9]

In- ductive Representation Learning on Large Graphs

Will Hamilton, Zhitao Ying, and Jure Leskovec. "In- ductive Representation Learning on Large Graphs". In NeurIPS, volume 30, pages 1025–1035, 2017

2017
[10]

Graph Convolutional Reinforcement Learning

Jiechuan Jiang, Chen Dun, Tiejun Huang, and Zongqing Lu. "Graph Convolutional Reinforcement Learning". In ICLR, 2020

2020
[11]

Semi-Supervised Classification with Graph Convolutional Networks

Thomas N. Kipf and Max Welling. "Semi-Supervised Classification with Graph Convolutional Networks". In ICLR, 2017

2017
[12]

Deep Implicit Coordination Graphs for Multi-Agent Reinforcement Learning

Sheng Li, Jayesh K. Gupta, Peter Morales, Ross Allen, and Mykel J. Kochenderfer. "Deep Implicit Coordination Graphs for Multi-Agent Reinforcement Learning". In AAMAS, pages 764–772, 2021

2021
[13]

Multi-Agent Game Abstraction via Graph Attention Neural Network

Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, and Yang Gao. "Multi-Agent Game Abstraction via Graph Attention Neural Network". In AAAI, volume 34, pages 7211–7218, 2020

2020
[14]

Deep Hierarchical Communication Graph in Multi-Agent Reinforcement Learning

Zeyang Liu, Lipeng Wan, Xue Sui, Zhuoran Chen, Kewu Sun, and Xuguang Lan. "Deep Hierarchical Communication Graph in Multi-Agent Reinforcement Learning.". In IJCAI, pages 208–216, 2023

2023
[15]

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive En- vironments

Ryan Lowe, Yi I Wu, Aviv Tamar, Jean Harb, Ope- nAI Pieter Abbeel, and Igor Mordatch. "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive En- vironments". In NeurIPS, volume 30, pages 6382– 6393, 2017

2017
[16]

Asyn- chronous Methods for Deep Reinforcement Learn- ing

V olodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. "Asyn- chronous Methods for Deep Reinforcement Learn- ing". In ICML, pages 1928–1937, 2016

1928
[17]

Playing Atari with Deep Reinforcement Learning

V olodymyr Mnih, Koray Kavukcuoglu, David Sil- ver, Alex Graves, Ioannis Antonoglou, Daan Wier- stra, and Martin Riedmiller. "Playing Atari With Deep Reinforcement Learning". arXiv preprint arXiv:1312.5602, 2013

work page internal anchor Pith review arXiv 2013
[18]

Multi-Agent Graph-Attention Communication and Teaming

Yaru Niu, Rohan Paleja, and Matthew Gombolay. "Multi-Agent Graph-Attention Communication and Teaming.". In AAMAS, pages 964–973, 2021

2021
[19]

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Tabish Rashid, Mikayel Samvelyan, Chris- tian Schroeder De Witt, Gregory Farquhar, Jakob Foerster, and Shimon Whiteson. "Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning". JMLR, 21(178):1–51, 2020

2020
[20]

The Graph Neural Network Model

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. "The Graph Neural Network Model". IEEE transac- tions on neural networks, 20(1):61–80, 2008

2008
[21]

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. "Proximal Pol- icy Optimization Algorithms". arXiv preprint arXiv:1707.06347, 2017

work page internal anchor Pith review arXiv 2017
[22]

Learning Efficient Diverse Communica- tion for Cooperative Heterogeneous Teaming

Esmaeil Seraj, Zheyuan Wang, Rohan Paleja, Daniel Martin, Matthew Sklar, Anirudh Patel, and Matthew Gombolay. "Learning Efficient Diverse Communica- tion for Cooperative Heterogeneous Teaming". InAA- MAS, pages 1173–1182, 2022

2022
[23]

Learning Structured Communication for Multi-Agent Reinforcement Learning

Junjie Sheng, Xiangfeng Wang, Bo Jin, Junchi Yan, Wenhao Li, Tsung-Hui Chang, Jun Wang, and Hongyuan Zha. "Learning Structured Communication for Multi-Agent Reinforcement Learning". JAAMAS, 36(2):50, 2022

2022
[24]

Value- Decomposition Networks For Cooperative Multi- Agent Learning Based On Team Reward

Peter Sunehag, Guy Lever, Audrunas Gruslys, Wo- jciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, and Thore Graepel. "Value- Decomposition Networks For Cooperative Multi- Agent Learning Based On Team Reward". InAAMAS, pages 2085–2087, 2018

2085
[25]

Multiagent Cooperation and Competi- tion With Deep Reinforcement Learning

Ardi Tampuu, Tambet Matiisen, Dorian Kodelja, Ilya Kuzovkin, Kristjan Korjus, Juhan Aru, Jaan Aru, and Raul Vicente. "Multiagent Cooperation and Competi- tion With Deep Reinforcement Learning". PloS one, 12(4):e0172395, 2017

2017
[26]

Graph Attention Networks

Petar Veli ˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. "Graph Attention Networks". In ICLR, 2018

2018
[27]

Q- learning

Christopher JCH Watkins and Peter Dayan. "Q- learning". Machine learning, 8(3):279–292, 1992

1992
[28]

Simple Statistical Gradient- Following Algorithms for Connectionist Reinforce- ment Learning

Ronald J Williams. "Simple Statistical Gradient- Following Algorithms for Connectionist Reinforce- ment Learning". Machine learning , 8(3):229–256, 1992

1992
[29]

Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation

Xinyi Yang, Shiyu Huang, Yiwen Sun, Yuxiang Yang, Chao Yu, Wei-Wei Tu, Huazhong Yang, and Yu Wang. "Learning Graph-Enhanced Commander-Executor for Multi-Agent Navigation". In AAMAS, pages 1652– 1660, 2023

2023
[30]

The Surpris- ing Effectiveness of PPO in Cooperative Multi-Agent Games

Chao Yu, Akash Velu, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, and Yi Wu. "The Surpris- ing Effectiveness of PPO in Cooperative Multi-Agent Games". NeurIPS, 35:24611–24624, 2022

2022
[31]

Graph Neural Networks: A Review of Methods and Applications

Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. "Graph Neural Networks: A Review of Methods and Applications". AI Open, 1:57–81, 2020

2020
[32]

A Survey of Multi-Agent Deep Reinforcement Learning With Communication

Changxi Zhu, Mehdi Dastani, and Shihan Wang. "A Survey of Multi-Agent Deep Reinforcement Learning With Communication". JAAMAS, 38(1), 2024

2024