PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

Fazl Barez; Hongyao Tang; Jianye Hao; Matthew E.Taylor; Pengyi Li; Tianpei Yang; Tong Sang; Wenyuan Tao; Xiaotian Hao; Yan Zheng

arxiv: 2203.08553 · v4 · pith:26DURGNNnew · submitted 2022-03-16 · 💻 cs.MA · cs.AI

PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration

Pengyi Li , Hongyao Tang , Tianpei Yang , Xiaotian Hao , Tong Sang , Yan Zheng , Jianye Hao , Matthew E.Taylor

show 3 more authors

Wenyuan Tao Zhen Wang Fazl Barez

This is my paper

classification 💻 cs.MA cs.AI

keywords collaborationpmiclearningbehaviorsinformationmarlmaximizingmutual

0 comments

read the original abstract

Learning to collaborate is critical in Multi-Agent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 7.0

IBAL framework constructs information-theoretic adversarial attacks on agent observations and actions to train MARL agents that remain robust to interaction disruptions and agent-missing scenarios.
Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning
cs.LG 2026-05 unverdicted novelty 6.0

The IBAL framework builds information-theoretic attacks that break agent interactions in MARL and trains policies to stay robust under observation and action perturbations.
Centralized Adaptive Sampling for Reliable Co-Training of Independent Multi-Agent Policies
cs.LG 2025-08 unverdicted novelty 5.0

CoSER adaptively samples joint actions in CTDE MARL to reduce sampling error relative to the joint on-policy distribution, empirically improving reliability of independent policy gradient convergence.