Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

Ngoc Duy Nguyen; Saeid Nahavandi; Thanh Thi Nguyen

arxiv: 1812.11794 · v2 · pith:VJEY4CXLnew · submitted 2018-12-31 · 💻 cs.LG · cs.AI· cs.MA· stat.ML

Deep Reinforcement Learning for Multi-Agent Systems: A Review of Challenges, Solutions and Applications

Thanh Thi Nguyen , Ngoc Duy Nguyen , Saeid Nahavandi This is my paper

classification 💻 cs.LG cs.AIcs.MAstat.ML

keywords learningmulti-agentdeepmethodsproblemsagentsalgorithmsapplications

0 comments

read the original abstract

Reinforcement learning (RL) algorithms have been around for decades and employed to solve various sequential decision-making problems. These algorithms however have faced great challenges when dealing with high-dimensional environments. The recent development of deep learning has enabled RL methods to drive optimal policies for sophisticated and capable agents, which can perform efficiently in these challenging environments. This paper addresses an important aspect of deep RL related to situations that require multiple agents to communicate and cooperate to solve complex tasks. A survey of different approaches to problems related to multi-agent deep RL (MADRL) is presented, including non-stationarity, partial observability, continuous state and action spaces, multi-agent training schemes, multi-agent transfer learning. The merits and demerits of the reviewed methods will be analyzed and discussed, with their corresponding applications explored. It is envisaged that this review provides insights about various MADRL methods and can lead to future development of more robust and highly useful multi-agent learning methods for solving real-world problems.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ARMS: Automatic Reward Shaping for Sparse-Reward Multi-Agent Reinforcement Learning
cs.MA 2026-05 unverdicted novelty 7.0

ARMS is an automatic reward-shaping framework for sparse-reward MARL that uses trajectory ranking and conditional best-response reasoning to preserve Nash equilibria while improving sampling efficiency in pathfinding tasks.
An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments
cs.LG 2019-07 unverdicted novelty 4.0

An attention-augmented actor-critic agent learns to dynamically weight multiple environment views by importance and outperforms baselines on TORCS and three other 3D simulators under noise and partial observability.
A Deep Reinforcement Learning Approach for Global Routing
cs.LG 2019-06 unverdicted novelty 4.0

Deep RL agent trained on generated global routing instances outperforms sequential A* search.