Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches

Sanyam Kapoor

arxiv: 1807.09427 · v1 · pith:RNJP5ILZnew · submitted 2018-07-25 · 💻 cs.AI · cs.LG· stat.ML

Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches

Sanyam Kapoor This is my paper

classification 💻 cs.AI cs.LGstat.ML

keywords learningmulti-agentreportsystemstextitcalledchallengesdecentralized

0 comments

read the original abstract

Reinforcement Learning (RL) is a learning paradigm concerned with learning to control a system so as to maximize an objective over the long term. This approach to learning has received immense interest in recent times and success manifests itself in the form of human-level performance on games like \textit{Go}. While RL is emerging as a practical component in real-life systems, most successes have been in Single Agent domains. This report will instead specifically focus on challenges that are unique to Multi-Agent Systems interacting in mixed cooperative and competitive environments. The report concludes with advances in the paradigm of training Multi-Agent Systems called \textit{Decentralized Actor, Centralized Critic}, based on an extension of MDPs called \textit{Decentralized Partially Observable MDP}s, which has seen a renewed interest lately.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Adaptive Estimation and Optimal Control in Offline Contextual MDPs without Stationarity
stat.ML 2026-05 unverdicted novelty 7.0

A T-estimation-based procedure for adaptive density estimation and optimal control in offline contextual MDPs without stationarity, providing oracle risk bounds under two loss functions and finite-sample cost guarantees.
Nested Reinforcement Learning Based Control for Protective Relays in Power Distribution Systems
eess.SY 2019-06 unverdicted novelty 6.0

A nested reinforcement learning approach is introduced for setting protective relays to distinguish faults from heavy loads in distribution systems with high distributed energy resource penetration.