pith. sign in

arxiv: 1809.09332 · v2 · pith:PPULHOP2new · submitted 2018-09-25 · 💻 cs.LG · cs.AI· cs.MA

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

classification 💻 cs.LG cs.AIcs.MA
keywords multiagenthierarchicalmarlsparseabstractiondeeplearningchallenging
0
0 comments X
read the original abstract

Multiagent reinforcement learning (MARL) is commonly considered to suffer from non-stationary environments and exponentially increasing policy space. It would be even more challenging when rewards are sparse and delayed over long trajectories. In this paper, we study hierarchical deep MARL in cooperative multiagent problems with sparse and delayed reward. With temporal abstraction, we decompose the problem into a hierarchy of different time scales and investigate how agents can learn high-level coordination based on the independent skills learned at the low level. Three hierarchical deep MARL architectures are proposed to learn hierarchical policies under different MARL paradigms. Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning. We empirically demonstrate the effectiveness of our approaches in two domains with extremely sparse feedback: (1) a variety of Multiagent Trash Collection tasks, and (2) a challenging online mobile game, i.e., Fever Basketball Defense.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Focusing Influence Mechanism for Multi-Agent Reinforcement Learning

    cs.LG 2025-06 unverdicted novelty 5.0

    The Focusing Influence Mechanism (FIM) uses an entropy-based criterion and eligibility traces to help multiple agents in reinforcement learning focus and maintain their influence on under-explored parts of the state s...