Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

Changjie Fan; Chunxu Ren; Hangtian Jia; Hongyao Tang; Jianye Hao; Li Wang; Tangjie Lv; Yan Zheng; Yingfeng Chen; Zhaopeng Meng

arxiv: 1809.09332 · v2 · pith:PPULHOP2new · submitted 2018-09-25 · 💻 cs.LG · cs.AI· cs.MA

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

Hongyao Tang , Jianye Hao , Tangjie Lv , Yingfeng Chen , Zongzhang Zhang , Hangtian Jia , Chunxu Ren , Yan Zheng

show 3 more authors

Zhaopeng Meng Changjie Fan Li Wang

This is my paper

classification 💻 cs.LG cs.AIcs.MA

keywords multiagenthierarchicalmarlsparseabstractiondeeplearningchallenging

0 comments

read the original abstract

Multiagent reinforcement learning (MARL) is commonly considered to suffer from non-stationary environments and exponentially increasing policy space. It would be even more challenging when rewards are sparse and delayed over long trajectories. In this paper, we study hierarchical deep MARL in cooperative multiagent problems with sparse and delayed reward. With temporal abstraction, we decompose the problem into a hierarchy of different time scales and investigate how agents can learn high-level coordination based on the independent skills learned at the low level. Three hierarchical deep MARL architectures are proposed to learn hierarchical policies under different MARL paradigms. Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning. We empirically demonstrate the effectiveness of our approaches in two domains with extremely sparse feedback: (1) a variety of Multiagent Trash Collection tasks, and (2) a challenging online mobile game, i.e., Fever Basketball Defense.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Focusing Influence Mechanism for Multi-Agent Reinforcement Learning
cs.LG 2025-06 unverdicted novelty 5.0

The Focusing Influence Mechanism (FIM) uses an entropy-based criterion and eligibility traces to help multiple agents in reinforcement learning focus and maintain their influence on under-explored parts of the state s...