EMTC adds temporal consistency to episodic memory in MARL via contrastive time-conditioned embeddings and dynamic gating, backed by an error bound and yielding up to 24% win-rate gains on hard SMAC and 28% on GRF.
Episodic Memory Deep Q-Networks
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Reinforcement learning (RL) algorithms have made huge progress in recent years by leveraging the power of deep neural networks (DNN). Despite the success, deep RL algorithms are known to be sample inefficient, often requiring many rounds of interaction with the environments to obtain satisfactory performance. Recently, episodic memory based RL has attracted attention due to its ability to latch on good actions quickly. In this paper, we present a simple yet effective biologically inspired RL algorithm called Episodic Memory Deep Q-Networks (EMDQN), which leverages episodic memory to supervise an agent during training. Experiments show that our proposed method can lead to better sample efficiency and is more likely to find good policies. It only requires 1/5 of the interactions of DQN to achieve many state-of-the-art performances on Atari games, significantly outperforming regular DQN and other episodic memory based RL algorithms.
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning
EMTC adds temporal consistency to episodic memory in MARL via contrastive time-conditioned embeddings and dynamic gating, backed by an error bound and yielding up to 24% win-rate gains on hard SMAC and 28% on GRF.