Actor-Attention-Critic for Multi-Agent Reinforcement Learning

· 2018 · cs.LG · arXiv 1810.02912

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

Reinforcement learning in multi-agent scenarios is important for real-world applications but presents challenges beyond those seen in single-agent settings. We present an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep. This attention mechanism enables more effective and scalable learning in complex multi-agent environments, when compared to recent approaches. Our approach is applicable not only to cooperative settings with shared rewards, but also individualized reward settings, including adversarial settings, as well as settings that do not provide global states, and it makes no assumptions about the action spaces of the agents. As such, it is flexible enough to be applied to most multi-agent learning problems.

representative citing papers

LLM-Enhanced Multi-Agent Reinforcement Learning with Expert Workflow for Real-Time P2P Energy Trading

cs.MA · 2025-07-20 · unverdicted · novelty 6.0

An LLM-enhanced MARL system with differential attention critic produces lower economic costs and voltage violations than baselines in simulated real-time P2P electricity trading.

Asynchronous Cooperative Multi-Agent Reinforcement Learning with Limited Communication

cs.MA · 2025-02-01 · unverdicted · novelty 6.0

AsynCoMARL is a new asynchronous MARL algorithm that matches leading baselines on success and collision rates while using 26% fewer messages via graph transformers on dynamic communication graphs.

citing papers explorer

Showing 2 of 2 citing papers.

LLM-Enhanced Multi-Agent Reinforcement Learning with Expert Workflow for Real-Time P2P Energy Trading cs.MA · 2025-07-20 · unverdicted · none · ref 32 · internal anchor
An LLM-enhanced MARL system with differential attention critic produces lower economic costs and voltage violations than baselines in simulated real-time P2P electricity trading.
Asynchronous Cooperative Multi-Agent Reinforcement Learning with Limited Communication cs.MA · 2025-02-01 · unverdicted · none · ref 6 · internal anchor
AsynCoMARL is a new asynchronous MARL algorithm that matches leading baselines on success and collision rates while using 26% fewer messages via graph transformers on dynamic communication graphs.

Actor-Attention-Critic for Multi-Agent Reinforcement Learning

fields

years

verdicts

representative citing papers

citing papers explorer