What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?
read the original abstract
Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents' policies are based on accurate state information. However, policies learned through Deep Reinforcement Learning (DRL) are susceptible to adversarial state perturbation attacks. In this work, we propose a State-Adversarial Markov Game (SAMG) and make the first attempt to investigate different solution concepts of MARL under state uncertainties. Our analysis shows that the commonly used solution concepts of optimal agent policy and robust Nash equilibrium do not always exist in SAMGs. To circumvent this difficulty, we consider a new solution concept called robust agent policy, where agents aim to maximize the worst-case expected state value. We prove the existence of robust agent policy for finite state and finite action SAMGs. Additionally, we propose a Robust Multi-Agent Adversarial Actor-Critic (RMA3C) algorithm to learn robust policies for MARL agents under state uncertainties. Our experiments demonstrate that our algorithm outperforms existing methods when faced with state perturbations and greatly improves the robustness of MARL policies. Our code is public on https://songyanghan.github.io/what_is_solution/.
This paper has not been read by Pith yet.
Forward citations
Cited by 5 Pith papers
-
Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning
IBAL framework constructs information-theoretic adversarial attacks on agent observations and actions to train MARL agents that remain robust to interaction disruptions and agent-missing scenarios.
-
Interaction-Breaking Adversarial Learning Framework for Robust Multi-Agent Reinforcement Learning
The IBAL framework builds information-theoretic attacks that break agent interactions in MARL and trains policies to stay robust under observation and action perturbations.
-
Wolfpack Adversarial Attack for Robust Multi-Agent Reinforcement Learning
Wolfpack attack framework disrupts MARL cooperation by targeting initial and assisting agents; WALL trains robust policies against it with reported experimental gains.
-
Robust and Safe Multi-Agent Reinforcement Learning with Communication for Autonomous Vehicles: From Simulation to Hardware
RSR-RSMARL is a robust safe MARL framework with V2V communication and CBF safety shields that supports zero-shot sim-to-real transfer and improves coordination on 1/10-scale vehicle hardware.
-
PIMbot: A Self-Adaptive Attack Framework for Adversarial Manipulation of Multi-Robot Reinforcement Learning
PIMbot introduces an adaptive attack using reward-channel and policy manipulation to disrupt cooperation in multi-robot social dilemma RL, shown effective in Gazebo simulation and on NVIDIA Jetson hardware.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.