CoSER adaptively samples joint actions in CTDE MARL to reduce sampling error relative to the joint on-policy distribution, empirically improving reliability of independent policy gradient convergence.
Pareto actor-critic for equilibrium selection in multi-agent reinforcement learning
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
Localized affinity regularization improves multi-agent performance on both competitive and cooperative objectives in a Fog of Love environment compared to standard MADDPG.
citing papers explorer
-
Centralized Adaptive Sampling for Reliable Co-Training of Independent Multi-Agent Policies
CoSER adaptively samples joint actions in CTDE MARL to reduce sampling error relative to the joint on-policy distribution, empirically improving reliability of independent policy gradient convergence.
-
Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment
Localized affinity regularization improves multi-agent performance on both competitive and cooperative objectives in a Fog of Love environment compared to standard MADDPG.