Counterfactual multi-agent policy gradients

· 2017 · arXiv 1705.08926

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Plasticity-Enhanced Multi-Agent Mixture of Experts for Dynamic Objective Adaptation in UAVs-Assisted Emergency Communication Networks

cs.MA · 2026-04-10 · unverdicted · novelty 7.0

PE-MAMoE combines sparsely gated mixture-of-experts actors with a non-parametric phase controller in MAPPO to maintain plasticity under dynamic user mobility and traffic, yielding 26.3% higher normalized IQM return in simulations.

Scalable Neighborhood-Based Multi-Agent Actor-Critic

cs.LG · 2026-04-20 · unverdicted · novelty 6.0

MADDPG-K scales centralized critics in multi-agent RL by limiting each critic to k-nearest neighbors under Euclidean distance, yielding constant input size and competitive performance.

Learning Safe Unlabeled Multi-Robot Planning with Motion Constraints

cs.RO · 2019-07-11 · unverdicted · novelty 5.0

A multi-agent RL framework for unlabeled multi-robot planning that uses velocity obstacle projections to guarantee collision-free trajectories applicable to arbitrary robot models.

citing papers explorer

Showing 3 of 3 citing papers.

Plasticity-Enhanced Multi-Agent Mixture of Experts for Dynamic Objective Adaptation in UAVs-Assisted Emergency Communication Networks cs.MA · 2026-04-10 · unverdicted · none · ref 4
PE-MAMoE combines sparsely gated mixture-of-experts actors with a non-parametric phase controller in MAPPO to maintain plasticity under dynamic user mobility and traffic, yielding 26.3% higher normalized IQM return in simulations.
Scalable Neighborhood-Based Multi-Agent Actor-Critic cs.LG · 2026-04-20 · unverdicted · none · ref 2
MADDPG-K scales centralized critics in multi-agent RL by limiting each critic to k-nearest neighbors under Euclidean distance, yielding constant input size and competitive performance.
Learning Safe Unlabeled Multi-Robot Planning with Motion Constraints cs.RO · 2019-07-11 · unverdicted · none · ref 11
A multi-agent RL framework for unlabeled multi-robot planning that uses velocity obstacle projections to guarantee collision-free trajectories applicable to arbitrary robot models.

Counterfactual multi-agent policy gradients

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer