Opponent-aware peer-learning corrections in finite-unroll Meta-MAPG increase entry probability into target stable-Nash basins relative to standard policy gradient, with annealing to recover local convergence.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Equilibrium Selection in Multi-Agent Policy Gradients via Opponent-Aware Basin Entry
Opponent-aware peer-learning corrections in finite-unroll Meta-MAPG increase entry probability into target stable-Nash basins relative to standard policy gradient, with annealing to recover local convergence.