Provably Convergent Actor-Critic for MARL through Risk-aversion

Eric Mazumdar; Yizhou Zhang

arxiv: 2602.12386 · v2 · pith:4NRC2BQWnew · submitted 2026-02-12 · 💻 cs.MA · cs.GT· cs.LG

Provably Convergent Actor-Critic for MARL through Risk-aversion

Yizhou Zhang , Eric Mazumdar This is my paper

classification 💻 cs.MA cs.GTcs.LG

keywords learningstationaryactor-criticalgorithmconvergencedemonstrateequilibriagames

0 comments

read the original abstract

Learning stationary policies in infinite-horizon general-sum Markov games (MGs) remains a fundamental open problem in Multi-Agent Reinforcement Learning (MARL). While stationary strategies are preferred for their practicality, computing stationary forms of classic game-theoretic equilibria is computationally intractable -- a stark contrast to the comparative ease of solving single-agent RL or zero-sum games. To bridge this gap, we study Risk-averse Quantal response Equilibria (RQE), a solution concept rooted in behavioral game theory that incorporates risk aversion and bounded rationality. We demonstrate that RQE possesses strong regularity conditions that make it uniquely amenable to learning in MGs. We propose a novel single-timescale Actor-Critic algorithm characterized by a faster actor and a slower critic. Leveraging the regularity of RQE, we prove that this approach achieves global convergence with finite-sample guarantees. We empirically validate our algorithm in several environments to demonstrate superior convergence properties compared to risk-neutral baselines.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Strategically Robust Linear Quadratic Dynamic Games
math.OC 2026-04 unverdicted novelty 7.0

Strategically robust LQ dynamic games reduce to standard LQ games with penalized fictitious adversaries, admitting unique Markovian linear equilibria computable by coupled Riccati equations, with simulations revealing...