Model-free DQN learning achieves suboptimality bounds of O(1/sqrt(Ns)) + O(1/N) in Karma DPGs at equilibrium, and deep RL combined with fictitious play empirically reaches near-Stationary Nash Equilibrium from scratch.
Convergence of actor-critic learning for mean field games and mean field control in continuous spaces
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.GT 1years
2026 1verdicts
UNVERDICTED 1roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
Towards Model-Free Learning in Dynamic Population Games: An Application to Karma Economies
Model-free DQN learning achieves suboptimality bounds of O(1/sqrt(Ns)) + O(1/N) in Karma DPGs at equilibrium, and deep RL combined with fictitious play empirically reaches near-Stationary Nash Equilibrium from scratch.