Introduces HS-S (aggregating dynamic threat powers) and Coco-S (fixed points of statewise HS Bellman operator) for stochastic games, proves they coincide for two players but disagree for three, shows uniqueness via extended axioms and topological degree theory, and gives sampling estimators.
A Friend-or-Foe framework for multi-agent reinforcement learning policy generation in mixing cooperative competitive scenarios
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.GT 1years
2023 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Learning Strategic Value and Cooperation in Multi-Player Stochastic Games through Side Payments
Introduces HS-S (aggregating dynamic threat powers) and Coco-S (fixed points of statewise HS Bellman operator) for stochastic games, proves they coincide for two players but disagree for three, shows uniqueness via extended axioms and topological degree theory, and gives sampling estimators.