Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion

Naci Saldi; U\u{g}ur Ayd{\i}n

arxiv: 2310.10828 · v1 · submitted 2023-10-16 · 📡 eess.SY · cs.SY· math.OC

Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion

U\u{g}ur Ayd{\i}n , Naci Saldi This is my paper

Pith reviewed 2026-05-24 06:17 UTC · model grok-4.3

classification 📡 eess.SY cs.SYmath.OC

keywords mean-field gamesrobustnessvalue iterationdiscounted costfinite approximationmodel uncertaintydiscrete-timestationary equilibria

0 comments

The pith

Stationary mean-field equilibria from value iteration remain robust to dynamics misspecifications and admit finite-model approximations under fine state quantization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper first derives conditions guaranteeing convergence of value iteration to stationary mean-field equilibria in discrete-time discounted games. It then proves that these equilibria stay close when the transition dynamics are perturbed by small misspecifications. The same robustness property is used to show that quantizing the state space finely enough produces a finite-state game whose equilibrium approximates the original equilibrium arbitrarily closely.

Core claim

The mean-field equilibrium obtained through this value iteration algorithm remains robust even in the face of system dynamics misspecifications. We then apply these robustness findings to the finite model approximation problem in mean-field games, showing that if the state space quantization is fine enough, the mean-field equilibrium for the finite model closely approximates the nominal one.

What carries the argument

Value iteration algorithm for stationary mean-field equilibria, whose convergence supplies the robustness bound that transfers to finite quantized models.

If this is right

Small errors in the transition kernel produce only small changes in the equilibrium when value iteration has converged.
Arbitrarily accurate approximation of the nominal equilibrium is possible by refining the state quantization.
The robustness and approximation results apply specifically to stationary equilibria under discounted infinite-horizon costs.
Finite-model equilibria can be computed directly and then transferred back to the original model with controlled error.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Controllers designed from the finite approximation can be deployed on the original system while retaining performance guarantees under modest model error.
Similar robustness arguments might apply to other iterative solution methods provided their convergence can be established first.
The results suggest a practical workflow: solve the quantized game, then verify robustness margins before implementation.

Load-bearing premise

Value iteration converges to a stationary mean-field equilibrium under the stated conditions.

What would settle it

A dynamics perturbation of size epsilon for which the equilibrium strategies or costs differ by more than a fixed delta even though value iteration converged, or a quantization level that fails to make the finite-model equilibrium close.

read the original abstract

In this paper, we investigate the robustness of stationary mean-field equilibria in the presence of model uncertainties, specifically focusing on infinite-horizon discounted cost functions. To achieve this, we initially establish convergence conditions for value iteration-based algorithms in mean-field games. Subsequently, utilizing these results, we demonstrate that the mean-field equilibrium obtained through this value iteration algorithm remains robust even in the face of system dynamics misspecifications. We then apply these robustness findings to the finite model approximation problem in mean-field games, showing that if the state space quantization is fine enough, the mean-field equilibrium for the finite model closely approximates the nominal one.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sets up convergence conditions for value iteration in discounted mean-field games then derives robustness to dynamics misspecification and finite quantization approximations from them, but the conditions' behavior under perturbation is the unexamined link.

read the letter

The main takeaway is that the authors establish convergence conditions for value iteration in infinite-horizon discounted mean-field games and then build robustness results for dynamics misspecifications and approximation guarantees for quantized finite models on top of that. This extends earlier work on mean-field games by adding explicit robustness and discretization analysis in the discounted case. The structure is straightforward: convergence first, then the two applications. If the derivations are solid, it provides quantitative tools that could be useful for numerical implementations in this area. The weaker part is the reliance on those convergence conditions. The robustness claims depend on the conditions holding even when the dynamics are altered, and the abstract gives no indication of what the conditions actually are or how sensitive they are to perturbations. Without seeing the specific assumptions or error bounds, it's difficult to tell how far the results reach. This is targeted at specialists in mean-field games and control theory who deal with approximation algorithms and model uncertainty. Someone in that community might find the robustness and quantization parts worth looking at. I would recommend sending it for peer review. The topic fits an established subfield, and checking the proofs would clarify whether the chain from convergence to the other results is reliable.

Referee Report

2 major / 1 minor

Summary. The manuscript claims to first establish convergence conditions for value iteration algorithms applied to infinite-horizon discounted-cost discrete-time mean-field games. These conditions are then invoked to prove that the resulting stationary mean-field equilibria remain robust under misspecifications of the system dynamics. The same robustness results are applied to derive approximation guarantees showing that, for sufficiently fine state-space quantization, the mean-field equilibrium of the finite model is close to that of the nominal (continuous-state) model.

Significance. If the convergence conditions are stated with explicit, verifiable assumptions that remain compatible with the subsequent perturbations, and if the derivations are free of circularity, the work would supply a useful theoretical bridge between algorithmic computation of MFG equilibria and their practical use under model uncertainty or discretization. Such results are relevant to control applications where exact dynamics or infinite state spaces are unavailable.

major comments (2)

[Convergence results (likely §3–4)] The convergence conditions for value iteration constitute the load-bearing step from which both the robustness and finite-model approximation claims are derived. The abstract provides no quantitative statement of these conditions (e.g., required bounds on the discount factor, Lipschitz constants of the mean-field interaction, or contraction modulus), preventing verification that the same conditions continue to hold once the dynamics are misspecified.
[Robustness analysis] § on robustness: the argument that equilibria obtained from the value-iteration algorithm remain robust to dynamics misspecification must explicitly confirm that the misspecification does not violate the contraction or continuity hypotheses used to establish convergence; otherwise the subsequent claims do not follow from the earlier results.

minor comments (1)

[Abstract] Abstract: a single sentence summarizing the form of the convergence conditions (e.g., “under a uniform contraction with modulus <1 for discount factor γ<γ0”) would help readers assess scope without reading the full technical sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. The points raised concern the presentation of quantitative conditions and the explicit linkage between convergence and robustness. We respond point-by-point below and will revise the manuscript to improve clarity.

read point-by-point responses

Referee: [Convergence results (likely §3–4)] The convergence conditions for value iteration constitute the load-bearing step from which both the robustness and finite-model approximation claims are derived. The abstract provides no quantitative statement of these conditions (e.g., required bounds on the discount factor, Lipschitz constants of the mean-field interaction, or contraction modulus), preventing verification that the same conditions continue to hold once the dynamics are misspecified.

Authors: We agree that the abstract should contain a quantitative statement of the convergence conditions. In the revised manuscript we will update the abstract to state that value iteration converges whenever the discount factor satisfies γ < 1/(1+L), where L is the Lipschitz constant of the mean-field interaction, yielding a contraction modulus strictly less than one. These explicit bounds appear in Theorem 3.1 and are used throughout the subsequent sections. The robustness analysis then shows that sufficiently small dynamics perturbations preserve the same inequality, so the conditions remain valid under misspecification. revision: yes
Referee: [Robustness analysis] § on robustness: the argument that equilibria obtained from the value-iteration algorithm remain robust to dynamics misspecification must explicitly confirm that the misspecification does not violate the contraction or continuity hypotheses used to establish convergence; otherwise the subsequent claims do not follow from the earlier results.

Authors: We agree that an explicit confirmation is required. The proof of robustness (Theorem 4.1) already establishes that if the dynamics perturbation is bounded by ε in the appropriate norm, the contraction modulus remains strictly below one for all ε smaller than a positive threshold depending on γ and L. We will add a short lemma immediately after the statement of the convergence theorem that isolates this preservation argument, making the logical dependence transparent and removing any appearance of circularity. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation is self-contained theoretical argument

full rationale

The paper first states convergence conditions for value iteration algorithms in mean-field games, then derives robustness to dynamics misspecification and finite-model approximation results from those conditions. No quoted step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the structure is a standard implication chain from convergence assumptions to robustness and approximation bounds. The claims remain independent of the paper's own outputs and do not rename or smuggle in prior results as new predictions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities can be identified from the abstract alone.

pith-pipeline@v0.9.0 · 5639 in / 1083 out tokens · 26930 ms · 2026-05-24T06:17:45.152139+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 1 … k < 1 and k1 < k2 … Banach fixed point theorem … unique fixed point of Hγ
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Lemma 3 … Lipschitz bounds on Hγ1, Hγ2 … W1 distances

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Physics-Constrained Adaptive Flow Matching for Climate Downscaling
physics.ao-ph 2026-04 unverdicted novelty 6.0

Physics-Constrained Adaptive Flow Matching reduces conservation errors and improves out-of-distribution performance in climate downscaling by enforcing large-scale consistency on precipitation and humidity without tar...
Approximation of Discrete-Time Infinite-Horizon Mean-Field Equilibria via Finite-Horizon Mean-Field Equilibria
math.OC 2025-09 unverdicted novelty 5.0

Finite-horizon mean-field equilibria accumulate to non-stationary infinite-horizon mean-field equilibria and converge to stationary ones under stated conditions, with explicit error bounds and a new uniqueness criterion.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · cited by 2 Pith papers · 1 internal anchor

[1]

Achdou, Yves, Fabio Camilli, Italo Capuzzo-Dolcetta. 2 012. Mean ﬁeld games: Numerical methods for the planning problem. SIAM Journal on Control and Optimization 50(1) 77–109

work page
[2]

Achdou, Yves, Italo Capuzzo-Dolcetta. 2010. Mean ﬁeld g ames: Numerical methods. SIAM Journal on Numerical Analysis 48(3) 1136–1162

work page 2010
[3]

Johari, G.Y

Adlakha, S., R. Johari, G.Y. Weintraub. 2015. Equilibri a of dynamic games with many players: Existence, approximation, and market structure. Journal of Economic Theory 156 269–316

work page 2015
[4]

Almulla, Noha, Rita Ferreira, Diogo Aguiar Gomes. 2017. Two numerical approaches to stationary mean-ﬁeld games. Dynamic Games and Applications 7 657–682

work page 2017
[5]

Anahtarci, Berkay, Can Deha Kariksiz, Naci Saldi. 2020. Value iteration algorithm for mean-ﬁeld games. Systems and Control Letters 143

work page 2020
[6]

Anahtarci, Berkay, Can Deha Kariksiz, Naci Saldi. 2023. Learning mean-ﬁeld games with discounted and average costs. Journal of Machine Learning Research 24(17) 1–59

work page 2023
[7]

Anahtarci, Berkay, Can Deha Kariksiz, Naci Saldi. 2023. Q-learning in regularized mean-ﬁeld games. Dynamic Games and Applications 13(1) 89–117

work page 2023
[8]

Baker, Graeme, Serdar Yuksel. 2016. Continuity and robu stness to incorrect priors in estimation and control. 2016 IEEE International Symposium on Information Theory (I SIT). 1999–2003

work page 2016
[9]

Bauso, Dario, Hamidou Tembine, Tamer Basar. 2016. Robus t mean ﬁeld games. Dynamic games and applications 6(3) 277–303. 33

work page 2016
[10]

Frehse, P

Bensoussan, A., J. Frehse, P. Yam. 2013. Mean Field Games and Mean Field Type Control Theory . Springer, New York

work page 2013
[11]

Bertsekas, Dimitri, Steven E Shreve. 1996. Stochastic optimal control: the discrete-time case , vol. 5. Athena Scientiﬁc

work page 1996
[12]

Billingsley, P. 1999. Convergence of Probability Measures . 2nd ed. New York: Wiley

work page 1999
[13]

Biswas, A. 2015. Mean ﬁeld games with ergodic cost for di screte time Markov processes. arXiv:1510.08968

work page internal anchor Pith review Pith/arXiv arXiv 2015
[14]

Bogachev, V.I. 2007. Measure Theory: Volume II . Springer

work page 2007
[15]

Cardaliaguet, P. 2011. Notes on Mean-ﬁeld Games

work page 2011
[16]

Cardaliaguet, Pierre, Fran¸ cois Delarue, Jean-Miche l Lasry, Pierre-Louis Lions. 2019. The master equation and the convergence problem in mean ﬁeld games . Princeton University Press

work page 2019
[17]

Carmona, R., F. Delarue. 2013. Probabilistic analysis of mean-ﬁeld games. SIAM J. Control Optim. 51(4) 2705–2734

work page 2013
[18]

Cui, Kai, Heinz Koeppl. 2021. Approximately solving me an ﬁeld games via entropy-regularized deep reinforcement learning. International Conference on Artiﬁcial Intelligence and St atistics. 1909–1917

work page 2021
[19]

Elliot, R., X. Li, Y. Ni. 2013. Discrete time mean-ﬁeld s tochastic linear-quadratic optimal control problems. Automatica 49 3222–3233

work page 2013
[20]

Mohr, R.R

Gomes, D.A., J. Mohr, R.R. Souza. 2010. Discrete time, ﬁ nite state space mean ﬁeld games. J. Math. Pures Appl. 93 308–328

work page 2010
[21]

Gomes, D.A., J. Sa´ ude. 2014. Mean ﬁeld games models - a b rief survey. Dyn. Games Appl. 4(2) 110–154

work page 2014
[22]

Lasserre

Hern´ andez-Lerma, O., J.B. Lasserre. 1996. Discrete-Time Markov Control Processes: Basic Optimal- ity Criteria . Springer

work page 1996
[23]

Huang, M. 2010. Large-population LQG games involving m ajor player: The Nash certainty equiva- lence principle. SIAM J. Control Optim. 48(5) 3318–3353

work page 2010
[24]

Caines, R.P

Huang, M., P.E. Caines, R.P. Malham´ e. 2007. Large-pop ulation cost coupled LQG problems with nonuniform agents: Individual-mass behavior and decentra lized ǫ-Nash equilibria. IEEE. Trans. Autom. Control 52(9) 1560–1571

work page 2007
[25]

Malham´ e, P.E

Huang, M., R.P. Malham´ e, P.E. Caines. 2006. Large popu lation stochastic dynamic games: Closed loop McKean-Vlasov systems and the Nash certainty equivale nce principle. Communications in Information Systems 6 221–252

work page 2006
[26]

Jusup, Matej, Barna Pasztor, Tadeusz Janik, Kenan Zhan g, Francesco Corman, Andreas Krause, Ilija Bogunovic. 2023. Safe model-based multi-agent mean-ﬁeld r einforcement learning. arXiv preprint arXiv:2306.17052

work page arXiv 2023
[27]

Kara, Ali Devran, Serdar Yuuksel. 2019. Robustness to i ncorrect priors in partially observed stochastic control. SIAM Journal on Control and Optimization 57(3) 1929–1964

work page 2019
[28]

Langen, H.J. 1981. Convergence of dynamic programming models. Math. Oper. Res. 6(4) 493–512

work page 1981
[29]

Lasry, J., P.Lions. 2007. Mean ﬁeld games. Japan. J. Math. 2 229–260

work page 2007
[30]

Lasry, Jean-Michel, Pierre-Louis Lions. 2007. Mean ﬁe ld games. Japanese journal of mathematics 2(1) 229–260

work page 2007
[31]

Lauriere, Mathieu. 2021. Numerical methods for mean ﬁe ld games and mean ﬁeld type control. Mean ﬁeld games 78 221–282

work page 2021
[32]

Moon, J., T. Ba¸ sar. 2015. Discrete-time decentralize d control using the risk-sensitive performance criterion in the large population regime: a mean ﬁeld approa ch. ACC 2015 . Chicago

work page 2015
[33]

Moon, J., T. Ba¸ sar. 2016. Discrete-time mean ﬁeld Stac kelberg games with a large number of followers. CDC 2016 . Las Vegas

work page 2016
[34]

Moon, J., T. Ba¸ sar. 2016. Robust mean ﬁeld games for cou pled Markov jump linear systems. Inter- national Journal of Control 89(7) 1367–1381

work page 2016
[35]

Moon, J., T. Ba¸ sar. 2017. Linear quadratic risk-sensi tive and robust mean ﬁeld games. IEEE. Trans. Autom. Control 62(3) 1062–1077

work page 2017
[36]

Nourian, M., G.N. Nair. 2013. Linear-quadratic-Gauss ian mean ﬁeld games under high rate quanti- zation. CDC 2013 . Florence. 34

work page 2013
[37]

Pasztor, Barna, Andreas Krause, Ilija Bogunovic. 2023 . Eﬃcient model-based multi-agent mean-ﬁeld reinforcement learning. Transactions on Machine Learning Research

work page 2023
[38]

Saldi, N. 2020. Discrete-time average-cost mean-ﬁeld games on Polish spaces. Turkish Journal of Mathematics 44(2) 463–480

work page 2020
[39]

Ba¸ sar, M

Saldi, N., T. Ba¸ sar, M. Raginsky. 2018. Markov–Nash eq uilibria in mean-ﬁeld games with discounted cost. SIAM Journal on Control and Optimization 56(6) 4256–4287

work page 2018
[40]

Ba¸ sar, M

Saldi, N., T. Ba¸ sar, M. Raginsky. 2019. Approximate Na sh equilibria in partially observed stochastic games with mean-ﬁeld interactions. Mathematics of Operations Research 44(3) 1006–1033

work page 2019
[41]

Ba¸ sar, M

Saldi, N., T. Ba¸ sar, M. Raginsky. 2020. Approximate Ma rkov-Nash equilibria for discrete-time risk- sensitive mean-ﬁeld games. Mathematics of Operations Research 45(4) 1596–1620

work page 2020
[42]

Saldi, Naci, Tamer Ba¸ sar, Maxim Raginsky. 2023. Partially observed discrete-time risk-sensitive mean ﬁeld games. Dynamic Games and Applications 13 929–960

work page 2023
[43]

Saldi, Naci, Serdar Y¨ uksel, Tam´ as Linder. 2017. On th e asymptotic optimality of ﬁnite approxima- tions to Markov decision processes with borel spaces. Mathematics of Operations Research 42(4) 945–978

work page 2017
[44]

Serfozo, Richard. 1982. Convergence of Lebesgue integ rals with varying measures. Sankhy¯ a Ser. A 44(3) 380–402

work page 1982
[45]

Shreve, Steven E, Dimitri P Bertsekas. 1979. Universal ly measurable policies in dynamic program- ming. Mathematics of Operations Research 4(1) 15–30

work page 1979
[46]

Tembine, H., Q. Zhu, T. Ba¸ sar. 2014. Risk-sensitive me an ﬁeld games. IEEE. Trans. Autom. Control 59(4) 835–850

work page 2014
[47]

Villani, C. 2009. Optimal transport: Old and New . Springer

work page 2009
[48]

Wiecek, P., E. Altman. 2015. Stationary anonymous sequ ential games with undiscounted rewards. Journal of Optimization Theory and Applications 166(2) 686–710

work page 2015
[49]

Wiecek, Piotr. 2019. Discrete-time ergodic mean-ﬁeld games with average reward on compact spaces. Dynamic Games and Applications 1–35

work page 2019
[50]

Zaman, Muhammad Aneeq Uz, Alec Koppel, Sujay Bhatt, Tam er Basar. 2023. Oracle-free reinforce- ment learning in mean-ﬁeld games along a single sample path. International Conference on Artiﬁcial Intelligence and Statistics . 10178–10206. 35

work page 2023

[1] [1]

Achdou, Yves, Fabio Camilli, Italo Capuzzo-Dolcetta. 2 012. Mean ﬁeld games: Numerical methods for the planning problem. SIAM Journal on Control and Optimization 50(1) 77–109

work page

[2] [2]

Achdou, Yves, Italo Capuzzo-Dolcetta. 2010. Mean ﬁeld g ames: Numerical methods. SIAM Journal on Numerical Analysis 48(3) 1136–1162

work page 2010

[3] [3]

Johari, G.Y

Adlakha, S., R. Johari, G.Y. Weintraub. 2015. Equilibri a of dynamic games with many players: Existence, approximation, and market structure. Journal of Economic Theory 156 269–316

work page 2015

[4] [4]

Almulla, Noha, Rita Ferreira, Diogo Aguiar Gomes. 2017. Two numerical approaches to stationary mean-ﬁeld games. Dynamic Games and Applications 7 657–682

work page 2017

[5] [5]

Anahtarci, Berkay, Can Deha Kariksiz, Naci Saldi. 2020. Value iteration algorithm for mean-ﬁeld games. Systems and Control Letters 143

work page 2020

[6] [6]

Anahtarci, Berkay, Can Deha Kariksiz, Naci Saldi. 2023. Learning mean-ﬁeld games with discounted and average costs. Journal of Machine Learning Research 24(17) 1–59

work page 2023

[7] [7]

Anahtarci, Berkay, Can Deha Kariksiz, Naci Saldi. 2023. Q-learning in regularized mean-ﬁeld games. Dynamic Games and Applications 13(1) 89–117

work page 2023

[8] [8]

Baker, Graeme, Serdar Yuksel. 2016. Continuity and robu stness to incorrect priors in estimation and control. 2016 IEEE International Symposium on Information Theory (I SIT). 1999–2003

work page 2016

[9] [9]

Bauso, Dario, Hamidou Tembine, Tamer Basar. 2016. Robus t mean ﬁeld games. Dynamic games and applications 6(3) 277–303. 33

work page 2016

[10] [10]

Frehse, P

Bensoussan, A., J. Frehse, P. Yam. 2013. Mean Field Games and Mean Field Type Control Theory . Springer, New York

work page 2013

[11] [11]

Bertsekas, Dimitri, Steven E Shreve. 1996. Stochastic optimal control: the discrete-time case , vol. 5. Athena Scientiﬁc

work page 1996

[12] [12]

Billingsley, P. 1999. Convergence of Probability Measures . 2nd ed. New York: Wiley

work page 1999

[13] [13]

Biswas, A. 2015. Mean ﬁeld games with ergodic cost for di screte time Markov processes. arXiv:1510.08968

work page internal anchor Pith review Pith/arXiv arXiv 2015

[14] [14]

Bogachev, V.I. 2007. Measure Theory: Volume II . Springer

work page 2007

[15] [15]

Cardaliaguet, P. 2011. Notes on Mean-ﬁeld Games

work page 2011

[16] [16]

Cardaliaguet, Pierre, Fran¸ cois Delarue, Jean-Miche l Lasry, Pierre-Louis Lions. 2019. The master equation and the convergence problem in mean ﬁeld games . Princeton University Press

work page 2019

[17] [17]

Carmona, R., F. Delarue. 2013. Probabilistic analysis of mean-ﬁeld games. SIAM J. Control Optim. 51(4) 2705–2734

work page 2013

[18] [18]

Cui, Kai, Heinz Koeppl. 2021. Approximately solving me an ﬁeld games via entropy-regularized deep reinforcement learning. International Conference on Artiﬁcial Intelligence and St atistics. 1909–1917

work page 2021

[19] [19]

Elliot, R., X. Li, Y. Ni. 2013. Discrete time mean-ﬁeld s tochastic linear-quadratic optimal control problems. Automatica 49 3222–3233

work page 2013

[20] [20]

Mohr, R.R

Gomes, D.A., J. Mohr, R.R. Souza. 2010. Discrete time, ﬁ nite state space mean ﬁeld games. J. Math. Pures Appl. 93 308–328

work page 2010

[21] [21]

Gomes, D.A., J. Sa´ ude. 2014. Mean ﬁeld games models - a b rief survey. Dyn. Games Appl. 4(2) 110–154

work page 2014

[22] [22]

Lasserre

Hern´ andez-Lerma, O., J.B. Lasserre. 1996. Discrete-Time Markov Control Processes: Basic Optimal- ity Criteria . Springer

work page 1996

[23] [23]

Huang, M. 2010. Large-population LQG games involving m ajor player: The Nash certainty equiva- lence principle. SIAM J. Control Optim. 48(5) 3318–3353

work page 2010

[24] [24]

Caines, R.P

Huang, M., P.E. Caines, R.P. Malham´ e. 2007. Large-pop ulation cost coupled LQG problems with nonuniform agents: Individual-mass behavior and decentra lized ǫ-Nash equilibria. IEEE. Trans. Autom. Control 52(9) 1560–1571

work page 2007

[25] [25]

Malham´ e, P.E

Huang, M., R.P. Malham´ e, P.E. Caines. 2006. Large popu lation stochastic dynamic games: Closed loop McKean-Vlasov systems and the Nash certainty equivale nce principle. Communications in Information Systems 6 221–252

work page 2006

[26] [26]

Jusup, Matej, Barna Pasztor, Tadeusz Janik, Kenan Zhan g, Francesco Corman, Andreas Krause, Ilija Bogunovic. 2023. Safe model-based multi-agent mean-ﬁeld r einforcement learning. arXiv preprint arXiv:2306.17052

work page arXiv 2023

[27] [27]

Kara, Ali Devran, Serdar Yuuksel. 2019. Robustness to i ncorrect priors in partially observed stochastic control. SIAM Journal on Control and Optimization 57(3) 1929–1964

work page 2019

[28] [28]

Langen, H.J. 1981. Convergence of dynamic programming models. Math. Oper. Res. 6(4) 493–512

work page 1981

[29] [29]

Lasry, J., P.Lions. 2007. Mean ﬁeld games. Japan. J. Math. 2 229–260

work page 2007

[30] [30]

Lasry, Jean-Michel, Pierre-Louis Lions. 2007. Mean ﬁe ld games. Japanese journal of mathematics 2(1) 229–260

work page 2007

[31] [31]

Lauriere, Mathieu. 2021. Numerical methods for mean ﬁe ld games and mean ﬁeld type control. Mean ﬁeld games 78 221–282

work page 2021

[32] [32]

Moon, J., T. Ba¸ sar. 2015. Discrete-time decentralize d control using the risk-sensitive performance criterion in the large population regime: a mean ﬁeld approa ch. ACC 2015 . Chicago

work page 2015

[33] [33]

Moon, J., T. Ba¸ sar. 2016. Discrete-time mean ﬁeld Stac kelberg games with a large number of followers. CDC 2016 . Las Vegas

work page 2016

[34] [34]

Moon, J., T. Ba¸ sar. 2016. Robust mean ﬁeld games for cou pled Markov jump linear systems. Inter- national Journal of Control 89(7) 1367–1381

work page 2016

[35] [35]

Moon, J., T. Ba¸ sar. 2017. Linear quadratic risk-sensi tive and robust mean ﬁeld games. IEEE. Trans. Autom. Control 62(3) 1062–1077

work page 2017

[36] [36]

Nourian, M., G.N. Nair. 2013. Linear-quadratic-Gauss ian mean ﬁeld games under high rate quanti- zation. CDC 2013 . Florence. 34

work page 2013

[37] [37]

Pasztor, Barna, Andreas Krause, Ilija Bogunovic. 2023 . Eﬃcient model-based multi-agent mean-ﬁeld reinforcement learning. Transactions on Machine Learning Research

work page 2023

[38] [38]

Saldi, N. 2020. Discrete-time average-cost mean-ﬁeld games on Polish spaces. Turkish Journal of Mathematics 44(2) 463–480

work page 2020

[39] [39]

Ba¸ sar, M

Saldi, N., T. Ba¸ sar, M. Raginsky. 2018. Markov–Nash eq uilibria in mean-ﬁeld games with discounted cost. SIAM Journal on Control and Optimization 56(6) 4256–4287

work page 2018

[40] [40]

Ba¸ sar, M

Saldi, N., T. Ba¸ sar, M. Raginsky. 2019. Approximate Na sh equilibria in partially observed stochastic games with mean-ﬁeld interactions. Mathematics of Operations Research 44(3) 1006–1033

work page 2019

[41] [41]

Ba¸ sar, M

Saldi, N., T. Ba¸ sar, M. Raginsky. 2020. Approximate Ma rkov-Nash equilibria for discrete-time risk- sensitive mean-ﬁeld games. Mathematics of Operations Research 45(4) 1596–1620

work page 2020

[42] [42]

Saldi, Naci, Tamer Ba¸ sar, Maxim Raginsky. 2023. Partially observed discrete-time risk-sensitive mean ﬁeld games. Dynamic Games and Applications 13 929–960

work page 2023

[43] [43]

Saldi, Naci, Serdar Y¨ uksel, Tam´ as Linder. 2017. On th e asymptotic optimality of ﬁnite approxima- tions to Markov decision processes with borel spaces. Mathematics of Operations Research 42(4) 945–978

work page 2017

[44] [44]

Serfozo, Richard. 1982. Convergence of Lebesgue integ rals with varying measures. Sankhy¯ a Ser. A 44(3) 380–402

work page 1982

[45] [45]

Shreve, Steven E, Dimitri P Bertsekas. 1979. Universal ly measurable policies in dynamic program- ming. Mathematics of Operations Research 4(1) 15–30

work page 1979

[46] [46]

Tembine, H., Q. Zhu, T. Ba¸ sar. 2014. Risk-sensitive me an ﬁeld games. IEEE. Trans. Autom. Control 59(4) 835–850

work page 2014

[47] [47]

Villani, C. 2009. Optimal transport: Old and New . Springer

work page 2009

[48] [48]

Wiecek, P., E. Altman. 2015. Stationary anonymous sequ ential games with undiscounted rewards. Journal of Optimization Theory and Applications 166(2) 686–710

work page 2015

[49] [49]

Wiecek, Piotr. 2019. Discrete-time ergodic mean-ﬁeld games with average reward on compact spaces. Dynamic Games and Applications 1–35

work page 2019

[50] [50]

Zaman, Muhammad Aneeq Uz, Alec Koppel, Sujay Bhatt, Tam er Basar. 2023. Oracle-free reinforce- ment learning in mean-ﬁeld games along a single sample path. International Conference on Artiﬁcial Intelligence and Statistics . 10178–10206. 35

work page 2023