Approximation of Discrete-Time Infinite-Horizon Mean-Field Equilibria via Finite-Horizon Mean-Field Equilibria

Naci Saldi; Tamer Ba\c{s}ar; U\u{g}ur Ayd{\i}n

arxiv: 2509.01039 · v2 · submitted 2025-09-01 · 🧮 math.OC

Approximation of Discrete-Time Infinite-Horizon Mean-Field Equilibria via Finite-Horizon Mean-Field Equilibria

U\u{g}ur Ayd{\i}n , Tamer Ba\c{s}ar , Naci Saldi This is my paper

Pith reviewed 2026-05-18 20:29 UTC · model grok-4.3

classification 🧮 math.OC

keywords mean-field gamesfinite-horizon approximationinfinite-horizon equilibriaweak convergencestationary equilibrianon-stationary equilibriadiscrete-time gamesregularized equilibria

0 comments

The pith

Finite-horizon mean-field equilibria accumulate to non-stationary infinite-horizon equilibria and converge to stationary ones under extra conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that mean-field equilibria computed for discounted finite-horizon mean-field games can be used to construct equilibria for the corresponding infinite-horizon games. Any accumulation point of the finite-horizon equilibria, taken in the weak sense as the horizon length tends to infinity, satisfies the equilibrium conditions for a non-stationary infinite-horizon game. When additional regularity holds, the same sequence of non-stationary solutions converges further to a stationary equilibrium, so finite-horizon computations serve as practical approximations for the stationary case as well. The analysis also supplies improved contraction rates for regularized finite-horizon solvers and finite-time error bounds that decay exponentially in the horizon length. These results directly support learning-based approximation schemes when the underlying dynamics and costs are unknown.

Core claim

Any accumulation point of mean-field equilibria from a discounted finite-horizon mean-field game constitutes, under weak convergence as the horizon tends to infinity, a non-stationary mean-field equilibrium of the infinite-horizon game; under further conditions these non-stationary equilibria converge to a stationary equilibrium, and finite-horizon closeness implies stationary closeness.

What carries the argument

Weak convergence of finite-horizon mean-field equilibria (measures and strategies) as the time horizon tends to infinity.

If this is right

Finite-horizon equilibria supply non-stationary infinite-horizon equilibria via their accumulation points.
Under extra conditions the non-stationary equilibria converge to stationary equilibria, so finite-horizon solutions approximate stationary ones.
Improved contraction rates hold for iterative methods that compute regularized finite-horizon equilibria.
When two finite-horizon games have close equilibria, their corresponding stationary infinite-horizon equilibria are also close.
Finite-horizon games enable learning-based approximation of infinite-horizon equilibria when system components are unknown, with exponentially decaying error bounds under stronger Lipschitz assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approximation result suggests that time-discretization schemes already used for computation can be repurposed as rigorous approximation tools rather than purely numerical devices.
The new uniqueness criterion for non-stationary infinite-horizon equilibria may simplify verification in settings where contraction mapping arguments are unavailable.
Because the error bounds decay exponentially in the horizon length, only moderately long finite-horizon problems need to be solved in practice to achieve high accuracy for the infinite-horizon limit.

Load-bearing premise

Accumulation points of the finite-horizon equilibria exist and the associated measures converge weakly as the horizon length grows without bound.

What would settle it

An explicit sequence of finite-horizon equilibria whose weak limit fails to satisfy the infinite-horizon equilibrium fixed-point condition for any admissible measure flow.

Figures

Figures reproduced from arXiv: 2509.01039 by Naci Saldi, Tamer Ba\c{s}ar, U\u{g}ur Ayd{\i}n.

read the original abstract

We address in this paper a fundamental question that arises in mean-field games (MFGs), namely whether mean-field equilibria (MFE) for discrete-time finite-horizon MFGs can be used to obtain approximate stationary as well as non-stationary MFE for similarly structured infinite-horizon MFGs. We provide a rigorous analysis of this relationship, and show that any accumulation point of MFE of a discounted finite-horizon MFG constitutes, under weak convergence as the time horizon goes to infinity, a non-stationary MFE for the corresponding infinite-horizon MFG. Further, under certain conditions, these non-stationary MFE converge to a stationary MFE, establishing the appealing result that finite-horizon MFE can serve as approximations for stationary MFE. Additionally, we establish improved contraction rates for iterative methods used to compute regularized MFE in finite-horizon settings, extending existing results in the literature. As a byproduct, we obtain that when two MFGs have finite-horizon MFE that are close to each other, the corresponding stationary MFE are also close. As one application of the theoretical results, we show that finite-horizon MFGs can facilitate learning-based approaches to approximate infinite-horizon MFE when system components are unknown. Under further assumptions on the Lipschitz coefficients of the regularized system components (which are stronger than contractivity of finite-horizon MFGs), we obtain exponentially decaying finite-time error bounds -- in the time horizon -- between finite-horizon non-stationary, infinite-horizon non-stationary, and stationary MFE. As a byproduct of our error bounds, we present a new uniqueness criterion for infinite-horizon nonstationary MFE beyond the available contraction results in the literature.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Finite-horizon MFE accumulate to non-stationary infinite-horizon MFE under weak convergence, with improved contraction rates and a new uniqueness criterion as the main additions.

read the letter

The main thing to know is that this paper shows any accumulation point of discounted finite-horizon mean-field equilibria converges weakly to a non-stationary equilibrium of the corresponding infinite-horizon game, and under extra conditions those limits converge further to a stationary equilibrium. They also improve the contraction rates for iterative computation of regularized finite-horizon MFE and derive exponentially decaying error bounds between the finite-horizon, infinite-horizon non-stationary, and stationary versions when Lipschitz constants are strong enough. A byproduct is a closeness result: nearby finite-horizon equilibria imply nearby stationary ones, plus a new uniqueness criterion for non-stationary infinite-horizon MFE that goes beyond standard contraction arguments. The learning application when transition kernels or costs are unknown is a direct practical payoff from the approximation bridge. The arguments rest on standard fixed-point and weak-convergence tools rather than anything circular, which keeps the claims clean. The stress-test concern about relative compactness is worth checking in the proofs. The abstract states the result conditionally on weak convergence and accumulation points existing, but if the paper does not supply explicit tightness conditions such as moment bounds or compact state spaces, the result only applies in settings where those hold automatically. In many discrete-time MFG models the spaces are unbounded, so this is a real but contained limitation rather than a fatal gap. The stronger Lipschitz assumptions needed for the exponential error bounds are noted explicitly, so the scope is clear. This paper is for researchers already working on discrete-time mean-field games who need approximation schemes or computational shortcuts for infinite-horizon problems. A reader comfortable with contraction mappings and weak convergence in stochastic control will extract the rates and the finite-to-infinite link without much trouble. It has enough concrete new technical content and reproducible claims to deserve a serious referee, though the referee should verify the compactness step and the precise assumptions on the kernels.

Referee Report

2 major / 2 minor

Summary. The paper claims that any accumulation point of mean-field equilibria (MFE) from a discounted discrete-time finite-horizon mean-field game (MFG), under weak convergence as the horizon N tends to infinity, constitutes a non-stationary MFE for the corresponding infinite-horizon MFG. Under additional conditions, these non-stationary MFE converge to stationary MFE. The work also establishes improved contraction rates for iterative methods computing regularized finite-horizon MFE, derives exponentially decaying finite-time error bounds between finite-horizon, infinite-horizon non-stationary, and stationary MFE (under stronger Lipschitz assumptions), and obtains a new uniqueness criterion for infinite-horizon non-stationary MFE. As a byproduct, closeness of finite-horizon MFE implies closeness of stationary MFE, with applications to learning-based approximation when dynamics are unknown.

Significance. If the central claims hold, the results provide a rigorous justification for using finite-horizon MFE as approximations to infinite-horizon problems, which is computationally attractive and supports learning methods with unknown components. The error bounds and uniqueness criterion extend the literature on contraction-based MFG analysis. The weak-convergence approach is standard but applied here to link finite- and infinite-horizon regimes in discrete time.

major comments (2)

[Main theorem and § on weak convergence argument] The main approximation result (stated in the abstract and proved in the central theorem) treats the existence of accumulation points of the finite-horizon MFE sequence and their weak convergence as given, without deriving relative compactness or tightness from explicit conditions on the state-action spaces, transition kernels, or cost functions. This is load-bearing for the claim that accumulation points constitute non-stationary infinite-horizon MFE, as subsequential limits may fail to exist without uniform integrability, moment bounds, or compactness assumptions (common in non-compact MFG settings). Please add a dedicated subsection or assumption list specifying these conditions or prove tightness under the paper's standing hypotheses.
[Error bounds section] The exponentially decaying error bounds between finite-horizon non-stationary, infinite-horizon non-stationary, and stationary MFE (under stronger Lipschitz coefficients) are presented as a byproduct, but the precise dependence on the horizon N and the contraction modulus should be stated explicitly in the theorem statement to allow verification of the decay rate.

minor comments (2)

[Notation and preliminaries] Clarify the precise topology and space in which weak convergence of the joint state-action trajectory measures is taken (e.g., space of probability measures on infinite sequences).
[Contraction rates subsection] The comparison of improved contraction rates to prior literature should include a direct numerical or symbolic comparison of the contraction constants.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate the planned revisions to strengthen the presentation of the weak-convergence argument and the error bounds.

read point-by-point responses

Referee: [Main theorem and § on weak convergence argument] The main approximation result (stated in the abstract and proved in the central theorem) treats the existence of accumulation points of the finite-horizon MFE sequence and their weak convergence as given, without deriving relative compactness or tightness from explicit conditions on the state-action spaces, transition kernels, or cost functions. This is load-bearing for the claim that accumulation points constitute non-stationary infinite-horizon MFE, as subsequential limits may fail to exist without uniform integrability, moment bounds, or compactness assumptions (common in non-compact MFG settings). Please add a dedicated subsection or assumption list specifying these conditions or prove tightness under the paper's standing hypotheses.

Authors: We agree that the existence of accumulation points under weak convergence is central and benefits from an explicit treatment. Our standing hypotheses already include compact state-action spaces, continuous transition kernels, and bounded Lipschitz costs, which imply tightness by Prokhorov's theorem and yield uniform integrability via moment bounds. To make the argument fully self-contained, we will add a dedicated subsection (new Section 2.4) that derives relative compactness directly from these hypotheses, including the required uniform integrability and moment conditions. This revision will be incorporated in the next version of the manuscript. revision: yes
Referee: [Error bounds section] The exponentially decaying error bounds between finite-horizon non-stationary, infinite-horizon non-stationary, and stationary MFE (under stronger Lipschitz coefficients) are presented as a byproduct, but the precise dependence on the horizon N and the contraction modulus should be stated explicitly in the theorem statement to allow verification of the decay rate.

Authors: We thank the referee for this helpful suggestion on clarity. The current error bounds are derived using the contraction modulus ρ of the regularized operator and the horizon N, yielding exponential decay of the form O(ρ^N). In the revised manuscript we will update the statement of the relevant theorem (Theorem 5.3) to display the explicit dependence, including the precise prefactor depending on the Lipschitz constants and the form C·ρ^N for the distance between the three classes of equilibria. This change will be made without altering the proof. revision: yes

Circularity Check

0 steps flagged

No circularity: central claims are conditional on weak convergence and use standard fixed-point arguments without reduction to inputs by construction.

full rationale

The paper's main result states that any accumulation point of finite-horizon MFEs, under weak convergence as horizon tends to infinity, constitutes a non-stationary infinite-horizon MFE. This is explicitly conditional on the existence of such accumulation points and their weak convergence, rather than deriving or assuming those properties from the result itself. The derivation relies on standard arguments from fixed-point theory and weak convergence in measure spaces, without self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations that reduce the claim to prior unverified work by the same authors. No equations or steps in the provided abstract or description exhibit a reduction where the output is equivalent to the input by construction. The additional results on contraction rates, error bounds, and uniqueness criteria are presented as extensions under further Lipschitz assumptions, again without circular reduction. This is a self-contained theoretical analysis against external benchmarks in MFG literature.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claims rest on standard mathematical background for mean-field games (existence of equilibria under suitable continuity and compactness assumptions) and on the technical condition that finite-horizon equilibria possess accumulation points under weak convergence; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Finite-horizon mean-field equilibria exist and the sequence indexed by horizon length admits accumulation points in an appropriate weak topology.
Invoked to guarantee that the limiting object is a non-stationary infinite-horizon MFE.

pith-pipeline@v0.9.0 · 5873 in / 1272 out tokens · 27245 ms · 2026-05-18T20:29:02.648426+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

any accumulation point of MFE of a discounted finite-horizon MFG constitutes, under weak convergence as the time horizon goes to infinity, a non-stationary MFE
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

improved contraction rates for iterative methods... ρ(AT) < ¯K + K1¯L/ρ(1-β)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 2 internal anchors

[1]

Thompson Sampling for Infinite-Horizon Discounted Decision Processes

Daniel Adelman, Cagla Keceli, and Alba V Olivares-Nadal. “Thompson Sampling for Infinite- Horizon Discounted Decision Processes”. In: arXiv preprint arXiv:2405.08253 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[2]

Learning mean-field games with discounted and average costs

Berkay Anahtarci, Can Deha Kariksiz, and Naci Saldi. “Learning mean-field games with discounted and average costs”. In: Journal of Machine Learning Research 24.17 (2023), pp. 1–59

work page 2023
[3]

Q-learning in regularized mean-field games

Berkay Anahtarci, Can Deha Kariksiz, and Naci Saldi. “Q-learning in regularized mean-field games”. In: Dynamic Games and Applications 13.1 (2023), pp. 89–117

work page 2023
[4]

Value iteration algorithm for mean- field games

Berkay Anahtarcı, Can Deha Karıksız, and Naci Saldi. “Value iteration algorithm for mean- field games”. In: Systems & Control Letters 143 (2020), p. 104744

work page 2020
[5]

Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion

U˘ gur Aydın and Naci Saldi. “Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion”. In: arXiv preprint arXiv:2310.10828 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[6]

Continuity and robustness to incorrect priors in estima- tion and control

Graeme Baker and Serdar Y¨ uksel. “Continuity and robustness to incorrect priors in estima- tion and control”. In: 2016 IEEE International Symposium on Information Theory (ISIT) . IEEE. 2016, pp. 1999–2003

work page 2016
[7]

Operator Theory. A Comprehensive Course in Analysis, Part 4

Simon Barry. “Operator Theory. A Comprehensive Course in Analysis, Part 4”. In: American Mathematical Society, Providence (2015)

work page 2015
[8]

Robust mean field games

Dario Bauso, Hamidou Tembine, and Tamer Ba¸ sar. “Robust mean field games”. In:Dynamic Games and Applications 6.3 (2016), pp. 277–303

work page 2016
[9]

Convergence of Probability Measures

Patrick Billingsley. Convergence of Probability Measures. John Wiley & Sons, 2013

work page 2013
[10]

Spectral Properties of Banded Toeplitz Matrices

Albrecht B¨ ottcher and Sergei M Grudsky. Spectral Properties of Banded Toeplitz Matrices. SIAM, 2005. 38

work page 2005
[11]

Roots on a circle

Keith Conrad. “Roots on a circle”. In: Expository note available at https://kconrad.math.uconn.edu/blurbs/ (2016)

work page 2016
[12]

Approximately solving mean field games via entropy-regularized deep reinforcement learning

Kai Cui and Heinz Koeppl. “Approximately solving mean field games via entropy-regularized deep reinforcement learning”. In: International Conference on Artificial Intelligence and Statistics. PMLR. 2021, pp. 1909–1917

work page 2021
[13]

Perron-Frobenius’ Theory and Applications

Karl Eriksson. Perron-Frobenius’ Theory and Applications. 2023

work page 2023
[14]

On matrices having equal spectral radius and spectral norm

M. Goldberg and G. Zwas. “On matrices having equal spectral radius and spectral norm”. In: Linear Algebra and its Applications 8.5 (1974), pp. 427–434. issn: 0024-3795. doi: https: //doi.org/10.1016/0024-3795(74)90076-7 . url: https://www.sciencedirect.com/ science/article/pii/0024379574900767

work page doi:10.1016/0024-3795(74)90076-7 1974
[15]

Optimization frameworks and sensitivity analysis of Stackelberg mean-field games

Xin Guo, Anran Hu, and Jiacheng Zhang. “Optimization frameworks and sensitivity analysis of Stackelberg mean-field games”. In: arXiv preprint arXiv:2210.04110 (2022)

work page arXiv 2022
[16]

Learning mean-field games

Xin Guo et al. “Learning mean-field games”. In: Advances in Neural Information Processing Systems 32 (2019)

work page 2019
[17]

On´ esimo Hern´ andez-Lerma.Adaptive Markov Control Processes . Vol. 79. Springer Science & Business Media, 2012

work page 2012
[18]

Matrix Analysis

Roger A Horn and Charles R Johnson. Matrix Analysis. Cambridge University Press, 2012

work page 2012
[19]

On the statistical efficiency of mean-field reinforcement learning with general function approximation

Jiawei Huang, Batuhan Yardim, and Niao He. “On the statistical efficiency of mean-field reinforcement learning with general function approximation”. In: International Conference on Artificial Intelligence and Statistics . PMLR. 2024, pp. 289–297

work page 2024
[20]

Fixed points and iteration of a nonexpansive mapping in a Banach space

Shiro Ishikawa. “Fixed points and iteration of a nonexpansive mapping in a Banach space”. In: Proc. Amer. Math. Soc. 59.1 (1976), pp. 65–71. issn: 0002-9939,1088-6826. doi: 10 . 2307/2042038. url: https://doi.org/10.2307/2042038

work page doi:10.2307/2042038 1976
[21]

Robustness to incorrect priors in partially observed stochastic control

Ali Devran Kara and Serdar Y¨ uksel. “Robustness to incorrect priors in partially observed stochastic control”. In: SIAM Journal on Control and Optimization 57.3 (2019), pp. 1929– 1964

work page 2019
[22]

Concentration inequalities for depen- dent random variables via the martingale method

Leonid (Aryeh) Kontorovich and Kavita Ramanan. “Concentration inequalities for depen- dent random variables via the martingale method”. In:The Annals of Probability 36.6 (2008), pp. 2126–2158. doi: 10.1214/07-AOP384. url: https://doi.org/10.1214/07-AOP384

work page doi:10.1214/07-aop384 2008
[23]

Topological Vector Spaces

Gottfried K¨ othe. Topological Vector Spaces. II. Vol. 237. Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, New York-Berlin, 1979, pp. xii+331. isbn: 0-387-90400-X

work page 1979
[24]

Convergence of dynamic programming models

Hans-Joachim Langen. “Convergence of dynamic programming models”. In: Mathematics of Operations Research 6.4 (1981), pp. 493–512

work page 1981
[25]

Computing and Learning Mean Field Equilibria with Scalar Interactions: Algo- rithms and Applications

Bar Light. “Computing and Learning Mean Field Equilibria with Scalar Interactions: Algo- rithms and Applications”. In: arXiv preprint arXiv:2502.12024 (2025)

work page arXiv 2025
[26]

Linear quadratic risk-sensitive and robust mean field games

Jun Moon and Tamer Ba¸ sar. “Linear quadratic risk-sensitive and robust mean field games”. In: IEEE Transactions on Automatic Control 62.3 (2016), pp. 1062–1077

work page 2016
[27]

Markov–Nash equilibria in mean-field games with discounted cost

N. Saldi, T. Ba¸ sar, and M. Raginsky. “Markov–Nash equilibria in mean-field games with discounted cost”. In: SIAM Journal on Control and Optimization 56.6 (2018), pp. 4256– 4287

work page 2018
[28]

Efficient model-based multi-agent mean-field reinforcement learning

Barna P´ asztor, Andreas Krause, and Ilija Bogunovic. “Efficient model-based multi-agent mean-field reinforcement learning”. In: Transactions on Machine Learning Research (2023)

work page 2023
[29]

On imitation in mean-field games

Giorgia Ramponi et al. “On imitation in mean-field games”. In: Advances in Neural Infor- mation Processing Systems 36 (2024)

work page 2024
[30]

Real and Complex Analysis

Walter Rudin. Real and Complex Analysis . McGraw-Hill, Inc., 1987

work page 1987
[31]

Convergence of Lebesgue integrals with varying measures

Richard Serfozo. “Convergence of Lebesgue integrals with varying measures”. In: Sankhy¯ a: The Indian Journal of Statistics, Series A (1982), pp. 380–402. 39

work page 1982
[32]

Reinforcement learning in stationary mean- field games

Jayakumar Subramanian and Aditya Mahajan. “Reinforcement learning in stationary mean- field games”. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems . 2019, pp. 251–259

work page 2019
[33]

Eigenvalues of several tridiagonal matrices

Wen-Chyuan Yueh. “Eigenvalues of several tridiagonal matrices.” In: Applied Mathematics E-Notes [electronic only] 5 (2005), pp. 66–74

work page 2005
[34]

Learning regularized monotone graphon mean-field games

Fengzhuo Zhang et al. “Learning regularized monotone graphon mean-field games”. In: Ad- vances in Neural Information Processing Systems 36 (2023), pp. 67297–67308. 40

work page 2023

[1] [1]

Thompson Sampling for Infinite-Horizon Discounted Decision Processes

Daniel Adelman, Cagla Keceli, and Alba V Olivares-Nadal. “Thompson Sampling for Infinite- Horizon Discounted Decision Processes”. In: arXiv preprint arXiv:2405.08253 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[2] [2]

Learning mean-field games with discounted and average costs

Berkay Anahtarci, Can Deha Kariksiz, and Naci Saldi. “Learning mean-field games with discounted and average costs”. In: Journal of Machine Learning Research 24.17 (2023), pp. 1–59

work page 2023

[3] [3]

Q-learning in regularized mean-field games

Berkay Anahtarci, Can Deha Kariksiz, and Naci Saldi. “Q-learning in regularized mean-field games”. In: Dynamic Games and Applications 13.1 (2023), pp. 89–117

work page 2023

[4] [4]

Value iteration algorithm for mean- field games

Berkay Anahtarcı, Can Deha Karıksız, and Naci Saldi. “Value iteration algorithm for mean- field games”. In: Systems & Control Letters 143 (2020), p. 104744

work page 2020

[5] [5]

Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion

U˘ gur Aydın and Naci Saldi. “Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion”. In: arXiv preprint arXiv:2310.10828 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[6] [6]

Continuity and robustness to incorrect priors in estima- tion and control

Graeme Baker and Serdar Y¨ uksel. “Continuity and robustness to incorrect priors in estima- tion and control”. In: 2016 IEEE International Symposium on Information Theory (ISIT) . IEEE. 2016, pp. 1999–2003

work page 2016

[7] [7]

Operator Theory. A Comprehensive Course in Analysis, Part 4

Simon Barry. “Operator Theory. A Comprehensive Course in Analysis, Part 4”. In: American Mathematical Society, Providence (2015)

work page 2015

[8] [8]

Robust mean field games

Dario Bauso, Hamidou Tembine, and Tamer Ba¸ sar. “Robust mean field games”. In:Dynamic Games and Applications 6.3 (2016), pp. 277–303

work page 2016

[9] [9]

Convergence of Probability Measures

Patrick Billingsley. Convergence of Probability Measures. John Wiley & Sons, 2013

work page 2013

[10] [10]

Spectral Properties of Banded Toeplitz Matrices

Albrecht B¨ ottcher and Sergei M Grudsky. Spectral Properties of Banded Toeplitz Matrices. SIAM, 2005. 38

work page 2005

[11] [11]

Roots on a circle

Keith Conrad. “Roots on a circle”. In: Expository note available at https://kconrad.math.uconn.edu/blurbs/ (2016)

work page 2016

[12] [12]

Approximately solving mean field games via entropy-regularized deep reinforcement learning

Kai Cui and Heinz Koeppl. “Approximately solving mean field games via entropy-regularized deep reinforcement learning”. In: International Conference on Artificial Intelligence and Statistics. PMLR. 2021, pp. 1909–1917

work page 2021

[13] [13]

Perron-Frobenius’ Theory and Applications

Karl Eriksson. Perron-Frobenius’ Theory and Applications. 2023

work page 2023

[14] [14]

On matrices having equal spectral radius and spectral norm

M. Goldberg and G. Zwas. “On matrices having equal spectral radius and spectral norm”. In: Linear Algebra and its Applications 8.5 (1974), pp. 427–434. issn: 0024-3795. doi: https: //doi.org/10.1016/0024-3795(74)90076-7 . url: https://www.sciencedirect.com/ science/article/pii/0024379574900767

work page doi:10.1016/0024-3795(74)90076-7 1974

[15] [15]

Optimization frameworks and sensitivity analysis of Stackelberg mean-field games

Xin Guo, Anran Hu, and Jiacheng Zhang. “Optimization frameworks and sensitivity analysis of Stackelberg mean-field games”. In: arXiv preprint arXiv:2210.04110 (2022)

work page arXiv 2022

[16] [16]

Learning mean-field games

Xin Guo et al. “Learning mean-field games”. In: Advances in Neural Information Processing Systems 32 (2019)

work page 2019

[17] [17]

On´ esimo Hern´ andez-Lerma.Adaptive Markov Control Processes . Vol. 79. Springer Science & Business Media, 2012

work page 2012

[18] [18]

Matrix Analysis

Roger A Horn and Charles R Johnson. Matrix Analysis. Cambridge University Press, 2012

work page 2012

[19] [19]

On the statistical efficiency of mean-field reinforcement learning with general function approximation

Jiawei Huang, Batuhan Yardim, and Niao He. “On the statistical efficiency of mean-field reinforcement learning with general function approximation”. In: International Conference on Artificial Intelligence and Statistics . PMLR. 2024, pp. 289–297

work page 2024

[20] [20]

Fixed points and iteration of a nonexpansive mapping in a Banach space

Shiro Ishikawa. “Fixed points and iteration of a nonexpansive mapping in a Banach space”. In: Proc. Amer. Math. Soc. 59.1 (1976), pp. 65–71. issn: 0002-9939,1088-6826. doi: 10 . 2307/2042038. url: https://doi.org/10.2307/2042038

work page doi:10.2307/2042038 1976

[21] [21]

Robustness to incorrect priors in partially observed stochastic control

Ali Devran Kara and Serdar Y¨ uksel. “Robustness to incorrect priors in partially observed stochastic control”. In: SIAM Journal on Control and Optimization 57.3 (2019), pp. 1929– 1964

work page 2019

[22] [22]

Concentration inequalities for depen- dent random variables via the martingale method

Leonid (Aryeh) Kontorovich and Kavita Ramanan. “Concentration inequalities for depen- dent random variables via the martingale method”. In:The Annals of Probability 36.6 (2008), pp. 2126–2158. doi: 10.1214/07-AOP384. url: https://doi.org/10.1214/07-AOP384

work page doi:10.1214/07-aop384 2008

[23] [23]

Topological Vector Spaces

Gottfried K¨ othe. Topological Vector Spaces. II. Vol. 237. Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, New York-Berlin, 1979, pp. xii+331. isbn: 0-387-90400-X

work page 1979

[24] [24]

Convergence of dynamic programming models

Hans-Joachim Langen. “Convergence of dynamic programming models”. In: Mathematics of Operations Research 6.4 (1981), pp. 493–512

work page 1981

[25] [25]

Computing and Learning Mean Field Equilibria with Scalar Interactions: Algo- rithms and Applications

Bar Light. “Computing and Learning Mean Field Equilibria with Scalar Interactions: Algo- rithms and Applications”. In: arXiv preprint arXiv:2502.12024 (2025)

work page arXiv 2025

[26] [26]

Linear quadratic risk-sensitive and robust mean field games

Jun Moon and Tamer Ba¸ sar. “Linear quadratic risk-sensitive and robust mean field games”. In: IEEE Transactions on Automatic Control 62.3 (2016), pp. 1062–1077

work page 2016

[27] [27]

Markov–Nash equilibria in mean-field games with discounted cost

N. Saldi, T. Ba¸ sar, and M. Raginsky. “Markov–Nash equilibria in mean-field games with discounted cost”. In: SIAM Journal on Control and Optimization 56.6 (2018), pp. 4256– 4287

work page 2018

[28] [28]

Efficient model-based multi-agent mean-field reinforcement learning

Barna P´ asztor, Andreas Krause, and Ilija Bogunovic. “Efficient model-based multi-agent mean-field reinforcement learning”. In: Transactions on Machine Learning Research (2023)

work page 2023

[29] [29]

On imitation in mean-field games

Giorgia Ramponi et al. “On imitation in mean-field games”. In: Advances in Neural Infor- mation Processing Systems 36 (2024)

work page 2024

[30] [30]

Real and Complex Analysis

Walter Rudin. Real and Complex Analysis . McGraw-Hill, Inc., 1987

work page 1987

[31] [31]

Convergence of Lebesgue integrals with varying measures

Richard Serfozo. “Convergence of Lebesgue integrals with varying measures”. In: Sankhy¯ a: The Indian Journal of Statistics, Series A (1982), pp. 380–402. 39

work page 1982

[32] [32]

Reinforcement learning in stationary mean- field games

Jayakumar Subramanian and Aditya Mahajan. “Reinforcement learning in stationary mean- field games”. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems . 2019, pp. 251–259

work page 2019

[33] [33]

Eigenvalues of several tridiagonal matrices

Wen-Chyuan Yueh. “Eigenvalues of several tridiagonal matrices.” In: Applied Mathematics E-Notes [electronic only] 5 (2005), pp. 66–74

work page 2005

[34] [34]

Learning regularized monotone graphon mean-field games

Fengzhuo Zhang et al. “Learning regularized monotone graphon mean-field games”. In: Ad- vances in Neural Information Processing Systems 36 (2023), pp. 67297–67308. 40

work page 2023