Approximation of Discrete-Time Infinite-Horizon Mean-Field Equilibria via Finite-Horizon Mean-Field Equilibria
Pith reviewed 2026-05-18 20:29 UTC · model grok-4.3
The pith
Finite-horizon mean-field equilibria accumulate to non-stationary infinite-horizon equilibria and converge to stationary ones under extra conditions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Any accumulation point of mean-field equilibria from a discounted finite-horizon mean-field game constitutes, under weak convergence as the horizon tends to infinity, a non-stationary mean-field equilibrium of the infinite-horizon game; under further conditions these non-stationary equilibria converge to a stationary equilibrium, and finite-horizon closeness implies stationary closeness.
What carries the argument
Weak convergence of finite-horizon mean-field equilibria (measures and strategies) as the time horizon tends to infinity.
If this is right
- Finite-horizon equilibria supply non-stationary infinite-horizon equilibria via their accumulation points.
- Under extra conditions the non-stationary equilibria converge to stationary equilibria, so finite-horizon solutions approximate stationary ones.
- Improved contraction rates hold for iterative methods that compute regularized finite-horizon equilibria.
- When two finite-horizon games have close equilibria, their corresponding stationary infinite-horizon equilibria are also close.
- Finite-horizon games enable learning-based approximation of infinite-horizon equilibria when system components are unknown, with exponentially decaying error bounds under stronger Lipschitz assumptions.
Where Pith is reading between the lines
- The approximation result suggests that time-discretization schemes already used for computation can be repurposed as rigorous approximation tools rather than purely numerical devices.
- The new uniqueness criterion for non-stationary infinite-horizon equilibria may simplify verification in settings where contraction mapping arguments are unavailable.
- Because the error bounds decay exponentially in the horizon length, only moderately long finite-horizon problems need to be solved in practice to achieve high accuracy for the infinite-horizon limit.
Load-bearing premise
Accumulation points of the finite-horizon equilibria exist and the associated measures converge weakly as the horizon length grows without bound.
What would settle it
An explicit sequence of finite-horizon equilibria whose weak limit fails to satisfy the infinite-horizon equilibrium fixed-point condition for any admissible measure flow.
Figures
read the original abstract
We address in this paper a fundamental question that arises in mean-field games (MFGs), namely whether mean-field equilibria (MFE) for discrete-time finite-horizon MFGs can be used to obtain approximate stationary as well as non-stationary MFE for similarly structured infinite-horizon MFGs. We provide a rigorous analysis of this relationship, and show that any accumulation point of MFE of a discounted finite-horizon MFG constitutes, under weak convergence as the time horizon goes to infinity, a non-stationary MFE for the corresponding infinite-horizon MFG. Further, under certain conditions, these non-stationary MFE converge to a stationary MFE, establishing the appealing result that finite-horizon MFE can serve as approximations for stationary MFE. Additionally, we establish improved contraction rates for iterative methods used to compute regularized MFE in finite-horizon settings, extending existing results in the literature. As a byproduct, we obtain that when two MFGs have finite-horizon MFE that are close to each other, the corresponding stationary MFE are also close. As one application of the theoretical results, we show that finite-horizon MFGs can facilitate learning-based approaches to approximate infinite-horizon MFE when system components are unknown. Under further assumptions on the Lipschitz coefficients of the regularized system components (which are stronger than contractivity of finite-horizon MFGs), we obtain exponentially decaying finite-time error bounds -- in the time horizon -- between finite-horizon non-stationary, infinite-horizon non-stationary, and stationary MFE. As a byproduct of our error bounds, we present a new uniqueness criterion for infinite-horizon nonstationary MFE beyond the available contraction results in the literature.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that any accumulation point of mean-field equilibria (MFE) from a discounted discrete-time finite-horizon mean-field game (MFG), under weak convergence as the horizon N tends to infinity, constitutes a non-stationary MFE for the corresponding infinite-horizon MFG. Under additional conditions, these non-stationary MFE converge to stationary MFE. The work also establishes improved contraction rates for iterative methods computing regularized finite-horizon MFE, derives exponentially decaying finite-time error bounds between finite-horizon, infinite-horizon non-stationary, and stationary MFE (under stronger Lipschitz assumptions), and obtains a new uniqueness criterion for infinite-horizon non-stationary MFE. As a byproduct, closeness of finite-horizon MFE implies closeness of stationary MFE, with applications to learning-based approximation when dynamics are unknown.
Significance. If the central claims hold, the results provide a rigorous justification for using finite-horizon MFE as approximations to infinite-horizon problems, which is computationally attractive and supports learning methods with unknown components. The error bounds and uniqueness criterion extend the literature on contraction-based MFG analysis. The weak-convergence approach is standard but applied here to link finite- and infinite-horizon regimes in discrete time.
major comments (2)
- [Main theorem and § on weak convergence argument] The main approximation result (stated in the abstract and proved in the central theorem) treats the existence of accumulation points of the finite-horizon MFE sequence and their weak convergence as given, without deriving relative compactness or tightness from explicit conditions on the state-action spaces, transition kernels, or cost functions. This is load-bearing for the claim that accumulation points constitute non-stationary infinite-horizon MFE, as subsequential limits may fail to exist without uniform integrability, moment bounds, or compactness assumptions (common in non-compact MFG settings). Please add a dedicated subsection or assumption list specifying these conditions or prove tightness under the paper's standing hypotheses.
- [Error bounds section] The exponentially decaying error bounds between finite-horizon non-stationary, infinite-horizon non-stationary, and stationary MFE (under stronger Lipschitz coefficients) are presented as a byproduct, but the precise dependence on the horizon N and the contraction modulus should be stated explicitly in the theorem statement to allow verification of the decay rate.
minor comments (2)
- [Notation and preliminaries] Clarify the precise topology and space in which weak convergence of the joint state-action trajectory measures is taken (e.g., space of probability measures on infinite sequences).
- [Contraction rates subsection] The comparison of improved contraction rates to prior literature should include a direct numerical or symbolic comparison of the contraction constants.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate the planned revisions to strengthen the presentation of the weak-convergence argument and the error bounds.
read point-by-point responses
-
Referee: [Main theorem and § on weak convergence argument] The main approximation result (stated in the abstract and proved in the central theorem) treats the existence of accumulation points of the finite-horizon MFE sequence and their weak convergence as given, without deriving relative compactness or tightness from explicit conditions on the state-action spaces, transition kernels, or cost functions. This is load-bearing for the claim that accumulation points constitute non-stationary infinite-horizon MFE, as subsequential limits may fail to exist without uniform integrability, moment bounds, or compactness assumptions (common in non-compact MFG settings). Please add a dedicated subsection or assumption list specifying these conditions or prove tightness under the paper's standing hypotheses.
Authors: We agree that the existence of accumulation points under weak convergence is central and benefits from an explicit treatment. Our standing hypotheses already include compact state-action spaces, continuous transition kernels, and bounded Lipschitz costs, which imply tightness by Prokhorov's theorem and yield uniform integrability via moment bounds. To make the argument fully self-contained, we will add a dedicated subsection (new Section 2.4) that derives relative compactness directly from these hypotheses, including the required uniform integrability and moment conditions. This revision will be incorporated in the next version of the manuscript. revision: yes
-
Referee: [Error bounds section] The exponentially decaying error bounds between finite-horizon non-stationary, infinite-horizon non-stationary, and stationary MFE (under stronger Lipschitz coefficients) are presented as a byproduct, but the precise dependence on the horizon N and the contraction modulus should be stated explicitly in the theorem statement to allow verification of the decay rate.
Authors: We thank the referee for this helpful suggestion on clarity. The current error bounds are derived using the contraction modulus ρ of the regularized operator and the horizon N, yielding exponential decay of the form O(ρ^N). In the revised manuscript we will update the statement of the relevant theorem (Theorem 5.3) to display the explicit dependence, including the precise prefactor depending on the Lipschitz constants and the form C·ρ^N for the distance between the three classes of equilibria. This change will be made without altering the proof. revision: yes
Circularity Check
No circularity: central claims are conditional on weak convergence and use standard fixed-point arguments without reduction to inputs by construction.
full rationale
The paper's main result states that any accumulation point of finite-horizon MFEs, under weak convergence as horizon tends to infinity, constitutes a non-stationary infinite-horizon MFE. This is explicitly conditional on the existence of such accumulation points and their weak convergence, rather than deriving or assuming those properties from the result itself. The derivation relies on standard arguments from fixed-point theory and weak convergence in measure spaces, without self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations that reduce the claim to prior unverified work by the same authors. No equations or steps in the provided abstract or description exhibit a reduction where the output is equivalent to the input by construction. The additional results on contraction rates, error bounds, and uniqueness criteria are presented as extensions under further Lipschitz assumptions, again without circular reduction. This is a self-contained theoretical analysis against external benchmarks in MFG literature.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Finite-horizon mean-field equilibria exist and the sequence indexed by horizon length admits accumulation points in an appropriate weak topology.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
any accumulation point of MFE of a discounted finite-horizon MFG constitutes, under weak convergence as the time horizon goes to infinity, a non-stationary MFE
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
improved contraction rates for iterative methods... ρ(AT) < ¯K + K1¯L/ρ(1-β)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Thompson Sampling for Infinite-Horizon Discounted Decision Processes
Daniel Adelman, Cagla Keceli, and Alba V Olivares-Nadal. “Thompson Sampling for Infinite- Horizon Discounted Decision Processes”. In: arXiv preprint arXiv:2405.08253 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[2]
Learning mean-field games with discounted and average costs
Berkay Anahtarci, Can Deha Kariksiz, and Naci Saldi. “Learning mean-field games with discounted and average costs”. In: Journal of Machine Learning Research 24.17 (2023), pp. 1–59
work page 2023
-
[3]
Q-learning in regularized mean-field games
Berkay Anahtarci, Can Deha Kariksiz, and Naci Saldi. “Q-learning in regularized mean-field games”. In: Dynamic Games and Applications 13.1 (2023), pp. 89–117
work page 2023
-
[4]
Value iteration algorithm for mean- field games
Berkay Anahtarcı, Can Deha Karıksız, and Naci Saldi. “Value iteration algorithm for mean- field games”. In: Systems & Control Letters 143 (2020), p. 104744
work page 2020
-
[5]
Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion
U˘ gur Aydın and Naci Saldi. “Robustness and Approximation of Discrete-time Mean-field Games under Discounted Cost Criterion”. In: arXiv preprint arXiv:2310.10828 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[6]
Continuity and robustness to incorrect priors in estima- tion and control
Graeme Baker and Serdar Y¨ uksel. “Continuity and robustness to incorrect priors in estima- tion and control”. In: 2016 IEEE International Symposium on Information Theory (ISIT) . IEEE. 2016, pp. 1999–2003
work page 2016
-
[7]
Operator Theory. A Comprehensive Course in Analysis, Part 4
Simon Barry. “Operator Theory. A Comprehensive Course in Analysis, Part 4”. In: American Mathematical Society, Providence (2015)
work page 2015
-
[8]
Dario Bauso, Hamidou Tembine, and Tamer Ba¸ sar. “Robust mean field games”. In:Dynamic Games and Applications 6.3 (2016), pp. 277–303
work page 2016
-
[9]
Convergence of Probability Measures
Patrick Billingsley. Convergence of Probability Measures. John Wiley & Sons, 2013
work page 2013
-
[10]
Spectral Properties of Banded Toeplitz Matrices
Albrecht B¨ ottcher and Sergei M Grudsky. Spectral Properties of Banded Toeplitz Matrices. SIAM, 2005. 38
work page 2005
-
[11]
Keith Conrad. “Roots on a circle”. In: Expository note available at https://kconrad.math.uconn.edu/blurbs/ (2016)
work page 2016
-
[12]
Approximately solving mean field games via entropy-regularized deep reinforcement learning
Kai Cui and Heinz Koeppl. “Approximately solving mean field games via entropy-regularized deep reinforcement learning”. In: International Conference on Artificial Intelligence and Statistics. PMLR. 2021, pp. 1909–1917
work page 2021
-
[13]
Perron-Frobenius’ Theory and Applications
Karl Eriksson. Perron-Frobenius’ Theory and Applications. 2023
work page 2023
-
[14]
On matrices having equal spectral radius and spectral norm
M. Goldberg and G. Zwas. “On matrices having equal spectral radius and spectral norm”. In: Linear Algebra and its Applications 8.5 (1974), pp. 427–434. issn: 0024-3795. doi: https: //doi.org/10.1016/0024-3795(74)90076-7 . url: https://www.sciencedirect.com/ science/article/pii/0024379574900767
-
[15]
Optimization frameworks and sensitivity analysis of Stackelberg mean-field games
Xin Guo, Anran Hu, and Jiacheng Zhang. “Optimization frameworks and sensitivity analysis of Stackelberg mean-field games”. In: arXiv preprint arXiv:2210.04110 (2022)
-
[16]
Xin Guo et al. “Learning mean-field games”. In: Advances in Neural Information Processing Systems 32 (2019)
work page 2019
-
[17]
On´ esimo Hern´ andez-Lerma.Adaptive Markov Control Processes . Vol. 79. Springer Science & Business Media, 2012
work page 2012
-
[18]
Roger A Horn and Charles R Johnson. Matrix Analysis. Cambridge University Press, 2012
work page 2012
-
[19]
Jiawei Huang, Batuhan Yardim, and Niao He. “On the statistical efficiency of mean-field reinforcement learning with general function approximation”. In: International Conference on Artificial Intelligence and Statistics . PMLR. 2024, pp. 289–297
work page 2024
-
[20]
Fixed points and iteration of a nonexpansive mapping in a Banach space
Shiro Ishikawa. “Fixed points and iteration of a nonexpansive mapping in a Banach space”. In: Proc. Amer. Math. Soc. 59.1 (1976), pp. 65–71. issn: 0002-9939,1088-6826. doi: 10 . 2307/2042038. url: https://doi.org/10.2307/2042038
-
[21]
Robustness to incorrect priors in partially observed stochastic control
Ali Devran Kara and Serdar Y¨ uksel. “Robustness to incorrect priors in partially observed stochastic control”. In: SIAM Journal on Control and Optimization 57.3 (2019), pp. 1929– 1964
work page 2019
-
[22]
Concentration inequalities for depen- dent random variables via the martingale method
Leonid (Aryeh) Kontorovich and Kavita Ramanan. “Concentration inequalities for depen- dent random variables via the martingale method”. In:The Annals of Probability 36.6 (2008), pp. 2126–2158. doi: 10.1214/07-AOP384. url: https://doi.org/10.1214/07-AOP384
-
[23]
Gottfried K¨ othe. Topological Vector Spaces. II. Vol. 237. Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, New York-Berlin, 1979, pp. xii+331. isbn: 0-387-90400-X
work page 1979
-
[24]
Convergence of dynamic programming models
Hans-Joachim Langen. “Convergence of dynamic programming models”. In: Mathematics of Operations Research 6.4 (1981), pp. 493–512
work page 1981
-
[25]
Computing and Learning Mean Field Equilibria with Scalar Interactions: Algo- rithms and Applications
Bar Light. “Computing and Learning Mean Field Equilibria with Scalar Interactions: Algo- rithms and Applications”. In: arXiv preprint arXiv:2502.12024 (2025)
-
[26]
Linear quadratic risk-sensitive and robust mean field games
Jun Moon and Tamer Ba¸ sar. “Linear quadratic risk-sensitive and robust mean field games”. In: IEEE Transactions on Automatic Control 62.3 (2016), pp. 1062–1077
work page 2016
-
[27]
Markov–Nash equilibria in mean-field games with discounted cost
N. Saldi, T. Ba¸ sar, and M. Raginsky. “Markov–Nash equilibria in mean-field games with discounted cost”. In: SIAM Journal on Control and Optimization 56.6 (2018), pp. 4256– 4287
work page 2018
-
[28]
Efficient model-based multi-agent mean-field reinforcement learning
Barna P´ asztor, Andreas Krause, and Ilija Bogunovic. “Efficient model-based multi-agent mean-field reinforcement learning”. In: Transactions on Machine Learning Research (2023)
work page 2023
-
[29]
On imitation in mean-field games
Giorgia Ramponi et al. “On imitation in mean-field games”. In: Advances in Neural Infor- mation Processing Systems 36 (2024)
work page 2024
-
[30]
Walter Rudin. Real and Complex Analysis . McGraw-Hill, Inc., 1987
work page 1987
-
[31]
Convergence of Lebesgue integrals with varying measures
Richard Serfozo. “Convergence of Lebesgue integrals with varying measures”. In: Sankhy¯ a: The Indian Journal of Statistics, Series A (1982), pp. 380–402. 39
work page 1982
-
[32]
Reinforcement learning in stationary mean- field games
Jayakumar Subramanian and Aditya Mahajan. “Reinforcement learning in stationary mean- field games”. In: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems . 2019, pp. 251–259
work page 2019
-
[33]
Eigenvalues of several tridiagonal matrices
Wen-Chyuan Yueh. “Eigenvalues of several tridiagonal matrices.” In: Applied Mathematics E-Notes [electronic only] 5 (2005), pp. 66–74
work page 2005
-
[34]
Learning regularized monotone graphon mean-field games
Fengzhuo Zhang et al. “Learning regularized monotone graphon mean-field games”. In: Ad- vances in Neural Information Processing Systems 36 (2023), pp. 67297–67308. 40
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.