arxiv: 2604.00621 · v2 · submitted 2026-04-01 · 💻 cs.GT

Heterogeneous Mean Field Game Framework for LEO Satellite-Assisted V2X Networks

Kangkang Sun , Jianhua Li , Xiuzhen Chen , Mingzhe Chen , Minyi Guo This is my paper

Pith reviewed 2026-05-13 22:18 UTC · model grok-4.3

classification 💻 cs.GT

keywords heterogeneous mean field gamesLEO satellite networksV2X coordinationtype selection scalingε-Nash equilibriumqueue-channel modelsmean field approximationdelay minimization

0 comments

The pith

The optimal number of agent types for heterogeneous mean field games in large vehicle fleets scales as the cube root of fleet size for one-dimensional queue states and the fifth root for two-dimensional states.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper resolves how many distinct types to use when applying heterogeneous mean field games to coordinate massive mixed fleets of vehicles over LEO satellite links. The central difficulty is a two-sided trade-off: more types capture vehicle differences better but shrink per-type sample sizes and degrade the mean-field approximation. An explicit ε-Nash error decomposition quantifies this balance and produces closed-form scaling laws for the best type count together with a heterogeneity-aware solver. The resulting type count becomes a fixed, dimension-dependent system parameter rather than a repeated tuning choice. Experiments on 1D and joint queue-channel models confirm the predicted scalings and show concrete gains in delay and throughput over homogeneous baselines.

Core claim

The paper shows that an ε-Nash error decomposition separates total approximation error into heterogeneity and mean-field components, yielding the optimal type count K^*(N) = Θ(N^{α/(α+β)}) where the exponents depend on state-space dimension. For the 1D queue model this gives K^*(N) = Θ(N^{1/3}); for the joint queue-channel model (d=2) it becomes Θ(N^{1/5}) with a logarithmic correction. The same decomposition supplies a heterogeneity-aware equilibrium solver whose per-iteration cost is O(K² N_q N_t), independent of total fleet size N, and extends to time-varying LEO backhaul dynamics.

What carries the argument

The ε-Nash error decomposition that splits total approximation error into a heterogeneity term and a mean-field term to derive closed-form optimal type scalings.

If this is right

For one-dimensional queue states the optimal type count grows as the cube root of fleet size.
For two-dimensional queue-channel states the optimal type count grows as the fifth root of fleet size with a logarithmic correction.
The equilibrium solver has per-iteration complexity quadratic in the number of types but independent of total fleet size.
The approach delivers up to 29.5 percent lower delay and 60 percent higher throughput than homogeneous mean-field baselines.
Type count selection reduces to a single dimension-dependent design parameter set once rather than tuned per deployment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same scaling relation could guide type selection in other large-scale multi-agent systems whose state spaces have low intrinsic dimension.
Real LEO deployments would need to test whether the modeled queue and channel dynamics match observed satellite latency and handoff patterns.
The framework's independence from fleet size N suggests it can be embedded directly in edge servers without retraining when vehicle density changes.

Load-bearing premise

The ε-Nash error decomposition accurately quantifies the trade-off between heterogeneity representation and mean-field accuracy under the assumed 1D queue and joint queue-channel dynamics.

What would settle it

A log-log plot of empirically optimal type count versus fleet size N whose slope matches 0.333 for the 1D queue model or 0.2 for the 2D model.

Figures

Figures reproduced from arXiv: 2604.00621 by Jianhua Li, Kangkang Sun, Mingzhe Chen, Minyi Guo, Xiuzhen Chen.

**Figure 2.** Figure 2: Scaling-law validation (Corollary 1). Left axis: continuous K∗(N) (solid red) and rounded integer Kˆ (blue circles); log-log slope ≈ 1/3. Right axis: minimum achievable error E ∗(N) (dashed); slope ≈ −1/6. • G-prox PDHG [3]: Homogeneous MFG (K = 1) with fixed step size ξς = 0.99. • SMFG [4]: Stackelberg MFG (K = 1) with congestionpricing feedback. • HMF-MARL [5]: Two-type HMFG (K = 2) with fixed type assi… view at source ↗

**Figure 4.** Figure 4: PDHG residual convergence validation (Theorem 4). Log-scale [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Scalability validation (Proposition 3). Per-iteration wall-clock runtime [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Comprehensive communication performance comparison over six KPI dimensions. [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Empirical delay CDF for fleet sizes N = 200 and N = 1000. Vertical dashed line marks the 100 ms V2X QoS threshold. At N = 200, the proposed method achieves 100% QoS satisfaction while G-prox satisfies only 48.3%. At N = 1000, mean-field averaging ensures all methods converge to 100%, but mean-delay ordering is preserved. Balanced (1/3,1/3,1/3) Moderate (0.5,0.3,0.2) Skewed (0.7,0.2,0.1) Heavy skew (0.9,0.0… view at source ↗

**Figure 8.** Figure 8: Optimal type count K∗ (top) and ε-Nash error (bottom) for balanced (λk = 1/K) vs. unbalanced (70%/20%/10%, λmin = 0.10) fleets. The unbalanced prefactor (0.1)1/3 ≈ 0.464 (Corollary 4) matches empirical ratios within ±3% across all N. F. Category V: LEO Satellite-Assisted Robustness We validate Theorem 3 and Corollary 5 under a LEOassisted backhaul model with Bτ sat ∼ U[300, 350] Mbps, ∆τ = 60 s, and µ = 0… view at source ↗

read the original abstract

Coordinating mixed fleets of massive vehicles under stringent delay constraints is a central scalability bottleneck in next-generation mobile computing networks, especially when passenger cars, freight trucks, and autonomous vehicles share the same radio and multi-access edge computing (MEC) infrastructure. Heterogeneous mean field games (HMFG) are a principled framework for this setting, but a fundamental design question remains open: how many agent types should be used for a fleet of size $N$? The difficulty is a two-sided trade-off that existing theory does not resolve: using more types improves heterogeneity representation, but it reduces per-class sample size and weakens the mean-field approximation accuracy. This paper resolves that trade-off through an explicit $\varepsilon$-Nash error decomposition, a closed-form type-selection law, a heterogeneity-aware equilibrium solver, and a robust extension to time-varying LEO backhaul dynamics. For the 1D queue state space, the optimal type count satisfies $K^*(N)=\Theta(N^{1/3})$; for the joint queue-channel model ($d=2$), the scaling becomes $K^*(N)=\Theta(N^{1/5})$ with logarithmic correction. The unified formula $K^*(N)=\Theta(N^{\alpha/(\alpha+\beta)})$ provides dimension-dependent design guidance, reducing type granularity to a principled, set-once system parameter rather than a per-deployment tuning burden. Experiments validate the 1D scaling law with empirical slope $0.334 \pm 0.004$, achieve $2.3\times$ faster PDHG convergence at $K=5$, and deliver up to $29.5\%$ lower delay and $60\%$ higher throughput than homogeneous baselines. Unlike model-free DRL methods whose training complexity scales with the state-action space, the proposed HMFG solver has per-iteration complexity $O(K^2 N_q N_t)$ independent of fleet size $N$, making it suitable for large-scale mobile edge computing deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a closed-form scaling law for choosing the number of types in heterogeneous mean field games for large V2X fleets, derived from an ε-Nash error decomposition that balances heterogeneity and approximation errors.

read the letter

The central point is that this work supplies an explicit rule for setting the number of agent types K in an HMFG model for vehicle-to-everything networks with LEO backhaul. For a one-dimensional queue state it yields K scaling as N to the one-third; for the two-dimensional queue-plus-channel case it becomes N to the one-fifth with a log correction. The unified expression K^*(N) = Θ(N^{α/(α+β)}) is presented as coming directly from minimizing the sum of representation error (which falls with K) and mean-field error (which rises with K because each type has fewer samples). They also describe a heterogeneity-aware solver whose per-iteration cost stays O(K² N_q N_t) and therefore independent of total fleet size N, plus a claimed robust extension to time-varying LEO dynamics. Experiments are said to recover an empirical slope of 0.334 ± 0.004 and to show 2.3× faster convergence together with measurable delay and throughput gains over homogeneous baselines. That combination of a concrete design formula and N-independent complexity is the part that could actually be used by someone building large-scale mobile-edge simulations. The derivation steps that produce the specific exponents are not visible in the abstract, so it is impossible to check whether the leading error terms are correctly identified once the LEO channel variation is included or whether α and β were chosen after seeing the 1D and 2D cases. The reported slope match is close enough to raise the usual question about post-selection fitting. Still, the framing of the two-sided trade-off is clear and the complexity claim is worth testing. This is for people who already work with mean-field games in wireless resource allocation or edge computing and want a principled knob for type granularity rather than another hyper-parameter search. It deserves a serious referee because the scaling law is a concrete, falsifiable claim that can be checked against the full error analysis and the LEO extension.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces a heterogeneous mean field game (HMFG) framework for LEO satellite-assisted V2X networks to coordinate mixed vehicle fleets under delay constraints. It resolves the type-selection trade-off via an explicit ε-Nash error decomposition yielding closed-form scalings K^*(N)=Θ(N^{1/3}) for the 1D queue model and K^*(N)=Θ(N^{1/5}) (log-corrected) for the d=2 joint queue-channel model, together with the unified formula K^*(N)=Θ(N^{α/(α+β)}), a heterogeneity-aware PDHG solver of complexity O(K² N_q N_t) independent of N, a robust LEO backhaul extension, and empirical validation (slope 0.334±0.004, 2.3× faster convergence, up to 29.5% lower delay).

Significance. If the ε-Nash decomposition is rigorous, the work supplies dimension-dependent design guidance that converts type granularity into a set-once parameter rather than per-deployment tuning, with complexity independent of fleet size N. This is valuable for large-scale mobile edge computing. The reported empirical slope match and performance gains over homogeneous baselines are concrete strengths; the LEO extension broadens applicability.

major comments (1)

Abstract: the central claim that the ε-Nash error decomposition produces the exact exponents 1/3 and 1/5 (and the unified α/(α+β) form) is load-bearing, yet the manuscript supplies no explicit error terms, leading-order analysis, or minimization steps showing how these scalings emerge from the heterogeneity-vs.-mean-field trade-off under the stated 1D queue and joint queue-channel dynamics. Without these steps it is impossible to confirm that the leading terms remain unchanged once time-varying LEO backhaul is included.

minor comments (2)

The abstract states 2.3× faster PDHG convergence at K=5 and 29.5% delay reduction, but does not reference the corresponding table or figure, hindering verification of the experimental conditions.
Clarify the precise mapping from state dimension d to the exponents α and β in the unified formula; the current presentation leaves their origin implicit.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback. We appreciate the recognition of the framework's potential value for large-scale mobile edge computing. We address the major comment below and will revise the manuscript accordingly to strengthen the presentation of the central theoretical claims.

read point-by-point responses

Referee: [—] Abstract: the central claim that the ε-Nash error decomposition produces the exact exponents 1/3 and 1/5 (and the unified α/(α+β) form) is load-bearing, yet the manuscript supplies no explicit error terms, leading-order analysis, or minimization steps showing how these scalings emerge from the heterogeneity-vs.-mean-field trade-off under the stated 1D queue and joint queue-channel dynamics. Without these steps it is impossible to confirm that the leading terms remain unchanged once time-varying LEO backhaul is included.

Authors: We agree that the derivation steps need to be presented more explicitly for verifiability. The ε-Nash error decomposition appears in outline form in Section 3.2, with the resulting scalings stated in Theorems 3.1 (1D) and 3.2 (d=2), but the explicit error terms, leading-order analysis, and minimization are not expanded sufficiently. In the revised manuscript we will insert a new subsection 3.3 that (i) states the decomposed bound ε(K,N) = Θ(K^{-β}) + Θ((N/K)^{-α/2}) + o(N^{-α/2}), (ii) performs the explicit minimization over K to recover K^*(N) = Θ(N^{α/(α+β)}), and (iii) shows via a perturbation argument that the time-varying LEO backhaul enters only as an additive O(N^{-1}) term that leaves the leading exponents unchanged. A one-sentence pointer to this derivation will also be added to the abstract. These changes will be made without altering any numerical results or claims. revision: yes

Circularity Check

1 steps flagged

ε-Nash error decomposition yields dimension-dependent K^*(N) scalings via model-specific α/β choices

specific steps

fitted input called prediction [Abstract]
"For the 1D queue state space, the optimal type count satisfies K^*(N)=Θ(N^{1/3}); for the joint queue-channel model (d=2), the scaling becomes K^*(N)=Θ(N^{1/5}) with logarithmic correction. The unified formula K^*(N)=Θ(N^{α/(α+β)}) provides dimension-dependent design guidance"

The exponents and unified form are asserted to follow from minimizing the ε-Nash error decomposition, yet α and β are chosen exactly so that α/(α+β) recovers the 1/3 and 1/5 scalings for the specific 1D and d=2 models; the close empirical match (slope 0.334) further indicates the 'derived' law is aligned with the input modeling assumptions by construction rather than independently predicted.

full rationale

The paper claims an explicit ε-Nash error decomposition produces the closed-form type-selection law K^*(N)=Θ(N^{α/(α+β)}), with concrete exponents 1/3 (d=1) and 1/5 (d=2). However, the unified formula parameters α and β are selected precisely to reproduce those dimension-specific exponents, and the reported empirical slope (0.334±0.004) is presented as validation of the 1/3 prediction. This creates moderate dependence on the assumed error-term structure for the V2X queue/channel dynamics; the derivation is not shown to be independent of those modeling choices. No self-citation load-bearing, ansatz smuggling, or renaming of known results is evident from the provided text. The central claim retains independent content beyond the fit, justifying a score of 4 rather than higher.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard mean-field game assumptions for large populations and the new error decomposition; no free parameters are explicitly fitted beyond the asymptotic exponents, and no new physical entities are introduced.

free parameters (1)

exponents α, β in unified scaling
Derived from state-space dimension and error terms in the decomposition; specific values 1/3 and 1/5 are stated for d=1 and d=2.

axioms (2)

domain assumption Mean-field approximation remains accurate when vehicles are grouped into K types for large N
Invoked throughout the HMFG framework for V2X coordination under delay constraints.
domain assumption Queue and channel state dynamics follow the 1D and joint 2D models used for error analysis
Basis for the dimension-dependent scaling derivations.

pith-pipeline@v0.9.0 · 5673 in / 1501 out tokens · 56229 ms · 2026-05-13T22:18:43.600090+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ε_{N,K} ≤ C_1 K^{-β} + C_2 (K/N)^α … K^*(N) = Θ(N^{α/(α+β)}) (Theorem 2, Cor. 1 for α=1/2, β=1 giving 1/3)
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Wasserstein rates on bounded interval; no reciprocal-cost or φ-ladder

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

Mean field games,

J.-M. Lasry and P.-L. Lions, “Mean field games,”Japanese journal of mathematics, vol. 2, no. 1, pp. 229–260, 2007

work page 2007
[2]

Large population stochas- tic dynamic games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle,

M. Huang, R. P. Malham ´e, and P. E. Caines, “Large population stochas- tic dynamic games: closed-loop mckean-vlasov systems and the nash certainty equivalence principle,” 2006

work page 2006
[3]

Time effi- cient offloading optimization in automotive multi-access edge computing networks using mean-field games,

Y . Kang, H. Wang, B. Kim, J. Xie, X.-P. Zhang, and Z. Han, “Time effi- cient offloading optimization in automotive multi-access edge computing networks using mean-field games,”IEEE Transactions on Vehicular Technology, vol. 72, no. 5, pp. 6460–6473, 2023

work page 2023
[4]

Distributed data offloading in ultra-dense leo satellite networks: A stackelberg mean-field game approach,

D. Wang, W. Wang, Y . Kang, and Z. Han, “Distributed data offloading in ultra-dense leo satellite networks: A stackelberg mean-field game approach,”IEEE Journal of Selected Topics in Signal Processing, vol. 17, no. 1, pp. 112–127, 2022

work page 2022
[5]

Heterogeneous mean-field multi-agent reinforcement learning for communication routing selection in sagi-net,

H. Zhang, H. Tang, Y . Hu, X. Wei, C. Wu, W. Ding, and X.-P. Zhang, “Heterogeneous mean-field multi-agent reinforcement learning for communication routing selection in sagi-net,” in2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall). IEEE, 2022, pp. 1–5

work page 2022
[7]

Heterogeneous mean field games and local well-posedness,

B. Qiao, “Heterogeneous mean field games and local well-posedness,” arXiv:2511.19766v1, Nov. 2025

work page arXiv 2025
[8]

Quantization,

R. M. Gray and D. L. Neuhoff, “Quantization,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2325–2383, 2002

work page 2002
[10]

Stochastic graphon games: Ii. the linear-quadratic case,

A. Aurell, R. Carmona, and M. Lauriere, “Stochastic graphon games: Ii. the linear-quadratic case,”Applied Mathematics & Optimization, vol. 85, no. 3, p. 39, 2022

work page 2022
[11]

Lightweight semantic communication-compliant shortest path selection in large-scale leo satellite networks,

B. Guo, Z. Xiong, Z. Zhang, Q. Yang, B. Li, D. Niyato, M. Guizani, and Z. Han, “Lightweight semantic communication-compliant shortest path selection in large-scale leo satellite networks,”IEEE Transactions on Mobile Computing, 2026

work page 2026
[12]

Optimization- driven drl for resource allocation under licensed and unlicensed uav spectrum sharing networks against uncertain jamming,

R. Ding, F. Zhou, Q. Wu, K.-K. Wong, and N. Al-Dhahir, “Optimization- driven drl for resource allocation under licensed and unlicensed uav spectrum sharing networks against uncertain jamming,”IEEE Transac- tions on Mobile Computing, 2026

work page 2026
[13]

On the rate of convergence in wasserstein distance of the empirical measure,

N. Fournier and A. Guillin, “On the rate of convergence in wasserstein distance of the empirical measure,”Probability theory and related fields, vol. 162, no. 3, pp. 707–738, 2015

work page 2015
[14]

A first-order primal-dual algorithm for convex problems with applications to imaging,

A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,”Journal of mathematical imaging and vision, vol. 40, no. 1, pp. 120–145, 2011

work page 2011
[15]

A survey on resource allocation in vehicular networks,

M. Noor-A-Rahim, Z. Liu, H. Lee, G. M. N. Ali, D. Pesch, and P. Xiao, “A survey on resource allocation in vehicular networks,”IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 2, pp. 701–721, 2020

work page 2020
[16]

Resource allocation in v2x communica- tion: State-of-the-art and research challenges,

A. Nair and S. Tanwar, “Resource allocation in v2x communica- tion: State-of-the-art and research challenges,”Physical Communication, vol. 64, p. 102351, 2024

work page 2024
[17]

Towards 6g v2x sidelink: survey of resource allo- cation—mathematical formulations, challenges, and proposed solutions,

P. Rajalakshmiet al., “Towards 6g v2x sidelink: survey of resource allo- cation—mathematical formulations, challenges, and proposed solutions,” IEEE Open Journal of Vehicular Technology, vol. 5, pp. 344–383, 2024

work page 2024
[18]

Dynamic data offloading for massive users in ultra-dense leo satellite networks based on stackelberg mean field game,

D. Wang, W. Wang, Y . Kang, and Z. Han, “Dynamic data offloading for massive users in ultra-dense leo satellite networks based on stackelberg mean field game,” inIEEE INFOCOM 2022-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2022, pp. 1–6

work page 2022
[19]

Joint server selection and handover design for satellite-based federated learning using mean- field evolutionary approach,

Y . Kang, Y . Zhu, D. Wang, Z. Han, and T. Bas ¸ar, “Joint server selection and handover design for satellite-based federated learning using mean- field evolutionary approach,”IEEE Transactions on Network Science and Engineering, vol. 11, no. 2, pp. 1655–1667, 2023

work page 2023
[20]

A game-theoretical approach for distributed computation offloading in leo satellite-terrestrial edge computing systems,

Y . Chen, Y . Yang, J. Hu, Y . Wu, and J. Huang, “A game-theoretical approach for distributed computation offloading in leo satellite-terrestrial edge computing systems,”IEEE Transactions on Mobile Computing, vol. 24, no. 5, pp. 4389–4402, 2025

work page 2025
[21]

Multi-user task offloading in uav-assisted leo satellite edge computing: A game-theoretic approach,

Y . Chen, J. Zhao, Y . Wu, J. Huang, and X. S. Shen, “Multi-user task offloading in uav-assisted leo satellite edge computing: A game-theoretic approach,”IEEE Transactions on Mobile Computing, vol. 24, no. 1, pp. 363–378, 2024

work page 2024
[22]

Computation offloading in leo satellite networks with hybrid cloud and edge computing,

Q. Tang, Z. Fei, B. Li, and Z. Han, “Computation offloading in leo satellite networks with hybrid cloud and edge computing,”IEEE Internet of Things Journal, vol. 8, no. 11, pp. 9164–9176, 2021

work page 2021
[23]

When game theory meets satellite communication networks: A survey,

W. Jiang, H. Han, M. He, and W. Gu, “When game theory meets satellite communication networks: A survey,”Computer Communications, vol. 217, pp. 208–229, 2024

work page 2024
[24]

Multi-functional ris-enabled in sagin for iot: A hybrid deep reinforcement learning approach with compressed twin-models,

L.-H. Shen and J.-J. Huang, “Multi-functional ris-enabled in sagin for iot: A hybrid deep reinforcement learning approach with compressed twin-models,”IEEE Internet of Things Journal, 2025

work page 2025
[25]

Mean field game-based waveform precoding design for mobile crowd integrated sensing, communication, and computation systems,

D. Wang, C. Huang, J. He, X. Chen, W. Wang, Z. Zhang, Z. Han, and M. Debbah, “Mean field game-based waveform precoding design for mobile crowd integrated sensing, communication, and computation systems,”IEEE Transactions on Wireless Communications, vol. 23, no. 8, pp. 10 430–10 444, 2024

work page 2024
[26]

Joint resource allocation for v2x communications with multi-type mean-field rein- forcement learning,

Y . Xu, X. Wu, Y . Tang, J. Shang, L. Zheng, and L. Zhao, “Joint resource allocation for v2x communications with multi-type mean-field rein- forcement learning,”IEEE Transactions on Intelligent Transportation Systems, 2024

work page 2024
[27]

Joint resource allocation for uav-assisted v2x communication with mean field multi-agent reinforcement learning,

Y . Xu, L. Zheng, X. Wu, Y . Tang, W. Liu, and D. Sun, “Joint resource allocation for uav-assisted v2x communication with mean field multi-agent reinforcement learning,”IEEE Transactions on Vehicular Technology, vol. 74, no. 1, pp. 1209–1223, 2024. 15

work page 2024
[28]

Carmona, F

R. Carmona, F. Delarueet al.,Probabilistic theory of mean field games with applications I-II. Springer, 2018, vol. 3

work page 2018
[29]

Mean-field neural networks: learning mappings on wasserstein space,

H. Pham and X. Warin, “Mean-field neural networks: learning mappings on wasserstein space,”Neural Networks, vol. 168, pp. 380–393, 2023

work page 2023
[30]

First order mean field games with density constraints: pressure equals price,

P. Cardaliaguet, A. R. M ´esz´aros, and F. Santambrogio, “First order mean field games with density constraints: pressure equals price,”SIAM Journal on Control and Optimization, vol. 54, no. 5, pp. 2672–2709, 2016

work page 2016
[31]

Graphon mean field games and the gmfg equations:ε-nash equilibria,

P. E. Caines and M. Huang, “Graphon mean field games and the gmfg equations:ε-nash equilibria,” in2019 IEEE 58th conference on decision and control (CDC). IEEE, 2019, pp. 286–292

work page 2019
[32]

Graphon mean-field control for cooperative multi-agent reinforcement learning,

Y . Hu, X. Wei, J. Yan, and H. Zhang, “Graphon mean-field control for cooperative multi-agent reinforcement learning,”Journal of the Franklin Institute, vol. 360, no. 18, pp. 14 783–14 805, 2023

work page 2023
[33]

Learning sparse graphon mean field games,

C. Fabian, K. Cui, and H. Koeppl, “Learning sparse graphon mean field games,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 4486–4514

work page 2023
[34]

Sharp asymptotic and finite-sample rates of convergence of empirical measures in wasserstein distance,

J. Weed and F. Bach, “Sharp asymptotic and finite-sample rates of convergence of empirical measures in wasserstein distance,”Bernoulli, vol. 25, no. 4A, pp. 2620–2648, 2019. 16 APPENDIX A. Proof of Lemma 1 We provide the complete proof of Lemma 1. Under Assumption 1, for any two typesk, k ′ ∈ [K], let(X (k) 0 , X(k′) 0 )be an optimal coupling achieving W2...

work page 2019