Learning to Coordinate over Networks with Bounded Rationality
Pith reviewed 2026-05-10 18:20 UTC · model grok-4.3
The pith
Networks with uniform degree maximize the stationary probability of perfect coordination for bounded-rational agents.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For a large class of networks the partition function of the Gibbs measure induced by logit-response dynamics is well approximated by the moment-generating function of a Gaussian random variable. This approximation is used to optimize degree distributions, proving that the network maximizing the stationary probability of coordinated action profiles is K-regular.
What carries the argument
The Gaussian moment-generating-function approximation to the partition function of the Gibbs measure on the network, which reduces the coordination probability to a function of degree distribution alone.
If this is right
- The stationary probability of perfect coordination increases monotonically with the rationality parameter β.
- For K-regular networks the stationary probability increases with the degree K.
- For irregular networks the stationary probability increases with the number of edges.
- An upper bound exists on the minimum rationality needed to reach a target coordination probability.
- The optimal network for coordination is always K-regular.
Where Pith is reading between the lines
- Degree heterogeneity reduces coordination reliability when rationality is bounded.
- The Gaussian approximation may be reusable for coordination problems in other potential games on graphs.
- Enforcing regular connectivity patterns would improve reliability in engineered systems such as sensor or robot networks.
Load-bearing premise
The Gaussian moment-generating-function approximation to the partition function holds with sufficient accuracy for the networks of interest.
What would settle it
For a small non-regular graph with the same number of edges as a regular graph, compute the exact partition function by enumeration and verify whether the regular graph yields a strictly higher value.
Figures
read the original abstract
Network coordination games are widely used to model collaboration among interconnected agents, with applications across diverse domains including economics, robotics, and cyber-security. We consider networks of bounded-rational agents who interact through binary stag hunt games, a canonical game theoretic model for distributed collaborative tasks. Herein, the agents update their actions using logit response functions, yielding the Log-Linear Learning (LLL) algorithm. While convergence of LLL to a risk-dominant Nash equilibrium requires unbounded rationality, we consider regimes in which rationality is strictly bounded. We first show that the stationary probability of states corresponding to perfect coordination is monotone increasing in the rationality parameter $\beta$. For $K$-regular networks, we prove that the stationary probability of a perfectly coordinated action profile is monotone in the connectivity degree $K$, and we provide an upper bound on the minimum rationality required to achieve a desired level of coordination. For irregular networks, we show that the stationary probability of perfectly coordinated action profiles increases with the number of edges in the graph. We show that, for a large class of networks, the partition function of the Gibbs measure is well approximated by the moment generating function of Gaussian random variable. This approximation allows us to optimize degree distributions and establishes that the optimal network - i.e., the one that maximizes the stationary probability of coordinated action profiles - is $K$-regular. Consequently, our results indicate that networks of uniformly bounded-rational agents achieve the most reliable coordination when connectivity is evenly distributed among agents.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper examines coordination in networks of bounded-rational agents playing binary stag-hunt games via log-linear learning (logit-response dynamics). It proves that the stationary probability of perfect coordination under the induced reversible Markov chain is monotone increasing in the rationality parameter β. For K-regular networks it shows monotonicity in K and supplies an upper bound on the minimal β needed for a target coordination level; for irregular networks it shows monotonicity in edge count. The central technical contribution is a Gaussian moment-generating-function approximation to the Gibbs partition function that is then used to compare degree distributions and conclude that the K-regular graph maximizes the stationary mass on coordinated profiles.
Significance. If the Gaussian approximation is accurate enough to preserve comparative rankings, the results supply a concrete network-design principle: uniform degree distribution improves coordination reliability for bounded-rational agents. The monotonicity statements rest on standard properties of reversible Markov chains and logit dynamics, which is a strength; the approximation step, however, is the load-bearing element for the optimality claim.
major comments (2)
- [Abstract / Gaussian-approximation section] Abstract and the section introducing the Gaussian approximation: the claim that the partition function Z of the Gibbs measure is 'well approximated' by the MGF of a Gaussian whose parameters depend on the degree sequence is used to establish that K-regular networks are optimal. No explicit remainder term, concentration inequality, or numerical validation of the approximation error for the binary stag-hunt potential is supplied. Because the cross-distribution comparison that yields global optimality rests on the approximation being both accurate and order-preserving, the absence of error bounds leaves open the possibility that approximation errors on irregular graphs reverse the claimed ranking.
- [K-regular networks results] The upper bound on the minimum rationality parameter β required for a desired coordination level in K-regular networks (stated in the abstract) is presented without derivation details or tightness analysis in the provided text. If this bound is obtained via the same Gaussian approximation, its validity inherits the same error-bound gap identified above.
minor comments (1)
- [Preliminaries] Notation for the stag-hunt payoff matrix and the logit-response update rule should be introduced with explicit parameter values (e.g., the risk-dominant threshold) to make the monotonicity proofs self-contained.
Simulated Author's Rebuttal
We thank the referee for the thorough review and constructive feedback on our manuscript. We appreciate the identification of areas where additional rigor and validation are needed, particularly concerning the Gaussian approximation and the derived bounds. We respond to each major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Abstract / Gaussian-approximation section] Abstract and the section introducing the Gaussian approximation: the claim that the partition function Z of the Gibbs measure is 'well approximated' by the MGF of a Gaussian whose parameters depend on the degree sequence is used to establish that K-regular networks are optimal. No explicit remainder term, concentration inequality, or numerical validation of the approximation error for the binary stag-hunt potential is supplied. Because the cross-distribution comparison that yields global optimality rests on the approximation being both accurate and order-preserving, the absence of error bounds leaves open the possibility that approximation errors on irregular graphs reverse the claimed ranking.
Authors: We acknowledge the validity of this concern. The manuscript presents the Gaussian approximation without providing a rigorous error bound or numerical evidence specific to the stag-hunt game. The approximation arises from applying the central limit theorem to the linear combination of actions weighted by degrees in the potential function. To strengthen this part, we will include numerical experiments validating the approximation accuracy for both regular and irregular networks of varying sizes. Furthermore, we will analyze whether the approximation error is sufficiently uniform across degree distributions to preserve the optimality ranking of regular graphs. If necessary, we will qualify the optimality result to hold under the approximation or for large networks where the CLT applies strongly. revision: yes
-
Referee: [K-regular networks results] The upper bound on the minimum rationality parameter β required for a desired coordination level in K-regular networks (stated in the abstract) is presented without derivation details or tightness analysis in the provided text. If this bound is obtained via the same Gaussian approximation, its validity inherits the same error-bound gap identified above.
Authors: The upper bound is derived by substituting the Gaussian approximation into the expression for the stationary probability and solving for β. We agree that without details, it is difficult to assess. In the revision, we will expand the relevant section to include the full derivation steps, starting from the approximated partition function to the explicit bound on β. We will also provide a tightness analysis, perhaps by showing the ratio of the bound to the true value in small cases, and discuss its conservatism. revision: yes
Circularity Check
No circularity: derivations rely on standard reversible Markov chain properties and an independent Gaussian approximation
full rationale
The paper establishes monotonicity of the stationary coordination probability in β and in K (for regular graphs) or edge count (for irregular graphs) directly from the reversibility of the logit-response Markov chain and the explicit form of the Gibbs measure; these steps invoke only textbook properties of detailed balance and do not reduce to any fitted quantity or self-citation. The Gaussian MGF approximation to the partition function is introduced as a separate technical tool whose validity is asserted for a large class of networks; it is not defined in terms of the target optimality statement, nor is any parameter of the approximation fitted to the coordination probability itself. Consequently the claim that K-regular graphs maximize the stationary mass follows from comparing the approximated expressions across degree sequences rather than from any definitional identity or self-referential loop. No load-bearing step collapses to its own input by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Agents update actions according to logit response functions with rationality parameter beta
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
partition function ZA(β) = 2^N E_S [e^{β Φ̃_A(S)}] approximated by MGF of Gaussian whose variance depends on degree sequence; regular graphs minimize σ²_A via Schur-convexity
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
no reference to recognition cost, golden-ratio identities, or 8-tick periodicity
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Skyrms,The Stag Hunt and the Evolution of Social Structure
B. Skyrms,The Stag Hunt and the Evolution of Social Structure. Cambridge: Cambridge University Press, 2004
work page 2004
-
[2]
A behavioral model of rational choice,
H. A. Simon, “A behavioral model of rational choice,”The Quarterly Journal of Economics, vol. 69, no. 1, pp. 99–118, 1955
work page 1955
-
[3]
M. O. Jackson,Social and Economic Networks: Models and Analysis. Princeton, NJ: Princeton University Press, 2008
work page 2008
-
[4]
An experimental study of prisoners’ dilemma and stag hunt games played by teams of players,
J. Kim and T. R. Palfrey, “An experimental study of prisoners’ dilemma and stag hunt games played by teams of players,”Games and Economic Behavior, 2025
work page 2025
-
[5]
Experience-weighted attraction learning in normal form games,
C. F. Camerer and T.-H. Ho, “Experience-weighted attraction learning in normal form games,”Econometrica, vol. 67, no. 4, pp. 827–874, 1999
work page 1999
-
[6]
C. F. Camerer,Behavioral Game Theory: Experiments in Strategic Interaction. Princeton, NJ: Princeton University Press, 2003
work page 2003
-
[7]
Game theory for autonomy: From min-max optimization to equilibrium and bounded rationality learning,
K. G. Vamvoudakis, F. Fotiadis, J. P. Hespanha, R. Chinchilla, G. Yang, M. Liu, J. S. Shamma, and L. Pavel, “Game theory for autonomy: From min-max optimization to equilibrium and bounded rationality learning,” in2023 American Control Conference (ACC), pp. 4363–4380, 2023
work page 2023
-
[8]
Bounded rationality in learning, perception, decision-making, and stochastic games,
P. Tsiotras and K. G. Vamvoudakis, “Bounded rationality in learning, perception, decision-making, and stochastic games,” inHandbook of Reinforcement Learning and Control(K. G. Vamvoudakis, Y . Wan, F. L. Lewis, and D. Cansever, eds.), vol. 325 ofStudies in Systems, Decision and Control, Cham: Springer, 2021
work page 2021
-
[9]
Graphical models for game theory,
M. Kearns, M. L. Littman, and S. Singh, “Graphical models for game theory,” inProceedings of the 17th Conference on Uncertainty in Artificial Intelligence (UAI 2001), (San Francisco, CA), pp. 253–260, Morgan Kaufmann Publishers Inc., 2001
work page 2001
-
[10]
S. M. Kakade, M. Kearns, and L. E. Ortiz, “Graphical economics,” in Proceedings of the 17th Annual Conference on Learning Theory (COLT 2004)(J. Shawe-Taylor and Y . Singer, eds.), vol. 3120 ofLecture Notes in Computer Science, (Berlin, Heidelberg), pp. 17–32, Springer, 2004
work page 2004
-
[11]
M. O. Jackson and Y . Zenou, “Games on networks,” inHandbook of Game Theory with Economic Applications, vol. 4, pp. 95–163, Elsevier, 2015
work page 2015
-
[12]
S. Morris, “Contagion,”The Review of Economic Studies, vol. 67, no. 1, pp. 57–78, 2000
work page 2000
-
[13]
The spread of innovations in social networks,
A. Montanari and A. Saberi, “The spread of innovations in social networks,”Proceedings of the National Academy of Sciences, vol. 107, no. 47, pp. 20196–20201, 2010
work page 2010
-
[14]
H. P. Young, “The evolution of conventions,”Econometrica: Journal of the Econometric Society, pp. 57–84, 1993
work page 1993
-
[15]
A risk-security tradeoff in graphical coordination games,
K. Paarporn, M. Alizadeh, and J. R. Marden, “A risk-security tradeoff in graphical coordination games,”IEEE Transactions on Automatic Control, vol. 66, no. 5, pp. 1973–1985, 2020
work page 1973
-
[16]
The impact of complex and informed adversarial behavior in graphical coordination games,
K. Paarporn, B. Canty, P. N. Brown, M. Alizadeh, and J. R. Marden, “The impact of complex and informed adversarial behavior in graphical coordination games,”IEEE Transactions on Control of Network Systems, vol. 8, no. 1, pp. 200–211, 2020
work page 2020
-
[17]
Robust coordination of linear threshold dynamics on directed weighted networks,
L. Arditti, G. Como, F. Fagnani, and M. Vanelli, “Robust coordination of linear threshold dynamics on directed weighted networks,”IEEE Transactions on Automatic Control, pp. 1–15, 2024
work page 2024
-
[18]
Prospect theory: An analysis of decision under risk,
D. Kahneman and A. Tversky, “Prospect theory: An analysis of decision under risk,”Econometrica, vol. 47, no. 2, pp. 263–292, 1979. 16
work page 1979
-
[19]
Prospect theory for continuous distribu- tions,
M. O. Rieger and M. Wang, “Prospect theory for continuous distribu- tions,”Journal of Risk and Uncertainty, vol. 36, no. 1, pp. 83–102, 2008
work page 2008
-
[20]
Effects of subjective biases on strategic information transmission,
V . S. S. Nadendla, C. Langbort, and T. Bas ¸ar, “Effects of subjective biases on strategic information transmission,”IEEE Transactions on Communications, vol. 66, no. 12, pp. 6040–6049, 2018
work page 2018
-
[21]
Unraveling in guessing games: An experimental study,
R. Nagel, “Unraveling in guessing games: An experimental study,”The American Economic Review, vol. 85, no. 5, pp. 1313–1326, 1995
work page 1995
-
[22]
A cognitive hierarchy model of games,
C. F. Camerer, T.-H. Ho, and J.-K. Chong, “A cognitive hierarchy model of games,”The Quarterly Journal of Economics, vol. 119, no. 3, pp. 861–898, 2004
work page 2004
-
[23]
Bounded rational dubins vehicle coordination for target tracking using reinforcement learning,
N.-M. T. Kokolakis and K. G. Vamvoudakis, “Bounded rational dubins vehicle coordination for target tracking using reinforcement learning,” Automatica, vol. 149, p. 110732, 2023
work page 2023
-
[24]
Risk sensitivity and theory of mind in human coordination,
P. L. Ferreira, F. C. Santos, and S. Pequito, “Risk sensitivity and theory of mind in human coordination,”PLOS Computational Biology, vol. 17, p. e1009167, July 2021
work page 2021
-
[25]
W. Yoshida, R. J. Dolan, and K. J. Friston, “Game theory of mind,” PLoS Computational Biology, vol. 4, no. 12, p. e1000254, 2008
work page 2008
-
[26]
Quantal response equilibria for normal form games,
R. D. McKelvey and T. R. Palfrey, “Quantal response equilibria for normal form games,”Games and Economic Behavior, vol. 10, no. 1, pp. 6–38, 1995
work page 1995
-
[27]
J. K. Goeree, C. A. Holt, and T. R. Palfrey,Quantal Response Equilib- rium: A Stochastic Theory of Games. Princeton University Press, 2016
work page 2016
-
[28]
On the uniqueness of quantal response equilibria and its application to network games,
E. Melo, “On the uniqueness of quantal response equilibria and its application to network games,”Economic Theory, vol. 74, no. 3, pp. 681–725, 2022
work page 2022
-
[29]
Conditional logit analysis of qualitative choice behav- ior,
D. McFadden, “Conditional logit analysis of qualitative choice behav- ior,”Frontiers in Econometrics, pp. 105–142, 1973
work page 1973
-
[30]
Revisiting log-linear learning: Asyn- chrony, completeness and payoff-based implementation,
J. R. Marden and J. S. Shamma, “Revisiting log-linear learning: Asyn- chrony, completeness and payoff-based implementation,”Games and Economic Behavior, vol. 75, no. 2, pp. 788–808, 2012
work page 2012
-
[31]
The statistical mechanics of strategic interaction,
L. E. Blume, “The statistical mechanics of strategic interaction,”Games and economic behavior, vol. 5, no. 3, pp. 387–424, 1993
work page 1993
-
[32]
The statistical mechanics of best-response strategy re- vision,
L. E. Blume, “The statistical mechanics of best-response strategy re- vision,”Games and Economic Behavior, vol. 11, no. 2, pp. 111–145, 1995
work page 1995
-
[33]
C. Al ´os-Ferrer and N. Netzer, “The logit-response dynamics,”Games and Economic Behavior, vol. 68, no. 2, pp. 413–427, 2010
work page 2010
-
[34]
Robustness of learning in games with heterogeneous players,
A. S. Akbar, H. Jaleel, W. Abbas, and J. S. Shamma, “Robustness of learning in games with heterogeneous players,”IEEE Transactions on Automatic Control, vol. 68, no. 3, pp. 1553–1567, 2022
work page 2022
-
[35]
Finite-time convergence to anϵ-efficient nash equilibrium in potential games,
A. Maddux, R. Ouhamma, H. Catic, and M. Kamgarpour, “Finite-time convergence to anϵ-efficient nash equilibrium in potential games,”IEEE Transactions on Control of Network Systems, pp. 1–12, 2026
work page 2026
-
[36]
Optimization by simulated annealing,
S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,”Science, vol. 220, pp. 671–680, May 1983
work page 1983
-
[37]
Rationality and connectivity in stochastic learning for networked coordination games,
Y . Zhang and M. M. Vasconcelos, “Rationality and connectivity in stochastic learning for networked coordination games,” in2024 Ameri- can Control Conference (ACC), pp. 1622–1627, IEEE, 2024
work page 2024
-
[38]
On the role of network structure in learning to coordinate with bounded rationality,
Y . Zhang and M. M. Vasconcelos, “On the role of network structure in learning to coordinate with bounded rationality,” in2024 IEEE 63rd Conference on Decision and Control (CDC), pp. 1684–1689, IEEE, 2024
work page 2024
-
[39]
D. Fudenberg and J. Tirole,Game Theory. Cambridge, MA: MIT Press, 1991
work page 1991
-
[40]
D. Monderer and L. S. Shapley, “Potential games,”Games and economic behavior, vol. 14, no. 1, pp. 124–143, 1996
work page 1996
-
[41]
Bullo,Lectures on Network Systems
F. Bullo,Lectures on Network Systems. Kindle Direct Publishing, 1.7 ed., 2024
work page 2024
-
[42]
The speed of innovation diffusion in social networks,
I. Arieli, Y . Babichenko, R. Peretz, and H. P. Young, “The speed of innovation diffusion in social networks,”Econometrica, vol. 88, no. 2, pp. 569–594, 2020
work page 2020
-
[43]
Econometric analysis of qualitative response models,
D. McFadden, “Econometric analysis of qualitative response models,” inHandbook of Econometrics(Z. Griliches and M. D. Intriligator, eds.), vol. 2, pp. 1395–1457, Elsevier, 1984
work page 1984
-
[44]
K. E. Train,Discrete choice methods with simulation. Cambridge university press, 2009
work page 2009
-
[45]
Inde- pendence of irrelevant decisions in stochastic choice,
F. Sandomirskiy, P. H. Sung, O. Tamuz, and B. Wincelberg, “Inde- pendence of irrelevant decisions in stochastic choice,”arXiv preprint arXiv:2312.04827, 2023
work page internal anchor Pith review arXiv 2023
-
[46]
On the origin of the Boltzmann distribution,
F. Sandomirskiy and O. Tamuz, “On the origin of the Boltzmann distribution,”Mathematische Annalen, vol. 392, no. 4, pp. 5617–5638, 2025
work page 2025
-
[47]
Rational inattention to discrete choices: A new foundation for the multinomial logit model,
F. Mat ˇejka and A. McKay, “Rational inattention to discrete choices: A new foundation for the multinomial logit model,”American Economic Review, vol. 105, no. 1, pp. 272–298, 2015
work page 2015
-
[48]
M. E. J. Newman,Networks. Oxford: Oxford University Press, 2nd ed., 2018
work page 2018
- [49]
-
[50]
A. Leonidov, A. Savvateev, and A. G. Semenov, “Ising game on graphs,” Chaos, Solitons & Fractals, vol. 180, p. 114497, 2024
work page 2024
-
[51]
M. M ´ezard and A. Montanari,Information, Physics, and Computation. Oxford Graduate Texts, Oxford: Oxford University Press, 2009
work page 2009
-
[52]
A. W. Marshall, I. Olkin, and B. C. Arnold,Inequalities: Theory of Majorization and Its Applications. New York: Springer, 2nd ed., 2011
work page 2011
-
[53]
R. A. Horn and C. R. Johnson,Matrix Analysis. Cambridge: Cambridge University Press, 2nd ed., 2012
work page 2012
-
[54]
P. Hall and C. C. Heyde,Martingale Limit Theory and Its Application. New York: Academic Press, 1980
work page 1980
-
[55]
C ¸ ınlar,Probability and Stochastics, vol
E. C ¸ ınlar,Probability and Stochastics, vol. 261 ofGraduate Texts in Mathematics. New York: Springer, 2011
work page 2011
-
[56]
Network dynamics of social influence in the wisdom of crowds,
J. Becker, D. Brackbill, and D. Centola, “Network dynamics of social influence in the wisdom of crowds,”Proceedings of the national academy of sciences, vol. 114, no. 26, pp. E5070–E5076, 2017
work page 2017
-
[57]
Analysis and interventions in large network games,
F. Parise and A. Ozdaglar, “Analysis and interventions in large network games,”Annual Review of Control, Robotics, and Autonomous Systems, vol. 4, pp. 455–486, 2021
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.