Recognition: 2 theorem links
· Lean TheoremNetworkGames: Simulating Cooperation in Network Games with Personality-driven LLM Agents
Pith reviewed 2026-05-17 04:52 UTC · model grok-4.3
The pith
Cooperative outcomes in network games depend on connectivity and personality placement, not dyadic interactions alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In a population of LLM agents with MBTI personalities playing iterated prisoner's dilemma on networks, macro-level cooperative outcomes are co-determined by the network's connectivity and the spatial distribution of personalities rather than being predictable from the dyadic interaction matrix alone.
What carries the argument
Message-passing process on graphs where LLM policies govern actions between heterogeneous agents situated in structures such as small-world or scale-free networks.
If this is right
- Small-world networks reduce collective cooperation relative to other topologies.
- Placing pro-social personalities at hub positions in scale-free networks substantially raises overall cooperation.
- Baseline dyadic interaction matrices between all 16 personality pairs fail to predict group outcomes.
- The co-determination pattern holds across multiple LLM architectures and scaled network sizes.
Where Pith is reading between the lines
- Platform designers could raise group cooperation by engineering connection patterns or seeding certain user types in central positions.
- Similar topology-plus-placement effects might appear in human networks if personality distributions align with network roles.
- The approach could be tested on dynamic networks that evolve over time or on other repeated games.
Load-bearing premise
LLM agents given MBTI labels produce cooperation behavior representative enough of humans to support claims about real social networks.
What would settle it
Human subjects playing the same iterated prisoner's dilemma on identical small-world and scale-free networks with personality labels assigned would show collective cooperation levels that match dyadic predictions regardless of topology or placement.
Figures
read the original abstract
While Large Language Models (LLMs) have been extensively tested in dyadic game-theoretic scenarios, their collective behavior within complex network games remains surprisingly unexplored. To bridge this gap, we present NetworkGames, a framework connecting Generative Agents and Geometric Deep Learning. By formalizing social simulation as a message-passing process governed by LLM policies, we investigate how node heterogeneity (MBTI personalities) and network topology co-determine collective welfare. We instantiate a population of LLM agents, each endowed with a distinct personality from the MBTI taxonomy, and situate them in various network structures (e.g., small-world and scale-free). Through extensive simulations of the Iterated Prisoner's Dilemma, we first establish a baseline dyadic interaction matrix, revealing nuanced cooperative preferences between all 16 personality pairs. We then demonstrate that macro-level cooperative outcomes are not predictable from dyadic interactions alone; they are co-determined by the network's connectivity and the spatial distribution of personalities. For instance, we find that small-world networks are detrimental to cooperation, while strategically placing pro-social personalities in hub positions within scale-free networks can significantly promote cooperative behavior. We validate the robustness of these findings through extensive stress tests across multiple LLM architectures, scaled network sizes, varying random seeds, and comprehensive ablation studies. Our findings offer significant implications for designing healthier online social environments and forecasting collective behavior. We open-source our framework to facilitate research into the social physics of AI societies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces NetworkGames, a framework for simulating Iterated Prisoner's Dilemma games on networks using LLM agents assigned MBTI personality types. It first constructs a 16x16 dyadic cooperation matrix from pairwise LLM interactions, then runs network simulations on small-world and scale-free topologies to show that macro-level cooperation rates are co-determined by network connectivity and the spatial placement of personality types rather than being fully predictable from the dyadic matrix alone. Examples include detrimental effects of small-world structure and benefits from placing pro-social types at hubs in scale-free networks. The work reports stress tests across LLM architectures, network sizes, random seeds, and ablations, and releases the code.
Significance. If the simulation results prove robust, the paper would advance the study of heterogeneous agent behavior in networked social dilemmas by demonstrating the joint role of topology and individual traits. The open-sourced framework and multi-LLM validation are concrete strengths that support reproducibility. However, the broader significance for real social networks and collective welfare forecasting is limited by the absence of calibration to human data, reducing direct applicability to the claimed implications for online environments.
major comments (2)
- [Abstract and stress tests section] Abstract and the section describing stress tests and ablation studies: the claim of 'extensive stress tests' and robustness is load-bearing for the central assertion that macro outcomes are not predictable from the dyadic matrix alone, yet no quantitative effect sizes, cooperation-rate differences with standard deviations, error bars, or explicit controls for LLM stochasticity (e.g., temperature, number of independent runs, or seed averaging) are reported. This prevents assessment of whether observed topology and placement effects exceed simulation noise.
- [Implications section] The section on implications and discussion: the extension to 'designing healthier online social environments' and 'forecasting collective behavior' rests on the assumption that MBTI-labeled LLM agents replicate human personality-driven cooperation patterns in iterated PD. No calibration against human experimental data on personality and network IPD is provided, which is load-bearing for any claim beyond pure simulation.
minor comments (2)
- [Dyadic matrix construction] The description of how the 16x16 dyadic matrix entries are computed (e.g., exact prompt template, number of iterations per pair, and cooperation metric) should be expanded for reproducibility, as this baseline underpins all network comparisons.
- [Figures] Figure captions for network visualizations and cooperation heatmaps could include explicit axis labels and legend definitions to improve clarity.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive comments on our manuscript. We address each of the major comments point by point below, and we are committed to improving the paper accordingly.
read point-by-point responses
-
Referee: [Abstract and stress tests section] Abstract and the section describing stress tests and ablation studies: the claim of 'extensive stress tests' and robustness is load-bearing for the central assertion that macro outcomes are not predictable from the dyadic matrix alone, yet no quantitative effect sizes, cooperation-rate differences with standard deviations, error bars, or explicit controls for LLM stochasticity (e.g., temperature, number of independent runs, or seed averaging) are reported. This prevents assessment of whether observed topology and placement effects exceed simulation noise.
Authors: We acknowledge the validity of this observation. Although the manuscript describes performing stress tests across multiple LLM architectures, network sizes, random seeds, and ablation studies, it does not provide the quantitative details such as effect sizes, standard deviations, or error bars for the reported cooperation rates. We will revise the relevant sections to include these statistical measures, explicitly report the number of independent runs, temperature settings used for the LLMs, and any seed averaging procedures. This will strengthen the evidence that the topology and placement effects are robust and exceed simulation noise. revision: yes
-
Referee: [Implications section] The section on implications and discussion: the extension to 'designing healthier online social environments' and 'forecasting collective behavior' rests on the assumption that MBTI-labeled LLM agents replicate human personality-driven cooperation patterns in iterated PD. No calibration against human experimental data on personality and network IPD is provided, which is load-bearing for any claim beyond pure simulation.
Authors: We agree that direct applicability to human social networks requires calibration with human data, which is not provided in this work. Our study focuses on the behavior of LLM agents with assigned personality types in network games as a means to explore emergent collective behaviors in artificial agent systems. We will revise the implications and discussion section to more clearly delineate the scope as a simulation study of AI agents, note the absence of human calibration, and suggest that future work could involve such validation to extend implications to real-world online environments. This maintains the contribution while avoiding overstatement. revision: partial
Circularity Check
No circularity: macro outcomes shown via direct network simulations, not reductions to fitted inputs or self-citations
full rationale
The paper constructs a dyadic interaction matrix directly from LLM queries on MBTI personality pairs, then runs explicit network simulations of the Iterated Prisoner's Dilemma on small-world and scale-free graphs with varying personality placements. The central demonstration—that collective cooperation is co-determined by topology and spatial distribution rather than dyadic matrix alone—follows from comparing simulation outputs across network conditions, with no equations, parameter fitting, or predictions that collapse back to the inputs by construction. No self-citation chains or uniqueness theorems are invoked as load-bearing; results are presented as empirical simulation outputs validated by ablation and robustness checks. The derivation is therefore self-contained as a comparative simulation study.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption MBTI personality types induce consistent and distinguishable cooperative preferences in LLM agents across game iterations.
- domain assumption Network topology and agent placement can be varied independently of the underlying LLM policy.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
macro-level cooperative outcomes are not predictable from dyadic interactions alone; they are co-determined by the network's connectivity and the spatial distribution of personalities
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
From Pre-trained Models to Large Language Models: A Comprehensive Survey of AI-Driven Psychological Computing
The paper introduces a new taxonomy that groups AI-driven psychological computing tasks by their underlying computational patterns into four categories and reviews over 300 works from the pre-trained model to LLM eras.
Reference graph
Works this paper leans on
-
[1]
Gopal Vinay Aher, Rosa I. Arriaga, and Bistra Dilkina. 2023. Using large lan- guage models to simulate multiple humans and replicate human subject studies. Scientific Reports13 (2023), 15201. doi:10.1038/s41598-023-41458-5
- [2]
-
[3]
Albert-László Barabási. 2013. Network science.Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences371, 1987 (2013), 20120375
work page 2013
- [4]
-
[5]
Brady, Joshua Conrad Jackson, Björn Lindström, and M.J
William J. Brady, Joshua Conrad Jackson, Björn Lindström, and M.J. Crockett
-
[6]
doi:10.1016/j.tics.2023.06.008
Algorithm-mediated social learning in online social networks.Trends in Cognitive Sciences27, 10 (2023), 947–960. doi:10.1016/j.tics.2023.06.008
-
[7]
Philip Brookins and Jason DeBacker. 2024. Playing games with GPT: What can we learn about a large language model from canonical strategic games?Economics Bulletin43, 4 (2024), 1–12. https://ideas.repec.org/a/ebl/ecbull/eb-23-00457.html
work page 2024
- [8]
-
[9]
Kanishk Gandhi, Jan-Philipp Fränken, Tobias Gerstenberg, and Noah Goodman
-
[10]
Advances in Neural Information Processing Systems36 (2023), 13518–13529
Understanding social reasoning in language models with language models. Advances in Neural Information Processing Systems36 (2023), 13518–13529
work page 2023
-
[11]
Chen Gao, Xiaochong Lan, Nian Li, Yuan Yuan, Jingtao Ding, Zhilun Zhou, Fengli Xu, and Yong Li. 2024. Large language models empowered agent-based modeling and simulation: A survey and perspectives.Humanities and Social Sciences Communications11, 1 (2024), 1–24. doi:10.1057/s41599-024-03611-3
-
[12]
Wenyue Hua, Ollie Liu, Lingyao Li, Alfonso Amayuelas, Julie Chen, Lucas Jiang, Mingyu Jin, Lizhou Fan, Fei Sun, William Wang, Xintong Wang, and Yongfeng Zhang. 2024. Game-theoretic LLM: Agent Workflow for Negotiation Games. arXiv:2411.05990 [cs.AI] https://arxiv.org/abs/2411.05990
-
[13]
Guangyuan Jiang, Manjie Xu, Song-Chun Zhu, Wenjuan Han, Chi Zhang, and Yixin Zhu. 2023. Evaluating and inducing personality in pre-trained language models.Advances in Neural Information Processing Systems36 (2023), 10622– 10643
work page 2023
-
[14]
Hang Jiang, Xiajie Zhang, Xubo Cao, Cynthia Breazeal, Deb Roy, and Jad Kabbara
-
[15]
PersonaLLM: Investigating the ability of large language models to express personality traits. arXiv:2305.02547 [cs.CL]
- [16]
-
[17]
Guohao Li, Hasan Abed Al Kader Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. 2023. CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society. arXiv:2303.17760 [cs.AI] https://arxiv.org/ abs/2303.17760
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[18]
2021.Predicting MBTI personality type of Twitter users
Weiling Li. 2021.Predicting MBTI personality type of Twitter users. Ph. D. Disser- tation. Rutgers University-Camden Graduate School
work page 2021
- [19]
-
[20]
Emanuele Lorè and Arman Heydari. 2023. Strategic Behavior of Large Language Models: Game Structure vs. Contextual Framing. https://arxiv.org/abs/2309. 05898
work page 2023
-
[21]
J. Nathan Matias. 2019. Preventing harassment and increasing group participation through social norms in 2,190 online science discussions.Proceedings of the Na- tional Academy of Sciences116, 20 (2019), 9785–9789. doi:10.1073/pnas.1813486116
-
[23]
Xinyi Mou, Xuanwen Ding, Qi He, Liang Wang, Jingcong Liang, Xinnong Zhang, Libo Sun, Jiayu Lin, Jie Zhou, Xuanjing Huang, and Zhongyu Wei. 2024. From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents. arXiv:2412.03563 [cs.CL] https://arxiv.org/abs/2412.03563
-
[24]
Martin A. Nowak and Robert M. May. 1992. Evolutionary games and spatial chaos.Nature359, 6398 (1992), 826–829. doi:10.1038/359826a0
-
[25]
Generative Agents: Interactive Simulacra of Human Behavior
Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative Agents: Interactive Simulacra of Human Behavior. arXiv:2304.03442 [cs.HC] https://arxiv.org/abs/2304.03442
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[26]
Francisco C. Santos and Jorge M. Pacheco. 2005. Scale-free networks provide a unifying framework for the emergence of cooperation.Physical Review Letters 95, 9 (2005), 098104. doi:10.1103/PhysRevLett.95.098104
-
[27]
Gregory Serapio-García, Mustafa Safdari, Clément Crepy, Luning Sun, Stephen Fitz, Marwa Abdulhai, Aleksandra Faust, and Maja Matarić. 2023. Personality traits in large language models. Research Square. doi:10.21203/rs.3.rs-3296728/v1 Preprint (Version 1)
-
[28]
György Szabó and Gábor Fáth. 2007. Evolutionary games on graphs.Physics Reports446, 4–6 (2007), 97–216. doi:10.1016/j.physrep.2007.04.004
- [29]
-
[30]
Wei Wang, Haili Yang, Yuanfu Lu, Yuanhang Zou, Xu Zhang, Shuting Guo, and Leyu Lin. 2021. Influence Maximization in Multi-Relational Social Networks. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management(Virtual Event, Queensland, Australia)(CIKM ’21’). Association for Computing Machinery, New York, NY, USA, 4193–42...
- [31]
-
[32]
Duncan J. Watts and Steven H. Strogatz. 1998. Collective dynamics of “small- world” networks.Nature393, 6684 (1998), 440–442. doi:10.1038/30918
-
[33]
Simin Yu, Hao Wang, Ye Su, Ziyu Niu, Zhi Li, Jianjun Liu, and Jiwei Wang
-
[34]
Journal of King Saud University - Computer and Information Sciences36, 2 (2024), 101923
Privacy-preserving recommendation system based on social relationships. Journal of King Saud University - Computer and Information Sciences36, 2 (2024), 101923. doi:10.1016/j.jksuci.2024.101923
-
[35]
Weiqi Zeng, Bo Wang, Dongming Zhao, Zongfeng Qu, Ruifang He, Yuexian Hou, and Qinghua Hu. 2025. Dynamic Personality in LLM Agents: A Framework for Evolutionary Modeling and Behavioral Analysis in the Prisoner’s Dilemma. In Findings of the Association for Computational Linguistics: ACL 2025. Association for Computational Linguistics, Vienna, Austria, 23087...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.