Recognition: 2 theorem links
· Lean TheoremS³: Social-network Simulation System with Large Language Model-Empowered Agents
Pith reviewed 2026-05-17 11:24 UTC · model grok-4.3
The pith
LLM agents in the S3 system emulate human perception and actions to produce emergent social network phenomena like information and emotion propagation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The S3 system constructs an agent-based social network simulator in which each agent, powered by a large language model, perceives the informational environment and emulates genuine human actions through carefully engineered and tuned prompts. By modeling emotion, attitude, and interaction behaviors together, the agents produce emergent population-level dynamics, including the spread of information, attitudes, and emotions across the network. Evaluation against real-world social network data at two simulation levels confirms that these dynamics align with observed patterns at encouraging accuracy.
What carries the argument
Prompt engineering and prompt tuning applied to LLM agents, which lets each agent perceive the informational environment and emulate human emotion, attitude, and interaction behaviors.
If this is right
- Social scientists gain a tool for state prediction and phenomena explanation in networks without large-scale surveys.
- Policy makers can test interventions by observing how simulated attitude or emotion spreads respond to changes in the informational environment.
- The same agent framework extends to simulation systems outside social science, such as economic or political networks.
- Two-level evaluation on real data provides a template for validating future LLM-based simulators against ground-truth traces.
Where Pith is reading between the lines
- Traditional rule-based agent models may be replaceable in many cases once LLM perception and action emulation reach this fidelity.
- The approach could be extended to test how changes in network structure, such as adding or removing ties, alter propagation speed.
- If the same prompt techniques work across different LLMs, the simulation cost could drop rapidly with newer model releases.
- Hybrid systems might combine S3-style agents with classical diffusion equations to handle very large networks.
Load-bearing premise
Prompt engineering and prompt tuning are enough to make LLM agents emulate real human behavior in social networks closely enough that the resulting population-level patterns are meaningful.
What would settle it
Run the S3 system on the same real-world social network dataset but disable the agents' ability to perceive the informational environment; if the propagation of information, attitudes, and emotions no longer emerges or matches real patterns, the central claim is falsified.
read the original abstract
Social network simulation plays a crucial role in addressing various challenges within social science. It offers extensive applications such as state prediction, phenomena explanation, and policy-making support, among others. In this work, we harness the formidable human-like capabilities exhibited by large language models (LLMs) in sensing, reasoning, and behaving, and utilize these qualities to construct the S$^3$ system (short for $\textbf{S}$ocial network $\textbf{S}$imulation $\textbf{S}$ystem). Adhering to the widely employed agent-based simulation paradigm, we employ prompt engineering and prompt tuning techniques to ensure that the agent's behavior closely emulates that of a genuine human within the social network. Specifically, we simulate three pivotal aspects: emotion, attitude, and interaction behaviors. By endowing the agent in the system with the ability to perceive the informational environment and emulate human actions, we observe the emergence of population-level phenomena, including the propagation of information, attitudes, and emotions. We conduct an evaluation encompassing two levels of simulation, employing real-world social network data. Encouragingly, the results demonstrate promising accuracy. This work represents an initial step in the realm of social network simulation empowered by LLM-based agents. We anticipate that our endeavors will serve as a source of inspiration for the development of simulation systems within, but not limited to, social science.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the S³ (Social network Simulation System) that employs large language model-empowered agents to simulate social networks. Using prompt engineering and prompt tuning, agents are designed to perceive informational environments and emulate human behaviors across emotion, attitude, and interaction. The authors report observing emergent population-level phenomena such as the propagation of information, attitudes, and emotions, and claim promising accuracy when evaluated against real-world social network data at two simulation levels. This is positioned as an initial exploration of LLM-based agent simulations in social science.
Significance. If the micro-level agent behaviors can be shown to match human distributions rather than LLM artifacts, this approach could enable more flexible and interpretable simulations of complex social dynamics than traditional agent-based models that rely on hand-crafted rules or fitted parameters. The evaluation on real-world data is a positive step toward falsifiability, but the current lack of detailed validation limits the immediate impact.
major comments (2)
- [Evaluation] Evaluation section: the abstract states 'promising accuracy' on real-world data at two simulation levels, but provides no specific metrics, baselines, error bars, statistical tests, or details on how human-likeness was validated beyond aggregate match. This is load-bearing for the central claim that observed propagations reflect genuine social dynamics.
- [Agent Design] Agent design and prompt engineering sections: the reliance on prompt engineering to emulate human perception, reasoning, and action is not accompanied by robustness checks (e.g., ablation across prompt variants or alternative LLMs) or micro-level comparisons (e.g., response time distributions, emotional valence shifts, or interaction selectivity against human data). Without these, population emergence could arise from generic LLM tendencies rather than modeled social processes.
minor comments (2)
- [Abstract] The abstract would be clearer if it named the two simulation levels and the specific real-world datasets used.
- Notation for the three simulated aspects (emotion, attitude, interaction) should be introduced consistently when first defined.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. The comments highlight key areas where we can improve the presentation of our evaluation results and agent design choices. We address each major comment point by point below and indicate the planned revisions.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: the abstract states 'promising accuracy' on real-world data at two simulation levels, but provides no specific metrics, baselines, error bars, statistical tests, or details on how human-likeness was validated beyond aggregate match. This is load-bearing for the central claim that observed propagations reflect genuine social dynamics.
Authors: We agree that greater quantitative detail is needed to support the evaluation claims. The manuscript currently reports alignment between simulated and real-world data at the individual agent level (emotion and attitude updates) and the population level (propagation of information, attitudes, and emotions) using two real social network datasets. To address this concern, we will revise the abstract for precision and expand the evaluation section to include specific metrics (such as correlation coefficients and mean absolute errors for propagation trends), comparisons against baselines like random-walk or rule-based diffusion models, error bars from repeated simulation runs, and appropriate statistical tests for significance. These changes will more rigorously substantiate the observed phenomena. revision: yes
-
Referee: [Agent Design] Agent design and prompt engineering sections: the reliance on prompt engineering to emulate human perception, reasoning, and action is not accompanied by robustness checks (e.g., ablation across prompt variants or alternative LLMs) or micro-level comparisons (e.g., response time distributions, emotional valence shifts, or interaction selectivity against human data). Without these, population emergence could arise from generic LLM tendencies rather than modeled social processes.
Authors: This concern about potential LLM artifacts versus intentionally modeled social processes is well-taken. Our agent architecture uses distinct prompt modules grounded in social science concepts for environmental perception, attitude updating via influence mechanisms, and selective interaction. We will add robustness checks in the revision, including ablations that isolate prompt components (e.g., emotion versus attitude modules) and results across alternative LLMs. For micro-level comparisons such as response time distributions or interaction selectivity, the present work emphasizes aggregate emergence; obtaining fine-grained human distributional data for direct matching would require new empirical studies outside the scope of this initial exploration. We will add an explicit limitations discussion and future-work section addressing this gap. revision: partial
Circularity Check
No circularity: simulation outputs validated on external real-world data
full rationale
The paper describes an LLM-agent simulation system built via prompt engineering and tuning to emulate individual human behaviors in social networks, then reports population-level emergence of information/attitude/emotion propagation. These outputs are compared directly to independent real-world social network datasets at two simulation levels, with no equations, fitted parameters, or self-referential definitions that reduce the claimed accuracy or emergence to the inputs by construction. The methodology is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLMs possess human-like capabilities in sensing, reasoning, and behaving that can be elicited via prompt engineering and tuning.
Forward citations
Cited by 17 Pith papers
-
AgentSocialBench: Evaluating Privacy Risks in Human-Centered Agentic Social Networks
AgentSocialBench demonstrates that privacy preservation is fundamentally harder in human-centered agentic social networks than in single-agent cases due to cross-domain coordination pressures and an abstraction parado...
-
Mechanism Plausibility in Generative Agent-Based Modeling
Introduces the Mechanism Plausibility Scale to distinguish generative sufficiency from mechanistic plausibility in LLM-based agent-based models.
-
Graph World Models: Concepts, Taxonomy, and Future Directions
The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.
-
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond
Proposes a levels x laws taxonomy for world models in AI agents, defining L1-L3 capabilities across physical, digital, social, and scientific regimes while reviewing over 400 works to outline a roadmap for advanced ag...
-
IntervenSim: Intervention-Aware Social Network Simulation for Opinion Dynamics
IntervenSim is an intervention-aware social network simulation that couples source interventions with crowd interactions in a feedback loop, improving MAPE by 41.6% and DTW by 66.9% over prior static frameworks on rea...
-
AgentMark: Utility-Preserving Behavioral Watermarking for Agents
AgentMark watermarks agent planning behaviors with multi-bit identifiers via conditional sampling that preserves utility and works on black-box systems.
-
Assessing Capabilities of Large Language Models in Social Media Analytics: A Multi-task Quest
LLMs show mixed results on authorship verification, post generation, and attribute inference from Twitter data, with new frameworks and user studies establishing benchmarks for these analytics tasks.
-
Topology-Aware LLM-Driven Social Simulation: A Unified Framework for Efficient and Realistic Agent Dynamics
TopoSim integrates network topology into LLM agent simulations via backbone units and heterogeneous influence to cut token use 50-90% while improving fidelity to real-world structures.
-
SOCIA-EVO: Automated Simulator Construction via Dual-Anchored Bi-Level Optimization
SOCIA-EVO generates statistically consistent simulators by separating structural refinement from parameter calibration via bi-level optimization and falsifying strategies through execution feedback in a Bayesian-weigh...
-
Beyond Individual Mimicry: Constructing Human-Like Social network with Graph-Augmented LLM Agents
GraphMind equips LLM agents with graph awareness to construct human-like social networks, producing botnets that substantially degrade performance of both text-based and graph-based detectors.
-
Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
LLM simulations of misinformation susceptibility overstate attitudinal associations and largely ignore personal network characteristics compared to human survey data.
-
Cognitive Architectures for Language Agents
CoALA is a modular cognitive architecture for language agents that organizes memory components, action spaces for internal and external interaction, and a generalized decision-making loop to support more systematic de...
-
A Survey on Large Language Model based Autonomous Agents
A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future di...
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.
-
Network Effects and Agreement Drift in LLM Debates
LLM agents in controlled network debates show agreement drift toward specific opinion positions, requiring separation of structural effects from LLM biases before using them as human behavioral proxies.
-
Large Language Model based Multi-Agents: A Survey of Progress and Challenges
The paper surveys LLM-based multi-agent systems, covering simulated domains, agent profiling and communication, mechanisms for capacity growth, and common benchmarks.
-
A Survey on the Memory Mechanism of Large Language Model based Agents
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
Reference graph
Works this paper leans on
-
[1]
Using large language models to simulate multiple humans and replicate human subject studies
Gati V Aher, Rosa I Arriaga, and Adam Tauman Kalai. Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning, pages 337–371. PMLR, 2023
work page 2023
-
[2]
Advancing the art of simulation in the social sciences
Robert Axelrod. Advancing the art of simulation in the social sciences. In Simulating social phenomena, pages 21–40. Springer, 1997
work page 1997
-
[3]
Fabian Baumann, Philipp Lorenz-Spreen, Igor M Sokolov, and Michele Starnini. Modeling echo chambers and polarization dynamics in social networks.Physical Review Letters, 124(4):048301, 2020
work page 2020
-
[4]
Emer- gence of polarized ideological opinions in multidimensional topic spaces
Fabian Baumann, Philipp Lorenz-Spreen, Igor M Sokolov, and Michele Starnini. Emer- gence of polarized ideological opinions in multidimensional topic spaces. Physical Review X, 11(1):011012, 2021. 15
work page 2021
-
[5]
Paul Bratley, Bennett L Fox, and Linus E Schrage. A guide to simulation, 1987
work page 1987
-
[6]
Language models are few-shot learners
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020
work page 1901
-
[7]
Understanding social preferences with simple tests
Gary Charness and Matthew Rabin. Understanding social preferences with simple tests. The quarterly journal of economics, 117(3):817–869, 2002
work page 2002
-
[8]
Scalable influence maximization in social networks under the linear threshold model
Wei Chen, Yifei Yuan, and Li Zhang. Scalable influence maximization in social networks under the linear threshold model. In 2010 IEEE international conference on data mining, pages 88–97. IEEE, 2010
work page 2010
-
[9]
Bastien Chopard and Michel Droz. Cellular automata. Modelling of Physical, pages 6–13, 1998
work page 1998
-
[10]
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, et al. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[11]
Morris H DeGroot. Reaching a consensus. Journal of the American Statistical association , 69(345):118–121, 1974
work page 1974
-
[12]
GLM: General language model pretraining with autoregressive blank infilling
Zhengxiao Du, Yujie Qian, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, and Jie Tang. GLM: General language model pretraining with autoregressive blank infilling. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 320–335, Dublin, Ireland, May 2022. Association for Computational Linguistics
work page 2022
- [13]
-
[14]
James Flamino, Alessandro Galeazzi, Stuart Feldman, Michael W Macy, Brendan Cross, Zhenkun Zhou, Matteo Serafino, Alexandre Bovet, Hernán A Makse, and Boleslaw K Szyman- ski. Political polarization of news media and influencers on twitter in the 2016 and 2020 us presidential elections. Nature Human Behaviour, pages 1–13, 2023
work page 2016
-
[15]
System dynamics and the lessons of 35 years
Jay W Forrester. System dynamics and the lessons of 35 years. In A systems-based approach to policymaking, pages 199–240. Springer, 1993
work page 1993
-
[16]
Simulation for the social scientist
Nigel Gilbert and Klaus Troitzsch. Simulation for the social scientist. McGraw-Hill Education (UK), 2005
work page 2005
-
[17]
Evaluating large language models in generating synthetic hci research data: a case study
Perttu Hämäläinen, Mikke Tavast, and Anton Kunnari. Evaluating large language models in generating synthetic hci research data: a case study. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–19, 2023
work page 2023
-
[18]
Minimum-sized influential node set selection for social networks under the independent cascade model
Jing He, Shouling Ji, Raheem Beyah, and Zhipeng Cai. Minimum-sized influential node set selection for social networks under the independent cascade model. In Proceedings of the 15th ACM International Symposium on Mobile ad hoc Networking and Computing, pages 93–102, 2014
work page 2014
-
[19]
Quantifying ideological polarization on a network using generalized euclidean distance
Marilena Hohmann, Karel Devriendt, and Michele Coscia. Quantifying ideological polarization on a network using generalized euclidean distance. Science Advances, 9(9):eabq2044, 2023
work page 2023
-
[20]
John J Horton. Large language models as simulated economic agents: What can we learn from homo silicus? Technical report, National Bureau of Economic Research, 2023
work page 2023
-
[21]
A simulation model of police patrol operations: program description
Peter Kolesar and Warren E Walker. A simulation model of police patrol operations: program description. 1975
work page 1975
-
[22]
Lik-Hang Lee, Tristan Braud, Pengyuan Zhou, Lin Wang, Dianlei Xu, Zijun Lin, Abhishek Kumar, Carlos Bermejo, and Pan Hui. All one needs to know about metaverse: A complete survey on technological singularity, virtual ecosystem, and research agenda. arXiv preprint arXiv:2110.05352, 2021
-
[23]
Emer- gence of polarization in coevolving networks
Jiazhen Liu, Shengda Huang, Nathaniel M Aden, Neil F Johnson, and Chaoming Song. Emer- gence of polarization in coevolving networks. Physical Review Letters, 130(3):037401, 2023
work page 2023
-
[24]
P- tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks
Xiao Liu, Kaixuan Ji, Yicheng Fu, Weng Tam, Zhengxiao Du, Zhilin Yang, and Jie Tang. P- tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 61–68, Dublin, Ireland, May 2022. Association for Computational Li...
work page 2022
-
[25]
A systematic review of worldwide causal and correlational evidence on digital media and democracy
Philipp Lorenz-Spreen, Lisa Oswald, Stephan Lewandowsky, and Ralph Hertwig. A systematic review of worldwide causal and correlational evidence on digital media and democracy. Nature human behaviour, 7(1):74–101, 2023
work page 2023
-
[26]
Stefan Luding. Information propagation. Nature, 435(7039):159–160, 2005
work page 2005
-
[27]
Using system dynamics to model the social security system
Lawrence C Marsh and Meredith Scovill. Using system dynamics to model the social security system. In NBER Workshop on Policy Analysis with Social Security Research Files , pages 15–17, 1978
work page 1978
-
[28]
Dynamics of growth in a finite world
Dennis L Meadows, William W Behrens, Donella H Meadows, Roger F Naill, Jørgen Randers, and Erich Zahn. Dynamics of growth in a finite world. Wright-Allen Press Cambridge, MA, 1974
work page 1974
-
[29]
Universality, criticality and complexity of information propagation in social media
Daniele Notarmuzi, Claudio Castellano, Alessandro Flammini, Dario Mazzilli, and Filippo Radicchi. Universality, criticality and complexity of information propagation in social media. Nature communications, 13(1):1308, 2022
work page 2022
-
[30]
Predicting opinion dynamics via sociologically-informed neural networks
Maya Okawa and Tomoharu Iwata. Predicting opinion dynamics via sociologically-informed neural networks. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 1306–1316, 2022
work page 2022
- [31]
-
[32]
Generative Agents: Interactive Simulacra of Human Behavior
Joon Sung Park, Joseph C O’Brien, Carrie J Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. arXiv preprint arXiv:2304.03442, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[33]
Deepinf: Social influence prediction with deep learning
Jiezhong Qiu, Jian Tang, Hao Ma, Yuxiao Dong, Kuansan Wang, and Jie Tang. Deepinf: Social influence prediction with deep learning. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’18, page 2110–2119, New York, NY , USA, 2018. Association for Computing Machinery
work page 2018
-
[34]
Status quo bias in decision making
William Samuelson and Richard Zeckhauser. Status quo bias in decision making. Journal of risk and uncertainty, 1:7–59, 1988
work page 1988
-
[35]
Link recommendation algorithms and dynamics of polarization in online social networks
Fernando P Santos, Yphtach Lelkes, and Simon A Levin. Link recommendation algorithms and dynamics of polarization in online social networks. Proceedings of the National Academy of Sciences, 118(50):e2102141118, 2021
work page 2021
-
[36]
Spinning the web of hate: Web-based hate propagation by extremist organizations
Joseph A Schafer. Spinning the web of hate: Web-based hate propagation by extremist organizations. Journal of Criminal Justice and Popular Culture, 2002
work page 2002
-
[37]
Effects of age and gender on blogging
Jonathan Schler, Moshe Koppel, Shlomo Argamon, and James W Pennebaker. Effects of age and gender on blogging. In AAAI spring symposium: Computational approaches to analyzing weblogs, volume 6, pages 199–205, 2006
work page 2006
-
[38]
Peter D Spencer. The effect of oil discoveries on the british economy—theoretical ambiguities and the consistent expectations simulation approach. The Economic Journal, 94(375):633–644, 1984
work page 1984
-
[39]
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timo- thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[40]
Social science microsimulation
Klaus G Troitzsch. Social science microsimulation. Springer Science & Business Media, 1996
work page 1996
-
[41]
Global evidence of expressed sentiment alterations during the covid-19 pandemic
Jianghao Wang, Yichun Fan, Juan Palacios, Yuchen Chai, Nicolas Guetta-Jeanrenaud, Nick Obradovich, Chenghu Zhou, and Siqi Zheng. Global evidence of expressed sentiment alterations during the covid-19 pandemic. Nature Human Behaviour, 6(3):349–358, 2022
work page 2022
-
[42]
Detecting and modelling real percolation and phase transitions of information on social media
Jiarong Xie, Fanhui Meng, Jiachen Sun, Xiao Ma, Gang Yan, and Yanqing Hu. Detecting and modelling real percolation and phase transitions of information on social media. Nature Human Behaviour, 5(9):1161–1168, 2021
work page 2021
-
[43]
V oting models in random networks
Mehmet E Yildiz, Roberto Pagliari, Asuman Ozdaglar, and Anna Scaglione. V oting models in random networks. In 2010 information theory and applications workshop (ITA), pages 1–7. IEEE, 2010
work page 2010
-
[44]
Neural dynamics on complex networks
Chengxi Zang and Fei Wang. Neural dynamics on complex networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 892–902, 2020. 17
work page 2020
-
[45]
GLM-130B: An Open Bilingual Pre-trained Model
Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, et al. Glm-130b: An open bilingual pre-trained model. arXiv preprint arXiv:2210.02414, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[46]
Who influenced you? predicting retweet via social influence locality
Jing Zhang, Jie Tang, Juanzi Li, Yang Liu, and Chunxiao Xing. Who influenced you? predicting retweet via social influence locality. ACM Trans. Knowl. Discov. Data, 9(3), apr 2015. 18
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.