Recognition: no theorem link
Mechanism Plausibility in Generative Agent-Based Modeling
Pith reviewed 2026-05-14 19:21 UTC · model grok-4.3
The pith
A four-level scale separates whether LLM-based agent models reproduce social phenomena from whether they plausibly explain how those phenomena arise through mechanisms.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By combining recent LLM-ABM research with contemporary philosophy of science literature on mechanisms, the authors operationalize plausibility as a four-level scale. This scale separates the evaluation of a model's generative sufficiency, meaning its ability to reproduce a phenomenon, from its mechanistic plausibility, meaning how the phenomenon could be produced by related organized entities and activities. It also clarifies the distinct roles of predictive models versus explanatory models in agent-based simulations.
What carries the argument
The Mechanism Plausibility Scale, a four-level operationalization that evaluates generative sufficiency separately from mechanistic plausibility based on philosophy of science concepts of mechanisms.
If this is right
- Simulations can be classified according to whether they merely reproduce observed behaviors or demonstrate plausible mechanisms for producing them.
- Predictive models focus on capability to match data, while explanatory models require evidence of mechanistic pathways.
- Modelers gain a grounded framework to describe experiment characteristics and assess progress toward explanation.
- Evaluation of LLM-generated behaviors in social simulations becomes more structured by separating reproduction from explanation.
Where Pith is reading between the lines
- Researchers in other generative modeling domains could adapt the scale to assess mechanistic claims.
- The scale suggests new experimental designs that test for specific mechanisms in agent behaviors.
- Adoption might encourage more interdisciplinary work between computational modelers and philosophers of science.
Load-bearing premise
That concepts of mechanisms from philosophy of science can be directly turned into a practical four-level scale for assessing LLM agent behaviors without additional validation or adaptation to specific domains.
What would settle it
A study where independent raters apply the four-level scale to the same set of LLM-ABM papers and find that their ratings do not align on the mechanistic plausibility levels, or where models rated high on the scale fail to predict new mechanistic interventions.
Figures
read the original abstract
Large language models (LLMs) can generate high-level diverse phenomena without explicitly programmed rules. This capability has led to their adoption within different agent-based models (ABMs) and social simulations. Recently, research has aim to test whether they are capable of generating different phenomena of interest, for example, human behavior on social media platforms or performance in game-theoretic scenarios. However, capability, prediction, and explanation are different -- drawing from the philosophy of science and mechanisms literature, \textit{explanation} requires showing, to some degree, how a phenomenon is produced by related organized entities and activities. For modelers, describing the characteristics of an experiment or whether a simulation provides progress in capability (or explanation), can be difficult without being grounded in potentially distant research areas. We integrate recent work on LLM-ABMs with contemporary philosophy of science literature and use it to operationalize a definition of `plausibility' in a four-level scale. Our scale separates the evaluation of a model's generative sufficiency (ability to reproduce a phenomenon) from its mechanistic plausibility (how the phenomenon could be produced), and clarifies the distinct roles of different models, such as predictive and explanatory ones. We introduce this as the Mechanism Plausibility Scale.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes integrating recent work on LLM-based agent-based models (ABMs) with philosophy-of-science literature on mechanisms to operationalize a four-level Mechanism Plausibility Scale. The scale is intended to separate evaluation of generative sufficiency (ability to reproduce a target phenomenon) from mechanistic plausibility (how the phenomenon is produced by organized entities and activities), thereby clarifying the distinct roles of predictive versus explanatory models.
Significance. If the scale can be made operational with concrete criteria, it would offer a useful conceptual tool for researchers working on generative ABMs to ground claims about explanation rather than mere reproduction. The integration of mechanism concepts from philosophy of science addresses a recognized gap in evaluating black-box LLM agents, and the distinction between sufficiency and plausibility could help structure future model assessments in social simulation.
major comments (2)
- [Abstract / Scale definition] Abstract and the section defining the scale: no explicit criteria, level definitions, or mapping from philosophy-of-science mechanism concepts (organized entities and activities producing a phenomenon) to observable properties of LLM-generated text or agent behaviors are supplied. The central claim that the scale cleanly separates generative sufficiency from mechanistic plausibility therefore rests on unstated interpretive assumptions rather than operational rules.
- [Abstract] Abstract: the manuscript states that the scale will be applied to phenomena such as social-media behavior and game-theoretic scenarios, yet provides no worked example, test case, or illustration of how any level would be assigned to an existing LLM-ABM output. Without such demonstrations the proposal remains conceptual and its usability for modelers cannot be assessed.
minor comments (2)
- [Abstract] Abstract: 'research has aim to test' should read 'research has aimed to test'.
- [Abstract] Abstract: the phrase 'different phenomena of interest' is vague; a single concrete reference to one of the cited domains would improve readability.
Simulated Author's Rebuttal
We thank the referee for the constructive report and for recognizing the potential value of integrating mechanism concepts from philosophy of science with LLM-based ABMs. We agree that the current manuscript is primarily conceptual and that explicit operational criteria plus a worked example are necessary to demonstrate usability. We address each major comment below and will incorporate the requested clarifications in a revised version.
read point-by-point responses
-
Referee: [Abstract / Scale definition] Abstract and the section defining the scale: no explicit criteria, level definitions, or mapping from philosophy-of-science mechanism concepts (organized entities and activities producing a phenomenon) to observable properties of LLM-generated text or agent behaviors are supplied. The central claim that the scale cleanly separates generative sufficiency from mechanistic plausibility therefore rests on unstated interpretive assumptions rather than operational rules.
Authors: We accept this assessment. The four-level scale is defined by reference to the degree of alignment with mechanistic explanation (organized entities and activities), but the manuscript does not yet supply concrete, observable criteria for assigning levels to LLM outputs. In the revision we will expand the scale-definition section to list explicit criteria for each level, including mappings such as: Level 1 requires only output matching the target phenomenon; Level 2 requires evidence of intermediate steps interpretable as activities; Level 3 requires identifiable entities whose interactions produce those activities; Level 4 requires consistency with independently validated mechanisms. These criteria will be stated in terms of observable properties of generated text or behavior traces. revision: yes
-
Referee: [Abstract] Abstract: the manuscript states that the scale will be applied to phenomena such as social-media behavior and game-theoretic scenarios, yet provides no worked example, test case, or illustration of how any level would be assigned to an existing LLM-ABM output. Without such demonstrations the proposal remains conceptual and its usability for modelers cannot be assessed.
Authors: We agree that a concrete illustration is required to show how the scale functions in practice. Although the manuscript is a conceptual proposal, we will add a new subsection containing a worked example. Using a published LLM-ABM study on social-media posting behavior (or, alternatively, a game-theoretic coordination task), we will walk through the assignment of each level, citing specific output features that justify the rating and showing how the distinction between generative sufficiency and mechanistic plausibility is applied. revision: yes
Circularity Check
No circularity: Mechanism Plausibility Scale is an external operationalization, not a self-referential derivation.
full rationale
The paper proposes a four-level scale by integrating external philosophy-of-science literature on mechanisms with existing LLM-ABM work. No equations, fitted parameters, or self-citations appear in the provided text that would reduce the scale definition to its own inputs by construction. The separation of generative sufficiency from mechanistic plausibility is presented as a conceptual framework drawn from cited sources rather than an internal fit or renaming. This is a standard non-circular proposal of a new evaluative tool.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Explanation requires showing how a phenomenon is produced by related organized entities and activities
invented entities (1)
-
Mechanism Plausibility Scale
no independent evidence
Reference graph
Works this paper leans on
-
[1]
William Agnew, A. Stevie Bergman, Jennifer Chien, Mark Díaz, Seliem El-Sayed, Jaylen Pittman, Shakir Mohamed, and Kevin R. McKee. 2024. The illusion of artificial inclusion. InProceedings of the CHI Conference on Human Factors in Computing Systems. 1–12. doi:10.1145/3613904.3642703 arXiv:2401.08572 [cs]
-
[2]
Elif Akata, Lion Schulz, Julian Coda-Forno, Seong Joon Oh, Matthias Bethge, and Eric Schulz. 2025. Playing repeated games with Large Language Models.Nature Human Behaviour(May 2025). doi:10.1038/s41562-025-02172-y arXiv:2305.16867 [cs]
-
[3]
Wang, Mathew Willows, Feitong Yang, and Guangyu Robert Yang
Altera AL, Andrew Ahn, Nic Becker, Stephanie Carroll, Nico Christie, Manuel Cortes, Arda Demirci, Melissa Du, Frankie Li, Shuying Luo, Peter Y. Wang, Mathew Willows, Feitong Yang, and Guangyu Robert Yang. 2024. Project Sid: Many-agent simulations toward AI civilization. doi:10.48550/arXiv.2411.00114 arXiv:2411.00114 [cs]
-
[4]
Jacy Reese Anthis, Ryan Liu, Sean M. Richardson, Austin C. Kozlowski, Bernard Koch, James Evans, Erik Brynjolfsson, and Michael Bernstein. 2025. LLM Social Simulations Are a Promising Research Method. doi:10.48550/arXiv.2504.02234 arXiv:2504.02234 [cs]
-
[5]
Eckhart Arnold. 2013. Simulation Models of the Evolution of Cooperation as Proofs of Logical Possibilities. How Useful Are They?Etica E Politica15, 2 (2013), 101–138. https://philarchive.org/rec/ARNSMO Publisher: University of Trieste, Department of Philosophy
2013
-
[6]
Eckhart Arnold. 2015. How Models Fail: A Critical Look at the History of Computer Simulations of the Evolution of Cooperation. In Collective Agency and Cooperation in Natural and Artificial Systems, Catrin Misselhorn (Ed.). Springer International Publishing, Cham, 261–279. doi:10.1007/978-3-319-15515-9_14
-
[7]
Robert Axelrod. [n. d.]. The Evolution of Cooperation*. ([n. d.]). https://ee.stanford.edu/~hellman/Breakthrough/book/pdfs/axelrod.pdf
-
[8]
N. Emrah Aydinonat. 2024. The puzzle of model-based explanation. InThe Routledge Handbook of Philosophy of Scientific Modeling(1 ed.). Routledge, London, 177–192. doi:10.4324/9781003205647-16
-
[9]
Paul Bartha. 2024. Analogy and Analogical Reasoning. InThe Stanford Encyclopedia of Philosophy(fall 2024 ed.), Edward N. Zalta and Uri Nodelman (Eds.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/fall2024/entries/reasoning-analogy/
2024
-
[10]
James Bogen and James Woodward. 1988. Saving the Phenomena.The Philosophical Review97, 3 (1988), 303–352. jstor:2185445 doi:10.2307/2185445
-
[11]
Alisa Bokulich. 2014. How the Tiger Bush Got its Stripes: ‘How Possibly’ vs. ‘How Actually’ Model Explanations.The Monist97, 3 (July 2014), 321–338. doi:10.5840/monist201497321
-
[12]
Robert N. Brandon and Robert N. Brandon. 2014.Adaptation and environment. Princeton University Press, Princeton. doi:doi: 10.1515/9781400860661
-
[13]
P. W. (Percy Williams) Bridgman. 1927.The Logic of Modern Physics. The Macmillan Company
1927
-
[14]
Runjin Chen, Andy Arditi, Henry Sleight, Owain Evans, and Jack Lindsey. 2025. Persona Vectors: Monitoring and Controlling Character Traits in Language Models. doi:10.48550/arXiv.2507.21509 arXiv:2507.21509 [cs]. 2https://asta.allen.ai/chat FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Zhao et al
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.21509 2025
-
[15]
Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, and Jie Zhou. 2023. AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors. doi:10.48550/arXiv.2308.10848 arXiv:2308.10848 [cs]
-
[16]
Anthony Costarelli, Mat Allen, Roman Hauksson, Grace Sodunke, Suhas Hariharan, Carlson Cheng, Wenjie Li, and Arjun Yadav. 2024. GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents. doi:10.48550/arXiv.2406.06613 arXiv:2406.06613 [cs] version: 1
-
[17]
Carl Craver, James Tabery, and Phyllis Illari. 2024. Mechanisms in Science. InThe Stanford Encyclopedia of Philosophy(fall 2024 ed.), Edward N. Zalta and Uri Nodelman (Eds.). Metaphysics Research Lab, Stanford University. https://plato.stanford.edu/archives/fall2024/ entries/science-mechanisms/
2024
-
[18]
Carl F. Craver. 2006. When mechanistic models explain.Synthese153, 3 (Dec. 2006), 355–376. doi:10.1007/s11229-006-9097-x
-
[19]
Carl F. Craver. 2009.Explaining the Brain. Oxford University Press
2009
-
[20]
Lee J. Cronbach and Paul E. Meehl. 1955. Construct validity in psychological tests.Psychological Bulletin52, 4 (1955), 281–302. doi:10.1037/h0040957 Place: US Publisher: American Psychological Association
-
[21]
1954.The aim and structure of physical theory
Pierre Maurice Marie Duhem. 1954.The aim and structure of physical theory. Vol. 1. Princeton University Press. Pages: 85-87
1954
-
[22]
2025.Deflating Mental Representation (The Jean Nicod Lectures)
Frances Egan. 2025.Deflating Mental Representation (The Jean Nicod Lectures). MIT Press (open access)
2025
- [23]
-
[24]
Joshua M. Epstein. 2006.Generative Social Science: Studies in Agent-Based Computational Modeling(stu - student edition ed.). Princeton University Press. http://www.jstor.org/stable/j.ctt7rxj1
2006
-
[25]
1999.The genetical theory of natural selection: by R.A
Ronald Aylmer Fisher. 1999.The genetical theory of natural selection: by R.A. Fisher ; edited with a foreword and notes by J.H. Bennett(a complete variorum ed ed.). Oxford University Press, Oxford
1999
-
[26]
S3: Social-network simulation system with large language model-empowered agents
Chen Gao, Xiaochong Lan, Zhihong Lu, Jinzhu Mao, Jinghua Piao, Huandong Wang, Depeng Jin, and Yong Li. 2025.S$^3$: Social-network Simulation System with Large Language Model-Empowered Agents. arXiv:2307.14984 [cs] doi:10.48550/arXiv.2307.14984
- [27]
-
[28]
1979.Reliability and Validity Assessment
Edward G.Carmines and Richard A.Zeller. 1979.Reliability and Validity Assessment. SAGE Publications, Inc. doi:10.4135/9781412985642
-
[29]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé III, and Kate Crawford
-
[30]
doi:10.48550/arXiv.1803.09010 arXiv:1803.09010 [cs]
Datasheets for Datasets. doi:10.48550/arXiv.1803.09010 arXiv:1803.09010 [cs]
-
[31]
2017.The New Mechanical Philosophy
Stuart Glennan. 2017.The New Mechanical Philosophy. Oxford University Press, Oxford
2017
-
[32]
Stuart S. Glennan. 1996. Mechanisms and the nature of causation.Erkenntnis44, 1 (Jan. 1996), 49–71. doi:10.1007/BF00172853
-
[33]
Claudius Graebner. 2018. How to Relate Models to Reality? An Epistemological Framework for the Validation and Verification of Computational Models.Journal of Artificial Societies and Social Simulation21, 3 (2018), 8
2018
-
[34]
Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakan...
-
[35]
Olivia Guest and Iris van Rooij. 2025. Critical Artificial Intelligence Literacy for Psychologists. doi:10.31234/osf.io/dkrgj_v1
-
[36]
Fulin Guo. 2023. GPT in Game Theory Experiments. doi:10.48550/arXiv.2305.05516 arXiv:2305.05516 [econ]
- [37]
-
[38]
Wenyue Hua, Lizhou Fan, Lingyao Li, Kai Mei, Jianchao Ji, Yingqiang Ge, Libby Hemphill, and Yongfeng Zhang. 2024. War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars. doi:10.48550/arXiv.2311.17227 arXiv:2311.17227 [cs]
-
[39]
Klein, Alex Krasodomski, Joshua Tan, and Eleanor Tursman
Brandon Jackson, B Cavello, Flynn Devine, Nick Garcia, Samuel J. Klein, Alex Krasodomski, Joshua Tan, and Eleanor Tursman. 2024. Public AI: Infrastructure for the common good. doi:10.5281/zenodo.13914560
-
[40]
Frank Jackson. 1982. Epiphenomenal Qualia.The Philosophical Quarterly32, 127 (April 1982), 127–136. doi:10.2307/2960077
-
[41]
Zhao Kaiya, Michelangelo Naim, Jovana Kondic, Manuel Cortes, Jiaxin Ge, Shuying Luo, Guangyu Robert Yang, and Andrew Ahn. 2023. Lyfe Agents: Generative Agents for Low-Cost Real-Time Social Interactions. arXiv:2310.02172 [cs] doi:10.48550/arXiv.2310.02172
-
[42]
David Michael Kaplan and Carl F. Craver. 2011. The Explanatory Force of Dynamical and Mathematical Models in Neuroscience: A Mechanistic Perspective*.Philosophy of Science78, 4 (2011), 601–627. doi:10.1086/661755 Publisher: [The University of Chicago Press, Philosophy of Science Association]
-
[43]
Kendrick N. Kay. 2018. Principles for models of neural information processing.NeuroImage180 (Oct. 2018), 101–109. doi:10.1016/j. neuroimage.2017.08.016 Mechanism Plausibility in Generative Agent-Based Modeling FAccT ’26, June 25–28, 2026, Montreal, QC, Canada
work page doi:10.1016/j 2018
-
[44]
Benjamin Kempinski, Ian Gemp, Kate Larson, Marc Lanctot, Yoram Bachrach, and Tal Kachman. 2025. Game of Thoughts: Iterative Reasoning in Game-Theoretic Domains with Large Language Models. InProceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’25). International Foundation for Autonomous Agents and Multiagent...
2025
-
[45]
J. R. Landis and G. G. Koch. 1977. The measurement of observer agreement for categorical data.Biometrics33, 1 (March 1977), 159–174
1977
-
[46]
Maik Larooij and Petter Törnberg. 2025. Do Large Language Models Solve the Problems of Agent-Based Modeling? A Critical Review of Generative Social Simulations. doi:10.48550/arXiv.2504.03274 arXiv:2504.03274 [cs]
-
[47]
Angeliki Lazaridou, Adhiguna Kuncoro, Elena Gribovskaya, Devang Agrawal, Adam Liska, Tayfun Terzi, Mai Gimenez, Cyprien de Masson d’Autume, Tomas Kocisky, Sebastian Ruder, Dani Yogatama, Kris Cao, Susannah Young, and Phil Blunsom. 2021. Mind the Gap: Assessing Temporal Generalization in Neural Language Models. doi:10.48550/arXiv.2102.01951 arXiv:2102.01951 [cs]
-
[48]
Huao Li, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, and Katia Sycara. 2023. Theory of Mind for Multi-Agent Collaboration via Large Language Models. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 180–192. doi:10.18653/v1/2023.emnlp-main.13 arXiv:2310.10701 [cs]
-
[49]
Xinyi Li, Yu Xu, Yongfeng Zhang, and Edward C. Malthouse. 2024. Large Language Model-driven Multi-Agent Simulation for News Diffusion Under Different Network Structures. doi:10.48550/arXiv.2410.13909 arXiv:2410.13909 [cs]
-
[50]
Yuxuan Li, Sauvik Das, and Hirokazu Shirado. 2025. What Makes LLM Agent Simulations Useful for Policy? Insights From an Iterative Design Engagement in Emergency Preparedness. doi:10.48550/arXiv.2509.21868 arXiv:2509.21868 [cs]
-
[51]
Yuxuan Li and Hirokazu Shirado. 2025. Spontaneous Giving and Calculated Greed in Language Models. doi:10.48550/arXiv.2502.17720 arXiv:2502.17720 [cs]
-
[52]
Yuhan Liu, Xiuying Chen, Xiaoqing Zhang, Xing Gao, Ji Zhang, and Rui Yan. 2024. From Skepticism to Acceptance: Simulating the Attitude Dynamics Toward Fake News. InProceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence. 7849–7857. doi:10.24963/ijcai.2024/873 arXiv:2403.09498 [cs]
-
[53]
Kenneth MacCorquodale and Paul E. Meehl. 1948. On a Distinction between Hypothetical Constructs and Intervening Variables. Psychological Review55, 2 (1948), 95–107. doi:10.1037/h0056029
-
[54]
Peter Machamer, Lindley Darden, and Carl F. Craver. 2000. Thinking about Mechanisms.Philosophy of Science67, 1 (2000), 1–25. https://www.jstor.org/stable/188611 Publisher: [The University of Chicago Press, Philosophy of Science Association]
2000
-
[55]
Giordano De Marzo, Luciano Pietronero, and David Garcia. 2023. Emergence of Scale-Free Networks in Social Interactions among Large Language Models. doi:10.48550/arXiv.2312.06619 arXiv:2312.06619 [physics]
-
[56]
Michela Massimi. 2022. Perspectival Ontology: Between Situated Knowledge and Multiculturalism.The Monist105, 2 (March 2022), 214–228. doi:10.1093/monist/onab032
-
[57]
Michael D. Mauk. 2000. The potential effectiveness of simulations versus phenomenological models.Nature Neuroscience3, 7 (July 2000), 649–651. doi:10.1038/76606 Publisher: Nature Publishing Group
-
[58]
James W. McAllister. 1997. Phenomena and Patterns in Data Sets.Erkenntnis (1975-)47, 2 (1997), 217–228. jstor:20012798 doi:10.1023/A: 1005387021520
work page doi:10.1023/a: 1997
-
[59]
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model Cards for Model Reporting. InProceedings of the Conference on Fairness, Accountability, and Transparency. 220–229. doi:10.1145/3287560.3287596 arXiv:1810.03993 [cs]
-
[60]
Morgan and Margaret Morrison (Eds.)
Mary S. Morgan and Margaret Morrison (Eds.). 1999.Models as Mediators: Perspectives on Natural and Social Science. Cambridge University Press, Cambridge. doi:10.1017/CBO9780511660108
-
[61]
Robert Northcott and Anna Alexandrova. 2015. Prisoner’s Dilemma Doesn’t Explain Much. InThe Prisoner?s Dilemma. Classic philosophical arguments., Martin Peterson (Ed.). Cambridge University Press, 64–84. https://philarchive.org/rec/NORPDD
2015
-
[62]
Generative Agents: Interactive Simulacra of Human Behavior
Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2023. Generative Agents: Interactive Simulacra of Human Behavior. doi:10.48550/arXiv.2304.03442 arXiv:2304.03442 [cs]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2304.03442 2023
-
[63]
Joon Sung Park, Lindsay Popowski, Carrie Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein. 2022. Social Simulacra: Creating Populated Prototypes for Social Computing Systems.Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology(Oct. 2022), 1–18. doi:10.1145/3526113.3545616 Conference Name: UIST ’22: The 3...
-
[64]
Wendy S. Parker. 2020. Model Evaluation: An Adequacy-for-Purpose View.Philosophy of Science87, 3 (July 2020), 457–477. doi:10.1086/ 708691
2020
-
[65]
Debjit Paul, Robert West, Antoine Bosselut, and Boi Faltings. 2024. Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning. arXiv. doi:10.48550/ARXIV.2402.13950 Version Number: 4
-
[66]
Judea Pearl. 2009.Causality(2 ed.). Cambridge University Press, Cambridge. doi:10.1017/CBO9780511803161
-
[67]
2018.The book of why: The new science of cause and effect(1 ed.)
Judea Pearl and Dana Mackenzie. 2018.The book of why: The new science of cause and effect(1 ed.). Basic Books, Inc., USA
2018
-
[68]
Axel Pichler and Nils Reiter. 2022. From Concepts to Texts and Back: Operationalization as a Core Activity of Digital Humanities. Journal of Cultural Analytics7, 4 (Dec. 2022). doi:10.22148/001c.57195 FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Zhao et al
-
[69]
1953.From a Logical Point of View
Willard Van Orman Quine. 1953.From a Logical Point of View. Harvard University Press, Cambridge
1953
-
[70]
Siyue Ren, Zhiyao Cui, Ruiqi Song, Zhen Wang, and Shuyue Hu. 2024. Emergence of Social Norms in Generative Agent Societies: Principles and Architecture. doi:10.48550/arXiv.2403.08251 arXiv:2403.08251 [cs]
-
[71]
Schelling
Thomas C. Schelling. 1969. Models of Segregation.The American Economic Review59, 2 (1969), 488–493. https://www.jstor.org/stable/ 1823701 Publisher: American Economic Association
1969
-
[72]
Galit Shmueli. 2010. To Explain or to Predict?Statist. Sci.25, 3 (Aug. 2010). doi:10.1214/10-STS330
-
[73]
Flaminio Squazzoni, J. Gareth Polhill, Bruce Edmonds, Petra Ahrweiler, Patrycja Antosz, Geeske Scholz, Emile Chappin, Melania Borit, Harko Verhagen, Francesca Giardini, and Nigel Gilbert. 2020. Computational Models That Matter During a Global Pandemic Outbreak: A Call to Action.JASSS - The Journal of Artificial Societies and Social Simulation23, 2 (March ...
-
[74]
S. S. Stevens. 1935. The Operational Definition of Psychological Concepts.Psychological Review42, 6 (1935), 517–527. doi:10.1037/h0056973
-
[75]
Samarth Swarup. 2019. Adequacy: What Makes a Simulation Good Enough?. In2019 Spring Simulation Conference (SpringSim). 1–12. doi:10.23919/SpringSim.2019.8732895
-
[76]
1910.A Text-Book of Psychology
Edward Bradford Titchener. 1910.A Text-Book of Psychology. MacMillan Co, New York, NY, US. xx, 565 pages. doi:10.1037/10907-000
-
[77]
Loïs Vanhée, Melania Borit, Peer-Olaf Siebers, Roger Cremades, Christopher Frantz, Önder Gürcan, František Kalvas, Denisa Reshef Kera, Vivek Nallur, Kavin Narasimhan, and Martin Neumann. 2025. Large Language Models for Agent-Based Modelling: Current and possible uses across the modelling cycle. doi:10.48550/arXiv.2507.05723 arXiv:2507.05723 [cs] version: 1
-
[78]
Elina Vessonen. 2021. Conceptual engineering and operationalism in psychology.Synthese199, 3 (Dec. 2021), 10615–10637. doi:10.1007/ s11229-021-03261-x
2021
-
[79]
Lei Wang, Jingsen Zhang, Hao Yang, Zhi-Yuan Chen, Jiakai Tang, Zeyu Zhang, Xu Chen, Yankai Lin, Hao Sun, Ruihua Song, Xin Zhao, Jun Xu, Zhicheng Dou, Jun Wang, and Ji-Rong Wen. 2025. User Behavior Simulation with Large Language Model-based Agents.ACM Trans. Inf. Syst.43, 2 (Jan. 2025), 55:1–55:37. doi:10.1145/3708985
-
[80]
Zhilin Wang, Yu Ying Chiu, and Yu Cheung Chiu. 2023. Humanoid Agents: Platform for Simulating Human-like Generative Agents. doi:10.48550/arXiv.2310.05418 arXiv:2310.05418 [cs]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.