An Infectious Disease Spread Simulation Based on Large Language Model Decision Making
Pith reviewed 2026-06-28 01:23 UTC · model grok-4.3
The pith
Income and education levels dominate variation in self-reported illness rates within an LLM-driven simulation of disease behavior across real city populations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By embedding LLM-generated decisions inside a spatially explicit synthetic population drawn from census data, the simulation demonstrates that income and education are the dominant drivers of self-reporting rates for influenza-like illness, with geography, LLM model choice, and message framing exerting smaller but consistent effects across San Francisco and Atlanta.
What carries the argument
Spatially grounded agent-based simulation framework that assigns LLM agents to census-based synthetic populations and generates self-reporting decisions under independent reasoning, household influence, or message-framing scenarios.
If this is right
- The generated synthetic data can directly support spatial epidemiological models that require heterogeneous behavioral inputs.
- Public-health analysts can use the framework to test how different message framings shift reporting rates before field deployment.
- Running the same agents under household-influence versus independent-reasoning rules isolates the contribution of social context to reporting behavior.
- Comparing outputs across two cities isolates the separate contribution of geographic distribution of demographic groups.
Where Pith is reading between the lines
- The same LLM-agent approach could be applied to other health behaviors such as vaccine uptake or mask compliance if demographic prompts are adjusted.
- Validation against real-time mobility or survey data would be needed to check whether the simulated spatial patterns match observed ones.
- If LLM decision quality improves with newer models, the relative influence of income and education might shift or become more stable.
Load-bearing premise
Large language models can produce realistic human decisions when given only demographic profiles and situational context.
What would settle it
Empirical data from actual residents of San Francisco and Atlanta showing that reporting rates do not vary primarily by income and education in the same pattern the LLM simulation produces.
Figures
read the original abstract
Modelling individual decision-making during infectious disease outbreaks is crucial for understanding behavioural dynamics and informing effective public health interventions. Prior work has shown that large language models can simulate realistic human behaviour by generating agent decisions based on demographic prompts and situational context. We build on this foundation with a spatially grounded, agent-based simulation framework that integrates LLM-generated decisions about self-reported influenza-like illness into a census-based synthetic population of agents. Location is treated as a central feature: agents are assigned to spatial units within cities, capturing the spatial distributions of different demographic groups using real-world census data and enabling geographically diverse behavioural modelling. We implement and compare three decision scenarios, independent reasoning, household influence, and message framing, and simulate self-reporting outcomes in San Francisco and Atlanta. Results reveal that income and education are the dominant drivers of reporting rate variation, with smaller but consistent effects from geography, LLM model choice, and message framing. Our framework generates synthetic data that captures both social and geographic heterogeneity, supporting spatial epidemiological modelling and bias-aware behavioural analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a spatially explicit agent-based simulation that assigns LLM-generated decisions on self-reporting influenza-like illness to agents in a census-derived synthetic population of San Francisco and Atlanta. Three decision scenarios (independent reasoning, household influence, message framing) are compared, with location treated as a core feature via real census spatial distributions. The central result is that income and education dominate variation in simulated reporting rates, while geography, LLM model choice, and framing exert smaller but consistent effects; the framework is positioned as a generator of synthetic data for spatial epidemiological modeling.
Significance. If the LLM behavioral mapping is internally consistent, the approach supplies a reproducible way to produce synthetic reporting data that embeds both demographic and geographic heterogeneity without new field collection. The use of real census data for spatial assignment and the explicit scenario comparisons are concrete strengths that could support downstream bias-aware epi models.
major comments (2)
- [Methods] Methods (LLM decision pipeline): the manuscript provides no description of prompt templates, temperature or sampling parameters, output parsing rules, or aggregation across multiple LLM calls per agent. These choices directly determine the reported effects of model choice and message framing and must be specified for the dominance claim to be reproducible.
- [Results] Results (driver dominance): the statement that income and education are the 'dominant drivers' is presented without supporting quantitative evidence such as regression coefficients, partial R² values, or variance decomposition across the simulated runs. It is therefore impossible to verify the relative magnitude of effects or to assess whether the smaller geography/LLM/framing effects are statistically distinguishable.
minor comments (1)
- [Title/Abstract] The title refers to an 'Infectious Disease Spread Simulation' yet the implemented model stops at self-reporting decisions; a clarifying sentence on whether transmission dynamics are actually simulated would improve scope alignment.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address each major comment below and will revise the manuscript to improve reproducibility and quantitative rigor.
read point-by-point responses
-
Referee: [Methods] Methods (LLM decision pipeline): the manuscript provides no description of prompt templates, temperature or sampling parameters, output parsing rules, or aggregation across multiple LLM calls per agent. These choices directly determine the reported effects of model choice and message framing and must be specified for the dominance claim to be reproducible.
Authors: We agree that the LLM decision pipeline requires full specification for reproducibility. The revised manuscript will add the complete prompt templates for each of the three scenarios, the exact temperature and sampling parameters used (e.g., temperature=0.7, top_p=1.0), the output parsing procedure (keyword extraction from structured responses), and confirmation that a single call was issued per agent with no aggregation or ensembling. revision: yes
-
Referee: [Results] Results (driver dominance): the statement that income and education are the 'dominant drivers' is presented without supporting quantitative evidence such as regression coefficients, partial R² values, or variance decomposition across the simulated runs. It is therefore impossible to verify the relative magnitude of effects or to assess whether the smaller geography/LLM/framing effects are statistically distinguishable.
Authors: We acknowledge that the dominance claim currently lacks quantitative backing. In the revision we will add logistic regression models fitted to the simulated reporting outcomes, report the resulting coefficients and partial R² values for income, education, geography, model choice, and framing, and include a variance decomposition across runs to quantify relative contributions and test whether the smaller effects are statistically distinguishable from zero. revision: yes
Circularity Check
No significant circularity; simulation generates synthetic outputs from stated assumptions
full rationale
The paper describes an agent-based simulation that feeds demographic and spatial prompts into LLMs to produce self-reporting decisions, then reports observed patterns (income/education as dominant drivers) inside those synthetic runs. No equations, fitted parameters, or self-citations are shown that reduce the reported results to the inputs by construction. The framework explicitly builds on external prior demonstrations of LLM behavioral simulation rather than deriving that capability internally. The central claim is therefore an empirical observation within the model, not a definitional or fitted tautology. This is a standard modeling exercise whose internal consistency does not require external ground truth for the reported synthetic patterns.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Deepak Bhaskar Acharya, Karthigeyan Kuppan, and B Divya. 2025. Agentic AI: Autonomous Intelligence for Complex Goals–A Comprehensive Survey.IEEE Access(2025)
2025
-
[2]
Icek Ajzen. 1991. The theory of planned behavior.Organizational behavior and human decision processes50, 2 (1991), 179–211
1991
-
[3]
Taylor Anderson and Suzana Dragićević. 2020. NEAT approach for testing and validation of geospatial network agent-based model processes: case study of influenza spread.IJGIS34, 9 (2020), 1792–1821
2020
-
[4]
Christopher Antoun, Chan Zhang, et al. 2016. Comparisons of online recruitment strategies for convenience samples: Craigslist, Google AdWords, Facebook, and Amazon Mechanical Turk.Field methods28, 3 (2016), 231–246
2016
-
[5]
Pierre-Yves Boëlle, Cécile Souty, Titouan Launay, et al. 2020. Excess cases of influenza-like illnesses synchronous with coronavirus disease (COVID-19) epi- demic, France, March 2020.Eurosurveillance25, 14 (2020), 2000326
2020
-
[6]
Catherine Dodds and Ibidun Fakoya. 2020. Covid-19: ensuring equality of access to testing for ethnic minorities.Bmj369 (2020)
2020
-
[7]
Justin Elarde, Joon-Seok Kim, Hamdi Kavak, Andreas Züfle, and Taylor Anderson
-
[8]
PloS one16, 11 (2021), e0259031
Change of human mobility during COVID-19: A United States case study. PloS one16, 11 (2021), e0259031
2021
-
[9]
Olga Gkountouna, Dieter Pfoser, and Andreas Züfle. 2020. Traffic flow estimation using probe vehicle data. In2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 579–588
2020
-
[10]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, et al. 2024. The llama 3 herd of models.arXiv preprint arXiv:2407.21783 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[11]
Gareth J Griffith, Tim T Morris, Matthew J Tudball, et al . 2020. Collider bias undermines our understanding of COVID-19 disease risk and severity.Nature communications11, 1 (2020), 5749
2020
-
[12]
Melanie Henwood. 2020. Care home deaths: The untold and largely unrecorded tragedy of COVID-19.British Policy and Politics at LSE(2020)
2020
-
[13]
Leslie, Hamdi Kavak, and Andreas Züfle
Samiul Islam, Dhruv Gandhi, Justin Elarde, Taylor Anderson, Amira Roess, Tim- othy F. Leslie, Hamdi Kavak, and Andreas Züfle. 2021. Spatiotemporal Prediction of Foot Traffic. InACM SIGSPATIAL LocalRec Workshop
2021
-
[14]
Albert Q Jiang, A Sablayrolles, A Mensch, C Bamford, D Singh Chaplot, Ddl Casas, F Bressand, G Lengyel, G Lample, L Saulnier, et al. 2023. Mistral 7b. arxiv. arXiv preprint arXiv:2310.0682510 (2023), 3
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[15]
Arie Kapteyn, Marco Angrisani, Jill Darling, and Tania Gutsche. 2024. The Understanding America Study (UAS).BMJ Open14, 10 (2024)
2024
-
[16]
William Ogilvy Kermack and Anderson G McKendrick. 1932. Contributions to the mathematical theory of epidemics. II. The problem of endemicity.Proceedings of the Royal Society of London. Series A, containing papers of a mathematical and physical character138, 834 (1932), 55–83
1932
-
[17]
Cliff C Kerr, Robyn M Stuart, Dina Mistry, Romesh G Abeysuriya, Katherine Rosenfeld, Gregory R Hart, Rafael C Núñez, Jamie A Cohen, Prashanth Selvaraj, Brittany Hagedorn, et al. 2021. Covasim: an agent-based model of COVID-19 dynamics and interventions.PLoS computational biology17, 7 (2021), e1009149
2021
-
[18]
Joon-Seok Kim, Hyunjee Jin, Hamdi Kavak, Ovi Chris Rouly, Andrew Crooks, Dieter Pfoser, Carola Wenk, and Andreas Züfle. 2020. Location-based social network data generation based on patterns of life. In2020 21st IEEE International Conference on Mobile Data Management (MDM). IEEE, 158–167
2020
-
[19]
Ruochen Kong, Taylor Anderson, David Heslop, and Andreas Zufle. 2024. An Infectious Disease Spread Simulation to Control Data Bias. InProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems. 681–684
2024
-
[20]
Ruochen Kong, Taylor Anderson, Matthew Scotch, David J Heslop, Yonchanok Khaokaew, Hao Xue, Li Xiong, Chandini Raina MacIntyre, Flora D Salim, and Andreas Züfle. 2025. Simulated Infectious Diseases Datasets with Controlled Data Bias. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 5551–5559
2025
-
[21]
2020.The Geographic Spread of COVID- 19 Correlates with Structure of Social Networks as Measured by Facebook (2020)
T Kuchler, D Russel, and J Stroebel. 2020.The Geographic Spread of COVID- 19 Correlates with Structure of Social Networks as Measured by Facebook (2020). Technical Report. CESifo Working Paper
2020
-
[22]
Eric Lin, Jinhyung D Park, and Andreas Züfle. 2017. Real-time bayesian micro- analysis for metro traffic prediction. InProceedings of the 3rd ACM SIGSPATIAL Workshop on Smart Cities and Urban Analytics. 1–4
2017
-
[23]
Yang Liu, Zhiyuan Liu, and Ruo Jia. 2019. DeepPF: A deep learning based ar- chitecture for metro passenger flow prediction.Transportation Research Part C: Emerging Technologies101 (2019), 18–34
2019
-
[24]
Wang Ma, Xiang Huo, and Minghao Zhou. 2018. The healthcare seeking rate of individuals with influenza like illness: a meta-analysis.Infectious Diseases50, 10 (2018), 728–735
2018
-
[25]
Abraham Harold Maslow. 1943. A theory of human motivation.Psychological review50, 4 (1943), 370
1943
-
[26]
Aaloke Mody, Kristin Pfeifauf, Cory Bradley, Branson Fox, Matifadza G Hlatshwayo, Will Ross, Vetta Sanders-Thompson, Karen Joynt Maddox, Mat Reidhead, Mario Schootman, et al. 2021. Understanding drivers of coronavirus disease 2019 (COVID-19) racial disparities: a population-level analysis of COVID- 19 testing among Black and White populations.Clinical Inf...
2021
-
[27]
David J Muscatello, Abrar A Chughtai, Anita Heywood, Lauren M Gardner, David J Heslop, and C Raina MacIntyre. 2017. Translation of real-time infectious disease modeling into routine public health practice.Emerging infectious diseases 23, 5 (2017)
2017
-
[28]
Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th annual acm symposium on user interface software and technology. 1–22
2023
-
[29]
John Pesavento, Andy Chen, Rayan Yu, Joon-Seok Kim, Hamdi Kavak, Taylor Anderson, and Andreas Züfle. 2020. Data-driven mobility models for COVID-19 simulation. InACM SIGSPATIAL ARIC Workshop. 29–38
2020
-
[30]
Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, et al. 2024. Chatdev: Communicative agents for software development. InProceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: Long papers). 15174–15186
2024
-
[31]
Chen Qian, Zihao Xie, Yifei Wang, Wei Liu, Kunlun Zhu, Hanchen Xia, Yufan Dang, Zhuoyun Du, Weize Chen, Cheng Yang, et al. 2025. Scaling large language model-based multi-agent collaboration. InInternational Conference on Learning Representations, Vol. 2025. 41488–41505
2025
-
[32]
Murray Shanahan, Kyle McDonell, and Laria Reynolds. 2023. Role play with large language models.Nature623, 7987 (2023), 493–498
2023
-
[33]
Jack Snowdon, Olga Gkountouna, Andreas Züfle, and Dieter Pfoser. 2018. Spa- tiotemporal traffic volume estimation model based on GPS samples. InACM SIGMOD GeoRich Workshop. 1–6
2018
-
[34]
Ross Taylor, Marcin Kardas, Guillem Cucurull, Thomas Scialom, Anthony Hartshorn, Elvis Saravia, Andrew Poulton, Viktor Kerkez, and Robert Sto- jnic. 2022. Galactica: A large language model for science.arXiv preprint arXiv:2211.09085(2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[35]
Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, et al. 2024. Gemma 2: Improving open language models at a practical size.arXiv preprint arXiv:2408.00118(2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[36]
Jerome I Tokars, Sonja J Olsen, and Carrie Reed. 2018. Seasonal incidence of symptomatic influenza in the United States.Clinical Infectious Diseases66, 10 (2018), 1511–1518
2018
-
[37]
Alma Tostmann, John Bradley, et al. 2020. Strong associations and moderate pre- dictive value of early symptoms for SARS-CoV-2 test positivity among healthcare workers, the Netherlands, March 2020.Eurosurveillance25, 16 (2020), 2000508
2020
-
[38]
Jessica Tyrrell, Jie Zheng, et al . 2021. Genetic predictors of participation in optional components of UK Biobank.Nature communications12, 1 (2021), 886
2021
-
[39]
Emma Von Hoene, Amira Roess, Shivani Achuthan, and Taylor Anderson. 2023. A framework for simulating emergent health behaviors in spatial agent-based models of disease spread. InProceedings of the 6th ACM SIGSPATIAL International Workshop on GeoSpatial Simulation. 1–9
2023
- [40]
-
[41]
Zhiheng Xi, Wenxiang Chen, Xin Guo, Wei He, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, et al. 2025. The rise and potential of large language model based agents: A survey.Science China Information Sciences 68, 2 (2025), 121101
2025
- [42]
-
[43]
Jie Xu, Dingxiong Deng, Ugur Demiryurek, Cyrus Shahabi, and Mihaela Van der Schaar. 2015. Mining the situation: Spatiotemporal traffic prediction with big data.IEEE Journal of Selected Topics in Signal Processing9, 4 (2015), 702–715
2015
-
[44]
Yinjie Zhu, Ming-Jie Duan, Hermien H Dijk, Roel D Freriks, Louise H Dekker, and Jochen O Mierau. 2021. Association between socioeconomic status and self-reported, tested and diagnosed COVID-19 status during the first wave in the Northern Netherlands: a general population-based cohort from 49 474 adults. BMJ open11, 3 (2021), e048020
2021
-
[45]
Andreas Züfle, Dieter Pfoser, Carola Wenk, et al. 2024. In Silico Human Mobility Data Science: Leveraging Massive Simulated Mobility Data (Vision Paper).ACM Transactions on Spatial Algorithms and Systems10, 2 (2024), 1–27
2024
-
[46]
Andreas Züfle, Flora Salim, Taylor Anderson, et al. 2024. Leveraging Simulation Data to Understand Bias in Predictive Models of Infectious Disease Spread.ACM Transactions on Spatial Algorithms and Systems10, 2 (2024), 1–22
2024
-
[47]
Not reporting your symptoms could result in wors- ening health, delayed treatment, and potential long-term complications
Andreas Züfle, Carola Wenk, Dieter Pfoser, Andrew Crooks, Joon-Seok Kim, Hamdi Kavak, Umar Manzoor, and Hyunjee Jin. 2023. Urban life: a model of people and places.Computational and Mathematical Organization Theory29, 1 (2023), 20–51. An Infectious Disease Spread Simulation Based on Large Language Model Decision Making KDD ’26, August 09–13, 2026, Jeju Is...
2023
-
[48]
Your background and personal circumstances are as follows: [You are under [AGE] years old, [GENDER] of [RACE] ethnicity living in [CITY]
Confidence Level: (Very Certain, Somewhat Certain, Uncertain) Prompt 2 Imagine yourself in the following situation: [From January to March 2030, a new flu strain, NEW FLU, emerged in this country, leading to the first reported cases and the World Health Organisation (WHO) declaring a pandemic.]. Your background and personal circumstances are as follows: [...
2030
-
[50]
Brief Reason: [one sentence, explain to me the rationale behind why you made this decision.] KDD ’26, August 09–13, 2026, Jeju Island, Republic of Korea Khaokaew et al. Prompt 3 Imagine yourself in the following situation: [From January to March 2030, a new flu strain, NEW FLU, emerged in this country, leading to the first reported cases and the World Hea...
2026
-
[53]
Personal Risk: Information provided by public health authorities at this time suggest that the mortality rate is around 1%
Brief Reason: [one sentence, explain to me the rationale behind why you made this decision.] Prompt 4 Imagine yourself in the following situation: [From January to March 2030, a new flu strain, NEW FLU, emerged in this country, leading to the first reported cases and the World Health Organisation (WHO) declaring a pandemic.]. Personal Risk: Information pr...
2030
-
[56]
Prompt 5 cont
Brief Reason: [one sentence, explain to me the rationale behind why you made this decision.] Prompt 5 Imagine yourself in the following situation: [From January to March 2030, a new flu strain, NEW FLU, emerged in this country, leading to the first reported cases and the World Health Organisation (WHO) declaring a pandemic.]. Prompt 5 cont. Personal Risk:...
2030
-
[57]
Confidence Level: (Very Certain, Somewhat Certain, Uncertain)
-
[58]
Reporting rate: [0-100% based on the persona]
-
[59]
Brief Reason: [one sentence, explain to me the rationale behind why you made this decision.] A.4 Demographic Reporting Rates Table 4 shows mean reporting rates and 95% CIs aggregated across five prompt variants. The wide CIs are driven by Prompt 5 (model- ing household influence via pre-generated decision banks), which consistently reduces the number of c...
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.