Recognition: unknown
WhatIf: Interactive Exploration of LLM-Powered Social Simulations for Policy Reasoning
Pith reviewed 2026-05-10 05:13 UTC · model grok-4.3
The pith
Interactive LLM social simulations let policymakers iteratively steer, compare, and inspect agent behaviors to test plans under uncertainty.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
WhatIf demonstrates that policymakers engage with LLM-powered social simulations as spaces for iterative branching and comparison, reflecting on tacit assumptions when agent behaviors violate expectations, surfacing unrecognized planning vulnerabilities, and grounding reasoning in inspectable agent-level cases, which supports designing such systems as interactive shared reasoning environments rather than offline predictive tools to better aid expert decisions under deep uncertainty.
What carries the argument
WhatIf, the interactive system supporting fluid steering of LLM agents, real-time scale, collaborative exploration, and multi-level interpretability of simulation outputs.
If this is right
- Users shift from evaluating fixed plans to iterative branching and comparison of policy options.
- Unexpected agent behaviors trigger reflection on unstated assumptions in the planners' mental models.
- Interaction surfaces previously unrecognized vulnerabilities in the proposed policies.
- Reasoning draws primarily from inspectable individual agent cases instead of aggregate statistics.
Where Pith is reading between the lines
- Combining the system with real-time external data sources could make agent responses more grounded for ongoing crises.
- The interactive approach could extend to policy domains like public health messaging or climate adaptation where population coordination is key.
- Repeated use sessions might allow users to refine the underlying simulation models based on observed discrepancies.
Load-bearing premise
The LLM-generated agent behaviors are realistic enough to prompt genuine reflection on real planning assumptions, and qualitative insights from five preparedness professionals generalize to broader policy needs.
What would settle it
A larger controlled study in which policymakers using WhatIf identify no more vulnerabilities or revise plans no more than those using static simulations or tabletop exercises would undermine the claim.
Figures
read the original abstract
Policymakers in domains such as emergency management, public health, and urban planning must make decisions under deep uncertainty, where outcomes depend on how large populations interpret information, coordinate, and adopt over time. Existing tools only partially support this process: tabletop exercises enable collaborative discussion but lack dynamic feedback, while computational simulations capture population dynamics but are designed for offline analysis. We present WhatIf, an interactive system that enables policymakers to steer, inspect, and compare LLM-powered social simulations in real time. Informed by a formative study in emergency preparedness planning, we derive four design requirements for interactive policy simulations: fluid steering, real-time scale, collaborative exploration, and multi-level interpretability. We developed WhatIf guided by these requirements and evaluated it with five preparedness professionals across three disaster evacuation scenarios. Our findings show that participants used the system as a space for iterative branching and comparison rather than evaluating fixed plans; reflected on tacit planning assumptions when agent behavior violated expectations; surfaced previously unrecognized planning vulnerabilities; and grounded their reasoning in inspectable agent-level cases rather than aggregate outputs alone. These findings suggest broader design implications for LLM-powered social simulation systems: designing such systems as interactive, shared reasoning environments -- rather than offline predictive tools -- can better support expert decision-making under deep uncertainty.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents WhatIf, an interactive system allowing policymakers to steer, inspect, and compare LLM-powered social simulations in real time for reasoning under deep uncertainty in domains like emergency management. Drawing on a formative study, it derives four design requirements (fluid steering, real-time scale, collaborative exploration, multi-level interpretability) and implements the system accordingly. A qualitative evaluation with five preparedness professionals across three disaster evacuation scenarios reports that participants used the tool for iterative branching and comparison, reflected on tacit assumptions when agent behaviors violated expectations, surfaced unrecognized vulnerabilities, and grounded reasoning in agent-level cases rather than aggregates. The authors conclude that such systems are better positioned as interactive shared reasoning environments than as offline predictive tools.
Significance. If the core findings hold, the work offers a concrete design contribution to HCI and policy informatics by demonstrating how real-time LLM simulation steering can surface planning assumptions and vulnerabilities that static or aggregate tools miss. The derivation of requirements from a domain-specific formative study and the emphasis on multi-level interpretability provide reusable guidance for future interactive simulation systems. However, the small-scale qualitative evaluation limits the strength of the broader design implications.
major comments (3)
- [Evaluation] Evaluation section: the reported findings rest on a qualitative study with five participants across three scenarios, yet no details are provided on study protocol, recruitment, session structure, data collection methods, or analysis approach. This absence makes it impossible to evaluate whether the observed reflections on tacit assumptions and vulnerabilities are attributable to the system's interactive features or to other factors.
- [Evaluation] The central claim that participants reflected on planning assumptions specifically because LLM agent behaviors violated expectations requires evidence that the generated behaviors are sufficiently realistic proxies for real populations. No validation is reported (e.g., expert ratings of trace realism against historical data, comparison to non-LLM baselines, or ablation of prompt effects), leaving open the possibility that reflections stem from LLM artifacts rather than transferable policy insights.
- [Discussion / Conclusion] The design implications (that interactive LLM simulations outperform offline tools for expert reasoning under uncertainty) are load-bearing for the paper's contribution, yet they are extrapolated from n=5 sessions without quantitative metrics, controls, or comparison conditions. This weakens the generalizability asserted in the abstract and conclusion.
minor comments (2)
- [Abstract] The abstract and introduction would benefit from a brief statement of the exact number of participants and scenarios to set expectations for the evaluation's scope.
- [System Description] Figure captions and system screenshots should explicitly label which interface elements correspond to the four design requirements (e.g., steering controls, agent inspection panels) to aid reader mapping.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting gaps in the evaluation methodology and the scope of our claims. We agree that additional details on the study protocol are necessary and that the qualitative findings from a small sample should not be overgeneralized. We will revise the manuscript to incorporate more methodological transparency and to qualify the design implications accordingly, while maintaining the focus on the interactive system's role in supporting reflection.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: the reported findings rest on a qualitative study with five participants across three scenarios, yet no details are provided on study protocol, recruitment, session structure, data collection methods, or analysis approach. This absence makes it impossible to evaluate whether the observed reflections on tacit assumptions and vulnerabilities are attributable to the system's interactive features or to other factors.
Authors: We agree that these details were omitted and should have been included. In the revised manuscript, we will expand the Evaluation section with a dedicated protocol description covering: recruitment through targeted outreach to emergency preparedness professionals; session structure as 60-minute individual sessions using a think-aloud protocol while exploring the three scenarios; data collection via screen recordings, audio transcripts, and brief post-session debriefs; and analysis via inductive thematic analysis to identify patterns in usage and reflection. These additions will allow readers to better assess the attribution of findings to the system's features. revision: yes
-
Referee: [Evaluation] The central claim that participants reflected on planning assumptions specifically because LLM agent behaviors violated expectations requires evidence that the generated behaviors are sufficiently realistic proxies for real populations. No validation is reported (e.g., expert ratings of trace realism against historical data, comparison to non-LLM baselines, or ablation of prompt effects), leaving open the possibility that reflections stem from LLM artifacts rather than transferable policy insights.
Authors: We acknowledge the validity of this point. The study did not conduct formal validation of LLM behavior realism (such as expert ratings or baseline comparisons), which is a genuine limitation. We will revise the Discussion to explicitly state that observed reflections may partly stem from LLM-specific characteristics rather than purely transferable insights, and that the system supports interrogation of assumptions within the generated simulations without claiming predictive fidelity. The contribution centers on the interactive exploration process rather than model accuracy. revision: partial
-
Referee: [Discussion / Conclusion] The design implications (that interactive LLM simulations outperform offline tools for expert reasoning under uncertainty) are load-bearing for the paper's contribution, yet they are extrapolated from n=5 sessions without quantitative metrics, controls, or comparison conditions. This weakens the generalizability asserted in the abstract and conclusion.
Authors: The work is framed as a qualitative design study, not a controlled comparative experiment. We will revise the abstract and conclusion to remove or qualify any implication of broad outperformance or generalizability, instead describing the findings as preliminary patterns observed in a small expert sample and calling for future studies with quantitative measures and controls. The reusable design requirements derived from the formative study remain the central contribution. revision: yes
- The request for validation evidence (e.g., expert ratings or baseline comparisons) demonstrating that LLM agent behaviors are realistic proxies for real populations, as no such validation was included in the study.
Circularity Check
No circularity: qualitative system design and user study
full rationale
The paper presents WhatIf, an interactive system for LLM-powered social simulations, with design requirements derived from a formative study and evaluated via qualitative sessions with five professionals across three scenarios. Claims about usage patterns, reflection on assumptions, and design implications rest directly on reported participant behaviors and observations. No equations, fitted parameters, predictions, or derivations appear that could reduce to self-citations or inputs by construction. Self-citations, if present, are not load-bearing for the central claims, which are grounded in the new user study rather than prior author work.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Juliane Adrian, Armin Seyfried, and Anna Sieben. 2020. Crowds in front of bottlenecks at entrances from the perspective of physics and social psychology. Journal of the Royal Society Interface17, 165 (2020)
2020
-
[2]
Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, et al. 2019. Guidelines for human-AI interaction. InProceedings of the 2019 chi conference on human factors in computing systems. 1–13
2019
-
[3]
Lee, and Nikola Banovic
Anindya Das Antar, Somayeh Molaei, Yan-Ying Chen, Matthew L. Lee, and Nikola Banovic. 2024. VIME: Visual Interactive Model Explorer for Identifying Capabilities and Limitations of Machine Learning Models for Sequential Decision- Making.Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology(2024). https://api.semanticscholar...
2024
-
[4]
Louis Bavoil, Steven P Callahan, Patricia J Crossno, Juliana Freire, Carlos E Scheidegger, Cláudio T Silva, and Huy T Vo. 2005. Vistrails: Enabling interactive multiple-view visualizations. InVIS 05. IEEE Visualization, 2005.IEEE, 135–142
2005
-
[5]
Virginia Braun and Victoria Clarke. 2006. Using thematic analy- sis in psychology.Qualitative Research in Psychology3, 2 (2006), 77–101. arXiv:https://doi.org/10.1191/1478088706qp063oa doi:10.1191/ 1478088706qp063oa
-
[6]
John Brooke et al. 1996. SUS-A quick and dirty usability scale.Usability evaluation in industry189, 194 (1996), 4–7
1996
-
[7]
Vanessa Colella, Richard Borovoy, and Mitchel Resnick. 1998. Participatory simulations: Using computational objects to learn about dynamic systems. In CHI 98 conference summary on human factors in computing systems. 9–10
1998
-
[8]
Zach Cutler, Kiran Gadhave, and Alexander Lex. 2020. Trrack: A library for provenance-tracking in web-based visualizations. In2020 IEEE Visualization Conference (VIS). IEEE, 116–120
2020
-
[9]
Pei Dang, Jun Zhu, Weilian Li, Yakun Xie, and Heng Zhang. 2025. Large-language- model-driven agents for fire evacuation simulation in a cellular automata envi- ronment.Safety Science191 (2025), 106935
2025
-
[10]
David J Dausey, James W Buehler, and Nicole Lurie. 2007. Designing and con- ducting tabletop exercises to assess public health preparedness for manmade and naturally occurring biological threats.BMC public health7, 1 (2007), 92
2007
-
[11]
Upol Ehsan, Koustuv Saha, Munmun De Choudhury, and Mark O Riedl. 2023. Charting the sociotechnical gap in explainable AI: A framework to address the gap in XAI.Proceedings of the ACM on human-computer interaction7, CSCW1 (2023), 1–32
2023
-
[12]
Alex Endert, M Shahriar Hossain, Naren Ramakrishnan, Chris North, Patrick Fiaux, and Christopher Andrews. 2014. The human is the loop: new directions for visual analytics.Journal of intelligent information systems43, 3 (2014), 411–435
2014
-
[13]
1970.Designing for pedestrians a level of service concept
John Joseph Fruin. 1970.Designing for pedestrians a level of service concept. Polytechnic University
1970
-
[14]
Chen Gao, Xiaochong Lan, Nian Li, Yuan Yuan, Jingtao Ding, Zhilun Zhou, Fengli Xu, and Yong Li. 2024. Large language models empowered agent-based modeling and simulation: A survey and perspectives.Humanities and Social Sciences Communications11, 1 (2024), 1–24
2024
-
[15]
Stan Geertman and John Stillwell. 2020. Planning support science: Developments and challenges.Environment and Planning B: Urban Analytics and City Science 47, 8 (2020), 1326–1342
2020
-
[16]
Stan C. M. Geertman. 2008. Planning Support Systems: A Planner’s Perspective. InPlanning Support Systems for Cities and Regions, Richard K. Brail (Ed.). Lincoln Institute of Land Policy, 213–230
2008
-
[17]
Abhinav Golas, Rahul Narain, and Ming C Lin. 2014. Continuum modeling of crowd turbulence.Physical Review E-Statistical, Nonlinear, and Soft Matter Physics 90, 4 (2014)
2014
-
[18]
1998.Guide for all-hazard emergency operations planning
Kay C Goss. 1998.Guide for all-hazard emergency operations planning. DIANE Publishing
1998
- [19]
-
[20]
1966.The hidden dimension
Edmund T Hall and Edward T Hall. 1966.The hidden dimension. Vol. 609. Anchor
1966
-
[21]
Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. InAdvances in psy- chology. Vol. 52. Elsevier, 139–183
1988
-
[22]
Jeffrey Heer, Fernanda B Viégas, and Martin Wattenberg. 2007. Voyagers and voyeurs: supporting asynchronous collaborative information visualization. In Proceedings of the SIGCHI conference on Human factors in computing systems. 1029–1038
2007
-
[23]
Dirk Helbing, Illés Farkas, and Tamas Vicsek. 2000. Simulating dynamical features of escape panic.Nature407, 6803 (2000), 487–490
2000
-
[24]
Dirk Helbing, Anders Johansson, and Habib Zein Al-Abideen. 2007. Dynamics of crowd disasters: An empirical study.Physical Review E—Statistical, Nonlinear, and Soft Matter Physics75, 4 (2007), 046109
2007
-
[25]
Dirk Helbing and Peter Molnar. 1995. Social force model for pedestrian dynamics. Physical review E51, 5 (1995), 4282
1995
-
[26]
Erzhen Hu, Yanhe Chen, Mingyi Li, Vrushank Phadnis, Pingmei Xu, Xun Qian, Alex Olwal, David Kim, Seongkook Heo, and Ruofei Du. 2025. DialogLab: Au- thoring, Simulating, and Testing Dynamic Human-AI Group Conversations. In Proceedings of the 38th Annual ACM Symposium on User Interface Software and Technology (UIST ’25). Association for Computing Machinery,...
-
[27]
Hilary Hutchinson, Wendy Mackay, Bo Westerlund, Benjamin B Bederson, Al- lison Druin, Catherine Plaisant, Michel Beaudouin-Lafon, Stéphane Conversy, Helen Evans, Heiko Hansen, et al. 2003. Technology probes: inspiring design for and with families. InProceedings of the SIGCHI conference on Human factors in computing systems. 17–24
2003
-
[28]
Petra Isenberg, Danyel Fisher, Sharoda A Paul, Meredith Ringel Morris, Kori Inkpen, and Mary Czerwinski. 2011. Co-located collaborative visual analytics around a tabletop display.IEEE Transactions on visualization and Computer Graphics18, 5 (2011), 689–702
2011
-
[29]
Can large language model agents simulate human trust behavior? , isbn =
Feiran Jia, Ziyu Ye, Shiyang Lai, Kai Shu, Jindong Gu, Adel Bibi, Ziniu Hu, David Jurgens, James Evans, Philip H.S. Torr, Bernard Ghanem, Guohao Li, Chengxing Xie, and Canyu Chen. 2024. Can Large Language Model Agents Simulate Human Trust Behavior?. InAdvances in Neural Information Processing Systems, A. Glober- son, L. Mackey, D. Belgrave, A. Fan, U. Paq...
-
[30]
Elinore J Kaufman, Douglas J Wiebe, Ruiying Aria Xiong, Christopher N Morrison, Mark J Seamon, and M Kit Delgado. 2021. Epidemiologic trends in fatal and nonfatal firearm injuries in the US, 2009-2017.JAMA internal medicine181, 2 (2021), 237–244
2021
-
[31]
Max T Kinateder, Erica D Kuligowski, Paul A Reneke, and Richard D Peacock
-
[32]
Risk perception in fire evacuation behavior revisited: definitions, related concepts, and empirical evidence.Fire science reviews4, 1 (2015), 1
2015
-
[33]
Richard E Klosterman. 1999. The what if? Collaborative planning support system. Environment and planning B: Planning and design26, 3 (1999), 393–408
1999
-
[34]
Robert Lempert. 2002. Agent-based modeling as organizational and public policy simulators.Proceedings of the national academy of sciences99, suppl_3 (2002), 7195–7196
2002
-
[35]
Guohao Li, Hasan Hammoud, Hani Itani, Dmitrii Khizbullin, and Bernard Ghanem. 2023. Camel: Communicative agents for" mind" exploration of large language model society.Advances in neural information processing systems36 (2023), 51991–52008
2023
- [36]
- [37]
-
[38]
Yuxuan Li, Hirokazu Shirado, and Sauvik Das. 2025. Actions speak louder than words: Agent decisions reveal implicit biases in language models. InProceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. 3303– 3325
2025
-
[39]
Michael K Lindell and Ronald W Perry. 2012. The protective action decision model: Theoretical modifications and additional evidence.Risk Analysis: An International Journal32, 4 (2012), 616–632
2012
-
[40]
Xiao Liu, Hao Yu, Hanchen Zhang, Yifan Xu, Xuanyu Lei, Hanyu Lai, Yu Gu, Yuxian Gu, Han Ding, Kai Men, Kejuan Yang, Shudan Zhang, Xiang Deng, Aohan Zeng, Zhengxiao Du, Chenhui Zhang, Shengqi Shen, Tianjun Zhang, Sheng Shen, Yu Su, Huan Sun, Minlie Huang, Yuxiao Dong, and Jie Tang
-
[41]
AgentBench: Evaluating LLMs as Agents
AgentBench: Evaluating LLMs as Agents.ArXivabs/2308.03688 (2023). https://api.semanticscholar.org/CorpusID:260682249
work page internal anchor Pith review arXiv 2023
-
[42]
Charles M Macal and Michael J North. 2005. Tutorial on agent-based modeling and simulation. InProceedings of the Winter Simulation Conference, 2005.IEEE, 14–pp
2005
-
[43]
Sally Maitlis. 2005. The social processes of organizational sensemaking.Academy of management journal48, 1 (2005), 21–49
2005
-
[44]
VAWJ Marchau, Warren E Walker, PJTM Bloemen, and Steven W Popper. 2019. Decision making under deep uncertainty: from theory to practice.(No Title) Preprint, , Li et al. (2019)
2019
-
[45]
Jurriaan D Mulder, Robert van Liere, and Jarke J van Wijk. 1998. Computational steering in the CAVE.Future Generation Computer Systems14, 3-4 (1998), 199– 207
1998
-
[46]
Arpit Narechania, Arjun Srinivasan, and John T. Stasko. 2020. NL4DV: A Toolkit for Generating Analytic Specifications for Data Visualization from Natural Lan- guage Queries.IEEE Transactions on Visualization and Computer Graphics27 (2020), 369–379. https://api.semanticscholar.org/CorpusID:221292836
2020
-
[47]
Vittorio Nespeca, Tina Comes, and Frances Brazier. 2023. A methodology to develop agent-based models for policy support via qualitative inquiry.Journal of Artificial Societies and Social Simulation26, 1 (2023)
2023
-
[48]
Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. 2023. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th annual acm symposium on user interface software and technology. 1–22
2023
-
[49]
Zou, Aaron Shaw, Benjamin Mako Hill, Carrie Jun Cai, Meredith Ringel Morris, Robb Willer, Percy Liang, and Michael S
Joon Sung Park, Carolyn Q. Zou, Aaron Shaw, Benjamin Mako Hill, Carrie Jun Cai, Meredith Ringel Morris, Robb Willer, Percy Liang, and Michael S. Bernstein
-
[50]
LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals
Generative Agent Simulations of 1,000 People.ArXivabs/2411.10109 (2024). https://api.semanticscholar.org/CorpusID:274117080
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[51]
Seung In Park, Yong Cao, and Francis K. H. Quek. 2011. Large Scale Crowd Sim- ulation Using A Hybrid Agent Model. https://api.semanticscholar.org/CorpusID: 16115924
2011
-
[52]
Peter Pelzer, Stan Geertman, Robert van der Heijden, and Etienne A. J. A. Rouwette. 2014. The Added Value of Planning Support Systems: A Practi- tioner’s Perspective.Computers, Environment and Urban Systems48 (2014), 16–27. doi:10.1016/j.compenvurbsys.2014.05.002
-
[53]
Liliana Perez and Suzana Dragicevic. 2009. An agent-based approach for modeling dynamics of contagious disease spread.International journal of health geographics 8, 1 (2009), 50
2009
-
[54]
Jing Piao, Yuwei Yan, Jun Zhang, Nian Li, Junbo Yan, Xiaochong Lan, Zhihong Lu, Zhiheng Zheng, Jing Yi Wang, Di Zhou, Chen Gao, Fengli Xu, Fang Zhang, Ke Rong, Jun Su, and Yong Li. 2025. AgentSociety: Large-Scale Simulation of LLM- Driven Generative Agents Advances Understanding of Human Behaviors and Society.ArXivabs/2502.08691 (2025). https://api.semant...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[55]
Jaziar Radianti, Santiago Gil Martinez, Bjørn Erik Munkvold, and Morgan Konnes- tad. 2018. Co-design of a virtual training tool with emergency management stakeholders for extreme weather response. Ininternational conference of design, user experience, and usability. Springer, 185–202
2018
-
[56]
Audrey Reinert, Luke S Snyder, Jieqiong Zhao, Andrew S Fox, Dean F Hougen, Charles Nicholson, and David S Ebert. 2020. Visual analytics for decision-making during pandemics.Computing in Science & Engineering22, 6 (2020), 48–59
2020
-
[57]
Gayani PDP Senanayake, Minh Kieu, Yang Zou, and Kim Dirks. 2024. Agent-based simulation for pedestrian evacuation: A systematic literature review.International journal of disaster risk reduction111 (2024), 104705
2024
-
[58]
Armin Seyfried, Bernhard Steffen, Wolfram Klingsch, and Maik Boltes. 2005. The fundamental diagram of pedestrian movement revisited.Journal of Statistical Mechanics: Theory and Experiment2005, 10 (2005), P10002–P10002
2005
-
[59]
Aleksandra Solinska-Nowak, Piotr Magnuszewski, Margot Curl, Adam French, Adriana Keating, Junko Mochizuki, Wei Liu, Reinhard Mechler, Michalina Ku- lakowska, and Lukasz Jarzabek. 2018. An overview of serious games for disaster risk management–Prospects and limitations for informing actions to arrest in- creasing risk.International journal of disaster risk...
2018
-
[60]
2014.Introduction to crowd science
G Keith Still. 2014.Introduction to crowd science. CRC Press
2014
-
[61]
Jur Van Den Berg, Stephen J Guy, Ming Lin, and Dinesh Manocha. 2011. Reciprocal n-body collision avoidance. InRobotics research: the 14th international symposium ISRR. Springer, 3–19
2011
-
[62]
Fernanda B Viegas, Martin Wattenberg, Frank Van Ham, Jesse Kriss, and Matt McKeon. 2007. Manyeyes: a site for visualization at internet scale.IEEE transac- tions on visualization and computer graphics13, 6 (2007), 1121–1128
2007
-
[63]
Warren E Walker, Robert J Lempert, and Jan H Kwakkel. 2012. Deep uncertainty. Delft University of Technology1, 2 (2012), 1
2012
-
[64]
Ulrich Weidmann. 1993. Transporttechnik der fussgänger.Schriftenreihe des IVT 90, 2 (1993)
1993
-
[65]
James Wexler, Mahima Pushkarna, Tolga Bolukbasi, Martin Wattenberg, Fernanda Viégas, and Jimbo Wilson. 2019. The what-if tool: Interactive probing of machine learning models.IEEE transactions on visualization and computer graphics26, 1 (2019), 56–65
2019
-
[66]
Uri Wilensky and Walter Stroup. 1999. Learning Through Participatory Simula- tions: Network-Based Design for Systems Learning in Classrooms. InProceedings of CSCL ’99
1999
-
[67]
Zhiqiang Xie, Hao Kang, Ying Sheng, Tushar Krishna, Kayvon Fatahalian, and Christos Kozyrakis. 2024. AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution.ArXivabs/2411.03519 (2024). https://api.semanticscholar.org/CorpusID:273850266
-
[68]
Yuwei Yan, Yu Shang, Qingbin Zeng, Yu Li, Keyu Zhao, Zhiheng Zheng, Xuefei Ning, Tianji Wu, Shengen Yan, Yu Wang, Fengli Xu, and Yong Li. 2025. AgentSo- ciety Challenge: Designing LLM Agents for User Modeling and Recommendation on Web Platforms.Companion Proceedings of the ACM on Web Conference 2025 (2025). https://api.semanticscholar.org/CorpusID:276617682
2025
-
[69]
Yongchao Zeng, Calum Brown, and Mark Rounsevell. 2026. Too human to model: the uncanny valley of large language models in simulating human systems.npj Complexity3 (2026). https://api.semanticscholar.org/CorpusID:286172750 A Representative Quotes from the Formative Study A.1 DR1: Fluid Steering On how differently situated recipients interpret the same safe...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.