A systematic audit of LLM-based AI societies finds that 89.7% of 39 studies violate at least one of six PIMMUR validity principles, with reproductions showing that many claimed collective behaviors disappear when controls are tightened.
Social science meets llms: How reliable are large language models in social simulations?
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
A data-driven method adaptively selects the number of LLM-simulated responses to form confidence sets with nominal coverage for human survey parameters and equates that number to the LLM's effective human-equivalent sample size.
AgentSociety is a large-scale LLM agent-based social simulator validated on polarization, UBI, disasters, and sustainability issues with alignment to real experiments.
A data-augmentation framework for conjoint analysis integrates LLM-generated data with human responses to yield consistent, asymptotically normal estimators and reported cost savings of 24.9-79.8% in two empirical studies.
citing papers explorer
-
The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies
A systematic audit of LLM-based AI societies finds that 89.7% of 39 studies violate at least one of six PIMMUR validity principles, with reproductions showing that many claimed collective behaviors disappear when controls are tightened.
-
How Many Human Survey Respondents is a Large Language Model Worth? An Uncertainty Quantification Perspective
A data-driven method adaptively selects the number of LLM-simulated responses to form confidence sets with nominal coverage for human survey parameters and equates that number to the LLM's effective human-equivalent sample size.
-
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society
AgentSociety is a large-scale LLM agent-based social simulator validated on polarization, UBI, disasters, and sustainability issues with alignment to real experiments.
-
Large Language Models for Market Research: A Data-augmentation Approach
A data-augmentation framework for conjoint analysis integrates LLM-generated data with human responses to yield consistent, asymptotically normal estimators and reported cost savings of 24.9-79.8% in two empirical studies.