Social science meets llms: How reliable are large language models in social simulations?

Huang, Yue, Zhengqing Yuan, Yujun Zhou, Kehan Guo, Xiangqi Wang, Haomin Zhuang, Weixiang Sun, Lichao Sun, Jindong Wang, Yanfang Ye, et al · 2024 · arXiv 2410.23426

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

cs.CL · 2025-09-22 · conditional · novelty 7.0

A systematic audit of LLM-based AI societies finds that 89.7% of 39 studies violate at least one of six PIMMUR validity principles, with reproductions showing that many claimed collective behaviors disappear when controls are tightened.

How Many Human Survey Respondents is a Large Language Model Worth? An Uncertainty Quantification Perspective

stat.ME · 2025-02-25 · unverdicted · novelty 6.0

A data-driven method adaptively selects the number of LLM-simulated responses to form confidence sets with nominal coverage for human survey parameters and equates that number to the LLM's effective human-equivalent sample size.

AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society

cs.SI · 2025-02-12 · unverdicted · novelty 6.0

AgentSociety is a large-scale LLM agent-based social simulator validated on polarization, UBI, disasters, and sustainability issues with alignment to real experiments.

Large Language Models for Market Research: A Data-augmentation Approach

cs.AI · 2024-12-26 · unverdicted · novelty 6.0

A data-augmentation framework for conjoint analysis integrates LLM-generated data with human responses to yield consistent, asymptotically normal estimators and reported cost savings of 24.9-79.8% in two empirical studies.

citing papers explorer

Showing 4 of 4 citing papers.

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies cs.CL · 2025-09-22 · conditional · none · ref 16
A systematic audit of LLM-based AI societies finds that 89.7% of 39 studies violate at least one of six PIMMUR validity principles, with reproductions showing that many claimed collective behaviors disappear when controls are tightened.
How Many Human Survey Respondents is a Large Language Model Worth? An Uncertainty Quantification Perspective stat.ME · 2025-02-25 · unverdicted · none · ref 30
A data-driven method adaptively selects the number of LLM-simulated responses to form confidence sets with nominal coverage for human survey parameters and equates that number to the LLM's effective human-equivalent sample size.
AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society cs.SI · 2025-02-12 · unverdicted · none · ref 51
AgentSociety is a large-scale LLM agent-based social simulator validated on polarization, UBI, disasters, and sustainability issues with alignment to real experiments.
Large Language Models for Market Research: A Data-augmentation Approach cs.AI · 2024-12-26 · unverdicted · none · ref 25
A data-augmentation framework for conjoint analysis integrates LLM-generated data with human responses to yield consistent, asymptotically normal estimators and reported cost savings of 24.9-79.8% in two empirical studies.

Social science meets llms: How reliable are large language models in social simulations?

fields

years

verdicts

representative citing papers

citing papers explorer