Using large language models to simulate human behavioural experiments: Port of mars

Oliver Slumbers, Joel Z Leibo, Marco A Janssen · 2025 · arXiv 2506.05555

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Can LLMs Think Like Consumers? Benchmarking Crowd-Level Reaction Reconstruction with ConsumerSimBench

cs.CL · 2026-05-16 · unverdicted · novelty 7.0

ConsumerSimBench evaluates 13 LLMs on reconstructing crowd reactions from 1,553 Chinese social-media topics using 23,122 auditable yes-no criteria, finding maximum coverage of 47.8% by Gemini-3.1-Pro.

Evaluating Cooperation in LLM Social Groups through Elected Leadership

cs.CL · 2026-04-13 · unverdicted · novelty 6.0

Elected leadership in LLM multi-agent simulations of common-pool resource governance raises social welfare scores by 55.4% and survival time by 128.6%.

citing papers explorer

Showing 2 of 2 citing papers.

Can LLMs Think Like Consumers? Benchmarking Crowd-Level Reaction Reconstruction with ConsumerSimBench cs.CL · 2026-05-16 · unverdicted · none · ref 36
ConsumerSimBench evaluates 13 LLMs on reconstructing crowd reactions from 1,553 Chinese social-media topics using 23,122 auditable yes-no criteria, finding maximum coverage of 47.8% by Gemini-3.1-Pro.
Evaluating Cooperation in LLM Social Groups through Elected Leadership cs.CL · 2026-04-13 · unverdicted · none · ref 6
Elected leadership in LLM multi-agent simulations of common-pool resource governance raises social welfare scores by 55.4% and survival time by 128.6%.

Using large language models to simulate human behavioural experiments: Port of mars

fields

years

verdicts

representative citing papers

citing papers explorer