LLMs converge on competitive rationality and coordination but diverge 48-fold on cooperation, with provider identity and generational shifts as dominant factors across 38 games.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
A data-driven method adaptively selects the number of LLM-simulated responses to form confidence sets with nominal coverage for human survey parameters and equates that number to the LLM's effective human-equivalent sample size.
GPT produces click distributions significantly different from real humans in 53% of UX first-click tasks, with prompting techniques like personas and chain-of-thought failing to improve alignment.
A model-free method builds confidence sets for latent parameters to proxy sim-to-real discrepancies and estimates the quantile function of that proxy to produce a distribution-level fidelity profile for simulators.
citing papers explorer
-
Large language models converge on competitive rationality but diverge on cooperation across providers and generations
LLMs converge on competitive rationality and coordination but diverge 48-fold on cooperation, with provider identity and generational shifts as dominant factors across 38 games.
-
How Many Human Survey Respondents is a Large Language Model Worth? An Uncertainty Quantification Perspective
A data-driven method adaptively selects the number of LLM-simulated responses to form confidence sets with nominal coverage for human survey parameters and equates that number to the LLM's effective human-equivalent sample size.
-
What Would GPT Click: Practical Effects of Human-AI Behavioral Misalignment and the Cost of Synthetic Participants in User Experience
GPT produces click distributions significantly different from real humans in 53% of UX first-click tasks, with prompting techniques like personas and chain-of-thought failing to improve alignment.
-
Model-Free Assessment of Simulator Fidelity via Quantile Curves
A model-free method builds confidence sets for latent parameters to proxy sim-to-real discrepancies and estimates the quantile function of that proxy to produce a distribution-level fidelity profile for simulators.