This human study did not involve human subjects: Validat- ing llm simulations as behavioral evidence

Hullman, J · 2026 · arXiv 2602.15785

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

Post-training makes large language models less human-like

cs.CL · 2026-05-08 · unverdicted · novelty 6.0

Post-training reduces LLMs' behavioral alignment with humans across families and sizes, with the misalignment increasing in newer generations while persona induction fails to improve individual-level predictions.

When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation

cs.LG · 2026-04-12 · unverdicted · novelty 6.0

Stronger reasoning models in LLMs reduce behavioral negotiation by defaulting to authority outcomes in multi-agent settings, unlike structured scaffolds that enable concessions.

When simulations look right but causal effects go wrong: Large language models as behavioral simulators

cs.CY · 2026-04-02 · unverdicted · novelty 6.0

LLMs reproduce observed attitudinal patterns in climate interventions reasonably well but diverge on causal effect estimates, with descriptive fit failing to predict causal accuracy across interventions and outcomes.

An Algebraic Exposition of the Theory of Dyadic Morality

cs.AI · 2026-05-15 · unverdicted · novelty 4.0

Algebraic formalization of dyadic morality via SCM with operators for moral judgment and applications to AI policy design.

Adaptive Querying with AI Persona Priors

stat.ML · 2026-05-01

citing papers explorer

Showing 5 of 5 citing papers.

Post-training makes large language models less human-like cs.CL · 2026-05-08 · unverdicted · none · ref 12
Post-training reduces LLMs' behavioral alignment with humans across families and sizes, with the misalignment increasing in newer generations while persona induction fails to improve individual-level predictions.
When Reasoning Models Hurt Behavioral Simulation: A Solver-Sampler Mismatch in Multi-Agent LLM Negotiation cs.LG · 2026-04-12 · unverdicted · none · ref 13
Stronger reasoning models in LLMs reduce behavioral negotiation by defaulting to authority outcomes in multi-agent settings, unlike structured scaffolds that enable concessions.
When simulations look right but causal effects go wrong: Large language models as behavioral simulators cs.CY · 2026-04-02 · unverdicted · none · ref 27
LLMs reproduce observed attitudinal patterns in climate interventions reasonably well but diverge on causal effect estimates, with descriptive fit failing to predict causal accuracy across interventions and outcomes.
An Algebraic Exposition of the Theory of Dyadic Morality cs.AI · 2026-05-15 · unverdicted · none · ref 11
Algebraic formalization of dyadic morality via SCM with operators for moral judgment and applications to AI policy design.
Adaptive Querying with AI Persona Priors stat.ML · 2026-05-01 · unreviewed · ref 41

This human study did not involve human subjects: Validat- ing llm simulations as behavioral evidence

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer