Post-training makes large language models less human-like

· 2026 · cs.CL · arXiv 2605.07632

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Large language models (LLMs) are increasingly used as surrogates for human participants, but it remains unclear which models best capture human behavior and why. To address this, we introduce Psych-201, a novel dataset that enables us to measure behavioral alignment at scale. We find that post-training -- the stage that turns base models into useful assistants -- consistently reduces alignment with human behavior across model families, sizes, and objectives. Moreover, this misalignment widens in newer model generations even as base models continue to improve. Finally, we find that persona-induction -- a popular technique for eliciting human-like behavior by conditioning models on participant-specific information -- does not improve predictions at the level of individuals. Taken together, our results suggest that the very processes that are currently employed to turn LLMs into useful assistants also make them less accurate models of human behavior.

representative citing papers

Apparent Psychological Profiles of Large Language Models are Largely a Measurement Artifact

cs.AI · 2026-06-18 · unverdicted · novelty 7.0

Apparent psychological profiles of LLMs are largely measurement artifacts driven by directional response bias rather than actual traits.

citing papers explorer

Showing 1 of 1 citing paper.

Apparent Psychological Profiles of Large Language Models are Largely a Measurement Artifact cs.AI · 2026-06-18 · unverdicted · none · ref 59 · internal anchor
Apparent psychological profiles of LLMs are largely measurement artifacts driven by directional response bias rather than actual traits.

Post-training makes large language models less human-like

fields

years

verdicts

representative citing papers

citing papers explorer