BehaviorBench is a benchmark for foundation models on behavioral tasks that reveals fine-tuned behavioral models outperform general models on distributional alignment while general models lead on individual-level accuracy.
AI Behavioral Science
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
We outline a foundation for a new field of ``AI Behavioral Science,'' covering three perspectives. First, as AI becomes ubiquitous and is increasingly proprietary and opaque, it becomes vital to develop techniques for assessing AI behavior. We outline how tools developed to assess people's behaviors by social scientists can be used to assess and infer AI's behaviors biases, tendencies, and heuristics. Second, we also discuss how AI can change the ways in which we learn about human behavior. Beyond its computational power, AI offers new techniques for simulating, inferring, and predicting human behaviors that we outline and discuss. Third, as humans and AI are interacting in increasingly complex and intertwined systems, we need to understand the implications for the resulting economic and political outcomes. We outline issues that are increasingly pressing concerning the future of human-AI interactions and potential changes and disruptions that can ensue.
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
BehaviorBench: Benchmarking Foundation Models for Behavioral Science Tasks
BehaviorBench is a benchmark for foundation models on behavioral tasks that reveals fine-tuned behavioral models outperform general models on distributional alignment while general models lead on individual-level accuracy.