ReplicatorBench evaluates LLM agents on replicating social and behavioral science claims across retrieval, computation, and interpretation stages, finding strength in experiment execution but weakness in resource retrieval.
There may also be useful code to help you with the replication
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
ReplicatorBench: Benchmarking LLM Agents for Replicability in Social and Behavioral Sciences
ReplicatorBench evaluates LLM agents on replicating social and behavioral science claims across retrieval, computation, and interpretation stages, finding strength in experiment execution but weakness in resource retrieval.