OmniToM is a new benchmark for Theory of Mind in LLMs that evaluates explicit belief extraction and seven-dimensional labeling from 895 stories, revealing an actor-specific belief-tracking bottleneck.
URL https://aclanthology.org/2024.ac l-long.847/
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Introduces BeliefTrack benchmark diagnosing three CBM failures in LLMs and shows RL with belief-state rewards cuts failure rates by 70.9% while representation steering cuts them by 46.1%.
LLM moral robustness under persona role-play is largely determined by model family with Claude models most consistent, while susceptibility shows little family dependence.