OmniToM is a new benchmark for Theory of Mind in LLMs that evaluates explicit belief extraction and seven-dimensional labeling from 895 stories, revealing an actor-specific belief-tracking bottleneck.
URL https://aclanthology.org/2024.ac l-long.847/
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Introduces BeliefTrack benchmark diagnosing three CBM failures in LLMs and shows RL with belief-state rewards cuts failure rates by 70.9% while representation steering cuts them by 46.1%.
LLM moral robustness under persona role-play is largely determined by model family with Claude models most consistent, while susceptibility shows little family dependence.
citing papers explorer
-
OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling
OmniToM is a new benchmark for Theory of Mind in LLMs that evaluates explicit belief extraction and seven-dimensional labeling from 895 stories, revealing an actor-specific belief-tracking bottleneck.
-
When Should Models Change Their Minds? Contextual Belief Management in Large Language Models
Introduces BeliefTrack benchmark diagnosing three CBM failures in LLMs and shows RL with belief-state rewards cuts failure rates by 70.9% while representation steering cuts them by 46.1%.
-
Moral Susceptibility and Robustness under Persona Role-Play in Large Language Models
LLM moral robustness under persona role-play is largely determined by model family with Claude models most consistent, while susceptibility shows little family dependence.