OSCToM uses RL-guided generation with an extended DSL and surrogate models to create nested belief conflict tasks, raising FANToM accuracy from 0.2% to 76% while being 6x more efficient.
Tomchallenges: A principle-guided dataset and diverse evaluation tasks for exploring theory of mind.arXiv preprint arXiv:2305.15068, 2023
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
OSCToM: RL-Guided Adversarial Generation for High-Order Theory of Mind
OSCToM uses RL-guided generation with an extended DSL and surrogate models to create nested belief conflict tasks, raising FANToM accuracy from 0.2% to 76% while being 6x more efficient.