Improvements in LLM Theory of Mind on static benchmarks do not reliably improve performance in dynamic, first-person human-AI interactions across goal-oriented and experience-oriented tasks.
Proceedings of the National Academy of Sciences , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
Fine-tuning LLMs on structured tasks inspired by maladaptive behaviors produces stable, context-general shifts in next-token distributions and response tendencies consistent with altered behavioral priors.
citing papers explorer
-
Does Theory of Mind Improvement Really Benefit Human-AI Interactions? Empirical Findings from Interactive Evaluations
Improvements in LLM Theory of Mind on static benchmarks do not reliably improve performance in dynamic, first-person human-AI interactions across goal-oriented and experience-oriented tasks.
-
Modeling Pathology-Like Behavioral Patterns in Language Models Through Behavioral Fine-Tuning
Fine-tuning LLMs on structured tasks inspired by maladaptive behaviors produces stable, context-general shifts in next-token distributions and response tendencies consistent with altered behavioral priors.