AttuneBench introduces a multi-turn conversation benchmark using participant annotations to evaluate LLM emotional intelligence, finding that model performance on emotion recognition, behavior classification, preference prediction, and response quality are largely independent.
Table 15 gives the full 11-model × 10-topic Composite matrix, allowing inspection of whether the topic-difficulty ordering is preserved at the per-model level
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
AttuneBench: A Conversation-Based Benchmark for LLM Emotional Intelligence
AttuneBench introduces a multi-turn conversation benchmark using participant annotations to evaluate LLM emotional intelligence, finding that model performance on emotion recognition, behavior classification, preference prediction, and response quality are largely independent.