SHREC is a new benchmark dataset of embodied human-robot conversations that shows substantial performance gaps in state-of-the-art foundation models on tasks involving social error detection and rationale generation.
Advancing social intelligence in ai agents: Technical challenges and open questions
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
SAVOIR combines prospective expected utility valuation with Shapley values for fair credit assignment in social dialogue RL, achieving SOTA on SOTOPIA where a 7B model matches or exceeds GPT-4o and Claude-3.5-Sonnet.
citing papers explorer
-
Social Human Robot Embodied Conversation (SHREC) Dataset: Benchmarking Foundational Models' Social Reasoning
SHREC is a new benchmark dataset of embodied human-robot conversations that shows substantial performance gaps in state-of-the-art foundation models on tasks involving social error detection and rationale generation.