SHREC is a new benchmark dataset of embodied human-robot conversations that shows substantial performance gaps in state-of-the-art foundation models on tasks involving social error detection and rationale generation.
MELD: A multimodal multi-party dataset for emotion recognition in conversations
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 7roles
baseline 1polarities
baseline 1representative citing papers
HyperEmo-RAG uses hierarchical hyperbolic embeddings and graph-based evidence injection to outperform prior methods in multimodal emotion recognition.
EmoMM benchmark reveals Video Contribution Collapse in MLLMs for emotion recognition under modality conflict and missingness, mitigated by CHASE head-level attention steering.
Authors build an emotional intensity dataset and fine-tune generative LLMs to predict continuous 0-100 scores, claiming outperformance over classification baselines plus generalization to sentiment and arousal.
EmoS is a new high-fidelity benchmark for fine-grained streaming emotional understanding that produces measurable gains when used to fine-tune multimodal large language models.
CUCI-Net abstracts context-utterance dependency into an interpretation cue that combines local modality signals with global context and feeds it into the final multimodal interaction for context-conditioned predictions.
A multi-agent framework decomposes multimodal empathetic response generation into structured reasoning steps and uses global reflection to reduce emotional biases, outperforming prior methods on IEMOCAP and MELD benchmarks.
citing papers explorer
-
Social Human Robot Embodied Conversation (SHREC) Dataset: Benchmarking Foundational Models' Social Reasoning
SHREC is a new benchmark dataset of embodied human-robot conversations that shows substantial performance gaps in state-of-the-art foundation models on tasks involving social error detection and rationale generation.
-
Navigating the Emotion Tree: Hierarchical Hyperbolic RAG for Multimodal Emotion Recognition
HyperEmo-RAG uses hierarchical hyperbolic embeddings and graph-based evidence injection to outperform prior methods in multimodal emotion recognition.
-
EmoMM: Benchmarking and Steering MLLM for Multimodal Emotion Recognition under Conflict and Missingness
EmoMM benchmark reveals Video Contribution Collapse in MLLMs for emotion recognition under modality conflict and missingness, mitigated by CHASE head-level attention steering.
-
Beyond Sentiment Classification: A Generative Framework for Emotion Intensity Evaluation in Text
Authors build an emotional intensity dataset and fine-tune generative LLMs to predict continuous 0-100 scores, claiming outperformance over classification baselines plus generalization to sentiment and arousal.
-
EmoS: A High-Fidelity Multimodal Benchmark for Fine-grained Streaming Emotional Understanding
EmoS is a new high-fidelity benchmark for fine-grained streaming emotional understanding that produces measurable gains when used to fine-tune multimodal large language models.
-
Beyond Isolated Utterances: Cue-Guided Interaction for Context-Dependent Conversational Multimodal Understanding
CUCI-Net abstracts context-utterance dependency into an interpretation cue that combines local modality signals with global context and feeds it into the final multimodal interaction for context-conditioned predictions.
-
A Multi-Agent Framework with Structured Reasoning and Reflective Refinement for Multimodal Empathetic Response Generation
A multi-agent framework decomposes multimodal empathetic response generation into structured reasoning steps and uses global reflection to reduce emotional biases, outperforming prior methods on IEMOCAP and MELD benchmarks.