Recognition: unknown
Tensor Fusion Network for Multimodal Sentiment Analysis
read the original abstract
Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Network, which learns both such dynamics end-to-end. The proposed approach is tailored for the volatile nature of spoken language in online videos as well as accompanying gestures and voice. In the experiments, our model outperforms state-of-the-art approaches for both multimodal and unimodal sentiment analysis.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition
A dual-branch graph model with modality disentanglement and speaker-aware hypergraphs outperforms baselines on IEMOCAP and MELD for multimodal conversational emotion recognition.
-
Simultaneous Long-tailed Recognition and Multi-modal Fusion for Highly Imbalanced Multi-modal Data
A multi-modal extension of multi-expert architectures uses confidence-guided fusion from modality-specific networks to handle long-tailed class imbalance across heterogeneous inputs.
-
Mitigating Multimodal Inconsistency via Cognitive Dual-Pathway Reasoning for Intent Recognition
CDPR uses an intuition pathway for cross-modal consensus and a reasoning pathway for quantifying and mitigating inconsistencies to improve multimodal intent recognition.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.