Tensor Fusion Network for Multimodal Sentiment Analysis

Amir Zadeh; Erik Cambria; Louis-Philippe Morency; Minghai Chen; Soujanya Poria

arxiv: 1707.07250 · v1 · pith:ICGKKAFNnew · submitted 2017-07-23 · 💻 cs.CL

Tensor Fusion Network for Multimodal Sentiment Analysis

Amir Zadeh , Minghai Chen , Soujanya Poria , Erik Cambria , Louis-Philippe Morency This is my paper

classification 💻 cs.CL

keywords analysismultimodalsentimentdynamicsfusionlanguagemodelnetwork

0 comments

read the original abstract

Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language. In this paper, we pose the problem of multimodal sentiment analysis as modeling intra-modality and inter-modality dynamics. We introduce a novel model, termed Tensor Fusion Network, which learns both such dynamics end-to-end. The proposed approach is tailored for the volatile nature of spoken language in online videos as well as accompanying gestures and voice. In the experiments, our model outperforms state-of-the-art approaches for both multimodal and unimodal sentiment analysis.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Rethinking Gating Mechanism in Sparse MoE: Handling Arbitrary Modality Inputs with Confidence-Guided Gate
cs.LG 2025-05 unverdicted novelty 7.0

ConfSMoE adds expert-opinion imputation and detaches softmax routing scores to ground-truth task confidence to relieve expert collapse in SMoE without extra load-balance losses, evaluated on four real-world datasets.
Disentangled Dual-Branch Graph Learning for Conversational Emotion Recognition
cs.SD 2026-04 unverdicted novelty 6.0

A dual-branch graph model with modality disentanglement and speaker-aware hypergraphs outperforms baselines on IEMOCAP and MELD for multimodal conversational emotion recognition.
Disentangling Bias by Modeling Intra- and Inter-modal Causal Attention for Multimodal Sentiment Analysis
cs.LG 2025-08 unverdicted novelty 6.0

MMCI uses multi-relational graph modeling and attention-based disentanglement of causal versus shortcut features, combined with backdoor adjustment, to reduce bias and improve generalization in multimodal sentiment analysis.
Simultaneous Long-tailed Recognition and Multi-modal Fusion for Highly Imbalanced Multi-modal Data
cs.CV 2026-05 unverdicted novelty 5.0

A multi-modal extension of multi-expert architectures uses confidence-guided fusion from modality-specific networks to handle long-tailed class imbalance across heterogeneous inputs.
Mitigating Multimodal Inconsistency via Cognitive Dual-Pathway Reasoning for Intent Recognition
cs.MM 2026-05 unverdicted novelty 5.0

CDPR uses an intuition pathway for cross-modal consensus and a reasoning pathway for quantifying and mitigating inconsistencies to improve multimodal intent recognition.
Multimodal and Multi-view Models for Emotion Recognition
cs.CL 2019-06 unverdicted novelty 5.0

Multimodal training with attention and contrastive multi-view learning improves both combined and acoustic-only emotion recognition on IEMOCAP over prior acoustic baselines.