pith. sign in

arxiv: 1805.08660 · v1 · pith:CPM6CPWPnew · submitted 2018-05-22 · 💻 cs.CL

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

classification 💻 cs.CL
keywords attentionmodalitiesmultimodalaffectiveaffectsdatafusionhierarchical
0
0 comments X
read the original abstract

Multimodal affective computing, learning to recognize and interpret human affects and subjective information from multiple data sources, is still challenging because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract level, ignoring time-dependent interactions between modalities. Addressing such issues, we introduce a hierarchical multimodal architecture with attention and word-level fusion to classify utter-ance-level sentiment and emotion from text and audio data. Our introduced model outperforms the state-of-the-art approaches on published datasets and we demonstrated that our model is able to visualize and interpret the synchronized attention over modalities.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.