pith. sign in

arxiv: 1808.07275 · v1 · pith:J327656Qnew · submitted 2018-08-22 · 💻 cs.AI · cs.CV· cs.MM

CentralNet: a Multilayer Approach for Multimodal Fusion

classification 💻 cs.AI cs.CVcs.MM
keywords modalityapproachmultimodalfusionnetworkapproachescentraldecisions
0
0 comments X
read the original abstract

This paper proposes a novel multimodal fusion approach, aiming to produce best possible decisions by integrating information coming from multiple media. While most of the past multimodal approaches either work by projecting the features of different modalities into the same space, or by coordinating the representations of each modality through the use of constraints, our approach borrows from both visions. More specifically, assuming each modality can be processed by a separated deep convolutional network, allowing to take decisions independently from each modality, we introduce a central network linking the modality specific networks. This central network not only provides a common feature embedding but also regularizes the modality specific networks through the use of multi-task learning. The proposed approach is validated on 4 different computer vision tasks on which it consistently improves the accuracy of existing multimodal fusion approaches.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.