MFAS: Multimodal Fusion Architecture Search

Fr\'ed\'eric Jurie; Juan-Manuel P\'erez-R\'ua; Moez Baccouche; St\'ephane Pateux; Valentin Vielzeuf

arxiv: 1903.06496 · v1 · pith:JUIDSP2Qnew · submitted 2019-03-15 · 💻 cs.LG · cs.CV· cs.NE

MFAS: Multimodal Fusion Architecture Search

Juan-Manuel P\'erez-R\'ua , Valentin Vielzeuf , St\'ephane Pateux , Moez Baccouche , Fr\'ed\'eric Jurie This is my paper

classification 💻 cs.LG cs.CVcs.NE

keywords datasetfusionmultimodalsearcharchitecturearchitecturesproblemproblems

0 comments

read the original abstract

We tackle the problem of finding good architectures for multimodal classification problems. We propose a novel and generic search space that spans a large number of possible fusion architectures. In order to find an optimal architecture for a given dataset in the proposed search space, we leverage an efficient sequential model-based exploration approach that is tailored for the problem. We demonstrate the value of posing multimodal fusion as a neural architecture search problem by extensive experimentation on a toy dataset and two other real multimodal datasets. We discover fusion architectures that exhibit state-of-the-art performance for problems with different domain and dataset size, including the NTU RGB+D dataset, the largest multi-modal action recognition dataset available.

This paper has not been read by Pith yet.

MFAS: Multimodal Fusion Architecture Search

discussion (0)