CUNI System for the WMT18 Multimodal Translation Task

Du\v{s}an Vari\v{s}; Jind\v{r}ich Helcl; Jind\v{r}ich Libovick\'y

arxiv: 1811.04697 · v1 · pith:INBX6OP3new · submitted 2018-11-12 · 💻 cs.CL

CUNI System for the WMT18 Multimodal Translation Task

Jind\v{r}ich Helcl , Jind\v{r}ich Libovick\'y , Du\v{s}an Vari\v{s} This is my paper

classification 💻 cs.CL

keywords multimodalnetworksubmissionfeaturesmethodsmodelrecurrentself-attentive

0 comments

read the original abstract

We present our submission to the WMT18 Multimodal Translation Task. The main feature of our submission is applying a self-attentive network instead of a recurrent neural network. We evaluate two methods of incorporating the visual features in the model: first, we include the image representation as another input to the network; second, we train the model to predict the visual features and use it as an auxiliary objective. For our submission, we acquired both textual and multimodal additional data. Both of the proposed methods yield significant improvements over recurrent networks and self-attentive textual baselines.

This paper has not been read by Pith yet.

CUNI System for the WMT18 Multimodal Translation Task

discussion (0)