pith. machine review for the scientific record. sign in

arxiv: 1510.04709 · v2 · submitted 2015-10-15 · 💻 cs.CL · cs.CV· cs.LG· cs.NE

Recognition: unknown

Multilingual Image Description with Neural Sequence Models

Authors on Pith no claims yet
classification 💻 cs.CL cs.CVcs.LGcs.NE
keywords descriptionimagelanguagemodelsneuralsequencesourcealigned
0
0 comments X
read the original abstract

In this paper we present an approach to multi-language image description bringing together insights from neural machine translation and neural image description. To create a description of an image for a given target language, our sequence generation models condition on feature vectors from the image, the description from the source language, and/or a multimodal vector computed over the image and a description in the source language. In image description experiments on the IAPR-TC12 dataset of images aligned with English and German sentences, we find significant and substantial improvements in BLEU4 and Meteor scores for models trained over multiple languages, compared to a monolingual baseline.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Video-guided Machine Translation with Global Video Context

    cs.CV 2026-04 unverdicted novelty 4.0

    A globally video-guided multimodal translation framework retrieves semantically related video segments with a vector database and applies attention mechanisms to improve subtitle translation accuracy in long videos.