pith. sign in

arxiv: 1904.01356 · v1 · pith:4JVPPKB6new · submitted 2019-04-02 · 💻 cs.CV · cs.CL

Aiding Intra-Text Representations with Visual Context for Multimodal Named Entity Recognition

classification 💻 cs.CV cs.CL
keywords entitymodelnamedpostsrecognitiontextimageimages
0
0 comments X p. Extension
pith:4JVPPKB6 Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{4JVPPKB6}

Prints a linked pith:4JVPPKB6 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

With massive explosion of social media such as Twitter and Instagram, people daily share billions of multimedia posts, containing images and text. Typically, text in these posts is short, informal and noisy, leading to ambiguities which can be resolved using images. In this paper we explore text-centric Named Entity Recognition task on these multimedia posts. We propose an end to end model which learns a joint representation of a text and an image. Our model extends multi-dimensional self attention technique, where now image helps to enhance relationship between words. Experiments show that our model is capable of capturing both textual and visual contexts with greater accuracy, achieving state-of-the-art results on Twitter multimodal Named Entity Recognition dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.