Rich Image Captioning in the Wild

Chris Buehler; Chris Sienkiewicz; Chris Thrasher; Cornelia Carapcea; Jian Sun; Kenneth Tran; Lei Zhang; Xiaodong He

arxiv: 1603.09016 · v2 · pith:MD3K2IBJnew · submitted 2016-03-30 · 💻 cs.CV

Rich Image Captioning in the Wild

Kenneth Tran , Xiaodong He , Lei Zhang , Jian Sun , Cornelia Carapcea , Chris Thrasher , Chris Buehler , Chris Sienkiewicz This is my paper

classification 💻 cs.CV

keywords captionmodelchallengesimagequalitystate-of-the-artwildaddresses

0 comments

read the original abstract

We present an image caption system that addresses new challenges of automatically describing images in the wild. The challenges include high quality caption quality with respect to human judgments, out-of-domain data handling, and low latency required in many applications. Built on top of a state-of-the-art framework, we developed a deep vision model that detects a broad range of visual concepts, an entity recognition model that identifies celebrities and landmarks, and a confidence model for the caption output. Experimental results show that our caption engine outperforms previous state-of-the-art systems significantly on both in-domain dataset (i.e. MS COCO) and out of-domain datasets.

This paper has not been read by Pith yet.

Rich Image Captioning in the Wild

discussion (0)