Discriminability objective for training descriptive captions

Brian Price; Gregory Shakhnarovich; Ruotian Luo; Scott Cohen

arxiv: 1803.04376 · v2 · pith:DZ3D7DL2new · submitted 2018-03-12 · 💻 cs.CV

Discriminability objective for training descriptive captions

Ruotian Luo , Brian Price , Scott Cohen , Gregory Shakhnarovich This is my paper

classification 💻 cs.CV

keywords captioncaptionsimageapproachcaptioningdiscriminabilitygeneratedloss

0 comments

read the original abstract

One property that remains lacking in image captions generated by contemporary methods is discriminability: being able to tell two images apart given the caption for one of them. We propose a way to improve this aspect of caption generation. By incorporating into the captioning training objective a loss component directly related to ability (by a machine) to disambiguate image/caption matches, we obtain systems that produce much more discriminative caption, according to human evaluation. Remarkably, our approach leads to improvement in other aspects of generated captions, reflected by a battery of standard scores such as BLEU, SPICE etc. Our approach is modular and can be applied to a variety of model/loss combinations commonly proposed for image captioning.

This paper has not been read by Pith yet.

Discriminability objective for training descriptive captions

discussion (0)