pith. machine review for the scientific record. sign in

arxiv: 1708.02300 · v1 · submitted 2017-08-07 · 💻 cs.CL · cs.AI· cs.CV· cs.LG

Recognition: unknown

Reinforced Video Captioning with Entailment Rewards

Authors on Pith no claims yet
classification 💻 cs.CL cs.AIcs.CVcs.LG
keywords improvementsmetricsachievingcaptioningmodeloptimizerewardssignificant
0
0 comments X
read the original abstract

Sequence-to-sequence models have shown promising improvements on the temporal task of video captioning, but they optimize word-level cross-entropy loss during training. First, using policy gradient and mixed-loss methods for reinforcement learning, we directly optimize sentence-level task-based metrics (as rewards), achieving significant improvements over the baseline, based on both automatic metrics and human evaluation on multiple datasets. Next, we propose a novel entailment-enhanced reward (CIDEnt) that corrects phrase-matching based metrics (such as CIDEr) to only allow for logically-implied partial matches and avoid contradictions, achieving further significant improvements over the CIDEr-reward model. Overall, our CIDEnt-reward model achieves the new state-of-the-art on the MSR-VTT dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.