PFAN adds position-aware attention on image blocks to improve visual-text joint embeddings and reports state-of-the-art results on Flickr30K, MS-COCO, and a new Tencent-News dataset.
Dual Attention Networks for Multimodal Reasoning and Matching
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Position Focused Attention Network for Image-Text Matching
PFAN adds position-aware attention on image blocks to improve visual-text joint embeddings and reports state-of-the-art results on Flickr30K, MS-COCO, and a new Tencent-News dataset.