Textcaps: a dataset for image captioningwith reading comprehension

Oleksii Sidorov, Ronghang Hu, Marcus Rohrbach, Amanpreet Singh · 2020

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

citation-role summary

dataset 1

citation-polarity summary

use dataset 1

representative citing papers

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

cs.CV · 2023-05-11 · conditional · novelty 7.0

Instruction tuning of BLIP-2 with an instruction-aware Query Transformer delivers state-of-the-art zero-shot performance on held-out vision-language datasets and strong finetuned results on downstream tasks.

citing papers explorer

Showing 1 of 1 citing paper.

InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning cs.CV · 2023-05-11 · conditional · none · ref 38
Instruction tuning of BLIP-2 with an instruction-aware Query Transformer delivers state-of-the-art zero-shot performance on held-out vision-language datasets and strong finetuned results on downstream tasks.

Textcaps: a dataset for image captioningwith reading comprehension

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer