Visual genome: Connecting language and vision using crowdsourced dense image annotations.International journal of computer vision, 123(1):32–73, 2017

Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A Shamma, et al · 2017

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis

cs.CV · 2026-03-23 · unverdicted · novelty 7.0

SteelDefectX is a new multi-form vision-language dataset and benchmark for analyzing steel surface defects using 7,778 images across 25 categories.

EmoVerse: A MLLMs-Driven Emotion Representation Dataset for Interpretable Visual Emotion Analysis

cs.CV · 2025-11-16 · unverdicted · novelty 7.0

EmoVerse is a large open-source dataset enabling interpretable visual emotion analysis via B-A-S triplets, region grounding, and unified CES/DES representations created through an MLLM-driven pipeline.

citing papers explorer

Showing 2 of 2 citing papers.

SteelDefectX: A Multi-Form Vision-Language Dataset and Benchmark for Steel Surface Defect Analysis cs.CV · 2026-03-23 · unverdicted · none · ref 15
SteelDefectX is a new multi-form vision-language dataset and benchmark for analyzing steel surface defects using 7,778 images across 25 categories.
EmoVerse: A MLLMs-Driven Emotion Representation Dataset for Interpretable Visual Emotion Analysis cs.CV · 2025-11-16 · unverdicted · none · ref 18
EmoVerse is a large open-source dataset enabling interpretable visual emotion analysis via B-A-S triplets, region grounding, and unified CES/DES representations created through an MLLM-driven pipeline.

Visual genome: Connecting language and vision using crowdsourced dense image annotations.International journal of computer vision, 123(1):32–73, 2017

fields

years

verdicts

representative citing papers

citing papers explorer