Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction, Feb

Koki Maeda, Shuhei Kurita, Taiki Miyanishi, Naoaki Okazaki · 2024 · arXiv 2402.17969

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ITIScore: An Image-to-Text-to-Image Rating Framework for the Image Captioning Ability of MLLMs

cs.CV · 2026-04-04 · unverdicted · novelty 6.0

ITIScore evaluates MLLM image captions via image-to-text-to-image reconstruction consistency and aligns with human judgments on a new 40K-caption benchmark.

VC-Inspector: Advancing Reference-free Evaluation of Video Captions with Factual Analysis

cs.CV · 2025-09-20 · unverdicted · novelty 6.0

VC-Inspector introduces a lightweight open-source LMM and a controllable factual-error generation framework that achieves state-of-the-art correlation with human judgments on reference-free video caption evaluation.

citing papers explorer

Showing 2 of 2 citing papers.

ITIScore: An Image-to-Text-to-Image Rating Framework for the Image Captioning Ability of MLLMs cs.CV · 2026-04-04 · unverdicted · none · ref 33
ITIScore evaluates MLLM image captions via image-to-text-to-image reconstruction consistency and aligns with human judgments on a new 40K-caption benchmark.
VC-Inspector: Advancing Reference-free Evaluation of Video Captions with Factual Analysis cs.CV · 2025-09-20 · unverdicted · none · ref 25
VC-Inspector introduces a lightweight open-source LMM and a controllable factual-error generation framework that achieves state-of-the-art correlation with human judgments on reference-free video caption evaluation.

Vision Language Model-based Caption Evaluation Method Leveraging Visual Context Extraction, Feb

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer