Rigel is a self-distilled LLM-based metric for image and video caption evaluation that reports over 10-point gains on ActivityNet-Fact in reference-free settings.
InACCV, pages 3570–3586
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
ITIScore evaluates MLLM image captions via image-to-text-to-image reconstruction consistency and aligns with human judgments on a new 40K-caption benchmark.
citing papers explorer
-
Rigel: Self-Distilled Score Adaptation for Image and Video Captioning Evaluation
Rigel is a self-distilled LLM-based metric for image and video caption evaluation that reports over 10-point gains on ActivityNet-Fact in reference-free settings.
-
ITIScore: An Image-to-Text-to-Image Rating Framework for the Image Captioning Ability of MLLMs
ITIScore evaluates MLLM image captions via image-to-text-to-image reconstruction consistency and aligns with human judgments on a new 40K-caption benchmark.