ClaimDiff-RL introduces reference-conditioned atomic claim differences verified by a multimodal judge as the reward signal for fine-grained RL in long-form image captioning.
Vicrit: A verifiable reinforcement learning proxy task for visual perception in vlms
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison
ClaimDiff-RL introduces reference-conditioned atomic claim differences verified by a multimodal judge as the reward signal for fine-grained RL in long-form image captioning.