CiF is a large new civil infrastructure segmentation dataset that shows zero-shot foundation models and domain-supervised models plateau at roughly 25% mAP, establishing infrastructure inspection as an open challenge for current visual AI.
Laion-5b: An open large-scale dataset for training next generation image-text models
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
verdicts
UNVERDICTED 4representative citing papers
ClaimDiff-RL introduces reference-conditioned atomic claim differences verified by a multimodal judge as the reward signal for fine-grained RL in long-form image captioning.
ALLaVA creates 1.3M GPT4V-synthesized samples enabling 4B VLMs to achieve competitive results on 17 benchmarks and match 7B/13B models on some tasks.
A small language model fine-tuned on tool-augmented chain-of-thought data generated by a larger LLM learns to selectively call tools, delivering better content moderation accuracy at lower inference cost.
citing papers explorer
No citing papers match the current filters.