CiF is a large new civil infrastructure segmentation dataset that shows zero-shot foundation models and domain-supervised models plateau at roughly 25% mAP, establishing infrastructure inspection as an open challenge for current visual AI.
Laion-5b: An open large-scale dataset for training next generation image-text models, 2022
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
ALLaVA creates 1.3M GPT4V-synthesized samples enabling 4B VLMs to achieve competitive results on 17 benchmarks and match 7B/13B models on some tasks.
A small language model fine-tuned on tool-augmented chain-of-thought data generated by a larger LLM learns to selectively call tools, delivering better content moderation accuracy at lower inference cost.
citing papers explorer
-
Cracks in the Foundation: A Civil Infrastructure Dataset to Challenge Vision Foundation Models
CiF is a large new civil infrastructure segmentation dataset that shows zero-shot foundation models and domain-supervised models plateau at roughly 25% mAP, establishing infrastructure inspection as an open challenge for current visual AI.
-
ALLaVA: Harnessing GPT4V-Synthesized Data for Lite Vision-Language Models
ALLaVA creates 1.3M GPT4V-synthesized samples enabling 4B VLMs to achieve competitive results on 17 benchmarks and match 7B/13B models on some tasks.
-
Tool-MCoT: Tool Augmented Multimodal Chain-of-Thought for Content Safety Moderation
A small language model fine-tuned on tool-augmented chain-of-thought data generated by a larger LLM learns to selectively call tools, delivering better content moderation accuracy at lower inference cost.
- ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison