Redcaps: Web-curated image-text data created by the people, for the people

Karan Desai, Gaurav Kaul, Zubin Aysola, Justin Johnson · 2021 · arXiv 2111.11431

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

LAION-5B: An open large-scale dataset for training next generation image-text models

cs.CV · 2022-10-16 · accept · novelty 7.0

LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

cs.CV · 2023-07-13 · unverdicted · novelty 6.0

InternVid supplies 7M videos and LLM captions to train ViCLIP, which reaches leading zero-shot action recognition and competitive retrieval performance.

NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild

cs.CV · 2026-04-13 · unverdicted · novelty 4.0

The NTIRE 2026 challenge provides a dataset of over 294,000 real and AI-generated images with 36 transformations to benchmark robust detection models.

citing papers explorer

Showing 3 of 3 citing papers.

LAION-5B: An open large-scale dataset for training next generation image-text models cs.CV · 2022-10-16 · accept · none · ref 14
LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation cs.CV · 2023-07-13 · unverdicted · none · ref 16
InternVid supplies 7M videos and LLM captions to train ViCLIP, which reaches leading zero-shot action recognition and competitive retrieval performance.
NTIRE 2026 Challenge on Robust AI-Generated Image Detection in the Wild cs.CV · 2026-04-13 · unverdicted · none · ref 15
The NTIRE 2026 challenge provides a dataset of over 294,000 real and AI-generated images with 36 transformations to benchmark robust detection models.

Redcaps: Web-curated image-text data created by the people, for the people

fields

years

verdicts

representative citing papers

citing papers explorer