Yfcc100m: The new data in multimedia research.Communications of the ACM, 59(2):64–73

Bart Thomee, David A Shamma, Gerald Friedland, Benjamin Elizalde, Karl Ni, Douglas Poland, Damian Borth, Li-Jia Li · 2016

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

LAION-5B: An open large-scale dataset for training next generation image-text models

cs.CV · 2022-10-16 · accept · novelty 7.0

LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.

UpstreamQA: A Modular Framework for Explicit Reasoning on Video Question Answering Tasks

cs.CV · 2026-04-25 · unverdicted · novelty 5.0

UpstreamQA disentangles video reasoning by using LRMs for explicit upstream object identification and scene context before downstream LMM VideoQA, improving performance and interpretability on OpenEQA and NExTQA in some cases.

Scalable and Generalizable Correspondence Pruning via Geometry-Consistent Pre-training

cs.CV · 2024-06-09 · unverdicted · novelty 5.0

A geometry-consistent pre-training paradigm using masked inlier reconstruction and a dual-stream encoder produces more robust and generalizable correspondence pruning for camera pose estimation and related 3D tasks.

citing papers explorer

Showing 3 of 3 citing papers.

LAION-5B: An open large-scale dataset for training next generation image-text models cs.CV · 2022-10-16 · accept · none · ref 82
LAION-5B is an openly released dataset of 5.85 billion CLIP-filtered image-text pairs that enables replication of foundational vision-language models.
UpstreamQA: A Modular Framework for Explicit Reasoning on Video Question Answering Tasks cs.CV · 2026-04-25 · unverdicted · none · ref 29
UpstreamQA disentangles video reasoning by using LRMs for explicit upstream object identification and scene context before downstream LMM VideoQA, improving performance and interpretability on OpenEQA and NExTQA in some cases.
Scalable and Generalizable Correspondence Pruning via Geometry-Consistent Pre-training cs.CV · 2024-06-09 · unverdicted · none · ref 32
A geometry-consistent pre-training paradigm using masked inlier reconstruction and a dual-stream encoder produces more robust and generalizable correspondence pruning for camera pose estimation and related 3D tasks.

Yfcc100m: The new data in multimedia research.Communications of the ACM, 59(2):64–73

fields

years

verdicts

representative citing papers

citing papers explorer