Webvision Database: Visual Learning and Under- standing from Web Data.arXiv preprint arXiv:1708.02862

· 2017 · cs.CV · arXiv 1708.02862

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

open full Pith review browse 7 citing papers arXiv PDF

abstract

In this paper, we present a study on learning visual recognition models from large scale noisy web data. We build a new database called WebVision, which contains more than $2.4$ million web images crawled from the Internet by using queries generated from the 1,000 semantic concepts of the benchmark ILSVRC 2012 dataset. Meta information along with those web images (e.g., title, description, tags, etc.) are also crawled. A validation set and test set containing human annotated images are also provided to facilitate algorithmic development. Based on our new database, we obtain a few interesting observations: 1) the noisy web images are sufficient for training a good deep CNN model for visual recognition; 2) the model learnt from our WebVision database exhibits comparable or even better generalization ability than the one trained from the ILSVRC 2012 dataset when being transferred to new datasets and tasks; 3) a domain adaptation issue (a.k.a., dataset bias) is observed, which means the dataset can be used as the largest benchmark dataset for visual domain adaptation. Our new WebVision database and relevant studies in this work would benefit the advance of learning state-of-the-art visual models with minimum supervision based on web data.

representative citing papers

Beyond Loss Values: Robust Dynamic Pruning via Loss Trajectory Alignment

cs.CV · 2026-04-08 · unverdicted · novelty 6.0

AlignPrune uses a Dynamic Alignment Score from loss trajectories to identify noisy samples more accurately than per-sample loss, improving pruning accuracy by up to 6.3% on noisy benchmarks.

CARE: Class-Adaptive Expert Consensus for Reliable Learning with Long-Tailed Noisy Labels

cs.CV · 2026-05-22 · unverdicted · novelty 5.0

CARE is a parameter-efficient framework that aggregates predictions from noisy labels, VLM text embeddings, and visual features with class-frequency-based agreement thresholds to rectify labels in long-tailed noisy datasets.

Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond

cs.AI · 2024-08-21 · unverdicted · novelty 5.0

The ADC method automates the creation of large image classification datasets using LLMs and search engines, achieving 79% human agreement and reducing label noise on a 1 million image clothing dataset, while also releasing benchmarks for noise and bias issues.

Product Image Recognition with Guidance Learning and Noisy Supervision

cs.CV · 2019-07-26 · unverdicted · novelty 5.0

Presents the Product-90 noisy product image dataset and a guidance learning method that combines noisy labels with teacher soft labels to train CNNs, reporting gains over prior methods on Product-90 and three public noisy datasets.

Learning from Imperfect Text Guidance: Robust Long-Tail Visual Recognition with High-Noise Label

cs.CV · 2026-04-25 · unverdicted · novelty 5.0

Weak Teacher Supervision uses vision-language model text predictions and label discrepancy checks to mitigate high-noise label-image mismatches in long-tailed visual recognition.

See Through the Noise: Improving Domain Generalization in Gaze Estimation

cs.CV · 2026-04-17 · unverdicted · novelty 5.0

SeeTN builds a semantic embedding space with prototype transformation and affinity regularization to identify and correct noisy labels, yielding better cross-domain gaze estimation without hurting source accuracy.

EchoAlign: Bridging Generative and Discriminative Learning under Noisy Labels

cs.LG · 2024-05-21 · unverdicted · novelty 4.0

EchoAlign adjusts instances with controllable generative models to match noisy labels and selects reliable subsets, outperforming prior methods on benchmarks especially under 30% instance-dependent noise.

citing papers explorer

Showing 7 of 7 citing papers.

Beyond Loss Values: Robust Dynamic Pruning via Loss Trajectory Alignment cs.CV · 2026-04-08 · unverdicted · none · ref 27
AlignPrune uses a Dynamic Alignment Score from loss trajectories to identify noisy samples more accurately than per-sample loss, improving pruning accuracy by up to 6.3% on noisy benchmarks.
CARE: Class-Adaptive Expert Consensus for Reliable Learning with Long-Tailed Noisy Labels cs.CV · 2026-05-22 · unverdicted · none · ref 132 · internal anchor
CARE is a parameter-efficient framework that aggregates predictions from noisy labels, VLM text embeddings, and visual features with class-frequency-based agreement thresholds to rectify labels in long-tailed noisy datasets.
Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond cs.AI · 2024-08-21 · unverdicted · none · ref 14 · internal anchor
The ADC method automates the creation of large image classification datasets using LLMs and search engines, achieving 79% human agreement and reducing label noise on a 1 million image clothing dataset, while also releasing benchmarks for noise and bias issues.
Product Image Recognition with Guidance Learning and Noisy Supervision cs.CV · 2019-07-26 · unverdicted · none · ref 15 · internal anchor
Presents the Product-90 noisy product image dataset and a guidance learning method that combines noisy labels with teacher soft labels to train CNNs, reporting gains over prior methods on Product-90 and three public noisy datasets.
Learning from Imperfect Text Guidance: Robust Long-Tail Visual Recognition with High-Noise Label cs.CV · 2026-04-25 · unverdicted · none · ref 33
Weak Teacher Supervision uses vision-language model text predictions and label discrepancy checks to mitigate high-noise label-image mismatches in long-tailed visual recognition.
See Through the Noise: Improving Domain Generalization in Gaze Estimation cs.CV · 2026-04-17 · unverdicted · none · ref 20
SeeTN builds a semantic embedding space with prototype transformation and affinity regularization to identify and correct noisy labels, yielding better cross-domain gaze estimation without hurting source accuracy.
EchoAlign: Bridging Generative and Discriminative Learning under Noisy Labels cs.LG · 2024-05-21 · unverdicted · none · ref 27 · internal anchor
EchoAlign adjusts instances with controllable generative models to match noisy labels and selects reliable subsets, outperforming prior methods on benchmarks especially under 30% instance-dependent noise.

Webvision Database: Visual Learning and Under- standing from Web Data.arXiv preprint arXiv:1708.02862

fields

years

verdicts

representative citing papers

citing papers explorer