pith. sign in

arxiv: 2312.00364 · v1 · pith:KOJ6TO6Nnew · submitted 2023-12-01 · 💻 cs.LG · cs.CV

Benchmarking Multi-Domain Active Learning on Image Classification

classification 💻 cs.LG cs.CV
keywords multi-domainactivelearningbenchmarkdatadatasetsexistingimage
0
0 comments X
read the original abstract

Active learning aims to enhance model performance by strategically labeling informative data points. While extensively studied, its effectiveness on large-scale, real-world datasets remains underexplored. Existing research primarily focuses on single-source data, ignoring the multi-domain nature of real-world data. We introduce a multi-domain active learning benchmark to bridge this gap. Our benchmark demonstrates that traditional single-domain active learning strategies are often less effective than random selection in multi-domain scenarios. We also introduce CLIP-GeoYFCC, a novel large-scale image dataset built around geographical domains, in contrast to existing genre-based domain datasets. Analysis on our benchmark shows that all multi-domain strategies exhibit significant tradeoffs, with no strategy outperforming across all datasets or all metrics, emphasizing the need for future research.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.