Deep Active Learning over the Long Tail
read the original abstract
This paper is concerned with pool-based active learning for deep neural networks. Motivated by coreset dataset compression ideas, we present a novel active learning algorithm that queries consecutive points from the pool using farthest-first traversals in the space of neural activation over a representation layer. We show consistent and overwhelming improvement in sample complexity over passive learning (random sampling) for three datasets: MNIST, CIFAR-10, and CIFAR-100. In addition, our algorithm outperforms the traditional uncertainty sampling technique (obtained using softmax activations), and we identify cases where uncertainty sampling is only slightly better than random sampling.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
TinyUSFM: Towards Compact and Efficient Ultrasound Foundation Models
TinyUSFM distills a large ultrasound foundation model into a lightweight version using feature-gradient coreset selection and domain-separated masked image modeling, matching performance on a new 18-dataset benchmark ...
-
Discriminative Active Learning
DAL poses batch active learning as a binary classification task between labeled and unlabeled data to select informative examples for labeling.
-
Are Candidate Models Really Needed for Active Learning?
Active learning with randomly initialized models achieves comparable results to traditional candidate-model methods, with low-confidence sampling proving most effective.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.