Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

Aditya Siddhant; Zachary C. Lipton

arxiv: 1808.05697 · v3 · pith:6LVO4G2Znew · submitted 2018-08-16 · 💻 cs.CL · cs.LG· stat.ML

Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study

Aditya Siddhant , Zachary C. Lipton This is my paper

classification 💻 cs.CL cs.LGstat.ML

keywords learningactivedeepmultipleacquisitionbayesianempiricalfunctions

0 comments

read the original abstract

Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the course of one AL run, an agent annotates its dataset exhausting its labeling budget. Thus, given a new task, an active learner has no opportunity to compare models and acquisition functions. This paper provides a large scale empirical study of deep active learning, addressing multiple tasks and, for each, multiple datasets, multiple models, and a full suite of acquisition functions. We find that across all settings, Bayesian active learning by disagreement, using uncertainty estimates provided either by Dropout or Bayes-by Backprop significantly improves over i.i.d. baselines and usually outperforms classic uncertainty sampling.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Modeling the Uncertainty in Electronic Health Records: a Bayesian Deep Learning Approach
cs.LG 2019-07 unverdicted novelty 3.0

Bayesian neural networks are used on EHR data to quantify prediction uncertainty from data noise, with experiments showing high-uncertainty cases degrade performance and can identify patients for data-quality intervention.