Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review

Gordon V. Cormack; Maura R. Grossman

arxiv: 1504.06868 · v1 · pith:VNVJQXDKnew · submitted 2015-04-26 · 💻 cs.IR · cs.LG

Autonomy and Reliability of Continuous Active Learning for Technology-Assisted Review

Gordon V. Cormack , Maura R. Grossman This is my paper

classification 💻 cs.IR cs.LG

keywords activeautonomycontinuouscormackdocumentsgrossmanlearningreview

0 comments

read the original abstract

We enhance the autonomy of the continuous active learning method shown by Cormack and Grossman (SIGIR 2014) to be effective for technology-assisted review, in which documents from a collection are retrieved and reviewed, using relevance feedback, until substantially all of the relevant documents have been reviewed. Autonomy is enhanced through the elimination of topic-specific and dataset-specific tuning parameters, so that the sole input required by the user is, at the outset, a short query, topic description, or single relevant document; and, throughout the review, ongoing relevance assessments of the retrieved documents. We show that our enhancements consistently yield superior results to Cormack and Grossman's version of continuous active learning, and other methods, not only on average, but on the vast majority of topics from four separate sets of tasks: the legal datasets examined by Cormack and Grossman, the Reuters RCV1-v2 subject categories, the TREC 6 AdHoc task, and the construction of the TREC 2002 filtering test collection.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

ReLeVAnT: Relevance Lexical Vectors for Accurate Legal Text Classification
cs.CL 2026-04 unverdicted novelty 4.0

ReLeVAnT achieves 99.3% accuracy and 98.7% F1 in binary legal document classification on LexGLUE via n-gram processing, contrastive score matching, and a shallow neural network after one-time keyword extraction.