pith. sign in

arxiv: 1806.10131 · v2 · pith:7KYAIRPGnew · submitted 2018-06-25 · 💻 cs.LG · cs.AI· stat.ML

Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels

classification 💻 cs.LG cs.AIstat.ML
keywords conceptdrifthierarchicalhypothesislabelsmethodstestingdetection
0
0 comments X
read the original abstract

One important assumption underlying common classification models is the stationarity of the data. However, in real-world streaming applications, the data concept indicated by the joint distribution of feature and label is not stationary but drifting over time. Concept drift detection aims to detect such drifts and adapt the model so as to mitigate any deterioration in the model's predictive performance. Unfortunately, most existing concept drift detection methods rely on a strong and over-optimistic condition that the true labels are available immediately for all already classified instances. In this paper, a novel Hierarchical Hypothesis Testing framework with Request-and-Reverify strategy is developed to detect concept drifts by requesting labels only when necessary. Two methods, namely Hierarchical Hypothesis Testing with Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the novel framework. In experiments with benchmark datasets, our methods demonstrate overwhelming advantages over state-of-the-art unsupervised drift detectors. More importantly, our methods even outperform DDM (the widely used supervised drift detector) when we use significantly fewer labels.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Pitfalls of Unlabeled Disagreement-Based Drift Detection in Streaming Tree Ensembles

    cs.LG 2026-05 unverdicted novelty 5.0

    Disagreement measures from label flipping in IDT ensembles underperform loss-based drift detectors in streaming tabular data due to the limited plasticity of tree models.