How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?

Nicolas Goix (LTCI)

arxiv: 1607.01152 · v1 · pith:NFPAZ4A7new · submitted 2016-07-05 · 📊 stat.ML · cs.LG

How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?

Nicolas Goix (LTCI) This is my paper

classification 📊 stat.ML cs.LG

keywords criteriaalgorithmscurvesdataanomalydetectionlabeledaccurately

0 comments

read the original abstract

When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of un-supervised anomaly detection algorithms. However , in many situations, few or no data are labeled. This calls for alternative criteria one can compute on non-labeled data. In this paper, two criteria that do not require labels are empirically shown to discriminate accurately (w.r.t. ROC or PR based criteria) between algorithms. These criteria are based on existing Excess-Mass (EM) and Mass-Volume (MV) curves, which generally cannot be well estimated in large dimension. A methodology based on feature sub-sampling and aggregating is also described and tested, extending the use of these criteria to high-dimensional datasets and solving major drawbacks inherent to standard EM and MV curves.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Analyzing Shapley Additive Explanations to Understand Anomaly Detection Algorithm Behaviors and Their Complementarity
cs.LG 2026-01 unverdicted novelty 7.0

SHAP attribution profiles can identify complementary anomaly detectors whose divergence in explanations predicts non-overlapping detections, enabling stronger ensembles when high individual performance is maintained.
Automatic Unsupervised Ensemble Outlier Model Selection--Extended Version
cs.LG 2026-05 unverdicted novelty 6.0

MetaEns trains on meta-datasets to predict marginal gains from adding models and uses a submodular-inspired objective with diversity discounting and risk regularization for greedy unsupervised ensemble selection, outp...