Conditional anomaly detection with soft harmonic functions

Branislav Kveton; Gregory F. Cooper; Hamed Valizadegan; Michal Valko; Milos Hauskrecht

arxiv: 2604.21462 · v1 · submitted 2026-04-23 · 💻 cs.LG

Conditional anomaly detection with soft harmonic functions

Michal Valko , Branislav Kveton , Hamed Valizadegan , Gregory F. Cooper , Milos Hauskrecht This is my paper

Pith reviewed 2026-05-09 22:05 UTC · model grok-4.3

classification 💻 cs.LG

keywords conditional anomaly detectionsoft harmonic functionslabel confidence estimationmislabeling detectionnon-parametric anomaly detectiongraph-based methodselectronic health records

0 comments

The pith

The soft harmonic solution estimates label confidence to identify anomalous mislabeling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a non-parametric method for conditional anomaly detection that uses the soft harmonic solution to gauge how likely a given label is for each data point. This allows detection of instances with unusual responses or mislabels. The solution is further regularized to avoid flagging isolated examples or those near distribution boundaries. If this works, it provides a way to clean datasets or monitor decisions without parametric assumptions on the data distribution. The approach is tested on synthetic data, standard machine learning benchmarks, and a real electronic health record collection.

Core claim

The authors claim that solving the soft harmonic function on a graph constructed from the data yields an estimate of label confidence that can be used to flag conditional anomalies, and that adding regularization terms prevents spurious detections on the support boundary and isolated points. This is shown to work better than several baselines on multiple datasets including a real-world medical one.

What carries the argument

the soft harmonic solution, which computes label confidence by minimizing a regularized quadratic form over a similarity graph of the data points

Load-bearing premise

The soft harmonic solution after regularization separates anomalous mislabeling from normal variation without creating false positives on distribution boundaries or isolated points.

What would settle it

Run the method on a dataset where some labels are deliberately flipped to be anomalous and check if the flagged points match the flipped ones more accurately than baselines, without excess flags on boundary points.

Figures

Figures reproduced from arXiv: 2604.21462 by Branislav Kveton, Gregory F. Cooper, Hamed Valizadegan, Michal Valko, Milos Hauskrecht.

**Figure 2.** Figure 2: Synthetic Data. Top: A sample of datasets D1, D2, and D3. Bottom: Synthetic datasets after changing the labels of 3% of the examples. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Black dots depict the top five conditional anomalies based on the score for each of the methods on D3. The top five conditional anomalies [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Histogram of anomaly scores for 2 different tasks. The scores [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Medical Dataset: Varying graph size. Comparison of 1) SoftHAD [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Medical Dataset: Varying regularizer 1) γg for SoftHAD 2) cost c for SVM with RBF kernel. the two methods with scaling adjustment for this multi-task problem ( [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

In this paper, we consider the problem of conditional anomaly detection that aims to identify data instances with an unusual response or a class label. We develop a new non-parametric approach for conditional anomaly detection based on the soft harmonic solution, with which we estimate the confidence of the label to detect anomalous mislabeling. We further regularize the solution to avoid the detection of isolated examples and examples on the boundary of the distribution support. We demonstrate the efficacy of the proposed method on several synthetic and UCI ML datasets in detecting unusual labels when compared to several baseline approaches. We also evaluate the performance of our method on a real-world electronic health record dataset where we seek to identify unusual patient-management decisions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts soft harmonic functions with boundary and isolation regularization for conditional label anomaly detection, a reasonable extension but one whose empirical claims rest on thin evidence from the abstract.

read the letter

This paper offers a non-parametric method for finding anomalous labels using a soft harmonic function with added regularization against isolated and boundary points. The idea is straightforward and the regularization is a sensible fix, but the lack of any quantitative results makes it hard to evaluate how well it works. The new part is applying the soft harmonic solution specifically to estimate label confidence for anomaly scoring, then regularizing to handle cases where standard methods would fail. That targets a real issue in label propagation for anomaly detection. It does well in choosing relevant datasets, including real electronic health records for patient management anomalies. The framing as conditional anomaly detection is clear. The soft spots are in the evaluation. No numbers, no error bars, no details on how the regularization parameters were set or ablated. This makes the outperformance claim difficult to trust without the full results. The derivation itself seems fine and not circular. Overall, this is for people focused on anomaly detection in labeled data, especially in applied domains like healthcare. A reader looking for new tools in semi-supervised anomaly detection could get some value from the approach. I would not cite this in my work in the next year, as it doesn't seem to break new ground beyond the regularization step. It should be sent for peer review. The construction is honest and the problem is worthwhile, so referees can help strengthen the experiments.

Referee Report

1 major / 2 minor

Summary. The paper proposes a non-parametric method for conditional anomaly detection that adapts the soft harmonic solution to estimate label confidence and thereby identify instances with anomalous (mis)labels. Regularization is added to suppress detections on isolated points and at the support boundary. The approach is evaluated on synthetic data, UCI benchmarks, and a real electronic health-record collection, where it is reported to outperform several baseline detectors.

Significance. If the empirical gains are reproducible and the regularization does not systematically inflate false positives near boundaries, the method would supply a lightweight graph-based alternative for label-anomaly detection in semi-supervised settings. The explicit handling of isolated and boundary cases addresses a known practical weakness of harmonic-function label propagation and could be useful in domains such as clinical decision auditing.

major comments (1)

The abstract asserts outperformance on synthetic, UCI, and EHR data yet supplies no numerical results, error bars, ablation tables, or description of how regularization parameters were selected. Because the central claim is that the regularized soft-harmonic estimator reliably separates anomalous mislabeling from normal variation, the absence of these quantitative details leaves the efficacy statement only weakly supported.

minor comments (2)

Notation for the soft-harmonic solution and the added regularization term should be introduced with explicit equations rather than prose descriptions alone.
The manuscript should clarify whether the graph construction (k-NN, kernel, etc.) is held fixed across all baselines or tuned per method.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our submission. We address the major comment below and outline the revisions we will implement to strengthen the empirical presentation.

read point-by-point responses

Referee: The abstract asserts outperformance on synthetic, UCI, and EHR data yet supplies no numerical results, error bars, ablation tables, or description of how regularization parameters were selected. Because the central claim is that the regularized soft-harmonic estimator reliably separates anomalous mislabeling from normal variation, the absence of these quantitative details leaves the efficacy statement only weakly supported.

Authors: We agree that the abstract provides only a high-level statement of outperformance without accompanying numbers or methodological details on regularization. Although the body of the manuscript reports comparative results against baselines on the synthetic, UCI, and EHR collections, we acknowledge that the absence of error bars, ablation studies, and an explicit account of parameter selection weakens the support for the central claim. In the revised version we will add error bars derived from repeated runs to all performance tables, include ablation experiments isolating the effect of each regularization term, and describe the regularization-parameter selection procedure (grid search over a validation split). We will also insert a concise quantitative summary into the abstract to give readers immediate evidence of the reported gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained

full rationale

The paper describes a non-parametric conditional anomaly detection method that adapts the soft harmonic solution from graph-based label propagation, adds an explicit regularization term targeting isolated points and support boundaries, and uses the resulting label confidence estimates to flag anomalous mislabelings. No equations, derivations, or claims in the abstract reduce the anomaly score to a fitted parameter renamed as a prediction, a self-citation chain that bears the central load, or any other enumerated circular pattern. The regularization step is presented as an independent modification rather than a re-expression of the input data, and the overall construction retains independent content beyond its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are described in sufficient detail to enumerate.

pith-pipeline@v0.9.0 · 5419 in / 1020 out tokens · 32498 ms · 2026-05-09T22:05:27.581231+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

[1]

Novelty detection: a review, part 1: statistical approaches,

M. Markou and S. Singh, “Novelty detection: a review, part 1: statistical approaches,”Signal Process., vol. 83, no. 12, pp. 2481–2497, 2003

work page 2003
[2]

Novelty detection: a review, part 2: neural network based approaches,

——, “Novelty detection: a review, part 2: neural network based approaches,”Signal Process., vol. 83, no. 12, pp. 2499– 2521, 2003

work page 2003
[3]

Anomaly detection: A survey

V . Chandola, A. Banerjee, and V . Kumar, “Anomaly detection: A survey,”ACM Comput. Surv., vol. 41, pp. 15:1–15:58, July 2009. [Online]. Available: http: //doi.acm.org/10.1145/1541880.1541882

work page doi:10.1145/1541880.1541882 2009
[4]

Evidence-based anomaly detection,

M. Hauskrecht, M. Valko, B. Kveton, S. Visweswaram, and G. Cooper, “Evidence-based anomaly detection,” inAn- nual American Medical Informatics Association Symposium, November 2007, pp. 319–324

work page 2007
[5]

Conditional anomaly detection meth- ods for patient-management alert systems,

M. Valko, G. Cooper, A. Seybert, S. Visweswaran, M. Saul, and M. Hauskrecht, “Conditional anomaly detection meth- ods for patient-management alert systems,” inWorkshop on Machine Learning in Health Care Applications in The 25th International Conference on Machine Learning, 2008

work page 2008
[6]

Anomaly pattern detec- tion in categorical datasets,

K. Das, J. Schneider, and D. B. Neill, “Anomaly pattern detec- tion in categorical datasets,” inProceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’08. New York, NY , USA: ACM, 2008, pp. 169–176

work page 2008
[7]

Conditional outlier detection for clinical alerting,

M. Hauskrecht, M. Valko, I. Batal, G. Clermont, S. Visweswaram, and G. Cooper, “Conditional outlier detection for clinical alerting,”Annual American Medical Informatics Association Symposium, 2010

work page 2010
[8]

An auctioning reputation system based on anomaly detection,

S. Rubin, M. Christodorescu, V . Ganapathy, J. T. Giffin, L. Kruger, H. Wang, and N. Kidd, “An auctioning reputation system based on anomaly detection,” inProceedings of the 12th ACM conference on Computer and communications security, ser. CCS ’05. New York, NY , USA: ACM, 2005, pp. 270–279

work page 2005
[9]

Detecting outlier sections in us congressional legislation,

E. Aktolga, I. Ros, and Y . Assogba, “Detecting outlier sections in us congressional legislation,” inProceedings of SIGIR, 2010, IR

work page 2010
[10]

Bayesian anomaly detection methods for social networks,

N. A. Heard, D. J. Weston, K. Platanioti, and D. J. Hand, “Bayesian anomaly detection methods for social networks,” Annals of Applied Statistics, vol. 4, pp. 645–662, 2010

work page 2010
[11]

Estimating time-varying networks,

M. Kolar, L. Song, A. Ahmed, and E. P. Xing, “Estimating time-varying networks,”Annals of Applied Statistics, vol. 4, pp. 94–123, 2010

work page 2010
[12]

Cross-outlier detection,

S. Papadimitriou and C. Faloutsos, “Cross-outlier detection,” inAdvances in Spatial and Temporal Databases, 8th Interna- tional Symposium, SSTD 2003, Santorini Island, Greece, July 24-27, 2003, Proceedings, T. Hadzilacos, Y . Manolopoulos, J. F. Roddick, and Y . Theodoridis, Eds., vol. 2750, 2003, pp. 199–213

work page 2003
[13]

Conditional anomaly detection,

X. Song, M. Wu, and C. Jermaine, “Conditional anomaly detection,”IEEE Transactions on Knowledge and Data En- gineering, vol. 19, no. 5, pp. 631–645, 2007, fellow-Sanjay Ranka

work page 2007
[14]

Distance metric learning for conditional anomaly detection,

M. Valko and M. Hauskrecht, “Distance metric learning for conditional anomaly detection,” inTwenty-First International Florida Artificial Intelligence Research Society Conference. AAAI Press, 2008

work page 2008
[15]

Identifying mislabeled training data,

C. E. Brodley and M. A. Friedl, “Identifying mislabeled training data,”J. Artif. Intell. Res. (JAIR), vol. 11, pp. 131– 167, 1999

work page 1999
[16]

Ensemble methods for noise elimination in classification problems

S. Verbaeten and A. V . Assche., “Ensemble methods for noise elimination in classification problems.” inProceeding of 4th International Workshop on Multiple Classifier Systems, 2003

work page 2003
[17]

Editing training data for knn classifiers with neural network ensemble

Y . Jiang and Z.-H. Zhou, “Editing training data for knn classifiers with neural network ensemble.” inLecture Notes in Computer Science 3173, 2004, pp. 356–361

work page 2004
[18]

Analysis of new techniques to obtain quality training sets

J. Sanchez, R. Barandela, A. I. Marques, R. Alejo, and B. J., “Analysis of new techniques to obtain quality training sets.” Pattern Recognition Letteres 24, pp. 1015–1022, 2003

work page 2003
[19]

Kernel based detection of mislabeled training examples,

H. Valizadegan and P.-N. Tan, “Kernel based detection of mislabeled training examples,” inProceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA, 2007

work page 2007
[20]

Learning with local and global consistency,

D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf, “Learning with local and global consistency,”Advances in Neural Information Processing Systems, vol. 16, pp. 321–328, 2004

work page 2004
[21]

Semi-supervised learning using gaussian fields and harmonic functions,

X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-supervised learning using gaussian fields and harmonic functions,” in Proceedings of the 20th International Conference on Machine Learning, 2003, pp. 912–919

work page 2003
[22]

Stability of transductive regression algorithms,

C. Cortes, M. Mohri, D. Pechyony, and A. Rastogi, “Stability of transductive regression algorithms,” inProceedings of the 25th International Conference on Machine Learning, 2008, pp. 176–183

work page 2008
[23]

Quantization,

R. Gray and D. Neuhoff, “Quantization,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2325–2383, 1998

work page 1998
[24]

Estimating the support of a high- dimensional distribution,

B. Scholkopf, J. C. Platt, J. Shawe-taylor, A. J. Smola, and R. C. Williamson, “Estimating the support of a high- dimensional distribution,”Neural Computation, vol. 13, p. 2001, 1999

work page 2001
[25]

Hastie, R

T. Hastie, R. Tibshirani, and J. H. Friedman,The Elements of Statistical Learning. Springer, August 2001

work page 2001
[26]

V . N. Vapnik,The nature of statistical learning theory. New York, NY , USA: Springer-Verlag New York, Inc., 1995

work page 1995
[27]

UCI machine learning repository,

A. Asuncion and D. Newman, “UCI machine learning repository,” 2011. [Online]. Available: http://www.ics.uci. edu/$\sim$mlearn/{MLR}epository.html

work page 2011
[28]

The meaning and use of the area under a receiver operating characteristic (roc) curve

J. A. Hanley and B. J. Mcneil, “The meaning and use of the area under a receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1, pp. 29–36, April 1982

work page 1982
[29]

A tutorial on spectral clustering,

U. Luxburg, “A tutorial on spectral clustering,”Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007

work page 2007

[1] [1]

Novelty detection: a review, part 1: statistical approaches,

M. Markou and S. Singh, “Novelty detection: a review, part 1: statistical approaches,”Signal Process., vol. 83, no. 12, pp. 2481–2497, 2003

work page 2003

[2] [2]

Novelty detection: a review, part 2: neural network based approaches,

——, “Novelty detection: a review, part 2: neural network based approaches,”Signal Process., vol. 83, no. 12, pp. 2499– 2521, 2003

work page 2003

[3] [3]

Anomaly detection: A survey

V . Chandola, A. Banerjee, and V . Kumar, “Anomaly detection: A survey,”ACM Comput. Surv., vol. 41, pp. 15:1–15:58, July 2009. [Online]. Available: http: //doi.acm.org/10.1145/1541880.1541882

work page doi:10.1145/1541880.1541882 2009

[4] [4]

Evidence-based anomaly detection,

M. Hauskrecht, M. Valko, B. Kveton, S. Visweswaram, and G. Cooper, “Evidence-based anomaly detection,” inAn- nual American Medical Informatics Association Symposium, November 2007, pp. 319–324

work page 2007

[5] [5]

Conditional anomaly detection meth- ods for patient-management alert systems,

M. Valko, G. Cooper, A. Seybert, S. Visweswaran, M. Saul, and M. Hauskrecht, “Conditional anomaly detection meth- ods for patient-management alert systems,” inWorkshop on Machine Learning in Health Care Applications in The 25th International Conference on Machine Learning, 2008

work page 2008

[6] [6]

Anomaly pattern detec- tion in categorical datasets,

K. Das, J. Schneider, and D. B. Neill, “Anomaly pattern detec- tion in categorical datasets,” inProceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ser. KDD ’08. New York, NY , USA: ACM, 2008, pp. 169–176

work page 2008

[7] [7]

Conditional outlier detection for clinical alerting,

M. Hauskrecht, M. Valko, I. Batal, G. Clermont, S. Visweswaram, and G. Cooper, “Conditional outlier detection for clinical alerting,”Annual American Medical Informatics Association Symposium, 2010

work page 2010

[8] [8]

An auctioning reputation system based on anomaly detection,

S. Rubin, M. Christodorescu, V . Ganapathy, J. T. Giffin, L. Kruger, H. Wang, and N. Kidd, “An auctioning reputation system based on anomaly detection,” inProceedings of the 12th ACM conference on Computer and communications security, ser. CCS ’05. New York, NY , USA: ACM, 2005, pp. 270–279

work page 2005

[9] [9]

Detecting outlier sections in us congressional legislation,

E. Aktolga, I. Ros, and Y . Assogba, “Detecting outlier sections in us congressional legislation,” inProceedings of SIGIR, 2010, IR

work page 2010

[10] [10]

Bayesian anomaly detection methods for social networks,

N. A. Heard, D. J. Weston, K. Platanioti, and D. J. Hand, “Bayesian anomaly detection methods for social networks,” Annals of Applied Statistics, vol. 4, pp. 645–662, 2010

work page 2010

[11] [11]

Estimating time-varying networks,

M. Kolar, L. Song, A. Ahmed, and E. P. Xing, “Estimating time-varying networks,”Annals of Applied Statistics, vol. 4, pp. 94–123, 2010

work page 2010

[12] [12]

Cross-outlier detection,

S. Papadimitriou and C. Faloutsos, “Cross-outlier detection,” inAdvances in Spatial and Temporal Databases, 8th Interna- tional Symposium, SSTD 2003, Santorini Island, Greece, July 24-27, 2003, Proceedings, T. Hadzilacos, Y . Manolopoulos, J. F. Roddick, and Y . Theodoridis, Eds., vol. 2750, 2003, pp. 199–213

work page 2003

[13] [13]

Conditional anomaly detection,

X. Song, M. Wu, and C. Jermaine, “Conditional anomaly detection,”IEEE Transactions on Knowledge and Data En- gineering, vol. 19, no. 5, pp. 631–645, 2007, fellow-Sanjay Ranka

work page 2007

[14] [14]

Distance metric learning for conditional anomaly detection,

M. Valko and M. Hauskrecht, “Distance metric learning for conditional anomaly detection,” inTwenty-First International Florida Artificial Intelligence Research Society Conference. AAAI Press, 2008

work page 2008

[15] [15]

Identifying mislabeled training data,

C. E. Brodley and M. A. Friedl, “Identifying mislabeled training data,”J. Artif. Intell. Res. (JAIR), vol. 11, pp. 131– 167, 1999

work page 1999

[16] [16]

Ensemble methods for noise elimination in classification problems

S. Verbaeten and A. V . Assche., “Ensemble methods for noise elimination in classification problems.” inProceeding of 4th International Workshop on Multiple Classifier Systems, 2003

work page 2003

[17] [17]

Editing training data for knn classifiers with neural network ensemble

Y . Jiang and Z.-H. Zhou, “Editing training data for knn classifiers with neural network ensemble.” inLecture Notes in Computer Science 3173, 2004, pp. 356–361

work page 2004

[18] [18]

Analysis of new techniques to obtain quality training sets

J. Sanchez, R. Barandela, A. I. Marques, R. Alejo, and B. J., “Analysis of new techniques to obtain quality training sets.” Pattern Recognition Letteres 24, pp. 1015–1022, 2003

work page 2003

[19] [19]

Kernel based detection of mislabeled training examples,

H. Valizadegan and P.-N. Tan, “Kernel based detection of mislabeled training examples,” inProceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA, 2007

work page 2007

[20] [20]

Learning with local and global consistency,

D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Scholkopf, “Learning with local and global consistency,”Advances in Neural Information Processing Systems, vol. 16, pp. 321–328, 2004

work page 2004

[21] [21]

Semi-supervised learning using gaussian fields and harmonic functions,

X. Zhu, Z. Ghahramani, and J. Lafferty, “Semi-supervised learning using gaussian fields and harmonic functions,” in Proceedings of the 20th International Conference on Machine Learning, 2003, pp. 912–919

work page 2003

[22] [22]

Stability of transductive regression algorithms,

C. Cortes, M. Mohri, D. Pechyony, and A. Rastogi, “Stability of transductive regression algorithms,” inProceedings of the 25th International Conference on Machine Learning, 2008, pp. 176–183

work page 2008

[23] [23]

Quantization,

R. Gray and D. Neuhoff, “Quantization,”IEEE Transactions on Information Theory, vol. 44, no. 6, pp. 2325–2383, 1998

work page 1998

[24] [24]

Estimating the support of a high- dimensional distribution,

B. Scholkopf, J. C. Platt, J. Shawe-taylor, A. J. Smola, and R. C. Williamson, “Estimating the support of a high- dimensional distribution,”Neural Computation, vol. 13, p. 2001, 1999

work page 2001

[25] [25]

Hastie, R

T. Hastie, R. Tibshirani, and J. H. Friedman,The Elements of Statistical Learning. Springer, August 2001

work page 2001

[26] [26]

V . N. Vapnik,The nature of statistical learning theory. New York, NY , USA: Springer-Verlag New York, Inc., 1995

work page 1995

[27] [27]

UCI machine learning repository,

A. Asuncion and D. Newman, “UCI machine learning repository,” 2011. [Online]. Available: http://www.ics.uci. edu/$\sim$mlearn/{MLR}epository.html

work page 2011

[28] [28]

The meaning and use of the area under a receiver operating characteristic (roc) curve

J. A. Hanley and B. J. Mcneil, “The meaning and use of the area under a receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1, pp. 29–36, April 1982

work page 1982

[29] [29]

A tutorial on spectral clustering,

U. Luxburg, “A tutorial on spectral clustering,”Statistics and Computing, vol. 17, no. 4, pp. 395–416, 2007

work page 2007