Recognition: no theorem link
When Normality Shifts: Risk-Aware Test-Time Adaptation for Unsupervised Tabular Anomaly Detection
Pith reviewed 2026-05-12 05:18 UTC · model grok-4.3
The pith
RTTAD combines collaborative dual-task training with risk-aware test-time contrastive learning to adapt models to normality shifts without anomaly contamination.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RTTAD holistically tackles normality shifts via a synergistic two-stage mechanism. During training, collaborative dual-task learning captures multi-level representations to establish a robust normal prior. During testing, a Test-Time Contrastive Learning (TTCL) module explicitly accounts for adaptation risk by selectively updating the model using high-confidence pseudo-normal samples while constraining anomalous ones. Additionally, TTCL incorporates a k-nearest neighbor-based contrastive objective to refine embedding distributions, thereby further enhancing the model's discriminative capacity.
What carries the argument
The Test-Time Contrastive Learning (TTCL) module, which selectively updates the model on high-confidence pseudo-normal samples and applies a k-nearest-neighbor contrastive objective to refine embeddings while blocking anomalous contamination.
If this is right
- Collaborative dual-task learning during training supplies multi-level representations that support later selective adaptation.
- Risk-aware selection of only high-confidence pseudo-normals prevents anomaly contamination during test-time updates.
- The k-nearest-neighbor contrastive objective further refines embedding distributions and improves separation of normal and anomalous points.
- The combined pipeline reaches state-of-the-art detection performance on 15 tabular datasets.
Where Pith is reading between the lines
- The same confidence-gated selection idea could be added to other unsupervised detectors facing deployment shifts without requiring full retraining.
- If the pseudo-normal selection holds up, the method points toward tighter integration of training and testing phases across anomaly detection pipelines.
- Applying the approach to data streams with gradual normality drift would test how well it tracks evolving normal patterns over time.
- Similar risk-aware modules might reduce contamination risks in related tasks such as unsupervised domain adaptation or online clustering.
Load-bearing premise
High-confidence pseudo-normal samples identified at test time contain few enough anomalies that updating on them improves rather than harms the model's ability to distinguish anomalies.
What would settle it
A test where the confidence selector is deliberately fed a dataset containing many mislabeled anomalies, followed by checking whether detection accuracy falls below that of the unadapted baseline model.
Figures
read the original abstract
Unsupervised tabular anomaly detection methods typically learn feature patterns from normal samples during training and subsequently identify samples that deviate from these patterns as anomalies during testing. However, in practical scenarios, the limited scale and diversity of training data often lead to an incomplete characterization of normal patterns. While test-time adaptation offers a remedy, its isolated focus on test-time optimization ignores the critical synergy with training-phase learning. Furthermore, indiscriminate adaptation to unlabeled test data inevitably triggers anomaly contamination, preventing the model from fully realizing its discriminative capability between normal and anomalous samples. To address these issues, we propose RTTAD, a Risk-aware Test-time adaptation method for unsupervised Tabular Anomaly Detection. RTTAD holistically tackles normality shifts via a synergistic two-stage mechanism. During training, collaborative dual-task learning captures multi-level representations to establish a robust normal prior. During testing, a Test-Time Contrastive Learning (TTCL) module explicitly accounts for adaptation risk by selectively updating the model using high-confidence pseudo-normal samples while constraining anomalous ones. Additionally, TTCL incorporates a k-nearest neighbor-based contrastive objective to refine embedding distributions, thereby further enhancing the model's discriminative capacity. Extensive experiments on 15 tabular datasets demonstrate that RTTAD achieves state-of-the-art overall detection performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes RTTAD, a two-stage risk-aware test-time adaptation framework for unsupervised tabular anomaly detection. It combines collaborative dual-task learning in training to build a robust normal prior with a Test-Time Contrastive Learning (TTCL) module that performs selective model updates on high-confidence pseudo-normal samples identified via risk-aware scoring and a k-nearest-neighbor contrastive objective, aiming to handle normality shifts without anomaly contamination. Experiments on 15 tabular datasets are claimed to yield state-of-the-art detection performance.
Significance. If the core mechanisms hold and the reported gains are reproducible, the work would offer a practical advance in tabular anomaly detection by explicitly addressing the synergy between training priors and test-time adaptation while mitigating contamination risks, which is a common failure mode in existing TTA methods for this domain.
major comments (1)
- [§4, Algorithm 1] §4 and Algorithm 1: The central claim that TTCL enables safe adaptation and SOTA performance rests on the assumption that high-confidence pseudo-normal samples are sufficiently pure; however, the manuscript provides no contamination-rate analysis, no ablation studies on noisy pseudo-labels, and no worst-case evaluation when the initial training prior is weak. Even modest anomaly leakage into the update set could cause the kNN contrastive objective to degrade rather than refine the embedding space, directly undermining the performance numbers on the 15 datasets.
minor comments (1)
- [Abstract] The abstract states SOTA results on 15 datasets but omits any mention of baselines, metrics, statistical significance tests, or ablation controls; these details should be summarized early in the introduction or experiments section for clarity.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback on our manuscript. We address the major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: [§4, Algorithm 1] §4 and Algorithm 1: The central claim that TTCL enables safe adaptation and SOTA performance rests on the assumption that high-confidence pseudo-normal samples are sufficiently pure; however, the manuscript provides no contamination-rate analysis, no ablation studies on noisy pseudo-labels, and no worst-case evaluation when the initial training prior is weak. Even modest anomaly leakage into the update set could cause the kNN contrastive objective to degrade rather than refine the embedding space, directly undermining the performance numbers on the 15 datasets.
Authors: We appreciate the referee highlighting the importance of validating the purity of the high-confidence pseudo-normal samples selected by the risk-aware scoring. The TTCL module is explicitly designed to mitigate contamination by using risk-aware scoring to prioritize samples aligned with the normal prior from collaborative dual-task learning and by constraining updates on lower-confidence (potentially anomalous) samples via the selective mechanism. However, we acknowledge that the current manuscript does not include explicit contamination-rate measurements, dedicated ablations on noisy pseudo-labels, or worst-case evaluations under weakened training priors. In the revised version, we will add: (1) quantitative analysis of contamination rates in the selected update sets across all 15 datasets, (2) ablation experiments that inject controlled levels of anomaly leakage into the pseudo-normal set and report the resulting impact on detection performance and embedding quality, and (3) additional experiments simulating weaker initial priors (e.g., by subsampling the training data) to evaluate robustness of the kNN contrastive objective. These additions will directly demonstrate that the selective update strategy prevents degradation and supports the reported gains. revision: yes
Circularity Check
No circularity: RTTAD components are independently specified and experimentally validated
full rationale
The abstract and description introduce a two-stage architecture (collaborative dual-task learning for normal prior + risk-aware TTCL with kNN contrastive objective and selective pseudo-normal updates) whose definitions and loss terms are presented as novel design choices rather than reductions of fitted quantities or self-citations. No equations, uniqueness theorems, or ansatzes are shown to be defined in terms of the target performance metrics. Performance claims rest on external experiments across 15 datasets, satisfying the self-contained criterion.
Axiom & Free-Parameter Ledger
free parameters (2)
- confidence threshold for pseudo-normal selection
- k in k-nearest-neighbor contrastive objective
axioms (1)
- domain assumption High-confidence pseudo-normal samples identified from unlabeled test data are sufficiently clean of anomalies
Reference graph
Works this paper leans on
-
[1]
Deep learning for medical anomaly detection–a survey,
T. Fernando, H. Gammulle, S. Denman, S. Sridharan, and C. Fookes, “Deep learning for medical anomaly detection–a survey,”ACM Com- puting Surveys (CSUR), vol. 54, no. 7, pp. 1–37, 2021. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 12
work page 2021
-
[2]
Truncated affinity maximization: One-class homophily modeling for graph anomaly detection,
H. Qiao and G. Pang, “Truncated affinity maximization: One-class homophily modeling for graph anomaly detection,”Advances in Neural Information Processing Systems, vol. 36, pp. 49 490–49 512, 2023
work page 2023
-
[3]
Generative semi- supervised graph anomaly detection,
H. Qiao, Q. Wen, X. Li, E.-P. Lim, and G. Pang, “Generative semi- supervised graph anomaly detection,” inAdvances in Neural Information Processing Systems, 2024
work page 2024
-
[4]
Zero-shot generalist graph anomaly detection with unified neighborhood prompts,
C. Niu, H. Qiao, C. Chen, L. Chen, and G. Pang, “Zero-shot generalist graph anomaly detection with unified neighborhood prompts,”arXiv preprint arXiv:2410.14886, 2024
-
[5]
K. G. Al-Hashedi and P. Magalingam, “Financial fraud detection ap- plying data mining techniques: A comprehensive review from 2009 to 2019,”Computer Science Review, vol. 40, p. 100402, 2021
work page 2009
-
[6]
Deep graph anomaly detection: A survey and new perspectives,
H. Qiao, H. Tong, B. An, I. King, C. Aggarwal, and G. Pang, “Deep graph anomaly detection: A survey and new perspectives,”arXiv preprint arXiv:2409.09957, 2024
-
[7]
Anomalygfm: Graph foun- dation model for zero/few-shot anomaly detection,
H. Qiao, C. Niu, L. Chen, and G. Pang, “Anomalygfm: Graph foun- dation model for zero/few-shot anomaly detection,”arXiv preprint arXiv:2502.09254, 2025
-
[8]
Deep industrial image anomaly detection: A survey,
J. Liu, G. Xie, J. Wang, S. Li, C. Wang, F. Zheng, and Y . Jin, “Deep industrial image anomaly detection: A survey,”Machine Intelligence Research, vol. 21, no. 1, pp. 104–135, 2024
work page 2024
-
[9]
Support vector method for novelty detection,
B. Sch ¨olkopf, R. C. Williamson, A. Smola, J. Shawe-Taylor, and J. Platt, “Support vector method for novelty detection,”Advances in neural information processing systems, vol. 12, 1999
work page 1999
-
[10]
Deep one-class classification,
L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E. M ¨uller, and M. Kloft, “Deep one-class classification,” inInternational conference on machine learning. PMLR, 2018, pp. 4393–4402
work page 2018
-
[11]
Unsupervised anomaly detection by robust density estimation,
B. Liu, P.-N. Tan, and J. Zhou, “Unsupervised anomaly detection by robust density estimation,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 4101–4108
work page 2022
-
[12]
Anomaly detection in public street lighting data using unsupervised clustering,
M. Ali, P. Scandurra, F. Moretti, and H. H. R. Sherazi, “Anomaly detection in public street lighting data using unsupervised clustering,” IEEE Transactions on Consumer Electronics, 2024
work page 2024
-
[13]
Ecod: Unsupervised outlier detection using empirical cumulative distribution functions,
Z. Li, Y . Zhao, X. Hu, N. Botta, C. Ionescu, and G. H. Chen, “Ecod: Unsupervised outlier detection using empirical cumulative distribution functions,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 181–12 193, 2022
work page 2022
-
[14]
Unsupervised anomaly detection with generative adversarial networks to guide marker discovery,
T. Schlegl, P. Seeb ¨ock, S. M. Waldstein, U. Schmidt-Erfurth, and G. Langs, “Unsupervised anomaly detection with generative adversarial networks to guide marker discovery,” inInternational conference on information processing in medical imaging. Springer, 2017, pp. 146– 157
work page 2017
-
[15]
D. Gong, L. Liu, V . Le, B. Saha, M. R. Mansour, S. Venkatesh, and A. v. d. Hengel, “Memorizing normality to detect anomaly: Memory- augmented deep autoencoder for unsupervised anomaly detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 1705–1714
work page 2019
-
[16]
Anomaly detection for tabular data with internal contrastive learning,
T. Shenkar and L. Wolf, “Anomaly detection for tabular data with internal contrastive learning,” inInternational conference on learning representations, 2022
work page 2022
-
[17]
Mcm: Masked cell modeling for anomaly detection in tabular data,
J. Yin, Y . Qiao, Z. Zhou, X. Wang, and J. Yang, “Mcm: Masked cell modeling for anomaly detection in tabular data,” inThe Twelfth International Conference on Learning Representations, 2024
work page 2024
-
[18]
Drl: Decomposed representation learning for tabular anomaly detection,
H. Ye, H. Zhao, W. Fan, M. Zhou, D. dan Guo, and Y . Chang, “Drl: Decomposed representation learning for tabular anomaly detection,” in The Thirteenth International Conference on Learning Representations, 2025
work page 2025
-
[19]
Support vector data description,
D. M. Tax and R. P. Duin, “Support vector data description,”Machine learning, vol. 54, pp. 45–66, 2004
work page 2004
-
[20]
Drocc: Deep robust one-class classification,
S. Goyal, A. Raghunathan, M. Jain, H. V . Simhadri, and P. Jain, “Drocc: Deep robust one-class classification,” inInternational conference on machine learning. PMLR, 2020, pp. 3711–3721
work page 2020
-
[21]
Mocca: Multilayer one-class classification for anomaly detection,
F. V . Massoli, F. Falchi, A. Kantarci, S ¸. Akti, H. K. Ekenel, and G. Amato, “Mocca: Multilayer one-class classification for anomaly detection,”IEEE transactions on neural networks and learning systems, vol. 33, no. 6, pp. 2313–2323, 2021
work page 2021
-
[22]
Calibrated one-class classification for unsupervised time series anomaly detection,
H. Xu, Y . Wang, S. Jian, Q. Liao, Y . Wang, and G. Pang, “Calibrated one-class classification for unsupervised time series anomaly detection,” IEEE Transactions on Knowledge and Data Engineering, 2024
work page 2024
-
[23]
Lof: identifying density-based local outliers,
M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: identifying density-based local outliers,” inProceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000, pp. 93–104
work page 2000
-
[24]
Deep autoencoding gaussian mixture model for unsupervised anomaly detection,
B. Zong, Q. Song, M. R. Min, W. Cheng, C. Lumezanu, D. Cho, and H. Chen, “Deep autoencoding gaussian mixture model for unsupervised anomaly detection,” inInternational conference on learning representa- tions, 2018
work page 2018
-
[25]
f-anogan: Fast unsupervised anomaly detection with generative adversarial networks,
T. Schlegl, P. Seeb ¨ock, S. M. Waldstein, G. Langs, and U. Schmidt- Erfurth, “f-anogan: Fast unsupervised anomaly detection with generative adversarial networks,”Medical image analysis, vol. 54, pp. 30–44, 2019
work page 2019
-
[26]
Draem-a discriminatively trained reconstruction embedding for surface anomaly detection,
V . Zavrtanik, M. Kristan, and D. Sko ˇcaj, “Draem-a discriminatively trained reconstruction embedding for surface anomaly detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 8330–8339
work page 2021
-
[27]
Generative cooperative learning for unsupervised video anomaly detection,
M. Z. Zaheer, A. Mahmood, M. H. Khan, M. Segu, F. Yu, and S.-I. Lee, “Generative cooperative learning for unsupervised video anomaly detection,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 14 744–14 754
work page 2022
-
[28]
Unsuper- vised surface anomaly detection with diffusion probabilistic model,
X. Zhang, N. Li, J. Li, T. Dai, Y . Jiang, and S.-T. Xia, “Unsuper- vised surface anomaly detection with diffusion probabilistic model,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 6782–6791
work page 2023
-
[29]
Recontrast: Domain-specific anomaly detection via contrastive reconstruction,
J. Guo, L. Jia, W. Zhang, H. Liet al., “Recontrast: Domain-specific anomaly detection via contrastive reconstruction,”Advances in Neural Information Processing Systems, vol. 36, 2024
work page 2024
-
[30]
arXiv preprint arXiv:2005.02359 , year=
L. Bergman and Y . Hoshen, “Classification-based anomaly detection for general data,”arXiv preprint arXiv:2005.02359, 2020
-
[31]
Neural transformation learning for deep anomaly detection beyond images,
C. Qiu, T. Pfrommer, M. Kloft, S. Mandt, and M. Rudolph, “Neural transformation learning for deep anomaly detection beyond images,” in International conference on machine learning. PMLR, 2021, pp. 8703– 8714
work page 2021
-
[32]
Subtype-aware unsupervised domain adapta- tion for medical diagnosis,
X. Liu, X. Liu, B. Hu, W. Ji, F. Xing, J. Lu, J. You, C.-C. J. Kuo, G. El Fakhri, and J. Woo, “Subtype-aware unsupervised domain adapta- tion for medical diagnosis,” inProceedings of the AAAI conference on artificial intelligence, vol. 35, 2021, pp. 2189–2197
work page 2021
-
[33]
Ttt++: When does self-supervised test-time training fail or thrive?
Y . Liu, P. Kothari, B. Van Delft, B. Bellot-Gurlet, T. Mordan, and A. Alahi, “Ttt++: When does self-supervised test-time training fail or thrive?”Advances in Neural Information Processing Systems, vol. 34, pp. 21 808–21 820, 2021
work page 2021
-
[34]
Test-time prompt tuning for zero-shot generalization in vision-language models,
M. Shu, W. Nie, D.-A. Huang, Z. Yu, T. Goldstein, A. Anandkumar, and C. Xiao, “Test-time prompt tuning for zero-shot generalization in vision-language models,”Advances in Neural Information Processing Systems, vol. 35, pp. 14 274–14 289, 2022
work page 2022
-
[35]
Test-time adaptation via self-training with nearest neighbor information,
M. Jang, S.-Y . Chung, and H. W. Chung, “Test-time adaptation via self-training with nearest neighbor information,” inThe International Conference on Learning Representations, ICLR 2023. The International Conference on Learning Representations (ICLR), 2023
work page 2023
-
[36]
When model meets new normals: Test-time adaptation for unsupervised time-series anomaly detection,
D. Kim, S. Park, and J. Choo, “When model meets new normals: Test-time adaptation for unsupervised time-series anomaly detection,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, 2024, pp. 13 113–13 121
work page 2024
-
[37]
Boosting anomaly detection using unsupervised diverse test-time augmentation,
S. Cohen, N. Goldshlager, L. Rokach, and B. Shapira, “Boosting anomaly detection using unsupervised diverse test-time augmentation,” Information Sciences, vol. 626, pp. 821–836, 2023
work page 2023
-
[38]
An evidence-based post-hoc adjustment frame- work for anomaly detection under data contamination,
S. Patra and S. B. Taieb, “An evidence-based post-hoc adjustment frame- work for anomaly detection under data contamination,” inThe Thirty- ninth Annual Conference on Neural Information Processing Systems, 2025
work page 2025
-
[39]
Estimating the contamination factor’s distribution in unsupervised anomaly detection,
L. Perini, P.-C. B ¨urkner, and A. Klami, “Estimating the contamination factor’s distribution in unsupervised anomaly detection,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 27 668–27 679
work page 2023
-
[40]
S. Rayana, “Odds library,” 2016. [Online]. Available: https://odds.cs. stonybrook.edu
work page 2016
-
[41]
Adbench: Anomaly detection benchmark,
S. Han, X. Hu, H. Huang, M. Jiang, and Y . Zhao, “Adbench: Anomaly detection benchmark,”Advances in Neural Information Processing Sys- tems, vol. 35, pp. 32 142–32 159, 2022
work page 2022
-
[42]
Anoshift: A distribution shift benchmark for unsupervised anomaly detection,
M. Dragoi, E. Burceanu, E. Haller, A. Manolache, and F. Brad, “Anoshift: A distribution shift benchmark for unsupervised anomaly detection,”Advances in Neural Information Processing Systems, vol. 35, pp. 32 854–32 867, 2022
work page 2022
-
[43]
F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in2008 eighth ieee international conference on data mining. IEEE, 2008, pp. 413–422
work page 2008
-
[44]
Deep isolation forest for anomaly detection,
H. Xu, G. Pang, Y . Wang, and Y . Wang, “Deep isolation forest for anomaly detection,”IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 12, pp. 12 591–12 604, 2023
work page 2023
-
[45]
Fascinating supervisory signals and where to find them: Deep anomaly detection with scale learning,
H. Xu, Y . Wang, J. Wei, S. Jian, Y . Li, and N. Liu, “Fascinating supervisory signals and where to find them: Deep anomaly detection with scale learning,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 38 655–38 673
work page 2023
-
[46]
Lunar: Unifying local outlier detection methods via graph neural networks,
A. Goodge, B. Hooi, S.-K. Ng, and W. S. Ng, “Lunar: Unifying local outlier detection methods via graph neural networks,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 36, 2022, pp. 6737–6745. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 13
work page 2022
-
[47]
Pytorch: An imperative style, high-performance deep learning library,
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antigaet al., “Pytorch: An imperative style, high-performance deep learning library,”Advances in neural information processing systems, vol. 32, 2019
work page 2019
-
[48]
Pyod: A python toolbox for scalable outlier detection,
Y . Zhao, Z. Nasrullah, and Z. Li, “Pyod: A python toolbox for scalable outlier detection,”Journal of machine learning research, vol. 20, no. 96, pp. 1–7, 2019
work page 2019
-
[49]
R. F. Woolson, “Wilcoxon signed-rank test,”Wiley encyclopedia of clinical trials, pp. 1–3, 2007
work page 2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.