pith. machine review for the scientific record. sign in

arxiv: 2605.09685 · v1 · submitted 2026-05-10 · 💻 cs.LG · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Learning Unified Representations of Normalcy for Time Series Anomaly Detection

Alireza Tavakkoli, Nicholas G. Murray, Prithul Sarker, Sushmita Sarker

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:31 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords anomaly detectionmultivariate time seriesscore-based generative modelsunsupervised learningnormal data distributiontime-dependent networkODE reconstructionearly detection
0
0 comments X

The pith

A score-based generative model learns unified normal representations to detect time series anomalies earlier and more accurately

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces U²AD, a framework that uses score-based generative modeling to learn the distribution of normal multivariate time series data. It employs a novel time-dependent score network and unified training objective to capture both local and global temporal contexts, defining the manifold of normal samples. Reconstruction then occurs via a deterministic ODE solver to flag anomalies as deviations from this learned distribution. A sympathetic reader would care because robust separation of normal and anomalous patterns without prior knowledge of anomalies could improve reliability in monitoring systems where early identification matters.

Core claim

The authors establish that their U²AD framework learns the underlying data distribution of normal samples by utilizing score-based generative modeling with a novel time-dependent score network and a unified training objective that together delineate the manifold of normal data while considering both local and global temporal contexts, with reconstruction performed via a deterministic sampling process using an ordinary differential equation solver.

What carries the argument

Score-based generative modeling with a novel time-dependent score network and unified training objective that delineates the manifold of normal data while accounting for local and global temporal contexts

If this is right

  • U²AD outperforms current state-of-the-art methods in detection accuracy on multivariate time series.
  • The approach identifies anomalies at significantly earlier stages of their occurrence.
  • Learning the normal data distribution via the proposed network and objective enables better separation from anomalous patterns.
  • Deterministic reconstruction with an ODE solver supports precise anomaly flagging after manifold learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The local-global context handling could make the method adaptable to time series with irregular sampling or varying lengths.
  • The unified normal representation might reduce false positives in noisy real-world environments.
  • Adapting the time-dependent score network for streaming data could support online anomaly detection applications.

Load-bearing premise

That training a score-based generative model exclusively on normal time series samples will produce a representation distinctly separable from anomalous patterns by incorporating local and global temporal contexts.

What would settle it

If benchmark experiments on standard multivariate time series datasets show that U²AD does not achieve higher detection accuracy or earlier anomaly identification than existing methods, the central performance claims would be disproven.

Figures

Figures reproduced from arXiv: 2605.09685 by Alireza Tavakkoli, Nicholas G. Murray, Prithul Sarker, Sushmita Sarker.

Figure 1
Figure 1. Figure 1: Anomaly detection for different kinds of anomalies, i.e. global, contextual, shapelet, seasonal and trend [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our proposed framework through denoising score matching and reverse sampling. Left and [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Architecture of the U 2AD framework. Input time series x(0) ∈ RN×d are perturbed via VP-SDE to generate noisy samples x(t). The Dual-Pathway Score Network consists of K stacked layers that disentangle global and local temporal dependencies. The global context pathway utilizes multi-head self-attention to capture long-range correlations (ψ), while the local Context pathway employs a cosine similarity-based … view at source ↗
Figure 4
Figure 4. Figure 4: Anomaly detection results for 4 benchmark datasets. The figures in the top row depict input time series, while [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗
read the original abstract

The core challenge in unsupervised anomaly detection is identifying abnormal patterns without prior knowledge of their characteristics. While existing methods have addressed aspects of this problem, they often struggle to learn a robust representation of the normal data distribution that is distinct from anomalous patterns. In this paper, we present a novel framework, Unified Unsupervised Anomaly Detection ($\text{U}^2\text{AD}$), that comprehensively addresses anomaly detection in multivariate time series. Our approach learns the underlying data distribution of normal samples by utilizing score-based generative modeling. We introduce a novel time-dependent score network and a unified training objective that together delineate the manifold of normal data while considering both local and global temporal contexts. Reconstruction is then performed via a deterministic sampling process using an ordinary differential equation solver. Our extensive experimental evaluations demonstrate that $\text{U}^2\text{AD}$ not only outperforms current state-of-the-art methods in detection accuracy but also identifies anomalies at significantly earlier stages of their occurrence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes U²AD, a framework for unsupervised anomaly detection in multivariate time series. It learns the normal data distribution via score-based generative modeling using a novel time-dependent score network and a unified training objective that incorporates both local and global temporal contexts. Reconstruction for anomaly scoring is performed with a deterministic ODE solver. The central claims are that this yields higher detection accuracy than current SOTA methods and identifies anomalies at significantly earlier stages, supported by extensive experiments.

Significance. If the empirical results hold, the work advances score-based generative approaches to time series anomaly detection by explicitly modeling temporal contexts in the normal manifold. The emphasis on earlier detection has clear practical value for monitoring applications. The method follows standard practices in the score-based anomaly detection literature and appears internally consistent.

major comments (1)
  1. [Experimental section] Experimental section (around the claims of earlier detection): the strongest claim is that anomalies are identified at significantly earlier stages, but the manuscript should explicitly define the timing metric (e.g., lead time in steps or a normalized early-detection score) and show how it is computed uniformly across all baselines to substantiate the cross-method comparison.
minor comments (2)
  1. [Abstract] Abstract: the claim of outperformance from 'extensive experimental evaluations' would be stronger if the abstract briefly named the datasets, number of baselines, and key metrics (accuracy, early-detection timing) rather than leaving all details to the body.
  2. [§3] §3 (method): the notation and architecture diagram for the time-dependent score network would benefit from an explicit equation showing how the time embedding is injected, to improve clarity and reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of our work and the recommendation for minor revision. The feedback on clarifying the early-detection metric is constructive and will improve the manuscript's clarity and reproducibility. We address the comment below.

read point-by-point responses
  1. Referee: [Experimental section] Experimental section (around the claims of earlier detection): the strongest claim is that anomalies are identified at significantly earlier stages, but the manuscript should explicitly define the timing metric (e.g., lead time in steps or a normalized early-detection score) and show how it is computed uniformly across all baselines to substantiate the cross-method comparison.

    Authors: We agree that an explicit definition and uniform computation procedure for the early-detection timing metric are required to substantiate the cross-method comparisons. In the revised manuscript, we will add a dedicated paragraph (and supporting equation) in the Experimental Setup subsection of Section 4. The metric will be defined as lead time: for each anomaly instance, the number of time steps between the first detection (when the anomaly score exceeds a fixed, method-agnostic threshold calibrated on validation normal data) and the ground-truth onset of the anomaly. We will specify that this computation is performed identically for all baselines by (i) using the same threshold selection protocol, (ii) applying the same post-processing (e.g., no smoothing unless explicitly part of a baseline), and (iii) averaging lead times only over true-positive detections. Pseudocode and a small illustrative table will be included to ensure transparency. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a score-based generative framework (U²AD) that trains a time-dependent score network on normal time-series samples using a unified objective incorporating local and global context, followed by deterministic ODE-based reconstruction for anomaly scoring. This is a standard unsupervised setup in which the model learns the normal manifold empirically; anomaly separability and earlier detection are reported as experimental outcomes on public datasets against baselines, not as mathematical identities. No equations reduce a claimed result to its own fitted inputs by construction, no uniqueness theorems are imported via self-citation, and no ansatz or renaming of known patterns is presented as a derivation. The central claims rest on empirical validation rather than tautological definitions, placing the work in the normal non-circular category for ML method papers.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Based solely on the abstract, the central claim rests on standard assumptions in generative modeling for time series without explicit free parameters or invented entities detailed.

axioms (2)
  • domain assumption Score-based generative modeling can effectively capture the underlying distribution of normal multivariate time series data.
    Invoked as the core of the U²AD approach in the abstract.
  • domain assumption A deterministic sampling process via ODE solver can reconstruct data to identify deviations from the normal manifold.
    Stated as the reconstruction step following training.
invented entities (1)
  • time-dependent score network no independent evidence
    purpose: To learn unified representations considering both local and global temporal contexts for delineating the normal data manifold.
    Introduced as novel component in the abstract.

pith-pipeline@v0.9.0 · 5469 in / 1470 out tokens · 42047 ms · 2026-05-12T04:31:21.421761+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 2 internal anchors

  1. [1]

    A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder.IEEE Robotics and Automation Letters, 3(3):1544–1551, 2018

    Daehyung Park, Yuuna Hoshi, and Charles C Kemp. A multimodal anomaly detector for robot-assisted feeding using an lstm-based variational autoencoder.IEEE Robotics and Automation Letters, 3(3):1544–1551, 2018

  2. [2]

    Robust anomaly detection for multivariate time series through stochastic recurrent neural network

    Ya Su, Youjian Zhao, Chenhao Niu, Rong Liu, Wei Sun, and Dan Pei. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2828–2837, 2019

  3. [3]

    Beatgan: Anomalous rhythm detection using adversarially generated time series

    Bin Zhou, Shenghua Liu, Bryan Hooi, Xueqi Cheng, and Jing Ye. Beatgan: Anomalous rhythm detection using adversarially generated time series. InIJCAI, volume 2019, pages 4433–4439, 2019

  4. [4]

    Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks

    Dan Li, Dacheng Chen, Baihong Jin, Lei Shi, Jonathan Goh, and See-Kiong Ng. Mad-gan: Multivariate anomaly detection for time series data with generative adversarial networks. InInternational conference on artificial neural networks, pages 703–716. Springer, 2019

  5. [5]

    Takehisa Yairi, Naoya Takeishi, Tetsuo Oda, Yuta Nakajima, Naoki Nishimura, and Noboru Takata. A data-driven health monitoring method for satellite housekeeping data based on probabilistic clustering and dimensionality reduction.IEEE Transactions on Aerospace and Electronic Systems, 53(3):1384–1401, 2017

  6. [6]

    Deep autoencoding gaussian mixture model for unsupervised anomaly detection

    Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, and Haifeng Chen. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. InInternational conference on learning representations, 2018. 9 Learning Unified Representations of Normalcy for Time Series Anomaly Detection

  7. [7]

    Deep one-class classification

    Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. Deep one-class classification. InInternational conference on machine learning, pages 4393–4402. PMLR, 2018

  8. [8]

    Timeseries anomaly detection using temporal hierarchical one-class network.Advances in Neural Information Processing Systems, 33:13016–13026, 2020

    Lifeng Shen, Zhuocong Li, and James Kwok. Timeseries anomaly detection using temporal hierarchical one-class network.Advances in Neural Information Processing Systems, 33:13016–13026, 2020

  9. [9]

    Anomaly transformer: Time series anomaly detection with association discrepancy,

    Jiehui Xu, Haixu Wu, Jianmin Wang, and Mingsheng Long. Anomaly transformer: Time series anomaly detection with association discrepancy.arXiv preprint arXiv:2110.02642, 2021

  10. [10]

    Sarad: Spatial association-aware anomaly detection and diagnosis for multivariate time series.Advances in Neural Information Processing Systems, 37:48371–48410, 2024

    Zhihao Dai, Ligang He, Shuanghua Yang, and Matthew Leeke. Sarad: Spatial association-aware anomaly detection and diagnosis for multivariate time series.Advances in Neural Information Processing Systems, 37:48371–48410, 2024

  11. [11]

    Memto: Memory-guided transformer for multivariate time series anomaly detection.Advances in Neural Information Processing Systems, 36:57947–57963, 2023

    Junho Song, Keonwoo Kim, Jeonglyul Oh, and Sungzoon Cho. Memto: Memory-guided transformer for multivariate time series anomaly detection.Advances in Neural Information Processing Systems, 36:57947–57963, 2023

  12. [12]

    Revisiting time series outlier detection: Definitions and benchmarks

    Kwei-Herng Lai, Daochen Zha, Junjie Xu, Yue Zhao, Guanchu Wang, and Xia Hu. Revisiting time series outlier detection: Definitions and benchmarks. InThirty-fifth conference on neural information processing systems datasets and benchmarks track (round 1), 2021

  13. [13]

    Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6(4), 2005

    Aapo Hyvärinen and Peter Dayan. Estimation of non-normalized statistical models by score matching.Journal of Machine Learning Research, 6(4), 2005

  14. [14]

    Sliced score matching: A scalable approach to density and score estimation

    Yang Song, Sahaj Garg, Jiaxin Shi, and Stefano Ermon. Sliced score matching: A scalable approach to density and score estimation. InUncertainty in Artificial Intelligence, pages 574–584. PMLR, 2020

  15. [15]

    MIT press, 2016

    Ian Goodfellow, Yoshua Bengio, and Aaron Courville.Deep learning. MIT press, 2016

  16. [16]

    Learning energy-based models in high-dimensional spaces with multi-scale denoising score matching.arXiv preprint arXiv:1910.07762, 2019

    Zengyi Li, Yubei Chen, and Friedrich T Sommer. Learning energy-based models in high-dimensional spaces with multi-scale denoising score matching.arXiv preprint arXiv:1910.07762, 2019

  17. [17]

    A kernelized stein discrepancy for goodness-of-fit tests

    Qiang Liu, Jason Lee, and Michael Jordan. A kernelized stein discrepancy for goodness-of-fit tests. InInternational conference on machine learning, pages 276–284. PMLR, 2016

  18. [18]

    A connection between score matching and denoising autoencoders.Neural computation, 23(7):1661–1674, 2011

    Pascal Vincent. A connection between score matching and denoising autoencoders.Neural computation, 23(7):1661–1674, 2011

  19. [19]

    springer Berlin, 1985

    Crispin W Gardiner et al.Handbook of stochastic methods, volume 3. springer Berlin, 1985

  20. [20]

    Springer, 2003

    Bernt Øksendal.Stochastic differential equations. Springer, 2003

  21. [21]

    Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982

    Brian DO Anderson. Reverse-time diffusion equation models.Stochastic Processes and their Applications, 12(3):313–326, 1982

  22. [22]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456, 2020

  23. [23]

    Attention is all you need.Advances in neural information processing systems, 30, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

  24. [24]

    Interacting particle solutions of fokker–planck equations through gradient–log–density estimation.Entropy, 22(8):802, 2020

    Dimitra Maoutsa, Sebastian Reich, and Manfred Opper. Interacting particle solutions of fokker–planck equations through gradient–log–density estimation.Entropy, 22(8):802, 2020

  25. [25]

    Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding

    Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. InProceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 387–395, 2018

  26. [26]

    Swat: A water treatment testbed for research and training on ics security

    Aditya P Mathur and Nils Ole Tippenhauer. Swat: A water treatment testbed for research and training on ics security. In2016 international workshop on cyber-physical systems for smart water networks (CySWater), pages 31–36. IEEE, 2016

  27. [27]

    A dataset to support research in the design of secure water treatment systems

    Jonathan Goh, Sridhar Adepu, Khurum Nazir Junejo, and Aditya Mathur. A dataset to support research in the design of secure water treatment systems. InCritical Information Infrastructures Security: 11th International Conference, CRITIS 2016, Paris, France, October 10–12, 2016, Revised Selected Papers 11, pages 88–99. Springer, 2017

  28. [28]

    Practical approach to asynchronous multivariate time series anomaly detection and localization

    Ahmed Abdulaal, Zhuanghua Liu, and Tomer Lancewicki. Practical approach to asynchronous multivariate time series anomaly detection and localization. InProceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 2485–2494, 2021. 10 Learning Unified Representations of Normalcy for Time Series Anomaly Detection

  29. [29]

    Multivariate time series anomaly detection and interpretation using hierarchical inter-metric and temporal embedding

    Zhihan Li, Youjian Zhao, Jiaqi Han, Ya Su, Rui Jiao, Xidao Wen, and Dan Pei. Multivariate time series anomaly detection and interpretation using hierarchical inter-metric and temporal embedding. InProceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 3220–3230, 2021

  30. [30]

    Deep isolation forest for anomaly detection.IEEE Transactions on Knowledge and Data Engineering, 35(12):12591–12604, 2023

    Hongzuo Xu, Guansong Pang, Yijie Wang, and Yongjun Wang. Deep isolation forest for anomaly detection.IEEE Transactions on Knowledge and Data Engineering, 35(12):12591–12604, 2023

  31. [31]

    Tranad: Deep transformer networks for anomaly detection in multivariate time series data.arXiv preprint arXiv:2201.07284, 2022

    Shreshth Tuli, Giuliano Casale, and Nicholas R Jennings. Tranad: Deep transformer networks for anomaly detection in multivariate time series data.arXiv preprint arXiv:2201.07284, 2022

  32. [32]

    Dcdetector: Dual attention contrastive representation learning for time series anomaly detection.arXiv preprint arXiv:2306.10347, 2023

    Yiyuan Yang, Chaoli Zhang, Tian Zhou, Qingsong Wen, and Liang Sun. Dcdetector: Dual attention contrastive representation learning for time series anomaly detection.arXiv preprint arXiv:2306.10347, 2023

  33. [33]

    Imdiffusion: Imputed diffusion models for multivariate time series anomaly detection.arXiv preprint arXiv:2307.00754, 2023

    Yuhang Chen, Chaoyun Zhang, Minghua Ma, Yudong Liu, Ruomeng Ding, Bowen Li, Shilin He, Saravan Rajmohan, Qingwei Lin, and Dongmei Zhang. Imdiffusion: Imputed diffusion models for multivariate time series anomaly detection.arXiv preprint arXiv:2307.00754, 2023

  34. [34]

    arXiv preprint arXiv:2210.02186 , year=

    Haixu Wu, Tengge Hu, Yong Liu, Hang Zhou, Jianmin Wang, and Mingsheng Long. Timesnet: Temporal 2d-variation modeling for general time series analysis.arXiv preprint arXiv:2210.02186, 2022

  35. [35]

    V olume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection.Proceedings of the VLDB Endowment, 15(11):2774–2787, 2022

    John Paparrizos, Paul Boniol, Themis Palpanas, Ruey S Tsay, Aaron Elmore, and Michael J Franklin. V olume Under the Surface: A New Accuracy Evaluation Measure for Time-Series Anomaly Detection.Proceedings of the VLDB Endowment, 15(11):2774–2787, 2022

  36. [36]

    Reward once, penalize once: Rectifying time series anomaly detection

    Keval Doshi, Shatha Abudalou, and Yasin Yilmaz. Reward once, penalize once: Rectifying time series anomaly detection. In2022 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2022

  37. [37]

    Sequential neural models with stochastic layers.Advances in neural information processing systems, 29, 2016

    Marco Fraccaro, Søren Kaae Sønderby, Ulrich Paquet, and Ole Winther. Sequential neural models with stochastic layers.Advances in neural information processing systems, 29, 2016

  38. [38]

    Failure detection in assembly: Force signature analysis

    Alberto Rodriguez, David Bourne, Mathew Mason, Gregory F Rossano, and JianJun Wang. Failure detection in assembly: Force signature analysis. In2010 IEEE International Conference on Automation Science and Engineering, pages 210–215. IEEE, 2010

  39. [39]

    A multimodal execution monitor with anomaly classification for robot-assisted feeding

    Daehyung Park, Hokeun Kim, Yuuna Hoshi, Zackory Erickson, Ariel Kapusta, and Charles C Kemp. A multimodal execution monitor with anomaly classification for robot-assisted feeding. In2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 5406–5413. IEEE, 2017

  40. [40]

    Tanogan: Time series anomaly detection with generative adversarial networks

    Md Abul Bashar and Richi Nayak. Tanogan: Time series anomaly detection with generative adversarial networks. In2020 IEEE Symposium Series on Computational Intelligence (SSCI), pages 1778–1785. IEEE, 2020

  41. [41]

    Usad: Unsupervised anomaly detection on multivariate time series

    Julien Audibert, Pietro Michiardi, Frédéric Guyard, Sébastien Marti, and Maria A Zuluaga. Usad: Unsupervised anomaly detection on multivariate time series. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 3395–3404, 2020

  42. [42]

    A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data

    Chuxu Zhang, Dongjin Song, Yuncong Chen, Xinyang Feng, Cristian Lumezanu, Wei Cheng, Jingchao Ni, Bo Zong, Haifeng Chen, and Nitesh V Chawla. A deep neural network for unsupervised anomaly detection and diagnosis in multivariate time series data. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 1409–1416, 2019

  43. [43]

    Tadgan: Time series anomaly detection using generative adversarial networks

    Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, and Kalyan Veeramachaneni. Tadgan: Time series anomaly detection using generative adversarial networks. In2020 ieee international conference on big data (big data), pages 33–43. IEEE, 2020

  44. [44]

    Lof: identifying density-based local outliers

    Markus M Breunig, Hans-Peter Kriegel, Raymond T Ng, and Jörg Sander. Lof: identifying density-based local outliers. InProceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 93–104, 2000

  45. [45]

    Deep autoencoding gaussian mixture model for unsupervised anomaly detection

    Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Dae ki Cho, and Haifeng Chen. Deep autoencoding gaussian mixture model for unsupervised anomaly detection. InInternational Conference on Learning Representations, 2018

  46. [46]

    Adaptive kernel density-based anomaly detection for nonlinear systems.Knowledge-Based Systems, 139:50–63, 2018

    Liangwei Zhang, Jing Lin, and Ramin Karim. Adaptive kernel density-based anomaly detection for nonlinear systems.Knowledge-Based Systems, 139:50–63, 2018

  47. [47]

    Support vector data description.Machine learning, 54:45–66, 2004

    David MJ Tax and Robert PW Duin. Support vector data description.Machine learning, 54:45–66, 2004

  48. [48]

    Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models.Advances in neural information processing systems, 33:6840–6851, 2020

  49. [49]

    Csdi: Conditional score-based diffusion models for probabilistic time series imputation.Advances in Neural Information Processing Systems, 34:24804–24816, 2021

    Yusuke Tashiro, Jiaming Song, Yang Song, and Stefano Ermon. Csdi: Conditional score-based diffusion models for probabilistic time series imputation.Advances in Neural Information Processing Systems, 34:24804–24816, 2021. 11 Learning Unified Representations of Normalcy for Time Series Anomaly Detection

  50. [50]

    Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting

    Kashif Rasul, Calvin Seward, Ingmar Schuster, and Roland V ollgraf. Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting. InInternational Conference on Machine Learning, pages 8857–8868. PMLR, 2021

  51. [51]

    Generative time series forecasting with diffusion, denoise, and disentanglement.Advances in Neural Information Processing Systems, 35:23009–23022, 2022

    Yan Li, Xinjiang Lu, Yaqing Wang, and Dejing Dou. Generative time series forecasting with diffusion, denoise, and disentanglement.Advances in Neural Information Processing Systems, 35:23009–23022, 2022

  52. [52]

    Madsgm: Multivariate anomaly detection with score-based generative models

    Haksoo Lim, Sewon Park, Minjung Kim, Jaehoon Lee, Seonkyu Lim, and Noseong Park. Madsgm: Multivariate anomaly detection with score-based generative models. InProceedings of the 32nd ACM International Conference on Information and Knowledge Management, pages 1411–1420, 2023

  53. [53]

    Multi-resolution decomposable diffusion model for non- stationary time series anomaly detection

    Guojin Zhong, Jin Yuan, Zhiyong Li, Long Chen, et al. Multi-resolution decomposable diffusion model for non- stationary time series anomaly detection. InThe Thirteenth International Conference on Learning Representations, 2025

  54. [54]

    Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32, 2019

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32, 2019

  55. [55]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

  56. [56]

    Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications

    Haowen Xu, Wenxiao Chen, Nengwen Zhao, Zeyan Li, Jiahao Bu, Zhihan Li, Ying Liu, Youjian Zhao, Dan Pei, Yang Feng, et al. Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications. InProceedings of the 2018 world wide web conference, pages 187–196, 2018

  57. [57]

    Timesead: Benchmarking deep multivariate time-series anomaly detection.Transactions on Machine Learning Research, 2023

    Dennis Wagner, Tobias Michels, Florian CF Schulz, Arjun Nair, Maja Rudolph, and Marius Kloft. Timesead: Benchmarking deep multivariate time-series anomaly detection.Transactions on Machine Learning Research, 2023

  58. [58]

    An evaluation of anomaly detection and diagnosis in multivariate time series.IEEE Transactions on Neural Networks and Learning Systems, 33(6):2508–2517, 2021

    Astha Garg, Wenyu Zhang, Jules Samaran, Ramasamy Savitha, and Chuan-Sheng Foo. An evaluation of anomaly detection and diagnosis in multivariate time series.IEEE Transactions on Neural Networks and Learning Systems, 33(6):2508–2517, 2021

  59. [59]

    Calibrated one-class classification for unsupervised time series anomaly detection.arXiv preprint arXiv:2207.12201, 2022

    Hongzuo Xu, Yijie Wang, Songlei Jian, Qing Liao, Yongjun Wang, and Guansong Pang. Calibrated one-class classification for unsupervised time series anomaly detection.arXiv preprint arXiv:2207.12201, 2022

  60. [60]

    Isolation forest

    Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. Isolation forest. In2008 eighth ieee international conference on data mining, pages 413–422. IEEE, 2008

  61. [61]

    Neural Contextual Anomaly Detection for Time Series

    Chris U Carmona, François-Xavier Aubet, Valentin Flunkert, and Jan Gasthaus. Neural contextual anomaly detection for time series.arXiv preprint arXiv:2107.07702, 2021

  62. [62]

    U-net: Convolutional networks for biomedical image seg- mentation

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image seg- mentation. InMedical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015

  63. [63]

    association discrepancy

    Robert Tibshirani, Guenther Walther, and Trevor Hastie. Estimating the number of clusters in a data set via the gap statistic.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63(2):411–423, 2001. 12 Learning Unified Representations of Normalcy for Time Series Anomaly Detection APPENDIX A Related Work Anomaly Detection in Multi...

  64. [64]

    Softmax(−Γ(ξ, ψ))⊙ ∥x i −˜xi∥2

  65. [65]

    Anomaly Criterion-1 is standard reconstruction error, and Anomaly Criterion-2 is volume minimization error [ 7]

    Softmax(−Γ(ξ, ψ))⊙ ∥x i −˜xi∥2 +∥s θ(˜xi)−c∥ 2 Throughout our experiments, we observed distinct performance trends among various Anomaly Criteria. Anomaly Criterion-1 is standard reconstruction error, and Anomaly Criterion-2 is volume minimization error [ 7]. Anomaly Criterion-3 is similar to criterion from Anomaly Transformer [ 9]. Notably, during the in...