COGNOS: Universal Enhancement for Time Series Anomaly Detection via Constrained Gaussian-Noise Optimization and Smoothing

Peng Chang; Shihao Tian; Wenlong Shang; Xutong Wan

arxiv: 2511.06894 · v3 · submitted 2025-11-10 · 💻 cs.LG · cs.AI

COGNOS: Universal Enhancement for Time Series Anomaly Detection via Constrained Gaussian-Noise Optimization and Smoothing

Wenlong Shang , Shihao Tian , Xutong Wan , Peng Chang This is my paper

Pith reviewed 2026-05-17 23:46 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords time series anomaly detectionGaussian white noise regularizationKalman smootherreconstruction residualsmodel-agnostic frameworkadaptive filteringanomaly score denoising

0 comments

The pith

Forcing reconstruction residuals into Gaussian white noise lets a Kalman smoother clean up anomaly scores in time series detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Reconstruction methods for spotting anomalies in time series data typically use mean squared error to train models, but this creates residuals that are statistically messy and lead to unreliable anomaly scores. COGNOS adds a regularization during training to make those residuals behave like Gaussian white noise. This property then lets an adaptive Kalman smoother denoise the scores in a robust way. The whole thing attaches to any existing model without internal changes, and tests on benchmarks show it lifts performance of current top methods.

Core claim

The authors establish that a training constraint enforcing Gaussian white noise residuals creates the right conditions for an Adaptive Residual Kalman Smoother to act as a robust estimator, yielding denoised anomaly scores that improve detection when added to any reconstruction-based backbone.

What carries the argument

Gaussian-White Noise Regularization that constrains residuals to a Gaussian white noise distribution, enabling the Adaptive Residual Kalman Smoother to denoise raw anomaly scores.

If this is right

COGNOS can be added to existing state-of-the-art models to boost their anomaly detection results.
The statistical regularization addresses the root cause of noisy scores in reconstruction approaches.
Adaptive filtering combined with the regularization produces more stable and accurate anomaly scores.
The method works across multiple benchmark datasets without model-specific tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This could extend to improving reconstruction tasks in other fields like audio or video anomaly detection.
It highlights the value of shaping error distributions explicitly rather than hoping for good behavior.
Testing with different smoothing algorithms might reveal even stronger combinations.

Load-bearing premise

That forcing reconstruction residuals to conform to a Gaussian white noise distribution during training creates an ideal precondition for the Adaptive Residual Kalman Smoother to produce meaningfully better anomaly scores across arbitrary backbone models.

What would settle it

A direct test where the Gaussian regularization is enforced but the resulting anomaly scores after smoothing show no improvement in precision or recall on standard time series anomaly benchmarks compared to the original backbone.

Figures

Figures reproduced from arXiv: 2511.06894 by Peng Chang, Shihao Tian, Wenlong Shang, Xutong Wan.

**Figure 1.** Figure 1: We present an analysis of TimesNet model on the SWaT dataset: (a) The resulting anomaly score is highly noisy, creating a poor signal-to-noise ratio where true anomalous deviations are masked. (b) The autocorrelation plot shows significant temporal correlation remaining in the residuals, suggesting the model failed to capture all predictable patterns. (c) The Q-Q plot reveals that reconstruction residuals… view at source ↗

**Figure 2.** Figure 2: COGNOS: Backbones are trained with GWNR Loss, and ARKS is used during inference to generate stable anomaly scores Ractive = R ⊙ M. 4.1.2. TIME DOMAIN: GLOBAL ENERGY CONSTRAINT To ensure faithful reconstruction, we employ the Mean Squared Error as the primary anchor. This term minimizes the total energy of the residual, forcing the model to capture the fundamental signal structure: LMSE = 1 N ∥R∥ 2 F , (3) … view at source ↗

**Figure 3.** Figure 3: Qualitative Analysis on Signal Purity and Statistical Physics. Top Row: Anomaly score dynamics on GECCO and PSM datasets. Bottom Row: Statistical diagnostics of reconstruction residuals [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: The impact of the Confidence Level (1−α) on KAN-AD backbone across the SMAP and SWAN datasets. tection: the statistically flawed and noisy anomaly scores generated by standard MSE-based training and inference. Experiments demonstrate that COGNOS is model-agnostic and highly effective, delivering substantial performance gains. Our results confirm this improvement arises from the strong synergy between our r… view at source ↗

**Figure 5.** Figure 5: Residual analysis on the SWaT dataset using TimesNet. Left: vanilla method, Right: COGNOS. 0 10 20 30 40 50 Lag 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 Autocorrelation Autocorrelation for Each Channel 0 10 20 30 40 50 Lag 1.00 0.75 0.50 0.25 0.00 0.25 0.50 0.75 1.00 Autocorrelation Autocorrelation for Each Channel 2 1 0 1 2 Standard Gaussain Distribution Quantiles 5.0 2.5 0.0 2.5 5.0 7.5 10.0 12.5 Nor… view at source ↗

**Figure 6.** Figure 6: Residual analysis on the PSM dataset using TimeMixer++. Left: vanilla method, Right: COGNOS. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 7.** Figure 7: Residual analysis on the SMAP dataset using CrossAD. Left: vanilla method, Right: COGNOS. 0 100 200 300 400 Time Step Ch1 Ch2 Ch3 Ch4 Ch5 Ch6 Ch7 Ch8 Ch9 Ch10 Ch11 Ch12 Ch13 Ch14 Ch15 Ch16 Ch17 Ch18 Ch19 Ch20 Ch21 Ch22 Ch23 Ch24 Ch25 Channels 0.00 0.01 0.02 0.03 Log(Anomaly Score) 0 100 200 300 400 Time Step Ch1 Ch2 Ch3 Ch4 Ch5 Ch6 Ch7 Ch8 Ch9 Ch10 Ch11 Ch12 Ch13 Ch14 Ch15 Ch16 Ch17 Ch18 Ch19 Ch20 Ch21 Ch2… view at source ↗

**Figure 8.** Figure 8: Qualitative comparison of anomaly scores on PSM using TimeMixer++. Left: vanilla method, Right: COGNOS. 0 100 200 300 400 Time Step Ch1 Ch2 Ch3 Ch4 Ch5 Ch6 Ch7 Ch8 Ch9 Channels 0.000 0.005 0.010 0.015 Log(Anomaly Score) 0 100 200 300 400 Time Step Ch1 Ch2 Ch3 Ch4 Ch5 Ch6 Ch7 Ch8 Ch9 Channels 0.0 0.1 0.2 0.3 Log(Anomaly Score) [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗

**Figure 9.** Figure 9: Qualitative comparison of anomaly scores on GECCO using ModernTCN. Left: vanilla method, Right: COGNOS. 0 100 200 300 400 Time Step Ch1 Ch2 Ch3 Ch4 Ch5 Ch6 Ch7 Ch8 Ch9 Ch10 Ch11 Ch12 Ch13 Ch14 Ch15 Ch16 Ch17 Ch18 Ch19 Ch20 Ch21 Ch22 Ch23 Ch24 Ch25 Channels 0 1 2 3 Log(Anomaly Score) 0 100 200 300 400 Time Step Ch1 Ch2 Ch3 Ch4 Ch5 Ch6 Ch7 Ch8 Ch9 Ch10 Ch11 Ch12 Ch13 Ch14 Ch15 Ch16 Ch17 Ch18 Ch19 Ch20 Ch21 C… view at source ↗

**Figure 10.** Figure 10: Qualitative comparison of anomaly scores on PSM using KAN-AD. Left: vanilla method, Right: COGNOS. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

read the original abstract

Reconstruction-based methods are a dominant paradigm in time series anomaly detection (TSAD), however, their near-universal reliance on Mean Squared Error (MSE) loss results in statistically flawed reconstruction residuals. This fundamental weakness leads to noisy, unstable anomaly scores, hindering reliable detection. To address this, we propose Constrained Gaussian-Noise Optimization and Smoothing (COGNOS), a universal, model-agnostic enhancement framework that tackles this issue at its source. COGNOS introduces a novel Gaussian-White Noise Regularization strategy during training, which directly constrains the model's output residuals to conform to a Gaussian white noise distribution. This engineered statistical property creates the ideal precondition for our second contribution: Adaptive Residual Kalman Smoother that operates as a statistically robust estimator to denoise the raw anomaly scores. Extensive experiments on multiple benchmarks demonstrate that COGNOS consistently enhances the performance of state-of-the-art backbones significantly, validating the efficacy of coupling statistical regularization with adaptive filtering.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

COGNOS adds Gaussian white noise regularization during training plus Kalman smoothing on residuals as a plug-in upgrade for reconstruction-based time series anomaly detectors, but the evidence that this pairing actually improves separability remains thin.

read the letter

The main thing to know is that the paper targets the statistical messiness of MSE residuals in reconstruction models and tries to fix it with a training constraint that pushes residuals toward Gaussian white noise, then applies an adaptive Kalman smoother to clean the anomaly scores afterward. This combo is presented as model-agnostic and easy to add to existing backbones.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes COGNOS, a model-agnostic enhancement for reconstruction-based time series anomaly detection. It adds a Gaussian-White Noise Regularization term during training that constrains reconstruction residuals to follow a Gaussian white noise distribution, which is presented as creating an ideal precondition for a subsequent Adaptive Residual Kalman Smoother that denoises raw anomaly scores. The authors claim that this combination yields consistent and significant performance gains when applied to state-of-the-art backbones across multiple benchmarks.

Significance. If the regularization term can be shown to enforce uncorrelated Gaussian residuals specifically on normal data while preserving or increasing separability on anomalies, and if the Kalman smoother then produces reliably better scores, the framework would constitute a useful plug-in improvement for existing reconstruction-based TSAD pipelines. The coupling of statistical regularization with adaptive filtering is conceptually coherent, but its practical value rests on concrete evidence that the precondition actually holds and translates into measurable detection gains.

major comments (2)

[Abstract] Abstract: the central claim that the Gaussian-White Noise Regularization 'creates the ideal precondition' for the Adaptive Residual Kalman Smoother is load-bearing, yet the abstract supplies no mathematical formulation of the constraint (moment matching, autocorrelation penalty, distributional divergence, or otherwise). Without this, it is impossible to verify whether the term enforces white-noise statistics on normal residuals or merely permits trivial adjustments that leave the statistical flaws of MSE unaddressed.
[Abstract] Abstract / Experiments: the assertion that 'extensive experiments on multiple benchmarks demonstrate that COGNOS consistently enhances the performance of state-of-the-art backbones significantly' is presented without any quantitative results, tables, baseline comparisons, or statistical significance tests. This absence prevents assessment of whether the reported gains are robust or whether they could be explained by the smoother alone rather than the claimed coupling with the regularization.

minor comments (1)

[Abstract] The acronym COGNOS is introduced without an explicit expansion that maps each word to the two technical contributions (regularization and smoother).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major comment below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the Gaussian-White Noise Regularization 'creates the ideal precondition' for the Adaptive Residual Kalman Smoother is load-bearing, yet the abstract supplies no mathematical formulation of the constraint (moment matching, autocorrelation penalty, distributional divergence, or otherwise). Without this, it is impossible to verify whether the term enforces white-noise statistics on normal residuals or merely permits trivial adjustments that leave the statistical flaws of MSE unaddressed.

Authors: We agree that the abstract would be clearer with a brief indication of how the regularization is formulated. The full mathematical definition appears in Section 3.2 of the manuscript, where the regularization term combines moment matching (to enforce zero mean and unit variance) with an autocorrelation penalty (to enforce uncorrelated residuals) applied to reconstruction errors on normal segments. This is not a trivial adjustment; it is designed to hold specifically for normal data while leaving anomaly-induced deviations intact. In the revised version we will add a short clause to the abstract summarizing this formulation at a high level. revision: yes
Referee: [Abstract] Abstract / Experiments: the assertion that 'extensive experiments on multiple benchmarks demonstrate that COGNOS consistently enhances the performance of state-of-the-art backbones significantly' is presented without any quantitative results, tables, baseline comparisons, or statistical significance tests. This absence prevents assessment of whether the reported gains are robust or whether they could be explained by the smoother alone rather than the claimed coupling with the regularization.

Authors: We acknowledge that the abstract, owing to length constraints, does not contain numerical results. The manuscript already provides these details in Section 4, including tables with F1/AUC improvements across backbones and datasets, baseline comparisons, and ablation studies that isolate the contribution of the regularization from the smoother alone. To address the concern directly, we will revise the abstract to include a concise quantitative highlight (e.g., “yielding consistent relative gains of 8–18 % in F1-score”) and will add a short sentence noting that ablations confirm the necessity of both components. Full tables and significance tests remain in the experimental section. revision: partial

Circularity Check

0 steps flagged

No significant circularity; regularization and smoother are independent contributions

full rationale

The paper's core derivation introduces a Gaussian-white-noise regularization term during training to enforce a statistical property on reconstruction residuals, followed by a separate Adaptive Residual Kalman Smoother applied to the resulting anomaly scores. These steps are presented as sequential but distinct: the regularization is not defined in terms of the smoother's output, nor is the smoother's improvement defined by construction from the regularization parameters. No equations reduce the claimed performance gain to a fitted input on the evaluation data, no self-citation chain bears the central claim, and no uniqueness theorem or ansatz is smuggled in. The experimental validation on benchmarks is external to the derivation itself, making the chain self-contained against the provided description.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, invented entities, or non-standard axioms are named. The central premise rests on the domain assumption that MSE residuals are statistically flawed and that Gaussian white noise is the correct target distribution.

axioms (1)

domain assumption Reconstruction-based TSAD methods universally rely on MSE loss, producing statistically flawed residuals that hinder reliable detection.
Stated as the fundamental weakness in the opening sentence of the abstract.

pith-pipeline@v0.9.0 · 5473 in / 1304 out tokens · 46766 ms · 2026-05-17T23:46:13.656663+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

GWNR Loss ... Spectral Whitening ... Wavelet MMD ... Adaptive Residual Kalman Smoother (ARKS) ... LG-SSM ... Circuit Breaker
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Wold Decomposition Theorem ... residuals approximate Gaussian white noise

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages

[1]

, " * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

Abdulaal, A.; Liu, Z.; and Lancewicki, T. 2021. Practical approach to asynchronous multivariate time series anomaly detection and localization. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 2485--2494

work page 2021
[4]

Campos, D.; Zhang, M.; Yang, B.; Kieu, T.; Guo, C.; and Jensen, C. S. 2023. Lightts: Lightweight time series classification with adaptive ensemble distillation. Proceedings of the ACM on Management of Data, 1(2): 1--27

work page 2023
[5]

Z.; Webb, G

Darban, Z. Z.; Webb, G. I.; Pan, S.; Aggarwal, C. C.; and Salehi, M. 2025. CARLA: Self-supervised contrastive representation learning for time series anomaly detection. Pattern Recognition, 157: 110874

work page 2025
[6]

K.; Li, X.; and Guan, C

Eldele, E.; Ragab, M.; Chen, Z.; Wu, M.; Kwoh, C. K.; Li, X.; and Guan, C. 2021. Time-Series Representation Learning via Temporal and Contextual Contrasting. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2352--2359. International Joint Conferences on Artificial Intelligence Organization

work page 2021
[7]

Franceschi, J.-Y.; Dieuleveut, A.; and Jaggi, M. 2019. Unsupervised scalable representation learning for multivariate time series. Advances in Neural Information Processing Systems, 32

work page 2019
[8]

Huang, X.; Chen, W.; Hu, B.; and Mao, Z. 2025. Graph Mixture of Experts and Memory-augmented Routers for Multivariate Time Series Anomaly Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16): 17476--17484

work page 2025
[9]

Huang, X.; Zhang, F.; Wang, R.; Lin, X.; Liu, H.; and Fan, H. 2023. KalmanAE: Deep embedding optimized Kalman filter for time series anomaly detection. IEEE Transactions on Instrumentation and Measurement, 72: 1--11

work page 2023
[10]

M.; and Rossi, D

Huet, A.; Navarro, J. M.; and Rossi, D. 2022. Local Evaluation of Time Series Anomaly Detection Algorithms. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD '22, 635–645. New York, NY, USA: Association for Computing Machinery. ISBN 9781450393850

work page 2022
[11]

Hundman, K.; Constantinou, V.; Laporte, C.; Colwell, I.; and Soderstrom, T. 2018. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 387--395

work page 2018
[12]

Kalman, R. E. 1960. A new approach to linear filtering and prediction problems

work page 1960
[13]

Kendall, A.; Gal, Y.; and Cipolla, R. 2018. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

work page 2018
[14]

S.; Zhao, Y.; Huang, F.; and Zheng, K

Kieu, T.; Yang, B.; Guo, C.; Jensen, C. S.; Zhao, Y.; Huang, F.; and Zheng, K. 2022. Robust and explainable autoencoders for unsupervised time series outlier detection. In 2022 IEEE 38th International conference on data engineering (ICDE), 3038--3050. IEEE

work page 2022
[15]

Liu, F.; Zhou, X.; Cao, J.; Wang, Z.; Wang, T.; Wang, H.; and Zhang, Y. 2022 a . Anomaly Detection in Quasi-Periodic Time Series Based on Automatic Data Segmentation and Attentional LSTM-CNN. IEEE Transactions on Knowledge and Data Engineering, 34(6): 2626--2640

work page 2022
[16]

X.; and Dustdar, S

Liu, S.; Yu, H.; Liao, C.; Li, J.; Lin, W.; Liu, A. X.; and Dustdar, S. 2022 b . Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. In The Tenth International Conference on Learning Representations

work page 2022
[17]

Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; and Long, M. 2024. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. In The Twelfth International Conference on Learning Representations

work page 2024
[18]

Ma, M.; Fu, L.; Zhai, Z.; and Sun, R.-B. 2024. Transformer based Kalman Filter with EM algorithm for time series prediction and anomaly detection of complex systems. Measurement, 229: 114378

work page 2024
[19]

P.; and Tippenhauer, N

Mathur, A. P.; and Tippenhauer, N. O. 2016. SWaT: A water treatment testbed for research and training on ICS security. In 2016 international workshop on cyber-physical systems for smart water networks (CySWater), 31--36. IEEE

work page 2016
[20]

H.; Sinthong, P.; and Kalagnanam, J

Nie, Y.; Nguyen, N. H.; Sinthong, P.; and Kalagnanam, J. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In The Eleventh International Conference on Learning Representations

work page 2023
[21]

E.; Tung, F.; and Striebel, C

Rauch, H. E.; Tung, F.; and Striebel, C. T. 1965. Maximum likelihood estimates of linear dynamic systems. AIAA Journal, 3(8): 1445--1450

work page 1965
[22]

Tonekaboni, S.; Eytan, D.; and Goldenberg, A. 2021. Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding. In The Ninth International Conference on Learning Representations

work page 2021
[23]

N.; Kaiser, .; and Polosukhin, I

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, .; and Polosukhin, I. 2017. Attention is all you need. Advances in Neural Information Processing Systems, 30

work page 2017
[24]

Wang, H.; Peng, J.; Huang, F.; Wang, J.; Chen, J.; and Xiao, Y. 2023. Micn: Multi-scale local and global context modeling for long-term series forecasting. In The Eleventh International Conference on Learning Representations

work page 2023
[25]

Y.; and ZHOU, J

Wang, S.; Wu, H.; Shi, X.; Hu, T.; Luo, H.; Ma, L.; Zhang, J. Y.; and ZHOU, J. 2024 a . TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting. In The Twelfth International Conference on Learning Representations

work page 2024
[26]

Wang, Y.; Wu, H.; Dong, J.; Liu, Y.; Long, M.; and Wang, J. 2024 b . Deep Time Series Models: A Comprehensive Survey and Benchmark

work page 2024
[27]

Wu, H.; Hu, T.; Liu, Y.; Zhou, H.; Wang, J.; and Long, M. 2023. TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. In The Eleventh International Conference on Learning Representations

work page 2023
[28]

Wu, H.; Xu, J.; Wang, J.; and Long, M. 2021. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34: 22419--22430

work page 2021
[29]

Wu, X.; Qiu, X.; Li, Z.; Wang, Y.; Hu, J.; Guo, C.; Xiong, H.; and Yang, B. 2025. CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching. In The Thirteenth International Conference on Learning Representations

work page 2025
[30]

Xu, H.; Chen, W.; Zhao, N.; Li, Z.; Bu, J.; Li, Z.; Liu, Y.; Zhao, Y.; Pei, D.; Feng, Y.; Chen, J.; Wang, Z.; and Qiao, H. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. In Proceedings of the 2018 World Wide Web Conference, WWW '18, 187–196. Republic and Canton of Geneva, CHE: International World W...

work page 2018
[31]

Xu, J.; Wu, H.; Wang, J.; and Long, M. 2022. Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. In The Tenth International Conference on Learning Representations

work page 2022
[32]

Yang, C.; Chen, X.; Sun, L.; Yang, H.; and Wu, Y. 2023 a . Enhancing Representation Learning for Periodic Time Series with Floss: A Frequency Domain Regularization Approach. CoRR, abs/2308.01011

work page arXiv 2023
[33]

Yang, Y.; Zhang, C.; Zhou, T.; Wen, Q.; and Sun, L. 2023 b . DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD '23, 3033–3045. New York, NY, USA: Association for Computing Machinery. ISBN 9798400701030

work page 2023
[34]

Yao, Y.; Ma, J.; Feng, S.; and Ye, Y. 2024. SVD-AE: An asymmetric autoencoder with SVD regularization for multivariate time series anomaly detection. Neural Networks, 170: 535--547

work page 2024
[35]

Yu, J.; Gao, X.; Li, B.; Zhai, F.; Lu, J.; Xue, B.; Fu, S.; and Xiao, C. 2024. A filter-augmented auto-encoder with learnable normalization for robust multivariate time series anomaly detection. Neural networks, 170: 478--493

work page 2024
[36]

Yue, Z.; Wang, Y.; Duan, J.; Yang, T.; Huang, C.; Tong, Y.; and Xu, B. 2022. TS2Vec: Towards Universal Representation of Time Series. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8): 8980--8987

work page 2022
[37]

I.; Pan, S.; Aggarwal, C.; and Salehi, M

Zamanzadeh Darban, Z.; Webb, G. I.; Pan, S.; Aggarwal, C.; and Salehi, M. 2024. Deep learning for time series anomaly detection: A survey. ACM Computing Surveys, 57(1): 1--42

work page 2024
[38]

Zeng, A.; Chen, M.; Zhang, L.; and Xu, Q. 2023. Are Transformers Effective for Time Series Forecasting? Proceedings of the AAAI Conference on Artificial Intelligence, 37(9): 11121--11128

work page 2023
[39]

Zhou, T.; Ma, Z.; Wen, Q.; Sun, L.; Yao, T.; Yin, W.; Jin, R.; et al. 2022. Film: Frequency improved legendre memory model for long-term time series forecasting. Advances in Neural Information Processing Systems, 35: 12677--12690

work page 2022

[1] [1]

, " * write output.state after.block = add.period write newline

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint howpublished institution isbn journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.a...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[3] [3]

Abdulaal, A.; Liu, Z.; and Lancewicki, T. 2021. Practical approach to asynchronous multivariate time series anomaly detection and localization. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, 2485--2494

work page 2021

[4] [4]

Campos, D.; Zhang, M.; Yang, B.; Kieu, T.; Guo, C.; and Jensen, C. S. 2023. Lightts: Lightweight time series classification with adaptive ensemble distillation. Proceedings of the ACM on Management of Data, 1(2): 1--27

work page 2023

[5] [5]

Z.; Webb, G

Darban, Z. Z.; Webb, G. I.; Pan, S.; Aggarwal, C. C.; and Salehi, M. 2025. CARLA: Self-supervised contrastive representation learning for time series anomaly detection. Pattern Recognition, 157: 110874

work page 2025

[6] [6]

K.; Li, X.; and Guan, C

Eldele, E.; Ragab, M.; Chen, Z.; Wu, M.; Kwoh, C. K.; Li, X.; and Guan, C. 2021. Time-Series Representation Learning via Temporal and Contextual Contrasting. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2352--2359. International Joint Conferences on Artificial Intelligence Organization

work page 2021

[7] [7]

Franceschi, J.-Y.; Dieuleveut, A.; and Jaggi, M. 2019. Unsupervised scalable representation learning for multivariate time series. Advances in Neural Information Processing Systems, 32

work page 2019

[8] [8]

Huang, X.; Chen, W.; Hu, B.; and Mao, Z. 2025. Graph Mixture of Experts and Memory-augmented Routers for Multivariate Time Series Anomaly Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(16): 17476--17484

work page 2025

[9] [9]

Huang, X.; Zhang, F.; Wang, R.; Lin, X.; Liu, H.; and Fan, H. 2023. KalmanAE: Deep embedding optimized Kalman filter for time series anomaly detection. IEEE Transactions on Instrumentation and Measurement, 72: 1--11

work page 2023

[10] [10]

M.; and Rossi, D

Huet, A.; Navarro, J. M.; and Rossi, D. 2022. Local Evaluation of Time Series Anomaly Detection Algorithms. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD '22, 635–645. New York, NY, USA: Association for Computing Machinery. ISBN 9781450393850

work page 2022

[11] [11]

Hundman, K.; Constantinou, V.; Laporte, C.; Colwell, I.; and Soderstrom, T. 2018. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 387--395

work page 2018

[12] [12]

Kalman, R. E. 1960. A new approach to linear filtering and prediction problems

work page 1960

[13] [13]

Kendall, A.; Gal, Y.; and Cipolla, R. 2018. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

work page 2018

[14] [14]

S.; Zhao, Y.; Huang, F.; and Zheng, K

Kieu, T.; Yang, B.; Guo, C.; Jensen, C. S.; Zhao, Y.; Huang, F.; and Zheng, K. 2022. Robust and explainable autoencoders for unsupervised time series outlier detection. In 2022 IEEE 38th International conference on data engineering (ICDE), 3038--3050. IEEE

work page 2022

[15] [15]

Liu, F.; Zhou, X.; Cao, J.; Wang, Z.; Wang, T.; Wang, H.; and Zhang, Y. 2022 a . Anomaly Detection in Quasi-Periodic Time Series Based on Automatic Data Segmentation and Attentional LSTM-CNN. IEEE Transactions on Knowledge and Data Engineering, 34(6): 2626--2640

work page 2022

[16] [16]

X.; and Dustdar, S

Liu, S.; Yu, H.; Liao, C.; Li, J.; Lin, W.; Liu, A. X.; and Dustdar, S. 2022 b . Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. In The Tenth International Conference on Learning Representations

work page 2022

[17] [17]

Liu, Y.; Hu, T.; Zhang, H.; Wu, H.; Wang, S.; Ma, L.; and Long, M. 2024. iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. In The Twelfth International Conference on Learning Representations

work page 2024

[18] [18]

Ma, M.; Fu, L.; Zhai, Z.; and Sun, R.-B. 2024. Transformer based Kalman Filter with EM algorithm for time series prediction and anomaly detection of complex systems. Measurement, 229: 114378

work page 2024

[19] [19]

P.; and Tippenhauer, N

Mathur, A. P.; and Tippenhauer, N. O. 2016. SWaT: A water treatment testbed for research and training on ICS security. In 2016 international workshop on cyber-physical systems for smart water networks (CySWater), 31--36. IEEE

work page 2016

[20] [20]

H.; Sinthong, P.; and Kalagnanam, J

Nie, Y.; Nguyen, N. H.; Sinthong, P.; and Kalagnanam, J. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In The Eleventh International Conference on Learning Representations

work page 2023

[21] [21]

E.; Tung, F.; and Striebel, C

Rauch, H. E.; Tung, F.; and Striebel, C. T. 1965. Maximum likelihood estimates of linear dynamic systems. AIAA Journal, 3(8): 1445--1450

work page 1965

[22] [22]

Tonekaboni, S.; Eytan, D.; and Goldenberg, A. 2021. Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding. In The Ninth International Conference on Learning Representations

work page 2021

[23] [23]

N.; Kaiser, .; and Polosukhin, I

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, .; and Polosukhin, I. 2017. Attention is all you need. Advances in Neural Information Processing Systems, 30

work page 2017

[24] [24]

Wang, H.; Peng, J.; Huang, F.; Wang, J.; Chen, J.; and Xiao, Y. 2023. Micn: Multi-scale local and global context modeling for long-term series forecasting. In The Eleventh International Conference on Learning Representations

work page 2023

[25] [25]

Y.; and ZHOU, J

Wang, S.; Wu, H.; Shi, X.; Hu, T.; Luo, H.; Ma, L.; Zhang, J. Y.; and ZHOU, J. 2024 a . TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting. In The Twelfth International Conference on Learning Representations

work page 2024

[26] [26]

Wang, Y.; Wu, H.; Dong, J.; Liu, Y.; Long, M.; and Wang, J. 2024 b . Deep Time Series Models: A Comprehensive Survey and Benchmark

work page 2024

[27] [27]

Wu, H.; Hu, T.; Liu, Y.; Zhou, H.; Wang, J.; and Long, M. 2023. TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. In The Eleventh International Conference on Learning Representations

work page 2023

[28] [28]

Wu, H.; Xu, J.; Wang, J.; and Long, M. 2021. Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34: 22419--22430

work page 2021

[29] [29]

Wu, X.; Qiu, X.; Li, Z.; Wang, Y.; Hu, J.; Guo, C.; Xiong, H.; and Yang, B. 2025. CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching. In The Thirteenth International Conference on Learning Representations

work page 2025

[30] [30]

Xu, H.; Chen, W.; Zhao, N.; Li, Z.; Bu, J.; Li, Z.; Liu, Y.; Zhao, Y.; Pei, D.; Feng, Y.; Chen, J.; Wang, Z.; and Qiao, H. 2018. Unsupervised Anomaly Detection via Variational Auto-Encoder for Seasonal KPIs in Web Applications. In Proceedings of the 2018 World Wide Web Conference, WWW '18, 187–196. Republic and Canton of Geneva, CHE: International World W...

work page 2018

[31] [31]

Xu, J.; Wu, H.; Wang, J.; and Long, M. 2022. Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy. In The Tenth International Conference on Learning Representations

work page 2022

[32] [32]

Yang, C.; Chen, X.; Sun, L.; Yang, H.; and Wu, Y. 2023 a . Enhancing Representation Learning for Periodic Time Series with Floss: A Frequency Domain Regularization Approach. CoRR, abs/2308.01011

work page arXiv 2023

[33] [33]

Yang, Y.; Zhang, C.; Zhou, T.; Wen, Q.; and Sun, L. 2023 b . DCdetector: Dual Attention Contrastive Representation Learning for Time Series Anomaly Detection. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD '23, 3033–3045. New York, NY, USA: Association for Computing Machinery. ISBN 9798400701030

work page 2023

[34] [34]

Yao, Y.; Ma, J.; Feng, S.; and Ye, Y. 2024. SVD-AE: An asymmetric autoencoder with SVD regularization for multivariate time series anomaly detection. Neural Networks, 170: 535--547

work page 2024

[35] [35]

Yu, J.; Gao, X.; Li, B.; Zhai, F.; Lu, J.; Xue, B.; Fu, S.; and Xiao, C. 2024. A filter-augmented auto-encoder with learnable normalization for robust multivariate time series anomaly detection. Neural networks, 170: 478--493

work page 2024

[36] [36]

Yue, Z.; Wang, Y.; Duan, J.; Yang, T.; Huang, C.; Tong, Y.; and Xu, B. 2022. TS2Vec: Towards Universal Representation of Time Series. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8): 8980--8987

work page 2022

[37] [37]

I.; Pan, S.; Aggarwal, C.; and Salehi, M

Zamanzadeh Darban, Z.; Webb, G. I.; Pan, S.; Aggarwal, C.; and Salehi, M. 2024. Deep learning for time series anomaly detection: A survey. ACM Computing Surveys, 57(1): 1--42

work page 2024

[38] [38]

Zeng, A.; Chen, M.; Zhang, L.; and Xu, Q. 2023. Are Transformers Effective for Time Series Forecasting? Proceedings of the AAAI Conference on Artificial Intelligence, 37(9): 11121--11128

work page 2023

[39] [39]

Zhou, T.; Ma, Z.; Wen, Q.; Sun, L.; Yao, T.; Yin, W.; Jin, R.; et al. 2022. Film: Frequency improved legendre memory model for long-term time series forecasting. Advances in Neural Information Processing Systems, 35: 12677--12690

work page 2022