PaAno+: Multiscale Encoding and Cross-Variable Attention for Time Series Anomaly Detection

Hongbing Wang; Wenchao Liu; Xiangguang Xiong; XiaoDong Liu; Youji Zhu

arxiv: 2606.20055 · v1 · pith:5VWCKSEUnew · submitted 2026-06-18 · 💻 cs.LG

PaAno+: Multiscale Encoding and Cross-Variable Attention for Time Series Anomaly Detection

Youji Zhu , Hongbing Wang , Wenchao Liu , Xiaodong Liu , Xiangguang Xiong This is my paper

Pith reviewed 2026-06-26 18:25 UTC · model grok-4.3

classification 💻 cs.LG

keywords time series anomaly detectionmultiscale encodingcross-variable attentionpatch representation learninglightweight modelself-supervised pretext taskTSB-AD benchmark

0 comments

The pith

PaAno+ adds multiscale convolutions and cross-variable attention to improve time series anomaly detection accuracy and efficiency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents PaAno as a lightweight model for time series anomaly detection aimed at industrial and medical monitoring. It builds a multiscale feature-extraction backbone with convolutional kernels of varying receptive fields, adds cross-scale adaptive attention with residual connections, and introduces a cross-variable fusion attention module to model dependencies between variables. A custom pretext task of temporal patch-window sorting combined with triplet loss is used to learn more discriminative patch embeddings. On the TSB-AD benchmark the model reports state-of-the-art results for both univariate and multivariate tasks, with gains across metrics including VUS-PR, while maintaining low computational cost suitable for edge deployment.

Core claim

The central claim is that a patch-oriented encoder using differentiated convolutional kernels for multiscale temporal features, followed by cross-scale adaptive attention aggregation and a dedicated cross-variable fusion attention module, together with a patch-window sorting pretext task and triplet loss, produces superior anomaly detection accuracy on the TSB-AD benchmark for univariate and multivariate series while remaining computationally lightweight.

What carries the argument

The multiscale convolutional backbone with cross-scale adaptive attention aggregation combined with the cross-variable fusion attention module, which captures hierarchical temporal patterns and explicit inter-variable correlations.

If this is right

The model enables real-time anomaly inference on resource-limited hardware.
Detection performance improves on both univariate and multivariate tasks relative to prior lightweight approaches.
The learned patch embeddings become more discriminative through the sorting pretext and triplet loss.
The architecture remains compact enough for practical deployment without the overhead of large transformer models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same multiscale-plus-cross-variable pattern could be tested on other sequential data such as sensor streams in robotics.
The patch-sorting pretext task might transfer to self-supervised pretraining for forecasting or imputation tasks.
Hybrid systems could combine this lightweight encoder with occasional calls to larger models only on uncertain cases.

Load-bearing premise

The TSB-AD benchmark and its chosen metrics including VUS-PR represent real-world industrial and medical time series anomaly detection under complex conditions.

What would settle it

A controlled experiment showing that PaAno+ does not outperform strong baselines on a fresh collection of industrial or medical time series drawn from settings absent from TSB-AD would falsify the claim of broad superiority.

Figures

Figures reproduced from arXiv: 2606.20055 by Hongbing Wang, Wenchao Liu, Xiangguang Xiong, XiaoDong Liu, Youji Zhu.

**Figure 1.** Figure 1: Classification of common defects found in the current dataset. Anomalies are marked in red. 3. Method 3.1. Problem Definition This work addresses the task of semi-supervised time series anomaly detection. Under this paradigm, the training set consists exclusively of normal time series samples. The model learns the distribution and temporal evolution patterns of normal data to identify anomalous time points… view at source ↗

**Figure 2.** Figure 2: Architecture of the PaAno+ multivariate time-series anomaly detection system. Insufficient contextual information: Several abnormal samples lack adjacent normal background data, hindering the model from capturing feature discrepancies between normal and abnormal patterns. Unrealistic anomaly ratio: Benchmark datasets assume an excessively high proportion of anomalous samples, which is inconsistent with the… view at source ↗

**Figure 3.** Figure 3: Training workflow of the PaAno+ model. where 𝑀 represents the batch size, and the distance function dist(⋅, ⋅) uses the cosine distance, dist(𝑎, 𝑏) = 1 − cos(𝑎, 𝑏). 𝛿 = 0.5 is the margin parameter, used to constrain the minimum difference in feature distances between positive and negative samples. This loss function requires that the feature distance between anchor points and negative samples be at least 𝛿… view at source ↗

**Figure 4.** Figure 4: Parameter sensitivity analysis of the PaAno+ model’s Top-𝑘 values and memory bank size on the TSB-AD-U and TSB-AD-M Eval datasets. 4.7.2. Cross-Variable Attention Contributions To explore the contribution of cross-variable fusion attention to multivariate anomaly detection, an ablated variant (w/o Attention) is established by removing the cross-variable attention module while preserving the multiscale enco… view at source ↗

**Figure 5.** Figure 5: Sensitivity analysis of the performance of univariate and multivariate time-series anomaly detection with respect to window length 𝑇 . All results are presented as percentages (%). The nearest-neighbor number 𝑘 and the memory compression ratio are two critical control parameters for the updating mechanism. Model performance fluctuates slightly when 𝑘 varies from 1 to 5. The setting 𝑘 = 3 achieves a favorab… view at source ↗

read the original abstract

Time-series anomaly detection has significant practical value for industrial and medical monitoring, as well as other critical domains. Current Transformer- and large-model-based detection approaches incur excessive computational overhead, while existing lightweight alternatives are constrained by insufficient feature extraction and inadequate modeling of dependencies across multivariate variables. To mitigate the above drawbacks, this study develops a lightweight, efficient anomaly detection model, dubbed PaAno, within the patch-oriented representation learning paradigm. In the encoder module, a multiscale feature-extraction backbone is constructed using convolutional kernels with differentiated receptive fields to capture hierarchical temporal characteristics; subsequent cross-scale adaptive attention aggregation, combined with residual connection optimization, further stabilizes feature representation learning. A cross-variable fusion attention module is embedded to explicitly characterize inter-variable correlations, empowering the model to identify anomalous patterns amid intricate operational conditions. Moreover, a novel pretext task based on temporal patch-window sorting is customized to uncover intrinsic structural properties of time series, and triplet loss is leveraged to optimize the patch embedding space for enhanced feature discrimination. Extensive experiments on the TSB-AD benchmark demonstrate that the proposed PaAno achieves state-of-the-art detection accuracy on both univariate and multivariate tasks, yielding significant performance gains across evaluation metrics, including VUS-PR, relative to the original PaAno. Leveraging a compact network design, the presented model achieves favorable computational efficiency, enabling deployment on resource-limited terminals for real-time anomaly inference.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PaAno+ adds multiscale convs, cross-scale and cross-variable attention plus a patch-sorting pretext to the prior PaAno, but the abstract supplies zero numbers, baselines or ablations to back the SOTA claim on TSB-AD.

read the letter

The new pieces are a multiscale convolutional backbone with different kernel sizes, cross-scale adaptive attention with residuals, a cross-variable fusion attention block, and a temporal patch-window sorting pretext task trained with triplet loss. These sit on top of the authors' earlier PaAno and target the usual complaints about transformer cost and weak multivariate modeling in lightweight detectors.

The framing is clear enough: industrial and medical monitoring needs something that runs on limited hardware while still picking up hierarchical patterns and variable correlations. The architecture choices line up with that goal and avoid overclaiming theoretical novelty.

The main weakness is that the abstract asserts state-of-the-art results and significant gains on VUS-PR and other metrics without showing a single number, table, baseline list, ablation, or statistical test. That makes it impossible to judge whether the added modules actually move the needle or whether the gains are within noise. The TSB-AD benchmark representativeness is also thin; the abstract gives no information on anomaly length distributions, noise levels, or how well the data match real operational conditions, so the practical-deployment story rests on an unexamined assumption.

This is for people who build deployable time-series monitors and want to see the latest lightweight patch-based variants. A reader could pick up the module ideas, but the missing empirical detail limits how much weight to give the results.

I would not send it to peer review in its current form; the central empirical claim needs the actual experiments and comparisons before a referee can do useful work.

Referee Report

1 major / 1 minor

Summary. The paper proposes PaAno+, a lightweight patch-oriented model for time-series anomaly detection. It introduces a multiscale convolutional encoder with differentiated receptive fields, cross-scale adaptive attention aggregation with residuals, a cross-variable fusion attention module, and a pretext task based on temporal patch-window sorting optimized via triplet loss. The central claim is that PaAno+ achieves state-of-the-art detection accuracy on both univariate and multivariate tasks on the TSB-AD benchmark, with significant gains (including on VUS-PR) over the original PaAno while remaining computationally efficient for resource-limited deployment.

Significance. If the reported gains hold under scrutiny, the work could supply a practical, deployable alternative to heavy Transformer-based detectors for industrial and medical monitoring. The multiscale backbone and explicit cross-variable modeling target documented weaknesses in prior lightweight methods, and the emphasis on efficiency is a clear strength for real-time inference.

major comments (1)

[§4] §4 (Experiments) and abstract: the SOTA claim and practical-value framing rest on the untested assumption that TSB-AD faithfully captures 'intricate operational conditions,' variable correlations, and anomaly patterns from the target domains. No analysis of anomaly-type diversity, length distributions, or noise characteristics is supplied; a concrete test would be to stratify results by these factors or evaluate on a controlled perturbation of TSB-AD.

minor comments (1)

[Abstract] Abstract: the final sentence refers to 'the proposed PaAno' while the title and earlier text use PaAno+; standardize nomenclature for clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed and constructive comment on the experimental section. We address the concern point-by-point below and outline the planned revisions.

read point-by-point responses

Referee: [§4] §4 (Experiments) and abstract: the SOTA claim and practical-value framing rest on the untested assumption that TSB-AD faithfully captures 'intricate operational conditions,' variable correlations, and anomaly patterns from the target domains. No analysis of anomaly-type diversity, length distributions, or noise characteristics is supplied; a concrete test would be to stratify results by these factors or evaluate on a controlled perturbation of TSB-AD.

Authors: We acknowledge that the manuscript does not include an explicit stratification or perturbation analysis of TSB-AD. TSB-AD aggregates multiple established real-world datasets chosen to reflect diverse operational conditions, anomaly types, and variable correlations across domains; our consistent gains (including on VUS-PR) across its univariate and multivariate subsets provide supporting evidence for the claims. To directly address the point, the revised version will add a short subsection in §4 summarizing TSB-AD's documented characteristics (anomaly-type coverage, length distributions, and noise profiles) based on the benchmark's original construction and metadata. We view a full controlled perturbation study as valuable future work rather than a requirement for the current claims, as it would entail new experiments outside the scope of the present evaluation. This constitutes a partial revision. revision: partial

Circularity Check

0 steps flagged

No mathematical derivation or self-referential predictions present

full rationale

The paper is entirely empirical: it describes a model architecture (multiscale encoder, cross-variable attention, pretext task) and reports benchmark results on TSB-AD. No equations, first-principles derivations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The central claim reduces to experimental performance numbers rather than any construction that equates output to input by definition. This is the normal non-circular outcome for a benchmark-driven methods paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no equations, derivations, or modeling choices, so no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.1-grok · 5787 in / 1114 out tokens · 22754 ms · 2026-06-26T18:25:00.665950+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

62 extracted references · 34 canonical work pages

[1]

Z. H. Yue, Y. J. Wang, J. Y. Duan, T. M. Yang, C. R. Huang, Y. H. Tong, and B. X. Xu. Ts2vec: Towards universal representation of time series. InAAAI, 2022. URLhttps://cdn.aaai.org/ojs/20881/20881-13-24894-1-2-20220628. pdf

2022
[2]

F. Jia, K. Wang, Y. Zheng, D. Cao, and Y. Liu. Gpt4mts: Prompt-based large language model for multimodal time-series forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 23343–23351, 2024. doi: 10.1609/aaai.v38i21.30383

work page doi:10.1609/aaai.v38i21.30383 2024
[3]

Paparrizos, P

J. Paparrizos, P. Boniol, T. Palpanas, R. S. Tsay, A. Elmore, and M. J. Franklin. Volume under the surface: A new accuracy evaluation measure for time-series anomaly detection.Proc. VLDB Endow., 15(11):2774–2787, 2022

2022
[4]

Y. Wang, L. Zhang, T. Si, G. Bishop, and H. Gong. Anomaly detection in high-dimensional time series data with scaled bregman divergence.Algorithms, 18:62, 2025. doi: 10.3390/a18020062

work page doi:10.3390/a18020062 2025
[5]

Liu and J

Q. Liu and J. Paparriz. The elephant in the room: Towards a reliable time-series anomaly detection benchmark. In Advances in Neural Information Processing Systems, volume 37, pages 108231–108261, 2024

2024
[6]

Park and S

J. Park and S. Kang. Paano: Patch-based representation learning for time-series anomaly detection. InProceedings of the International Conference on Learning Representations (ICLR), 2026. URLhttps://openreview.net/forum?id= NXThkM7Iym

2026
[7]

N. Y. Lu, F. R. Gao, Y. Yang, and F. L. Wang. Pca-based modeling and on-line monitoring strategy for uneven-length batch processes.Industrial and Engineering Chemistry Research, 43(13):3343–3352, 2004. doi: 10.1002/aic.10024

work page doi:10.1002/aic.10024 2004
[8]

E. J. Candès, X. D. Li, Y. Ma, and J. Wright. Robust principal component analysis?Journal of the ACM, 58(3):1–37, 2011

2011
[9]

Yairi, Y

T. Yairi, Y. Kato, and K. Hori. Fault detection by mining association rules from house-keeping data. InProceedings of the 6th International Symposium on Artificial Intelligence, Robotics and Automation in Space (i-SAIRAS), pages 18–21,
[10]

doi: 10.1.1.102.7045
[11]

Paparrizos and L

J. Paparrizos and L. Gravano. k-shape: Efficient and accurate clustering of time series. InProceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1855–1870, 2015. doi: 10.1145/2723372.2737793

work page doi:10.1145/2723372.2737793 2015
[12]

Z. He, X. Xu, and S. Deng. Discovering cluster-based local outliers.Pattern recognition letters, 24(9-10):1641–1650, 2003

2003
[13]

Z. Li, H. Ma, and Y. Mei. A unifying method for outlier and change detection from data streams based on local polynomial fitting. InPacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD), pages 150–161, 2007. doi: 10.1007/978-3-540-71701-0_17

work page doi:10.1007/978-3-540-71701-0_17 2007
[14]

H. Ren, B. Xu, and Y. Wang. Time-series anomaly detection service at microsoft. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3009–3017, 2019. doi: 10.1145/3292500.3342043

work page doi:10.1145/3292500.3342043 2019
[15]

F. T. Liu, K. M. Ting, and Z. H. Zhou. Isolation forest. In2008 Eighth IEEE International Conference on Data Mining, pages 413–422, 2008. doi: 10.1109/ICDM.2008.17

work page doi:10.1109/icdm.2008.17 2008
[16]

Extended isolation forest,

S. Hariri, M. C. Kind, and R. J. Brunner. Extended isolation forest.IEEE Transactions on Knowledge and Data Engineering, 33(4):1479–1489, 2021. doi: 10.1109/TKDE.2019.2947676

work page doi:10.1109/tkde.2019.2947676 2021
[17]

M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander. Lof: identifying density-based local outliers. InProceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 93–104, 2000. doi: 10.1145/342009.335388

work page doi:10.1145/342009.335388 2000
[18]

Efficient algorithms for mining outliers from large data sets

S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. InProceedings of the 2000 ACM SIGMOD International Conference on Management of Data,pages427–438,2000. doi:10.1145/342009.335437

work page doi:10.1145/342009.335437 2000
[19]

Goldstein and A

M. Goldstein and A. Dengel. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: poster and demo track, 1:59–63, 2012

2012
[20]

Z. Li, Y. Zhao, N. Botta, C. Ionescu, and X. Hu. Copod: Copula-based outlier detection. In2020 IEEE International Conference on Data Mining, pages 1118–1123, 2020. doi: 10.1109/ICDM50108.2020.00135

work page doi:10.1109/icdm50108.2020.00135 2020
[21]

C. C. M. Yeh, Y. Zhu, L. Ulanova, N. Begum, Y. Ding, H. A. Dau, D. F. Silva, A. Mueen, and E. Ke. Matrix profile i: All pairs similarity joins for time series: A unifying view that includes motifs, discords and shapelets. In2016 IEEE 16th International Conference on Data Mining, pages 1317–1322, 2016. doi: 10.1109/ICDM.2016.0179

work page doi:10.1109/icdm.2016.0179 2016
[22]

Boniol and T

P. Boniol and T. Palpanas. Series2graph: Graph-based subsequence anomaly detection for time series.Proceedings of the VLDB Endowment, 13(12):1821–1834, 2020. doi: 10.14778/3407790.3407792

work page doi:10.14778/3407790.3407792 2020
[23]

Boniol, J

P. Boniol, J. Paparrizos, T. Palpanas, and M. J. Franklin. Sand: Streaming subsequence anomaly detection.Proceedings of the VLDB Endowment, 14(10):1717–1729, 2021. doi: 10.14778/3467861.3467863

work page doi:10.14778/3467861.3467863 2021
[24]

Z. Wang, W. Yan, and T. Oates. Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks, pages 1578–1585, 2017. doi: 10.1109/IJCNN.2017.7966039

work page doi:10.1109/ijcnn.2017.7966039 2017
[25]

Munir, S

M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed. Deepant: A deep learning approach for unsupervised anomaly detection in time series.IEEE Access, 7:1991–2005, 2019. doi: 10.1109/ACCESS.2018.2886457

work page doi:10.1109/access.2018.2886457 1991
[26]

H. X. Wu, T. G. Hu, Y. Liu, H. Zhou, J. M. Wang, and M. S. Long. Timesnet: Temporal 2d-variation modeling for general time series analysis. InInternational Conference on Learning Representations (ICLR), 2023. URL https://openreview.net/forum?id=ju_Uqw384Oq

2023
[27]

Malhotra, L

P. Malhotra, L. Vig, G. Shroff, and P. Agarwal. Long short-term memory networks for anomaly detection in time series. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pages Youji Zhu et al.:Preprint submitted to ElsevierPage 19 of 21 PaAno+: Multiscale Encoding and Cross-Variable Attention for T...

2015
[28]

Sakurada and T

M. Sakurada and T. Yairi. Anomaly detection using autoencoders with nonlinear dimensionality reduction. InMLSDA 2nd Workshop on Machine Learning for Sensory Data Analysis, pages 4–11, 2014. doi: 10.1145/2689746.2689747

work page doi:10.1145/2689746.2689747 2014
[29]

USAD: UnSupervised anomaly detection on multivariate time series,

J. Audibert, P. Michiardi, F. Guyard, S. Marti, and M. A. Zuluaga. Usad: Unsupervised anomaly detection on multivariate time series. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3395–3404, 2020. doi: 10.1145/3394486.3403392

work page doi:10.1145/3394486.3403392 2020
[30]

Y. Su, Y. J. Zhao, C. H. Niu, R. Liu, W. Sun, and D. Pei. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2828–2837, 2019. doi: 10.1145/3292500.3330672

work page doi:10.1145/3292500.3330672 2019
[31]

Z. J. Xu, A. L. Zeng, and Q. Xu. Fits: Modeling time series with 10k parameters. InInternational Conference on Learning Representations (ICLR), 2024. URLhttps://openreview.net/forum?id=h8eTPQz2jI

2024
[32]

J. H. Xu, H. X. Wu, J. M. Wang, and M. S. Long. Anomaly transformer: Time series anomaly detection with association discrepancy. InInternational Conference on Learning Representations (ICLR), 2022. URLhttps://openreview.net/ forum?id=LzQQ89U1qm_

2022
[33]

S. Tuli, G. Casale, and N. R. Jennings. Tranad: Deep transformer networks for anomaly detection in multivariate time series data.Proceedings of the VLDB Endowment, 15(6):1201–1214, 2022. doi: 10.14778/3514061.3514067

work page doi:10.14778/3514061.3514067 2022
[34]

Y. Y. Yang, C. L. Zhang, T. Zhou, Q. S. Wen, and L. Sun. Dcdetector: Dual attention contrastive representation learning for time series anomaly detection. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3033–3045, 2023. doi: 10.1145/3580305.3599295

work page doi:10.1145/3580305.3599295 2023
[35]

Y. Q. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. InInternational Conference on Learning Representations (ICLR), 2023. URLhttps://openreview.net/ forum?id=Jbdc0vTOcol

2023
[36]

Y. Liu, T. G. Hu, H. R. Zhang, H. X. Wu, S. Y. Wang, L. T. Ma, and M. S. Long. itransformer: Inverted transformers are effective for time series forecasting. InInternational Conference on Learning Representations (ICLR), 2024. URL https://openreview.net/pdf?id=JePfAI8fah

2024
[37]

Goswami, K

M. Goswami, K. Szafer, A. Choudhry, Y. F. Cai, S. Li, and A. Dubrawski. Moment: A family of open time-series foundation models. InProceedings of the 41st International Conference on Machine Learning (ICML), 2024. URL https://openreview.net/pdf?id=FVvf69a5rx

2024
[38]

A. Das, W. H. Kong, R. Sen, and Y. C. Zhou. A decoder-only foundation model for time-series forecasting. InProceedings of the 41st International Conference on Machine Learning (ICML), 2024. URLhttps://openreview.net/forum?id= jn2iTJas6h

2024
[39]

A. F. Ansari, L. Stella, A. C. Turkmen, X. Y. Zhang, P. Mercado, H. B. Shen, O. Shchur, S. S. Rangapuram, S. P. Arango, S. Kapoor, J. Zschiegner, D. C. Maddix, H. Wang, M. W. Mahoney, K. Torkkola, A. G. Wilson, M. Bohlke-Schneider, and B. Wang. Chronos: Learning the language of time series.Transactions on Machine Learning Research (TMLR), 2024. URLhttps:/...

2024
[40]

Rasul, A

K. Rasul, A. Ashok, A. R. Williams, H. Ghonia, R. Bhagwatkar, A. Khorasani, B. M. J. Darvishi, G. Adamopoulos, R. Riachi, N. Hassen, M. Biloš, S. Garg, A. Schneider, N. Chapados, A. Drouin, V. Zantedeschi, Y. Nevmyvaka, and I. Rish. Lag-llama: Towards foundation models for probabilistic time series forecasting. InR0-FoMo Workshop at NeurIPS 2023, 2023. UR...

2023
[41]

Z. Z. Darban, G. I. Webb, S. R. Pan, C. C. Aggarwal, and M. Salehi. Deep learning for time series anomaly detection: A survey.ACM Computing Surveys, 57(1):1–42, 2024. doi: 10.1145/3735790.3735791

work page doi:10.1145/3735790.3735791 2024
[42]

Schölkopf, A

B. Schölkopf, A. J. Smola, R. C. Williamson, and P. L. Bartlett. New support vector algorithms.Neural Computation, 12(5):1207–1245, 2000. doi: 10.1162/089976600300015565

work page doi:10.1162/089976600300015565 2000
[43]

X. J. Wu, X. F. Qiu, Z. Y. Li, Y. H. Wang, J. L. Hu, C. J. Guo, H. Xiong, and B. Yang. Catch: Channel-aware multivariate time series anomaly detection via frequency patching. InICLR, 2025. URLhttps://openreview.net/forum? id=OY7NBoHUcy

2025
[44]

Zhong, Z

Z. Zhong, Z. Yu, Y. Yang, W. Wang, K. Yang, and C. L. P. Chen. Patchad: A lightweight patch-based mlp-mixer for time series anomaly detection.IEEE Transactions on Big Data, 11(6):3460–3473, 2025. doi: 10.1109/TBDATA.2025.3596745

work page doi:10.1109/tbdata.2025.3596745 2025
[45]

F. H. Ismail, B. Lucas, G. Forestier, C. Pelletier, D. F. Schmidt, J. Weber, G. I. Webb, L. Idoumghar, P. A. Muller, and F. Petitjean. Inceptiontime: Finding alexnet for time series classification.Data Mining and Knowledge Discovery, 2020. doi: 10.1007/s10618-020-00710-y

work page doi:10.1007/s10618-020-00710-y 2020
[46]

S. Xia, W. Sun, X. Zou, P. Chen, D. Ma, H. Xu, M. Chen, and H. Li. Mfam-ad: An anomaly detection model for multivariate time series using attention mechanism.PeerJ Computer Science, 2024. doi: 10.7717/peerj-cs.2201

work page doi:10.7717/peerj-cs.2201 2024
[47]

Zhang, B

B. Zhang, B. Qi, J. Wang, and G. Liang. An improved gaussian mixture-probability hypothesis density filter for underwater multiple target tracking in dense clutter scenario. In2024 OES China Ocean Acoustics, pages 1–7, 2024. doi: 10.1109/COA58979.2024.10723628

work page doi:10.1109/coa58979.2024.10723628 2024
[48]

H. F. Lee, Z. X. Zeng, Z. P. Qiu, W. F. Zhu, and R. L. Xiao. Cscad: Modeling cross-scale sequence correlations for multivariate time series anomaly detection.Information Processing and Management, 2025. doi: 10.1016/j.ipm.2025. 104315

work page doi:10.1016/j.ipm.2025 2025
[49]

W. S. Gao, X. Y. Wang, Y. Wang, and X. C. Jing. Dual-stream attention-enhanced memory networks for video anomaly detection.Sensors, 25(17), 2025. doi: 10.3390/s25175496

work page doi:10.3390/s25175496 2025
[50]

Z. Z. Darban, G. I. Webb, S. Pan, C. C. Aggarwal, and M. Salehi. Carla: Self-supervised contrastive representation learning for time series anomaly detection.Pattern Recognition, 157, 2025. doi: 10.1016/j.patcog.2024.110874. Youji Zhu et al.:Preprint submitted to ElsevierPage 20 of 21 PaAno+: Multiscale Encoding and Cross-Variable Attention for Time Serie...

work page doi:10.1016/j.patcog.2024.110874 2025
[51]

X. Y. Yang, Z. G. Zhang, and R. Y. Cui. Timeclr: A self-supervised contrastive learning framework for univariate time series representation.Knowledge-Based Systems, 245, 2022. doi: 10.1016/j.knosys.2022.108606

work page doi:10.1016/j.knosys.2022.108606 2022
[52]

G. Woo, C. H. Liu, D. Sahoo, A. Kumar, and S. C. H. Hoi. Cost: Contrastive learning of disentangled seasonal-trend representations for time series forecasting. InICLR, 2022. URLhttps://openreview.net/forum?id=PilZY3omXV2

2022
[53]

Zhang, Z

X. Zhang, Z. Y. Zhao, T. Tsiligkaridis, and M. Zitnik. Tf-c: Time-frequency contrastive learning for time series. In NeurIPS, 2022. URLhttps://openreview.net/forum?id=OJ4mMfGKLN

2022
[54]

Y. S. Dai, H. Wang, K. Rafferty, I. Spence, and B. Quinn. Tdsrl: Time series dual self-supervised representation learning for anomaly detection from different perspectives, 2024. URLhttpshdl.handle.net/10419/289582

2024
[55]

Q. Wang, H. Q. Zhu, W. Zhang, F. Jiang, X. L. Wang, and H. Huang. Maet: A generalizable masked autoencoding framework for anomaly detection in time-series data.Journal of Signal Processing Systems, 97:281–291, 2025. doi: 10.1007/s11265-025-01968-5

work page doi:10.1007/s11265-025-01968-5 2025
[56]

Y. C. Fang, J. D. Xie, Y. Zhao, L. Chen, Y. J. Gao, and K. Zheng. Tfmae: Temporal-frequency masked autoencoders for time series anomaly detection. InProceedings of the 40th IEEE International Conference on Data Engineering (ICDE), Utrecht, Netherlands, pages 1228–1241, 2024. doi: 10.1109/ICDE60146.2024.00099

work page doi:10.1109/icde60146.2024.00099 2024
[57]

J. Kim, K. Park, S. Yun, and S. Lee. Ppt: Patch order do matters in time series pretext task. InProceedings of the International Conference on Learning Representations (ICLR), Singapore, 2025. URLhttps://openreview.net/forum? id=7zwIEbSTDy

2025
[58]

Ling and H

Y. Ling and H. Shenda. Unsupervised time-series representation learning with iterative bilinear temporal-spectral fusion. InProceedings of the 39th International Conference on Machine Learning, PMLR, volume 162, pages 25038–25054, 2022. URLhttp://proceedings.mlr.press/v162/yang22e.html

2022
[59]

Z. J. Zhong, Z. W. Yu, X. Xi, Y. Xu, W. M. Cao, Y. Y. Yang, K. X. Yang, and J. You. Simad: A simple dissimilarity- based approach for time-series anomaly detection.IEEE Transactions on Neural Networks and Learning Systems, 36(11): 19669–19680, 2025. doi: 10.1109/TNNLS.2025.3590220

work page doi:10.1109/tnnls.2025.3590220 2025
[60]

A. Zeng, M. Chen, L. Zhang, and Q. Xu. Are transformers effective for time series forecasting? InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 11121–11128, 2023

2023
[61]

Q. C. Shentu, B. B. Li, K. Zhao, Y. Shu, Z. W. Rao, L. J. Pan, B. Yang, and C. J. Guo. Towards a general time series anomaly detector with adaptive bottlenecks and dual adversarial decoders. InInternational Conference on Learning Representations (ICLR), 2025. URLhttps://openreview.net/forum?id=aKcd7ImG5e

2025
[62]

B. B. Li, Q. C. Shentu, Y. Shu, H. Zhang, M. Li, N. Jin, B. Yang, and C. J. Guo. Crossad: Time series anomaly detection with cross-scale associations and cross-window modeling. InIn Advances in Neural Information Processing Systems (NeurIPS 2025), 2025. URLhttps://nips.cc/virtual/2025/loc/san-diego/poster/116814. Youji Zhu et al.:Preprint submitted to Els...

2025

[1] [1]

Z. H. Yue, Y. J. Wang, J. Y. Duan, T. M. Yang, C. R. Huang, Y. H. Tong, and B. X. Xu. Ts2vec: Towards universal representation of time series. InAAAI, 2022. URLhttps://cdn.aaai.org/ojs/20881/20881-13-24894-1-2-20220628. pdf

2022

[2] [2]

F. Jia, K. Wang, Y. Zheng, D. Cao, and Y. Liu. Gpt4mts: Prompt-based large language model for multimodal time-series forecasting. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 23343–23351, 2024. doi: 10.1609/aaai.v38i21.30383

work page doi:10.1609/aaai.v38i21.30383 2024

[3] [3]

Paparrizos, P

J. Paparrizos, P. Boniol, T. Palpanas, R. S. Tsay, A. Elmore, and M. J. Franklin. Volume under the surface: A new accuracy evaluation measure for time-series anomaly detection.Proc. VLDB Endow., 15(11):2774–2787, 2022

2022

[4] [4]

Y. Wang, L. Zhang, T. Si, G. Bishop, and H. Gong. Anomaly detection in high-dimensional time series data with scaled bregman divergence.Algorithms, 18:62, 2025. doi: 10.3390/a18020062

work page doi:10.3390/a18020062 2025

[5] [5]

Liu and J

Q. Liu and J. Paparriz. The elephant in the room: Towards a reliable time-series anomaly detection benchmark. In Advances in Neural Information Processing Systems, volume 37, pages 108231–108261, 2024

2024

[6] [6]

Park and S

J. Park and S. Kang. Paano: Patch-based representation learning for time-series anomaly detection. InProceedings of the International Conference on Learning Representations (ICLR), 2026. URLhttps://openreview.net/forum?id= NXThkM7Iym

2026

[7] [7]

N. Y. Lu, F. R. Gao, Y. Yang, and F. L. Wang. Pca-based modeling and on-line monitoring strategy for uneven-length batch processes.Industrial and Engineering Chemistry Research, 43(13):3343–3352, 2004. doi: 10.1002/aic.10024

work page doi:10.1002/aic.10024 2004

[8] [8]

E. J. Candès, X. D. Li, Y. Ma, and J. Wright. Robust principal component analysis?Journal of the ACM, 58(3):1–37, 2011

2011

[9] [9]

Yairi, Y

T. Yairi, Y. Kato, and K. Hori. Fault detection by mining association rules from house-keeping data. InProceedings of the 6th International Symposium on Artificial Intelligence, Robotics and Automation in Space (i-SAIRAS), pages 18–21,

[10] [10]

doi: 10.1.1.102.7045

[11] [11]

Paparrizos and L

J. Paparrizos and L. Gravano. k-shape: Efficient and accurate clustering of time series. InProceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pages 1855–1870, 2015. doi: 10.1145/2723372.2737793

work page doi:10.1145/2723372.2737793 2015

[12] [12]

Z. He, X. Xu, and S. Deng. Discovering cluster-based local outliers.Pattern recognition letters, 24(9-10):1641–1650, 2003

2003

[13] [13]

Z. Li, H. Ma, and Y. Mei. A unifying method for outlier and change detection from data streams based on local polynomial fitting. InPacific-Asia Conference on Knowledge Discovery and Data Mining(PAKDD), pages 150–161, 2007. doi: 10.1007/978-3-540-71701-0_17

work page doi:10.1007/978-3-540-71701-0_17 2007

[14] [14]

H. Ren, B. Xu, and Y. Wang. Time-series anomaly detection service at microsoft. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3009–3017, 2019. doi: 10.1145/3292500.3342043

work page doi:10.1145/3292500.3342043 2019

[15] [15]

F. T. Liu, K. M. Ting, and Z. H. Zhou. Isolation forest. In2008 Eighth IEEE International Conference on Data Mining, pages 413–422, 2008. doi: 10.1109/ICDM.2008.17

work page doi:10.1109/icdm.2008.17 2008

[16] [16]

Extended isolation forest,

S. Hariri, M. C. Kind, and R. J. Brunner. Extended isolation forest.IEEE Transactions on Knowledge and Data Engineering, 33(4):1479–1489, 2021. doi: 10.1109/TKDE.2019.2947676

work page doi:10.1109/tkde.2019.2947676 2021

[17] [17]

M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander. Lof: identifying density-based local outliers. InProceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pages 93–104, 2000. doi: 10.1145/342009.335388

work page doi:10.1145/342009.335388 2000

[18] [18]

Efficient algorithms for mining outliers from large data sets

S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. InProceedings of the 2000 ACM SIGMOD International Conference on Management of Data,pages427–438,2000. doi:10.1145/342009.335437

work page doi:10.1145/342009.335437 2000

[19] [19]

Goldstein and A

M. Goldstein and A. Dengel. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: poster and demo track, 1:59–63, 2012

2012

[20] [20]

Z. Li, Y. Zhao, N. Botta, C. Ionescu, and X. Hu. Copod: Copula-based outlier detection. In2020 IEEE International Conference on Data Mining, pages 1118–1123, 2020. doi: 10.1109/ICDM50108.2020.00135

work page doi:10.1109/icdm50108.2020.00135 2020

[21] [21]

C. C. M. Yeh, Y. Zhu, L. Ulanova, N. Begum, Y. Ding, H. A. Dau, D. F. Silva, A. Mueen, and E. Ke. Matrix profile i: All pairs similarity joins for time series: A unifying view that includes motifs, discords and shapelets. In2016 IEEE 16th International Conference on Data Mining, pages 1317–1322, 2016. doi: 10.1109/ICDM.2016.0179

work page doi:10.1109/icdm.2016.0179 2016

[22] [22]

Boniol and T

P. Boniol and T. Palpanas. Series2graph: Graph-based subsequence anomaly detection for time series.Proceedings of the VLDB Endowment, 13(12):1821–1834, 2020. doi: 10.14778/3407790.3407792

work page doi:10.14778/3407790.3407792 2020

[23] [23]

Boniol, J

P. Boniol, J. Paparrizos, T. Palpanas, and M. J. Franklin. Sand: Streaming subsequence anomaly detection.Proceedings of the VLDB Endowment, 14(10):1717–1729, 2021. doi: 10.14778/3467861.3467863

work page doi:10.14778/3467861.3467863 2021

[24] [24]

Z. Wang, W. Yan, and T. Oates. Time series classification from scratch with deep neural networks: A strong baseline. In 2017 International Joint Conference on Neural Networks, pages 1578–1585, 2017. doi: 10.1109/IJCNN.2017.7966039

work page doi:10.1109/ijcnn.2017.7966039 2017

[25] [25]

Munir, S

M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed. Deepant: A deep learning approach for unsupervised anomaly detection in time series.IEEE Access, 7:1991–2005, 2019. doi: 10.1109/ACCESS.2018.2886457

work page doi:10.1109/access.2018.2886457 1991

[26] [26]

H. X. Wu, T. G. Hu, Y. Liu, H. Zhou, J. M. Wang, and M. S. Long. Timesnet: Temporal 2d-variation modeling for general time series analysis. InInternational Conference on Learning Representations (ICLR), 2023. URL https://openreview.net/forum?id=ju_Uqw384Oq

2023

[27] [27]

Malhotra, L

P. Malhotra, L. Vig, G. Shroff, and P. Agarwal. Long short-term memory networks for anomaly detection in time series. In European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), pages Youji Zhu et al.:Preprint submitted to ElsevierPage 19 of 21 PaAno+: Multiscale Encoding and Cross-Variable Attention for T...

2015

[28] [28]

Sakurada and T

M. Sakurada and T. Yairi. Anomaly detection using autoencoders with nonlinear dimensionality reduction. InMLSDA 2nd Workshop on Machine Learning for Sensory Data Analysis, pages 4–11, 2014. doi: 10.1145/2689746.2689747

work page doi:10.1145/2689746.2689747 2014

[29] [29]

USAD: UnSupervised anomaly detection on multivariate time series,

J. Audibert, P. Michiardi, F. Guyard, S. Marti, and M. A. Zuluaga. Usad: Unsupervised anomaly detection on multivariate time series. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 3395–3404, 2020. doi: 10.1145/3394486.3403392

work page doi:10.1145/3394486.3403392 2020

[30] [30]

Y. Su, Y. J. Zhao, C. H. Niu, R. Liu, W. Sun, and D. Pei. Robust anomaly detection for multivariate time series through stochastic recurrent neural network. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2828–2837, 2019. doi: 10.1145/3292500.3330672

work page doi:10.1145/3292500.3330672 2019

[31] [31]

Z. J. Xu, A. L. Zeng, and Q. Xu. Fits: Modeling time series with 10k parameters. InInternational Conference on Learning Representations (ICLR), 2024. URLhttps://openreview.net/forum?id=h8eTPQz2jI

2024

[32] [32]

J. H. Xu, H. X. Wu, J. M. Wang, and M. S. Long. Anomaly transformer: Time series anomaly detection with association discrepancy. InInternational Conference on Learning Representations (ICLR), 2022. URLhttps://openreview.net/ forum?id=LzQQ89U1qm_

2022

[33] [33]

S. Tuli, G. Casale, and N. R. Jennings. Tranad: Deep transformer networks for anomaly detection in multivariate time series data.Proceedings of the VLDB Endowment, 15(6):1201–1214, 2022. doi: 10.14778/3514061.3514067

work page doi:10.14778/3514061.3514067 2022

[34] [34]

Y. Y. Yang, C. L. Zhang, T. Zhou, Q. S. Wen, and L. Sun. Dcdetector: Dual attention contrastive representation learning for time series anomaly detection. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3033–3045, 2023. doi: 10.1145/3580305.3599295

work page doi:10.1145/3580305.3599295 2023

[35] [35]

Y. Q. Nie, N. H. Nguyen, P. Sinthong, and J. Kalagnanam. A time series is worth 64 words: Long-term forecasting with transformers. InInternational Conference on Learning Representations (ICLR), 2023. URLhttps://openreview.net/ forum?id=Jbdc0vTOcol

2023

[36] [36]

Y. Liu, T. G. Hu, H. R. Zhang, H. X. Wu, S. Y. Wang, L. T. Ma, and M. S. Long. itransformer: Inverted transformers are effective for time series forecasting. InInternational Conference on Learning Representations (ICLR), 2024. URL https://openreview.net/pdf?id=JePfAI8fah

2024

[37] [37]

Goswami, K

M. Goswami, K. Szafer, A. Choudhry, Y. F. Cai, S. Li, and A. Dubrawski. Moment: A family of open time-series foundation models. InProceedings of the 41st International Conference on Machine Learning (ICML), 2024. URL https://openreview.net/pdf?id=FVvf69a5rx

2024

[38] [38]

A. Das, W. H. Kong, R. Sen, and Y. C. Zhou. A decoder-only foundation model for time-series forecasting. InProceedings of the 41st International Conference on Machine Learning (ICML), 2024. URLhttps://openreview.net/forum?id= jn2iTJas6h

2024

[39] [39]

A. F. Ansari, L. Stella, A. C. Turkmen, X. Y. Zhang, P. Mercado, H. B. Shen, O. Shchur, S. S. Rangapuram, S. P. Arango, S. Kapoor, J. Zschiegner, D. C. Maddix, H. Wang, M. W. Mahoney, K. Torkkola, A. G. Wilson, M. Bohlke-Schneider, and B. Wang. Chronos: Learning the language of time series.Transactions on Machine Learning Research (TMLR), 2024. URLhttps:/...

2024

[40] [40]

Rasul, A

K. Rasul, A. Ashok, A. R. Williams, H. Ghonia, R. Bhagwatkar, A. Khorasani, B. M. J. Darvishi, G. Adamopoulos, R. Riachi, N. Hassen, M. Biloš, S. Garg, A. Schneider, N. Chapados, A. Drouin, V. Zantedeschi, Y. Nevmyvaka, and I. Rish. Lag-llama: Towards foundation models for probabilistic time series forecasting. InR0-FoMo Workshop at NeurIPS 2023, 2023. UR...

2023

[41] [41]

Z. Z. Darban, G. I. Webb, S. R. Pan, C. C. Aggarwal, and M. Salehi. Deep learning for time series anomaly detection: A survey.ACM Computing Surveys, 57(1):1–42, 2024. doi: 10.1145/3735790.3735791

work page doi:10.1145/3735790.3735791 2024

[42] [42]

Schölkopf, A

B. Schölkopf, A. J. Smola, R. C. Williamson, and P. L. Bartlett. New support vector algorithms.Neural Computation, 12(5):1207–1245, 2000. doi: 10.1162/089976600300015565

work page doi:10.1162/089976600300015565 2000

[43] [43]

X. J. Wu, X. F. Qiu, Z. Y. Li, Y. H. Wang, J. L. Hu, C. J. Guo, H. Xiong, and B. Yang. Catch: Channel-aware multivariate time series anomaly detection via frequency patching. InICLR, 2025. URLhttps://openreview.net/forum? id=OY7NBoHUcy

2025

[44] [44]

Zhong, Z

Z. Zhong, Z. Yu, Y. Yang, W. Wang, K. Yang, and C. L. P. Chen. Patchad: A lightweight patch-based mlp-mixer for time series anomaly detection.IEEE Transactions on Big Data, 11(6):3460–3473, 2025. doi: 10.1109/TBDATA.2025.3596745

work page doi:10.1109/tbdata.2025.3596745 2025

[45] [45]

F. H. Ismail, B. Lucas, G. Forestier, C. Pelletier, D. F. Schmidt, J. Weber, G. I. Webb, L. Idoumghar, P. A. Muller, and F. Petitjean. Inceptiontime: Finding alexnet for time series classification.Data Mining and Knowledge Discovery, 2020. doi: 10.1007/s10618-020-00710-y

work page doi:10.1007/s10618-020-00710-y 2020

[46] [46]

S. Xia, W. Sun, X. Zou, P. Chen, D. Ma, H. Xu, M. Chen, and H. Li. Mfam-ad: An anomaly detection model for multivariate time series using attention mechanism.PeerJ Computer Science, 2024. doi: 10.7717/peerj-cs.2201

work page doi:10.7717/peerj-cs.2201 2024

[47] [47]

Zhang, B

B. Zhang, B. Qi, J. Wang, and G. Liang. An improved gaussian mixture-probability hypothesis density filter for underwater multiple target tracking in dense clutter scenario. In2024 OES China Ocean Acoustics, pages 1–7, 2024. doi: 10.1109/COA58979.2024.10723628

work page doi:10.1109/coa58979.2024.10723628 2024

[48] [48]

H. F. Lee, Z. X. Zeng, Z. P. Qiu, W. F. Zhu, and R. L. Xiao. Cscad: Modeling cross-scale sequence correlations for multivariate time series anomaly detection.Information Processing and Management, 2025. doi: 10.1016/j.ipm.2025. 104315

work page doi:10.1016/j.ipm.2025 2025

[49] [49]

W. S. Gao, X. Y. Wang, Y. Wang, and X. C. Jing. Dual-stream attention-enhanced memory networks for video anomaly detection.Sensors, 25(17), 2025. doi: 10.3390/s25175496

work page doi:10.3390/s25175496 2025

[50] [50]

Z. Z. Darban, G. I. Webb, S. Pan, C. C. Aggarwal, and M. Salehi. Carla: Self-supervised contrastive representation learning for time series anomaly detection.Pattern Recognition, 157, 2025. doi: 10.1016/j.patcog.2024.110874. Youji Zhu et al.:Preprint submitted to ElsevierPage 20 of 21 PaAno+: Multiscale Encoding and Cross-Variable Attention for Time Serie...

work page doi:10.1016/j.patcog.2024.110874 2025

[51] [51]

X. Y. Yang, Z. G. Zhang, and R. Y. Cui. Timeclr: A self-supervised contrastive learning framework for univariate time series representation.Knowledge-Based Systems, 245, 2022. doi: 10.1016/j.knosys.2022.108606

work page doi:10.1016/j.knosys.2022.108606 2022

[52] [52]

G. Woo, C. H. Liu, D. Sahoo, A. Kumar, and S. C. H. Hoi. Cost: Contrastive learning of disentangled seasonal-trend representations for time series forecasting. InICLR, 2022. URLhttps://openreview.net/forum?id=PilZY3omXV2

2022

[53] [53]

Zhang, Z

X. Zhang, Z. Y. Zhao, T. Tsiligkaridis, and M. Zitnik. Tf-c: Time-frequency contrastive learning for time series. In NeurIPS, 2022. URLhttps://openreview.net/forum?id=OJ4mMfGKLN

2022

[54] [54]

Y. S. Dai, H. Wang, K. Rafferty, I. Spence, and B. Quinn. Tdsrl: Time series dual self-supervised representation learning for anomaly detection from different perspectives, 2024. URLhttpshdl.handle.net/10419/289582

2024

[55] [55]

Q. Wang, H. Q. Zhu, W. Zhang, F. Jiang, X. L. Wang, and H. Huang. Maet: A generalizable masked autoencoding framework for anomaly detection in time-series data.Journal of Signal Processing Systems, 97:281–291, 2025. doi: 10.1007/s11265-025-01968-5

work page doi:10.1007/s11265-025-01968-5 2025

[56] [56]

Y. C. Fang, J. D. Xie, Y. Zhao, L. Chen, Y. J. Gao, and K. Zheng. Tfmae: Temporal-frequency masked autoencoders for time series anomaly detection. InProceedings of the 40th IEEE International Conference on Data Engineering (ICDE), Utrecht, Netherlands, pages 1228–1241, 2024. doi: 10.1109/ICDE60146.2024.00099

work page doi:10.1109/icde60146.2024.00099 2024

[57] [57]

J. Kim, K. Park, S. Yun, and S. Lee. Ppt: Patch order do matters in time series pretext task. InProceedings of the International Conference on Learning Representations (ICLR), Singapore, 2025. URLhttps://openreview.net/forum? id=7zwIEbSTDy

2025

[58] [58]

Ling and H

Y. Ling and H. Shenda. Unsupervised time-series representation learning with iterative bilinear temporal-spectral fusion. InProceedings of the 39th International Conference on Machine Learning, PMLR, volume 162, pages 25038–25054, 2022. URLhttp://proceedings.mlr.press/v162/yang22e.html

2022

[59] [59]

Z. J. Zhong, Z. W. Yu, X. Xi, Y. Xu, W. M. Cao, Y. Y. Yang, K. X. Yang, and J. You. Simad: A simple dissimilarity- based approach for time-series anomaly detection.IEEE Transactions on Neural Networks and Learning Systems, 36(11): 19669–19680, 2025. doi: 10.1109/TNNLS.2025.3590220

work page doi:10.1109/tnnls.2025.3590220 2025

[60] [60]

A. Zeng, M. Chen, L. Zhang, and Q. Xu. Are transformers effective for time series forecasting? InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 11121–11128, 2023

2023

[61] [61]

Q. C. Shentu, B. B. Li, K. Zhao, Y. Shu, Z. W. Rao, L. J. Pan, B. Yang, and C. J. Guo. Towards a general time series anomaly detector with adaptive bottlenecks and dual adversarial decoders. InInternational Conference on Learning Representations (ICLR), 2025. URLhttps://openreview.net/forum?id=aKcd7ImG5e

2025

[62] [62]

B. B. Li, Q. C. Shentu, Y. Shu, H. Zhang, M. Li, N. Jin, B. Yang, and C. J. Guo. Crossad: Time series anomaly detection with cross-scale associations and cross-window modeling. InIn Advances in Neural Information Processing Systems (NeurIPS 2025), 2025. URLhttps://nips.cc/virtual/2025/loc/san-diego/poster/116814. Youji Zhu et al.:Preprint submitted to Els...

2025