Matrix Profile for Anomaly Detection on Multidimensional Time Series

Audrey Der; Chin-Chia Michael Yeh; Eamonn Keogh; Huiyuan Chen; Junpeng Wang; Liang Wang; Prince Osei Aboagye; Uday Singh Saini; Vivian Lai; Wei Zhang

arxiv: 2409.09298 · v2 · submitted 2024-09-14 · 💻 cs.LG · cs.AI· cs.DB

Matrix Profile for Anomaly Detection on Multidimensional Time Series

Chin-Chia Michael Yeh , Audrey Der , Uday Singh Saini , Vivian Lai , Yan Zheng , Junpeng Wang , Xin Dai , Zhongfang Zhuang

show 6 more authors

Yujie Fan Huiyuan Chen Prince Osei Aboagye Liang Wang Wei Zhang Eamonn Keogh

This is my paper

Pith reviewed 2026-05-23 21:02 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.DB

keywords matrix profileanomaly detectionmultidimensional time seriestime series data miningunsupervised anomaly detectionsupervised anomaly detectionsemi-supervised anomaly detectionnearest neighbor search

0 comments

The pith

The Matrix Profile is the only anomaly detection method that maintains high performance across unsupervised, supervised, and semi-supervised setups on multidimensional time series.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how to adapt the Matrix Profile for anomaly detection when time series have multiple dimensions, such as readings from several sensors at once. It tests ways to reduce the resulting three-dimensional distance information into a usable profile and extends the approach to locate k-nearest neighbors. Benchmarks against 19 other methods on 119 datasets show that only the Matrix Profile achieves strong results no matter whether labels are unavailable, fully available, or partially available. A reader would care because many practical monitoring tasks involve exactly this mix of multiple channels and uncertain label access.

Core claim

The authors show that condensing the n by n by d distance tensor into a profile vector, combined with a k-nearest-neighbor extension, produces the sole method among the tested baselines that delivers high performance in unsupervised, supervised, and semi-supervised regimes across the full set of 119 multidimensional time series anomaly detection datasets.

What carries the argument

The multidimensional Matrix Profile obtained by condensing the n x n x d pairwise subsequence distance tensor into a one-dimensional profile vector for nearest-neighbor anomaly scoring.

Load-bearing premise

The 119 datasets and 19 baseline implementations are representative enough that the observed performance consistency will hold for new multidimensional time series.

What would settle it

A collection of multidimensional time series datasets, distinct from the 119 used, on which at least one of the 19 baseline methods matches or exceeds the Matrix Profile performance in all three learning setups.

Figures

Figures reproduced from arXiv: 2409.09298 by Audrey Der, Chin-Chia Michael Yeh, Eamonn Keogh, Huiyuan Chen, Junpeng Wang, Liang Wang, Prince Osei Aboagye, Uday Singh Saini, Vivian Lai, Wei Zhang, Xin Dai, Yan Zheng, Yujie Fan, Zhongfang Zhuang.

**Figure 2.** Figure 2: The Matrix Profile summarizes the pairwise distance matrix of a [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: In this example, based on the sorted results, the anomaly is most [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 5.** Figure 5: … … 1. find nearest neighbor d 2. sort … … n …1-m+1 n2-m+1 Matrix Profile Pairwise Distance Tensor [PITH_FULL_IMAGE:figures/full_fig_p004_5.png] view at source ↗

**Figure 4.** Figure 4: The Matrix Profile uses the post-sorting strategy to summarize the [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 6.** Figure 6: Different dimensions of the multidimensional Matrix Profile can be [PITH_FULL_IMAGE:figures/full_fig_p004_6.png] view at source ↗

**Figure 8.** Figure 8: This figure presents the pre-sorting and post-sorting multidimensional [PITH_FULL_IMAGE:figures/full_fig_p005_8.png] view at source ↗

**Figure 7.** Figure 7: This figure illustrates how the four variants of the multidimensional [PITH_FULL_IMAGE:figures/full_fig_p005_7.png] view at source ↗

**Figure 9.** Figure 9: This figure demonstrates how different kNN finding algorithms scale with the parameters k and time series length n2. The proposed kNN finding algorithm is more efficient compared to the baseline methods. its label is for hyper-parameter tuning. Including the test time series in the computation of MP for the training time series can improve the accuracy of test performance estimation. Second, including the … view at source ↗

**Figure 10.** Figure 10: Total benchmark runtime for various number of processes settings. [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

**Figure 11.** Figure 11: The relationship between anomaly detection performance and [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

read the original abstract

The Matrix Profile (MP), a versatile tool for time series data mining, has been shown effective in time series anomaly detection (TSAD). This paper delves into the problem of anomaly detection in multidimensional time series, a common occurrence in real-world applications. For instance, in a manufacturing factory, multiple sensors installed across the site collect time-varying data for analysis. The Matrix Profile, named for its role in profiling the matrix storing pairwise distance between subsequences of univariate time series, becomes complex in multidimensional scenarios. If the input univariate time series has n subsequences, the pairwise distance matrix is a n x n matrix. In a multidimensional time series with d dimensions, the pairwise distance information must be stored in a n x n x d tensor. In this paper, we first analyze different strategies for condensing this tensor into a profile vector. We then investigate the potential of extending the MP to efficiently find k-nearest neighbors for anomaly detection. Finally, we benchmark the multidimensional MP against 19 baseline methods on 119 multidimensional TSAD datasets. The experiments covers three learning setups: unsupervised, supervised, and semi-supervised. MP is the only method that consistently delivers high performance across all setups. To ensure complete transparency and facilitate future research, our full Matrix Profile-based implementation, which includes newly added evaluations against the TSB-AD benchmark, is publicly available at: https://github.com/mcyeh/mmpad_tsb

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows a multidimensional Matrix Profile extension that is the only one of 19 methods to stay strong across unsupervised, supervised, and semi-supervised anomaly detection on 119 datasets, but the result rests on whether those datasets represent the broader space.

read the letter

The main thing to know is that the authors condense the n x n x d distance tensor into a profile vector, add a k-NN variant, and then run the first broad head-to-head on the TSB-AD collection. Their headline result is that only the Matrix Profile approach holds up well in all three learning regimes at once. The public code release helps make that claim checkable.

Referee Report

2 major / 1 minor

Summary. The paper extends the Matrix Profile (MP) to multidimensional time series anomaly detection by analyzing strategies to condense the n×n×d pairwise distance tensor into a profile vector and extending MP for k-nearest neighbor search. It benchmarks the resulting method against 19 baselines on 119 multidimensional TSAD datasets from TSB-AD, covering unsupervised, supervised, and semi-supervised setups, and claims that MP is the only method that consistently delivers high performance across all three setups.

Significance. If the empirical results hold under representative conditions, the work would position the multidimensional MP as a simple, versatile, and robust baseline for TSAD that performs reliably across learning paradigms, which could reduce the need for paradigm-specific method selection in sensor-based applications. The public release of the full implementation (including TSB-AD evaluations) is a clear strength for reproducibility.

major comments (2)

[Abstract] Abstract: the central claim that MP is the only method delivering consistently high performance across all setups requires that the 119 datasets and 19 baselines are representative of the space of dimensionality d, subsequence lengths, inter-dimension correlations, and anomaly characteristics. No selection criteria, diversity statistics, or coverage analysis are supplied, so it is impossible to determine whether the observed consistency gap is intrinsic or an artifact of the collection.
[Abstract] Abstract: the description of the multidimensional MP states that different condensation strategies for the n×n×d tensor are analyzed and that MP is extended to find k-nearest neighbors, yet supplies no concrete definitions of the condensation functions, the distance measure, or the anomaly scoring rule. These omissions are load-bearing for both reproducibility and for interpreting why MP outperforms the baselines.

minor comments (1)

[Abstract] The abstract refers to 'high performance' without naming the evaluation metrics or any statistical tests used to support the consistency claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below, with commitments to revisions where they strengthen clarity and reproducibility.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that MP is the only method delivering consistently high performance across all setups requires that the 119 datasets and 19 baselines are representative of the space of dimensionality d, subsequence lengths, inter-dimension correlations, and anomaly characteristics. No selection criteria, diversity statistics, or coverage analysis are supplied, so it is impossible to determine whether the observed consistency gap is intrinsic or an artifact of the collection.

Authors: The 119 datasets are taken directly from the TSB-AD benchmark, which was constructed to span a broad range of multidimensional time series characteristics including varying d, lengths, inter-dimension correlations, and anomaly types. We will revise the abstract to explicitly cite TSB-AD and note its coverage to support the generalizability of the consistency claim. The 19 baselines were selected to represent leading methods across the three learning paradigms. While the original submission did not include new diversity statistics, we can add a reference to the TSB-AD paper's dataset characterization in the experiments section during revision. revision: partial
Referee: [Abstract] Abstract: the description of the multidimensional MP states that different condensation strategies for the n×n×d tensor are analyzed and that MP is extended to find k-nearest neighbors, yet supplies no concrete definitions of the condensation functions, the distance measure, or the anomaly scoring rule. These omissions are load-bearing for both reproducibility and for interpreting why MP outperforms the baselines.

Authors: We agree the abstract is high-level and omits specifics. Section 3 of the full manuscript defines the condensation strategies (min, mean, and max across dimensions), uses Euclidean distance for the n×n×d tensor, and bases anomaly scores on the resulting profile (with the kNN extension computing distances to the k-th neighbor). To address the concern, we will revise the abstract to concisely reference these elements (e.g., 'analyzing min/mean condensation with Euclidean distance and kNN extension') while remaining within length constraints. The public GitHub implementation further ensures full reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark on external datasets and baselines

full rationale

The paper presents an empirical study extending the Matrix Profile to multidimensional time series anomaly detection. It analyzes condensation strategies for the distance tensor, explores k-NN extensions, and reports performance on 119 public TSB-AD datasets against 19 independently published baselines across unsupervised, supervised, and semi-supervised setups. The central claim (MP's consistent high performance) is grounded in these external comparisons rather than any internal derivation, fitted parameter, or self-citation chain. No equations or predictions reduce to inputs by construction; the work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard subsequence distance calculations and the assumption that the chosen condensation functions preserve anomaly signal; no new free parameters, invented entities, or ad-hoc axioms are introduced in the abstract.

axioms (1)

domain assumption Euclidean distance between subsequences remains a meaningful similarity measure when extended across multiple dimensions
Invoked when the n x n x d tensor is formed from pairwise distances

pith-pipeline@v0.9.0 · 5836 in / 1217 out tokens · 30596 ms · 2026-05-23T21:02:30.464874+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

[1]

Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets,

C.-C. M. Yeh, Y . Zhu, L. Ulanova, N. Begum, Y . Ding, H. A. Dau, D. F. Silva, A. Mueen, and E. Keogh, “Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets,” in 2016 IEEE 16th international conference on data mining (ICDM). Ieee, 2016, pp. 1317–1322

work page 2016
[2]

Matrix profile xxiv: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams,

Y . Lu, R. Wu, A. Mueen, M. A. Zuluaga, and E. Keogh, “Matrix profile xxiv: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2022, pp. 1173–1182

work page 2022
[3]

Hot sax: Efficiently finding the most un- usual time series subsequence,

E. Keogh, J. Lin, and A. Fu, “Hot sax: Efficiently finding the most un- usual time series subsequence,” in Fifth IEEE International Conference on Data Mining (ICDM’05) . Ieee, 2005, pp. 8–pp

work page 2005
[4]

M. C.-C. Yeh, Towards a near universal time series data mining tool: Introducing the matrix profile. University of California, Riverside, 2018

work page 2018
[5]

Sketching multidimensional time series for fast discord mining,

C.-C. M. Yeh, Y . Zheng, M. Pan, H. Chen, Z. Zhuang, J. Wang, L. Wang, W. Zhang, J. M. Phillips, and E. Keogh, “Sketching multidimensional time series for fast discord mining,” arXiv preprint arXiv:2311.03393 , 2023

work page arXiv 2023
[6]

Matrix profile xxviii: Discovering multi- dimensional time series anomalies with k of n anomaly detection,

S. Tafazoli and E. Keogh, “Matrix profile xxviii: Discovering multi- dimensional time series anomalies with k of n anomaly detection,” in Proceedings of the 2023 SIAM International Conference on Data Mining (SDM). SIAM, 2023, pp. 685–693

work page 2023
[7]

Anomaly detection in time series: a comprehensive evaluation,

S. Schmidl, P. Wenig, and T. Papenbrock, “Anomaly detection in time series: a comprehensive evaluation,” Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1779–1797, 2022

work page 2022
[8]

Adaptive anomaly detection in chaotic time series with a spatially aware echo state network,

N. Heim and J. E. Avery, “Adaptive anomaly detection in chaotic time series with a spatially aware echo state network,” arXiv preprint arXiv:1909.01709, 2019

work page arXiv 1909
[9]

Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding,

K. Hundman, V . Constantinou, C. Laporte, I. Colwell, and T. Soder- strom, “Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining , 2018, pp. 387–395

work page 2018
[10]

Deepant: A deep learning approach for unsupervised anomaly detection in time series,

M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed, “Deepant: A deep learning approach for unsupervised anomaly detection in time series,” Ieee Access, vol. 7, pp. 1991–2005, 2018

work page 1991
[11]

Robust anomaly detection for multivariate time series through stochastic recurrent neural network,

Y . Su, Y . Zhao, C. Niu, R. Liu, W. Sun, and D. Pei, “Robust anomaly detection for multivariate time series through stochastic recurrent neural network,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining , 2019, pp. 2828– 2837

work page 2019
[12]

Robust PCA for Anomaly Detection in Cyber Networks

R. Paffenroth, K. Kay, and L. Servi, “Robust pca for anomaly detection in cyber networks,” arXiv preprint arXiv:1801.01571 , 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[13]

Fault detection by mining association rules from house-keeping data,

T. Yairi, Y . Kato, and K. Hori, “Fault detection by mining association rules from house-keeping data,” in proceedings of the 6th International Symposium on Artificial Intelligence, Robotics and Automation in Space, vol. 18. Citeseer, 2001, p. 21

work page 2001
[14]

Efficient algorithms for mining outliers from large data sets,

S. Ramaswamy, R. Rastogi, and K. Shim, “Efficient algorithms for mining outliers from large data sets,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data , 2000, pp. 427–438

work page 2000
[15]

Discovering cluster-based local outliers,

Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,” Pattern recognition letters, vol. 24, no. 9-10, pp. 1641–1650, 2003

work page 2003
[16]

Lof: identifying density-based local outliers,

M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: identifying density-based local outliers,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data , 2000, pp. 93–104

work page 2000
[17]

A novel anomaly detection scheme based on principal component classifier,

M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, and L. Chang, “A novel anomaly detection scheme based on principal component classifier,” in Proceedings of the IEEE foundations and new directions of data mining workshop. IEEE Press, 2003, pp. 172–179

work page 2003
[18]

Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,

M. Goldstein and A. Dengel, “Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,” KI-2012: poster and demo track, vol. 1, pp. 59–63, 2012

work page 2012
[19]

Enhancing effectiveness of outlier detections for low density patterns,

J. Tang, Z. Chen, A. W.-C. Fu, and D. W. Cheung, “Enhancing effectiveness of outlier detections for low density patterns,” in Advances in Knowledge Discovery and Data Mining: 6th Pacific-Asia Conference, PAKDD 2002 Taipei, Taiwan, May 6–8, 2002 Proceedings 6. Springer, 2002, pp. 535–548

work page 2002
[20]

A hybrid semi-supervised anomaly detection model for high-dimensional data,

H. Song, Z. Jiang, A. Men, B. Yang et al., “A hybrid semi-supervised anomaly detection model for high-dimensional data,” Computational intelligence and neuroscience , vol. 2017, 2017

work page 2017
[21]

Multivariate time series anomaly detec- tion: A framework of hidden markov models,

J. Li, W. Pedrycz, and I. Jamal, “Multivariate time series anomaly detec- tion: A framework of hidden markov models,” Applied Soft Computing , vol. 60, pp. 229–240, 2017

work page 2017
[22]

Copod: copula-based outlier detection,

Z. Li, Y . Zhao, N. Botta, C. Ionescu, and X. Hu, “Copod: copula-based outlier detection,” in2020 IEEE international conference on data mining (ICDM). IEEE, 2020, pp. 1118–1123

work page 2020
[23]

Extended isolation forest,

S. Hariri, M. C. Kind, and R. J. Brunner, “Extended isolation forest,” IEEE transactions on knowledge and data engineering , vol. 33, no. 4, pp. 1479–1489, 2019

work page 2019
[24]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 eighth ieee international conference on data mining. IEEE, 2008, pp. 413–422

work page 2008
[25]

Outlier detection using isolation forest and local outlier factor,

Z. Cheng, C. Zou, and J. Dong, “Outlier detection using isolation forest and local outlier factor,” in Proceedings of the conference on research in adaptive and convergent systems , 2019, pp. 161–168

work page 2019
[26]

Hybrid Isolation Forest - Application to Intrusion Detection

P.-F. Marteau, S. Soheily-Khah, and N. B ´echet, “Hybrid isolation forest- application to intrusion detection,” arXiv preprint arXiv:1705.03800 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[27]

mtads: Multivariate time series anomaly detection benchmark suites,

D. Baumgartner, H. Langseth, H. Ramampiaro, and K. Engø-Monsen, “mtads: Multivariate time series anomaly detection benchmark suites,” in 2023 IEEE International Conference on Big Data (BigData) . IEEE, 2023, pp. 588–597

work page 2023
[28]

Matrix profile vi: Mean- ingful multidimensional motif discovery,

C.-C. M. Yeh, N. Kavantzas, and E. Keogh, “Matrix profile vi: Mean- ingful multidimensional motif discovery,” in 2017 IEEE international conference on data mining (ICDM) . IEEE, 2017, pp. 565–574

work page 2017
[29]

Error-bounded approximate time series joins using compact dictionary representations of time series,

C.-C. M. Yeh, Y . Zheng, J. Wang, H. Chen, Z. Zhuang, W. Zhang, and E. Keogh, “Error-bounded approximate time series joins using compact dictionary representations of time series,” in Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) . SIAM, 2022, pp. 181–189

work page 2022
[30]

Matrix profile ii: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins,

Y . Zhu, Z. Zimmerman, N. S. Senobari, C.-C. M. Yeh, G. Funning, A. Mueen, P. Brisk, and E. Keogh, “Matrix profile ii: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins,” in 2016 IEEE 16th international conference on data mining (ICDM) . IEEE, 2016, pp. 739–748

work page 2016
[31]

Introspective sorting and selection algorithms,

D. R. Musser, “Introspective sorting and selection algorithms,” Software: Practice and Experience , vol. 27, no. 8, pp. 983–993, 1997

work page 1997
[32]

Algorithm 65: find,

C. A. Hoare, “Algorithm 65: find,” Communications of the ACM, vol. 4, no. 7, pp. 321–322, 1961

work page 1961
[33]

Project website,

The Author(s), “Project website,” 2024, https://sites.google.com/view/ mp4ad

work page 2024
[34]

Timeeval: A benchmarking toolkit for time series anomaly detection algorithms,

P. Wenig, S. Schmidl, and T. Papenbrock, “Timeeval: A benchmarking toolkit for time series anomaly detection algorithms,” Proceedings of the VLDB Endowment, vol. 15, no. 12, pp. 3678–3681, 2022

work page 2022
[35]

CalIt2 Building People Counts,

J. Hutchins, “CalIt2 Building People Counts,” UCI Machine Learning Repository, 2006, DOI: https://doi.org/10.24432/C5NG78

work page doi:10.24432/c5ng78 2006
[36]

Wearable assistant for parkinson’s disease patients with the freezing of gait symptom,

M. Bachlin, M. Plotnik, D. Roggen, I. Maidan, J. M. Hausdorff, N. Giladi, and G. Troster, “Wearable assistant for parkinson’s disease patients with the freezing of gait symptom,” IEEE Transactions on Information Technology in Biomedicine , vol. 14, no. 2, pp. 436–446, 2009

work page 2009
[37]

Exathlon: A benchmark for explainable anomaly detection over time series,

V . Jacob, F. Song, A. Stiegler, B. Rad, Y . Diao, and N. Tatbul, “Exathlon: A benchmark for explainable anomaly detection over time series,” arXiv preprint arXiv:2010.05073, 2020

work page arXiv 2010
[38]

Anomaly detection and localiza- tion for cyber-physical production systems with self-organizing maps,

A. von Birgelen and O. Niggemann, “Anomaly detection and localiza- tion for cyber-physical production systems with self-organizing maps,” IMPROVE-Innovative Modelling Approaches for Production Systems to Raise Validatable Efficiency: Intelligent Methods for the Factory of the Future, pp. 55–71, 2018

work page 2018
[39]

Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,

A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” circulation, vol. 101, no. 23, pp. e215–e220, 2000

work page 2000
[40]

The meaning and use of the area under a receiver operating characteristic (roc) curve

J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1, pp. 29–36, 1982

work page 1982
[41]

The use of the area under the roc curve in the evaluation of machine learning algorithms,

A. P. Bradley, “The use of the area under the roc curve in the evaluation of machine learning algorithms,” Pattern recognition, vol. 30, no. 7, pp. 1145–1159, 1997

work page 1997
[42]

Precision and recall for time series,

N. Tatbul, T. J. Lee, S. Zdonik, M. Alam, and J. Gottschlich, “Precision and recall for time series,” Advances in neural information processing systems, vol. 31, 2018

work page 2018
[43]

Ma- trix profile xxx: Madrid: A hyper-anytime and parameter-free algorithm to find time series anomalies of all lengths,

Y . Lu, T. V . A. Srinivas, T. Nakamura, M. Imamura, and E. Keogh, “Ma- trix profile xxx: Madrid: A hyper-anytime and parameter-free algorithm to find time series anomalies of all lengths,” in 2023 IEEE International Conference on Data Mining (ICDM) . IEEE, 2023, pp. 1199–1204

work page 2023
[44]

Rpmixer: Shaking up time series forecasting with random projections for large spatial- temporal data,

C.-C. M. Yeh, Y . Fan, X. Dai, U. S. Saini, V . Lai, P. O. Aboagye, J. Wang, H. Chen, Y . Zheng, Z. Zhuang et al. , “Rpmixer: Shaking up time series forecasting with random projections for large spatial- temporal data,” in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2024, pp. 3919–3930

work page 2024
[45]

Toward a foundation model for time series data,

C.-C. M. Yeh, X. Dai, H. Chen, Y . Zheng, Y . Fan, A. Der, V . Lai, Z. Zhuang, J. Wang, L. Wanget al., “Toward a foundation model for time series data,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , 2023, pp. 4400–4404

work page 2023
[46]

A systematic evaluation of gener- ated time series and their effects in self-supervised pretraining,

A. Der, C.-C. M. Yeh, X. Dai, H. Chen, Y . Zheng, Y . Fan, Z. Zhuang, V . Lai, J. Wang, L. Wang et al. , “A systematic evaluation of gener- ated time series and their effects in self-supervised pretraining,” arXiv preprint arXiv:2408.07869, 2024

work page arXiv 2024

[1] [1]

Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets,

C.-C. M. Yeh, Y . Zhu, L. Ulanova, N. Begum, Y . Ding, H. A. Dau, D. F. Silva, A. Mueen, and E. Keogh, “Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets,” in 2016 IEEE 16th international conference on data mining (ICDM). Ieee, 2016, pp. 1317–1322

work page 2016

[2] [2]

Matrix profile xxiv: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams,

Y . Lu, R. Wu, A. Mueen, M. A. Zuluaga, and E. Keogh, “Matrix profile xxiv: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2022, pp. 1173–1182

work page 2022

[3] [3]

Hot sax: Efficiently finding the most un- usual time series subsequence,

E. Keogh, J. Lin, and A. Fu, “Hot sax: Efficiently finding the most un- usual time series subsequence,” in Fifth IEEE International Conference on Data Mining (ICDM’05) . Ieee, 2005, pp. 8–pp

work page 2005

[4] [4]

M. C.-C. Yeh, Towards a near universal time series data mining tool: Introducing the matrix profile. University of California, Riverside, 2018

work page 2018

[5] [5]

Sketching multidimensional time series for fast discord mining,

C.-C. M. Yeh, Y . Zheng, M. Pan, H. Chen, Z. Zhuang, J. Wang, L. Wang, W. Zhang, J. M. Phillips, and E. Keogh, “Sketching multidimensional time series for fast discord mining,” arXiv preprint arXiv:2311.03393 , 2023

work page arXiv 2023

[6] [6]

Matrix profile xxviii: Discovering multi- dimensional time series anomalies with k of n anomaly detection,

S. Tafazoli and E. Keogh, “Matrix profile xxviii: Discovering multi- dimensional time series anomalies with k of n anomaly detection,” in Proceedings of the 2023 SIAM International Conference on Data Mining (SDM). SIAM, 2023, pp. 685–693

work page 2023

[7] [7]

Anomaly detection in time series: a comprehensive evaluation,

S. Schmidl, P. Wenig, and T. Papenbrock, “Anomaly detection in time series: a comprehensive evaluation,” Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1779–1797, 2022

work page 2022

[8] [8]

Adaptive anomaly detection in chaotic time series with a spatially aware echo state network,

N. Heim and J. E. Avery, “Adaptive anomaly detection in chaotic time series with a spatially aware echo state network,” arXiv preprint arXiv:1909.01709, 2019

work page arXiv 1909

[9] [9]

Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding,

K. Hundman, V . Constantinou, C. Laporte, I. Colwell, and T. Soder- strom, “Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining , 2018, pp. 387–395

work page 2018

[10] [10]

Deepant: A deep learning approach for unsupervised anomaly detection in time series,

M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed, “Deepant: A deep learning approach for unsupervised anomaly detection in time series,” Ieee Access, vol. 7, pp. 1991–2005, 2018

work page 1991

[11] [11]

Robust anomaly detection for multivariate time series through stochastic recurrent neural network,

Y . Su, Y . Zhao, C. Niu, R. Liu, W. Sun, and D. Pei, “Robust anomaly detection for multivariate time series through stochastic recurrent neural network,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining , 2019, pp. 2828– 2837

work page 2019

[12] [12]

Robust PCA for Anomaly Detection in Cyber Networks

R. Paffenroth, K. Kay, and L. Servi, “Robust pca for anomaly detection in cyber networks,” arXiv preprint arXiv:1801.01571 , 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[13] [13]

Fault detection by mining association rules from house-keeping data,

T. Yairi, Y . Kato, and K. Hori, “Fault detection by mining association rules from house-keeping data,” in proceedings of the 6th International Symposium on Artificial Intelligence, Robotics and Automation in Space, vol. 18. Citeseer, 2001, p. 21

work page 2001

[14] [14]

Efficient algorithms for mining outliers from large data sets,

S. Ramaswamy, R. Rastogi, and K. Shim, “Efficient algorithms for mining outliers from large data sets,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data , 2000, pp. 427–438

work page 2000

[15] [15]

Discovering cluster-based local outliers,

Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,” Pattern recognition letters, vol. 24, no. 9-10, pp. 1641–1650, 2003

work page 2003

[16] [16]

Lof: identifying density-based local outliers,

M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: identifying density-based local outliers,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data , 2000, pp. 93–104

work page 2000

[17] [17]

A novel anomaly detection scheme based on principal component classifier,

M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, and L. Chang, “A novel anomaly detection scheme based on principal component classifier,” in Proceedings of the IEEE foundations and new directions of data mining workshop. IEEE Press, 2003, pp. 172–179

work page 2003

[18] [18]

Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,

M. Goldstein and A. Dengel, “Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,” KI-2012: poster and demo track, vol. 1, pp. 59–63, 2012

work page 2012

[19] [19]

Enhancing effectiveness of outlier detections for low density patterns,

J. Tang, Z. Chen, A. W.-C. Fu, and D. W. Cheung, “Enhancing effectiveness of outlier detections for low density patterns,” in Advances in Knowledge Discovery and Data Mining: 6th Pacific-Asia Conference, PAKDD 2002 Taipei, Taiwan, May 6–8, 2002 Proceedings 6. Springer, 2002, pp. 535–548

work page 2002

[20] [20]

A hybrid semi-supervised anomaly detection model for high-dimensional data,

H. Song, Z. Jiang, A. Men, B. Yang et al., “A hybrid semi-supervised anomaly detection model for high-dimensional data,” Computational intelligence and neuroscience , vol. 2017, 2017

work page 2017

[21] [21]

Multivariate time series anomaly detec- tion: A framework of hidden markov models,

J. Li, W. Pedrycz, and I. Jamal, “Multivariate time series anomaly detec- tion: A framework of hidden markov models,” Applied Soft Computing , vol. 60, pp. 229–240, 2017

work page 2017

[22] [22]

Copod: copula-based outlier detection,

Z. Li, Y . Zhao, N. Botta, C. Ionescu, and X. Hu, “Copod: copula-based outlier detection,” in2020 IEEE international conference on data mining (ICDM). IEEE, 2020, pp. 1118–1123

work page 2020

[23] [23]

Extended isolation forest,

S. Hariri, M. C. Kind, and R. J. Brunner, “Extended isolation forest,” IEEE transactions on knowledge and data engineering , vol. 33, no. 4, pp. 1479–1489, 2019

work page 2019

[24] [24]

Isolation forest,

F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 eighth ieee international conference on data mining. IEEE, 2008, pp. 413–422

work page 2008

[25] [25]

Outlier detection using isolation forest and local outlier factor,

Z. Cheng, C. Zou, and J. Dong, “Outlier detection using isolation forest and local outlier factor,” in Proceedings of the conference on research in adaptive and convergent systems , 2019, pp. 161–168

work page 2019

[26] [26]

Hybrid Isolation Forest - Application to Intrusion Detection

P.-F. Marteau, S. Soheily-Khah, and N. B ´echet, “Hybrid isolation forest- application to intrusion detection,” arXiv preprint arXiv:1705.03800 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[27] [27]

mtads: Multivariate time series anomaly detection benchmark suites,

D. Baumgartner, H. Langseth, H. Ramampiaro, and K. Engø-Monsen, “mtads: Multivariate time series anomaly detection benchmark suites,” in 2023 IEEE International Conference on Big Data (BigData) . IEEE, 2023, pp. 588–597

work page 2023

[28] [28]

Matrix profile vi: Mean- ingful multidimensional motif discovery,

C.-C. M. Yeh, N. Kavantzas, and E. Keogh, “Matrix profile vi: Mean- ingful multidimensional motif discovery,” in 2017 IEEE international conference on data mining (ICDM) . IEEE, 2017, pp. 565–574

work page 2017

[29] [29]

Error-bounded approximate time series joins using compact dictionary representations of time series,

C.-C. M. Yeh, Y . Zheng, J. Wang, H. Chen, Z. Zhuang, W. Zhang, and E. Keogh, “Error-bounded approximate time series joins using compact dictionary representations of time series,” in Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) . SIAM, 2022, pp. 181–189

work page 2022

[30] [30]

Matrix profile ii: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins,

Y . Zhu, Z. Zimmerman, N. S. Senobari, C.-C. M. Yeh, G. Funning, A. Mueen, P. Brisk, and E. Keogh, “Matrix profile ii: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins,” in 2016 IEEE 16th international conference on data mining (ICDM) . IEEE, 2016, pp. 739–748

work page 2016

[31] [31]

Introspective sorting and selection algorithms,

D. R. Musser, “Introspective sorting and selection algorithms,” Software: Practice and Experience , vol. 27, no. 8, pp. 983–993, 1997

work page 1997

[32] [32]

Algorithm 65: find,

C. A. Hoare, “Algorithm 65: find,” Communications of the ACM, vol. 4, no. 7, pp. 321–322, 1961

work page 1961

[33] [33]

Project website,

The Author(s), “Project website,” 2024, https://sites.google.com/view/ mp4ad

work page 2024

[34] [34]

Timeeval: A benchmarking toolkit for time series anomaly detection algorithms,

P. Wenig, S. Schmidl, and T. Papenbrock, “Timeeval: A benchmarking toolkit for time series anomaly detection algorithms,” Proceedings of the VLDB Endowment, vol. 15, no. 12, pp. 3678–3681, 2022

work page 2022

[35] [35]

CalIt2 Building People Counts,

J. Hutchins, “CalIt2 Building People Counts,” UCI Machine Learning Repository, 2006, DOI: https://doi.org/10.24432/C5NG78

work page doi:10.24432/c5ng78 2006

[36] [36]

Wearable assistant for parkinson’s disease patients with the freezing of gait symptom,

M. Bachlin, M. Plotnik, D. Roggen, I. Maidan, J. M. Hausdorff, N. Giladi, and G. Troster, “Wearable assistant for parkinson’s disease patients with the freezing of gait symptom,” IEEE Transactions on Information Technology in Biomedicine , vol. 14, no. 2, pp. 436–446, 2009

work page 2009

[37] [37]

Exathlon: A benchmark for explainable anomaly detection over time series,

V . Jacob, F. Song, A. Stiegler, B. Rad, Y . Diao, and N. Tatbul, “Exathlon: A benchmark for explainable anomaly detection over time series,” arXiv preprint arXiv:2010.05073, 2020

work page arXiv 2010

[38] [38]

Anomaly detection and localiza- tion for cyber-physical production systems with self-organizing maps,

A. von Birgelen and O. Niggemann, “Anomaly detection and localiza- tion for cyber-physical production systems with self-organizing maps,” IMPROVE-Innovative Modelling Approaches for Production Systems to Raise Validatable Efficiency: Intelligent Methods for the Factory of the Future, pp. 55–71, 2018

work page 2018

[39] [39]

Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,

A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” circulation, vol. 101, no. 23, pp. e215–e220, 2000

work page 2000

[40] [40]

The meaning and use of the area under a receiver operating characteristic (roc) curve

J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1, pp. 29–36, 1982

work page 1982

[41] [41]

The use of the area under the roc curve in the evaluation of machine learning algorithms,

A. P. Bradley, “The use of the area under the roc curve in the evaluation of machine learning algorithms,” Pattern recognition, vol. 30, no. 7, pp. 1145–1159, 1997

work page 1997

[42] [42]

Precision and recall for time series,

N. Tatbul, T. J. Lee, S. Zdonik, M. Alam, and J. Gottschlich, “Precision and recall for time series,” Advances in neural information processing systems, vol. 31, 2018

work page 2018

[43] [43]

Ma- trix profile xxx: Madrid: A hyper-anytime and parameter-free algorithm to find time series anomalies of all lengths,

Y . Lu, T. V . A. Srinivas, T. Nakamura, M. Imamura, and E. Keogh, “Ma- trix profile xxx: Madrid: A hyper-anytime and parameter-free algorithm to find time series anomalies of all lengths,” in 2023 IEEE International Conference on Data Mining (ICDM) . IEEE, 2023, pp. 1199–1204

work page 2023

[44] [44]

Rpmixer: Shaking up time series forecasting with random projections for large spatial- temporal data,

C.-C. M. Yeh, Y . Fan, X. Dai, U. S. Saini, V . Lai, P. O. Aboagye, J. Wang, H. Chen, Y . Zheng, Z. Zhuang et al. , “Rpmixer: Shaking up time series forecasting with random projections for large spatial- temporal data,” in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2024, pp. 3919–3930

work page 2024

[45] [45]

Toward a foundation model for time series data,

C.-C. M. Yeh, X. Dai, H. Chen, Y . Zheng, Y . Fan, A. Der, V . Lai, Z. Zhuang, J. Wang, L. Wanget al., “Toward a foundation model for time series data,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , 2023, pp. 4400–4404

work page 2023

[46] [46]

A systematic evaluation of gener- ated time series and their effects in self-supervised pretraining,

A. Der, C.-C. M. Yeh, X. Dai, H. Chen, Y . Zheng, Y . Fan, Z. Zhuang, V . Lai, J. Wang, L. Wang et al. , “A systematic evaluation of gener- ated time series and their effects in self-supervised pretraining,” arXiv preprint arXiv:2408.07869, 2024

work page arXiv 2024