Matrix Profile for Anomaly Detection on Multidimensional Time Series
Pith reviewed 2026-05-23 21:02 UTC · model grok-4.3
The pith
The Matrix Profile is the only anomaly detection method that maintains high performance across unsupervised, supervised, and semi-supervised setups on multidimensional time series.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors show that condensing the n by n by d distance tensor into a profile vector, combined with a k-nearest-neighbor extension, produces the sole method among the tested baselines that delivers high performance in unsupervised, supervised, and semi-supervised regimes across the full set of 119 multidimensional time series anomaly detection datasets.
What carries the argument
The multidimensional Matrix Profile obtained by condensing the n x n x d pairwise subsequence distance tensor into a one-dimensional profile vector for nearest-neighbor anomaly scoring.
Load-bearing premise
The 119 datasets and 19 baseline implementations are representative enough that the observed performance consistency will hold for new multidimensional time series.
What would settle it
A collection of multidimensional time series datasets, distinct from the 119 used, on which at least one of the 19 baseline methods matches or exceeds the Matrix Profile performance in all three learning setups.
Figures
read the original abstract
The Matrix Profile (MP), a versatile tool for time series data mining, has been shown effective in time series anomaly detection (TSAD). This paper delves into the problem of anomaly detection in multidimensional time series, a common occurrence in real-world applications. For instance, in a manufacturing factory, multiple sensors installed across the site collect time-varying data for analysis. The Matrix Profile, named for its role in profiling the matrix storing pairwise distance between subsequences of univariate time series, becomes complex in multidimensional scenarios. If the input univariate time series has n subsequences, the pairwise distance matrix is a n x n matrix. In a multidimensional time series with d dimensions, the pairwise distance information must be stored in a n x n x d tensor. In this paper, we first analyze different strategies for condensing this tensor into a profile vector. We then investigate the potential of extending the MP to efficiently find k-nearest neighbors for anomaly detection. Finally, we benchmark the multidimensional MP against 19 baseline methods on 119 multidimensional TSAD datasets. The experiments covers three learning setups: unsupervised, supervised, and semi-supervised. MP is the only method that consistently delivers high performance across all setups. To ensure complete transparency and facilitate future research, our full Matrix Profile-based implementation, which includes newly added evaluations against the TSB-AD benchmark, is publicly available at: https://github.com/mcyeh/mmpad_tsb
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper extends the Matrix Profile (MP) to multidimensional time series anomaly detection by analyzing strategies to condense the n×n×d pairwise distance tensor into a profile vector and extending MP for k-nearest neighbor search. It benchmarks the resulting method against 19 baselines on 119 multidimensional TSAD datasets from TSB-AD, covering unsupervised, supervised, and semi-supervised setups, and claims that MP is the only method that consistently delivers high performance across all three setups.
Significance. If the empirical results hold under representative conditions, the work would position the multidimensional MP as a simple, versatile, and robust baseline for TSAD that performs reliably across learning paradigms, which could reduce the need for paradigm-specific method selection in sensor-based applications. The public release of the full implementation (including TSB-AD evaluations) is a clear strength for reproducibility.
major comments (2)
- [Abstract] Abstract: the central claim that MP is the only method delivering consistently high performance across all setups requires that the 119 datasets and 19 baselines are representative of the space of dimensionality d, subsequence lengths, inter-dimension correlations, and anomaly characteristics. No selection criteria, diversity statistics, or coverage analysis are supplied, so it is impossible to determine whether the observed consistency gap is intrinsic or an artifact of the collection.
- [Abstract] Abstract: the description of the multidimensional MP states that different condensation strategies for the n×n×d tensor are analyzed and that MP is extended to find k-nearest neighbors, yet supplies no concrete definitions of the condensation functions, the distance measure, or the anomaly scoring rule. These omissions are load-bearing for both reproducibility and for interpreting why MP outperforms the baselines.
minor comments (1)
- [Abstract] The abstract refers to 'high performance' without naming the evaluation metrics or any statistical tests used to support the consistency claim.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below, with commitments to revisions where they strengthen clarity and reproducibility.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that MP is the only method delivering consistently high performance across all setups requires that the 119 datasets and 19 baselines are representative of the space of dimensionality d, subsequence lengths, inter-dimension correlations, and anomaly characteristics. No selection criteria, diversity statistics, or coverage analysis are supplied, so it is impossible to determine whether the observed consistency gap is intrinsic or an artifact of the collection.
Authors: The 119 datasets are taken directly from the TSB-AD benchmark, which was constructed to span a broad range of multidimensional time series characteristics including varying d, lengths, inter-dimension correlations, and anomaly types. We will revise the abstract to explicitly cite TSB-AD and note its coverage to support the generalizability of the consistency claim. The 19 baselines were selected to represent leading methods across the three learning paradigms. While the original submission did not include new diversity statistics, we can add a reference to the TSB-AD paper's dataset characterization in the experiments section during revision. revision: partial
-
Referee: [Abstract] Abstract: the description of the multidimensional MP states that different condensation strategies for the n×n×d tensor are analyzed and that MP is extended to find k-nearest neighbors, yet supplies no concrete definitions of the condensation functions, the distance measure, or the anomaly scoring rule. These omissions are load-bearing for both reproducibility and for interpreting why MP outperforms the baselines.
Authors: We agree the abstract is high-level and omits specifics. Section 3 of the full manuscript defines the condensation strategies (min, mean, and max across dimensions), uses Euclidean distance for the n×n×d tensor, and bases anomaly scores on the resulting profile (with the kNN extension computing distances to the k-th neighbor). To address the concern, we will revise the abstract to concisely reference these elements (e.g., 'analyzing min/mean condensation with Euclidean distance and kNN extension') while remaining within length constraints. The public GitHub implementation further ensures full reproducibility. revision: yes
Circularity Check
No circularity: empirical benchmark on external datasets and baselines
full rationale
The paper presents an empirical study extending the Matrix Profile to multidimensional time series anomaly detection. It analyzes condensation strategies for the distance tensor, explores k-NN extensions, and reports performance on 119 public TSB-AD datasets against 19 independently published baselines across unsupervised, supervised, and semi-supervised setups. The central claim (MP's consistent high performance) is grounded in these external comparisons rather than any internal derivation, fitted parameter, or self-citation chain. No equations or predictions reduce to inputs by construction; the work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Euclidean distance between subsequences remains a meaningful similarity measure when extended across multiple dimensions
Reference graph
Works this paper leans on
-
[1]
C.-C. M. Yeh, Y . Zhu, L. Ulanova, N. Begum, Y . Ding, H. A. Dau, D. F. Silva, A. Mueen, and E. Keogh, “Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets,” in 2016 IEEE 16th international conference on data mining (ICDM). Ieee, 2016, pp. 1317–1322
work page 2016
-
[2]
Y . Lu, R. Wu, A. Mueen, M. A. Zuluaga, and E. Keogh, “Matrix profile xxiv: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2022, pp. 1173–1182
work page 2022
-
[3]
Hot sax: Efficiently finding the most un- usual time series subsequence,
E. Keogh, J. Lin, and A. Fu, “Hot sax: Efficiently finding the most un- usual time series subsequence,” in Fifth IEEE International Conference on Data Mining (ICDM’05) . Ieee, 2005, pp. 8–pp
work page 2005
-
[4]
M. C.-C. Yeh, Towards a near universal time series data mining tool: Introducing the matrix profile. University of California, Riverside, 2018
work page 2018
-
[5]
Sketching multidimensional time series for fast discord mining,
C.-C. M. Yeh, Y . Zheng, M. Pan, H. Chen, Z. Zhuang, J. Wang, L. Wang, W. Zhang, J. M. Phillips, and E. Keogh, “Sketching multidimensional time series for fast discord mining,” arXiv preprint arXiv:2311.03393 , 2023
-
[6]
S. Tafazoli and E. Keogh, “Matrix profile xxviii: Discovering multi- dimensional time series anomalies with k of n anomaly detection,” in Proceedings of the 2023 SIAM International Conference on Data Mining (SDM). SIAM, 2023, pp. 685–693
work page 2023
-
[7]
Anomaly detection in time series: a comprehensive evaluation,
S. Schmidl, P. Wenig, and T. Papenbrock, “Anomaly detection in time series: a comprehensive evaluation,” Proceedings of the VLDB Endowment, vol. 15, no. 9, pp. 1779–1797, 2022
work page 2022
-
[8]
Adaptive anomaly detection in chaotic time series with a spatially aware echo state network,
N. Heim and J. E. Avery, “Adaptive anomaly detection in chaotic time series with a spatially aware echo state network,” arXiv preprint arXiv:1909.01709, 2019
-
[9]
Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding,
K. Hundman, V . Constantinou, C. Laporte, I. Colwell, and T. Soder- strom, “Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining , 2018, pp. 387–395
work page 2018
-
[10]
Deepant: A deep learning approach for unsupervised anomaly detection in time series,
M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed, “Deepant: A deep learning approach for unsupervised anomaly detection in time series,” Ieee Access, vol. 7, pp. 1991–2005, 2018
work page 1991
-
[11]
Robust anomaly detection for multivariate time series through stochastic recurrent neural network,
Y . Su, Y . Zhao, C. Niu, R. Liu, W. Sun, and D. Pei, “Robust anomaly detection for multivariate time series through stochastic recurrent neural network,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining , 2019, pp. 2828– 2837
work page 2019
-
[12]
Robust PCA for Anomaly Detection in Cyber Networks
R. Paffenroth, K. Kay, and L. Servi, “Robust pca for anomaly detection in cyber networks,” arXiv preprint arXiv:1801.01571 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[13]
Fault detection by mining association rules from house-keeping data,
T. Yairi, Y . Kato, and K. Hori, “Fault detection by mining association rules from house-keeping data,” in proceedings of the 6th International Symposium on Artificial Intelligence, Robotics and Automation in Space, vol. 18. Citeseer, 2001, p. 21
work page 2001
-
[14]
Efficient algorithms for mining outliers from large data sets,
S. Ramaswamy, R. Rastogi, and K. Shim, “Efficient algorithms for mining outliers from large data sets,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data , 2000, pp. 427–438
work page 2000
-
[15]
Discovering cluster-based local outliers,
Z. He, X. Xu, and S. Deng, “Discovering cluster-based local outliers,” Pattern recognition letters, vol. 24, no. 9-10, pp. 1641–1650, 2003
work page 2003
-
[16]
Lof: identifying density-based local outliers,
M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: identifying density-based local outliers,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data , 2000, pp. 93–104
work page 2000
-
[17]
A novel anomaly detection scheme based on principal component classifier,
M.-L. Shyu, S.-C. Chen, K. Sarinnapakorn, and L. Chang, “A novel anomaly detection scheme based on principal component classifier,” in Proceedings of the IEEE foundations and new directions of data mining workshop. IEEE Press, 2003, pp. 172–179
work page 2003
-
[18]
Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,
M. Goldstein and A. Dengel, “Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm,” KI-2012: poster and demo track, vol. 1, pp. 59–63, 2012
work page 2012
-
[19]
Enhancing effectiveness of outlier detections for low density patterns,
J. Tang, Z. Chen, A. W.-C. Fu, and D. W. Cheung, “Enhancing effectiveness of outlier detections for low density patterns,” in Advances in Knowledge Discovery and Data Mining: 6th Pacific-Asia Conference, PAKDD 2002 Taipei, Taiwan, May 6–8, 2002 Proceedings 6. Springer, 2002, pp. 535–548
work page 2002
-
[20]
A hybrid semi-supervised anomaly detection model for high-dimensional data,
H. Song, Z. Jiang, A. Men, B. Yang et al., “A hybrid semi-supervised anomaly detection model for high-dimensional data,” Computational intelligence and neuroscience , vol. 2017, 2017
work page 2017
-
[21]
Multivariate time series anomaly detec- tion: A framework of hidden markov models,
J. Li, W. Pedrycz, and I. Jamal, “Multivariate time series anomaly detec- tion: A framework of hidden markov models,” Applied Soft Computing , vol. 60, pp. 229–240, 2017
work page 2017
-
[22]
Copod: copula-based outlier detection,
Z. Li, Y . Zhao, N. Botta, C. Ionescu, and X. Hu, “Copod: copula-based outlier detection,” in2020 IEEE international conference on data mining (ICDM). IEEE, 2020, pp. 1118–1123
work page 2020
-
[23]
S. Hariri, M. C. Kind, and R. J. Brunner, “Extended isolation forest,” IEEE transactions on knowledge and data engineering , vol. 33, no. 4, pp. 1479–1489, 2019
work page 2019
-
[24]
F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 eighth ieee international conference on data mining. IEEE, 2008, pp. 413–422
work page 2008
-
[25]
Outlier detection using isolation forest and local outlier factor,
Z. Cheng, C. Zou, and J. Dong, “Outlier detection using isolation forest and local outlier factor,” in Proceedings of the conference on research in adaptive and convergent systems , 2019, pp. 161–168
work page 2019
-
[26]
Hybrid Isolation Forest - Application to Intrusion Detection
P.-F. Marteau, S. Soheily-Khah, and N. B ´echet, “Hybrid isolation forest- application to intrusion detection,” arXiv preprint arXiv:1705.03800 , 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[27]
mtads: Multivariate time series anomaly detection benchmark suites,
D. Baumgartner, H. Langseth, H. Ramampiaro, and K. Engø-Monsen, “mtads: Multivariate time series anomaly detection benchmark suites,” in 2023 IEEE International Conference on Big Data (BigData) . IEEE, 2023, pp. 588–597
work page 2023
-
[28]
Matrix profile vi: Mean- ingful multidimensional motif discovery,
C.-C. M. Yeh, N. Kavantzas, and E. Keogh, “Matrix profile vi: Mean- ingful multidimensional motif discovery,” in 2017 IEEE international conference on data mining (ICDM) . IEEE, 2017, pp. 565–574
work page 2017
-
[29]
Error-bounded approximate time series joins using compact dictionary representations of time series,
C.-C. M. Yeh, Y . Zheng, J. Wang, H. Chen, Z. Zhuang, W. Zhang, and E. Keogh, “Error-bounded approximate time series joins using compact dictionary representations of time series,” in Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) . SIAM, 2022, pp. 181–189
work page 2022
-
[30]
Y . Zhu, Z. Zimmerman, N. S. Senobari, C.-C. M. Yeh, G. Funning, A. Mueen, P. Brisk, and E. Keogh, “Matrix profile ii: Exploiting a novel algorithm and gpus to break the one hundred million barrier for time series motifs and joins,” in 2016 IEEE 16th international conference on data mining (ICDM) . IEEE, 2016, pp. 739–748
work page 2016
-
[31]
Introspective sorting and selection algorithms,
D. R. Musser, “Introspective sorting and selection algorithms,” Software: Practice and Experience , vol. 27, no. 8, pp. 983–993, 1997
work page 1997
-
[32]
C. A. Hoare, “Algorithm 65: find,” Communications of the ACM, vol. 4, no. 7, pp. 321–322, 1961
work page 1961
-
[33]
The Author(s), “Project website,” 2024, https://sites.google.com/view/ mp4ad
work page 2024
-
[34]
Timeeval: A benchmarking toolkit for time series anomaly detection algorithms,
P. Wenig, S. Schmidl, and T. Papenbrock, “Timeeval: A benchmarking toolkit for time series anomaly detection algorithms,” Proceedings of the VLDB Endowment, vol. 15, no. 12, pp. 3678–3681, 2022
work page 2022
-
[35]
CalIt2 Building People Counts,
J. Hutchins, “CalIt2 Building People Counts,” UCI Machine Learning Repository, 2006, DOI: https://doi.org/10.24432/C5NG78
-
[36]
Wearable assistant for parkinson’s disease patients with the freezing of gait symptom,
M. Bachlin, M. Plotnik, D. Roggen, I. Maidan, J. M. Hausdorff, N. Giladi, and G. Troster, “Wearable assistant for parkinson’s disease patients with the freezing of gait symptom,” IEEE Transactions on Information Technology in Biomedicine , vol. 14, no. 2, pp. 436–446, 2009
work page 2009
-
[37]
Exathlon: A benchmark for explainable anomaly detection over time series,
V . Jacob, F. Song, A. Stiegler, B. Rad, Y . Diao, and N. Tatbul, “Exathlon: A benchmark for explainable anomaly detection over time series,” arXiv preprint arXiv:2010.05073, 2020
-
[38]
A. von Birgelen and O. Niggemann, “Anomaly detection and localiza- tion for cyber-physical production systems with self-organizing maps,” IMPROVE-Innovative Modelling Approaches for Production Systems to Raise Validatable Efficiency: Intelligent Methods for the Factory of the Future, pp. 55–71, 2018
work page 2018
-
[39]
A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.-K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” circulation, vol. 101, no. 23, pp. e215–e220, 2000
work page 2000
-
[40]
The meaning and use of the area under a receiver operating characteristic (roc) curve
J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a receiver operating characteristic (roc) curve.” Radiology, vol. 143, no. 1, pp. 29–36, 1982
work page 1982
-
[41]
The use of the area under the roc curve in the evaluation of machine learning algorithms,
A. P. Bradley, “The use of the area under the roc curve in the evaluation of machine learning algorithms,” Pattern recognition, vol. 30, no. 7, pp. 1145–1159, 1997
work page 1997
-
[42]
Precision and recall for time series,
N. Tatbul, T. J. Lee, S. Zdonik, M. Alam, and J. Gottschlich, “Precision and recall for time series,” Advances in neural information processing systems, vol. 31, 2018
work page 2018
-
[43]
Y . Lu, T. V . A. Srinivas, T. Nakamura, M. Imamura, and E. Keogh, “Ma- trix profile xxx: Madrid: A hyper-anytime and parameter-free algorithm to find time series anomalies of all lengths,” in 2023 IEEE International Conference on Data Mining (ICDM) . IEEE, 2023, pp. 1199–1204
work page 2023
-
[44]
C.-C. M. Yeh, Y . Fan, X. Dai, U. S. Saini, V . Lai, P. O. Aboagye, J. Wang, H. Chen, Y . Zheng, Z. Zhuang et al. , “Rpmixer: Shaking up time series forecasting with random projections for large spatial- temporal data,” in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining , 2024, pp. 3919–3930
work page 2024
-
[45]
Toward a foundation model for time series data,
C.-C. M. Yeh, X. Dai, H. Chen, Y . Zheng, Y . Fan, A. Der, V . Lai, Z. Zhuang, J. Wang, L. Wanget al., “Toward a foundation model for time series data,” in Proceedings of the 32nd ACM International Conference on Information and Knowledge Management , 2023, pp. 4400–4404
work page 2023
-
[46]
A systematic evaluation of gener- ated time series and their effects in self-supervised pretraining,
A. Der, C.-C. M. Yeh, X. Dai, H. Chen, Y . Zheng, Y . Fan, Z. Zhuang, V . Lai, J. Wang, L. Wang et al. , “A systematic evaluation of gener- ated time series and their effects in self-supervised pretraining,” arXiv preprint arXiv:2408.07869, 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.