pith. sign in

arxiv: 2409.15030 · v2 · submitted 2024-09-23 · 💻 cs.LG · cs.CR· cs.ET· cs.IT· math.IT· quant-ph

Anomaly Detection from a Tensor Train Perspective

Pith reviewed 2026-05-23 20:40 UTC · model grok-4.3

classification 💻 cs.LG cs.CRcs.ETcs.ITmath.ITquant-ph
keywords anomaly detectiontensor traintensor networksdata compressioncybersecuritydigits datasetfaces dataset
0
0 comments X

The pith

Tensor Train compression preserves normal data structure while deleting anomalous structure for detection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a series of algorithms that use Tensor Train representations to compress datasets in a manner that keeps the structure of normal data but removes the structure of anomalous data. This selective compression allows anomalies to be identified as data points whose structure does not survive the process. The methods are general and can be used with any tensor network. They were evaluated on image datasets consisting of digits and faces as well as a cybersecurity dataset for attack detection, showing practical utility in spotting deviations without prior anomaly examples.

Core claim

The central claim is that by using data compression in a Tensor Train representation, algorithms can preserve the structure of normal data while deleting the structure of anomalous data, thereby enabling anomaly detection. These algorithms are applicable to any tensor network representation and have been tested on the digits and Olivetti faces datasets as well as a cybersecurity dataset.

What carries the argument

Tensor Train representation for data compression that selectively preserves normal structure and deletes anomalous structure.

If this is right

  • The algorithms can be applied to any tensor network representation.
  • Effectiveness is demonstrated on digits and Olivetti faces image datasets.
  • The methods detect cyber-attacks in a cybersecurity dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach could be extended to other types of high-dimensional data where tensor decompositions are suitable.
  • It may complement existing anomaly detection methods by providing a structure-based perspective.

Load-bearing premise

Anomalous data possesses a distinct structure that can be selectively deleted in Tensor Train compression while normal data structure is preserved.

What would settle it

A test where normal and anomalous data show no difference in how their structures are affected by Tensor Train compression, resulting in no separation between them.

Figures

Figures reproduced from arXiv: 2409.15030 by Aitor Moreno Fdez. de Leceta, Alejandro Mata Ali, Jorge L\'opez Rubio.

Figure 1
Figure 1. Figure 1: a) Tensor T of order 5 in tensor networks notation, b) TT representation of a tensor of order 5 in tensor networks notation more types of representations with different struc￾tures, such as 2-D PEPS or hierarchical trees. The representations can be exact, so that when contracted they return exactly the tensor rep￾resented, or approximate, returning a tensor as similar as desired by controlling the degree o… view at source ↗
Figure 2
Figure 2. Figure 2: Process of creation of the TT representation [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of the TT representation generation [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: AUROC results obtained with ACGCTNAD for different τ values considering each type of digit as normal and all others as anomalous. In green we mark the maximum possible value (1) and in red the value 0.5. 0 2 4 6 8 type 0.5 0.6 0.7 0.8 0.9 1.0 max AUROC min=0.74, max=1.0 [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Maximum AUROC achieved with ACGCTNAD for each type of digit. In green we mark the maximum possible value (1) and in red the minimum possible value (0.5). 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 AUROC max=1.000 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 AUROC max=1.000 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 AUROC max=1.000 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 AUROC max=1.000… view at source ↗
Figure 8
Figure 8. Figure 8: AUROC results obtained with ACGCTNAD without scaling for different τ values considering each type of digit as normal and all others as anomalous. In green we mark the maximum possible value (1) and in red the value 0.5. 0 2 4 6 8 type 0.5 0.6 0.7 0.8 0.9 1.0 max AUROC min=0.84, max=0.98 [PITH_FULL_IMAGE:figures/full_fig_p007_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Maximum AUROC achieved with ACGCTNAD for each type of digit. In green we mark the maximum possible value (1) and in red the minimum possible value (0.5). Accepted in Quantum 9999-99-99, click title to verify. Published under CC-BY 4.0. 7 [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Maximum AUROC achieved with ACGCT￾NAD for each type of face with standard scaler. In green we mark the maximum possible value (1) and in red the minimum possible value (0.5). In this first test we will not apply any scaler and we will put together a set of 4990 normal training data with a set of 20000 data to be tested. We can observe in [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: ROC curves obtained for different values of [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: ROC curves obtained for τ = 0.01 with method ACGCTNAD without and with a standard scaler, respectively, without training dataset. scaler, the threshold is 0.9999706310280276 and the accuracy is [PITH_FULL_IMAGE:figures/full_fig_p009_13.png] view at source ↗
Figure 12
Figure 12. Figure 12: ROC curves obtained for different values of [PITH_FULL_IMAGE:figures/full_fig_p009_12.png] view at source ↗
read the original abstract

We present a series of algorithms in tensor networks for anomaly detection in datasets, by using data compression in a Tensor Train representation. These algorithms consist of preserving the structure of normal data in compression and deleting the structure of anomalous data. The algorithms can be applied to any tensor network representation. We test the effectiveness of the methods with digits and Olivetti faces datasets and a cybersecurity dataset to determine cyber-attacks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper presents a series of algorithms for anomaly detection that operate via data compression in a Tensor Train (TT) representation. The central idea is that compression can be performed so as to preserve the structure of normal data while deleting the structure of anomalous data; the methods are stated to apply to any tensor network representation. Effectiveness is tested on the digits and Olivetti faces image datasets as well as a cybersecurity dataset for detecting attacks.

Significance. If the claimed selective deletion of anomalous structure can be shown to arise from the compression procedure itself without post-hoc tuning or labels, the work would supply a new tensor-network route to unsupervised anomaly detection. The absence of an explicit, dataset-independent criterion for the distinction, however, leaves the practical significance dependent on whether the reported experiments demonstrate a non-circular separation.

major comments (2)
  1. [Abstract] Abstract and introduction: the claim that TT compression 'preserves the structure of normal data' while 'deleting the structure of anomalous data' is load-bearing for the anomaly-detection interpretation, yet no explicit rank-truncation rule, error metric, or invariance argument is supplied that would guarantee a measurable difference on the basis of structure alone.
  2. The weakest assumption identified in the stress-test note remains unaddressed: if rank selection or tolerance thresholds are chosen after inspecting reconstruction errors on mixed data, the procedure becomes circular; the manuscript must demonstrate that the separation criterion can be fixed a priori or derived from normal data only.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major point below, clarifying the methodological details present in the manuscript and indicating revisions to improve explicitness.

read point-by-point responses
  1. Referee: [Abstract] Abstract and introduction: the claim that TT compression 'preserves the structure of normal data' while 'deleting the structure of anomalous data' is load-bearing for the anomaly-detection interpretation, yet no explicit rank-truncation rule, error metric, or invariance argument is supplied that would guarantee a measurable difference on the basis of structure alone.

    Authors: The methods section details the TT-SVD procedure with a fixed relative tolerance on singular values, chosen from the spectrum of normal-data tensors alone to retain dominant low-rank factors. The error metric is the Frobenius reconstruction error after truncation. The invariance follows from the hierarchical low-rank constraint of the TT format, which matches the compressible structure of normal samples but not that of anomalies. We will revise the abstract and introduction to state this truncation rule and metric explicitly. revision: yes

  2. Referee: The weakest assumption identified in the stress-test note remains unaddressed: if rank selection or tolerance thresholds are chosen after inspecting reconstruction errors on mixed data, the procedure becomes circular; the manuscript must demonstrate that the separation criterion can be fixed a priori or derived from normal data only.

    Authors: Parameter selection (ranks and tolerance) is performed exclusively on the normal-data training subset before any mixed test data are examined; the stress-test note varies these a-priori values to assess robustness but does not tune them on mixed data. We will add an explicit subsection documenting this protocol and confirming that the criterion is fixed from normal data only. revision: yes

Circularity Check

0 steps flagged

No circularity; abstract presents high-level claim without equations, fits, or self-citations that reduce to inputs.

full rationale

The abstract describes algorithms that preserve normal data structure and delete anomalous structure via Tensor Train compression, applicable to any tensor network. No derivations, equations, parameters fitted to subsets, or self-citations are present. The reader's take confirms no detectable reductions. Without explicit steps that equate outputs to inputs by construction (e.g., no rank truncation or error metrics shown as self-referential), the description remains non-circular. Full text placeholder does not alter this as no load-bearing reductions are exhibited.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on an unstated assumption about differential structure preservation during compression.

pith-pipeline@v0.9.0 · 5600 in / 998 out tokens · 24044 ms · 2026-05-23T20:40:49.083020+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 1 internal anchor

  1. [1]

    Supervised anomalydetectionforcomplexindustrialim- ages, 2024

    Aimira Baitieva, David Hurych, Victor Besnier, and Olivier Bernard. Supervised anomalydetectionforcomplexindustrialim- ages, 2024. URL https://arxiv.org/abs/ 2405.04953

  2. [2]

    Patchcl-ae: Anomaly detection for med- ical images using patch-wise contrastive learning-based auto-encoder

    Shuai Lu, Weihang Zhang, Jia Guo, Hanruo Liu, Huiqi Li, and Ningli Wang. Patchcl-ae: Anomaly detection for med- ical images using patch-wise contrastive learning-based auto-encoder. Comput- erized Medical Imaging and Graphics , 114:102366, 2024. ISSN 0895-6111. DOI: https://doi.org/10.1016/j.compmedimag.2024.102366. URL https://www.sciencedirect. com/scie...

  3. [3]

    Comparative evaluation of anomaly detection methods for fraud detection in online credit card payments, 2023

    Hugo Thimonier, Fabrice Popineau, Arpad Rimmel, Bich-Liên Doan, and Fabrice Daniel. Comparative evaluation of anomaly detection methods for fraud detection in online credit card payments, 2023. URL https://arxiv.org/abs/2312.13896

  4. [4]

    Anomaly-based cyberattacks detection for smart homes: A systematic literature review

    Juan Ignacio Iturbe Araya and Helena Rifà-Pous. Anomaly-based cyberattacks detection for smart homes: A systematic literature review. Internet of Things , 22: 100792, 2023. ISSN 2542-6605. DOI: https://doi.org/10.1016/j.iot.2023.100792. URL https://www.sciencedirect. com/science/article/pii/ S2542660523001154

  5. [5]

    Pascha- lidis

    Jing Zhang and Ioannis Ch. Pascha- lidis. Statistical anomaly detection via composite hypothesis testing for markov models. IEEE Transactions on Signal Processing, 66(3):589–602, 2018. DOI: 10.1109/TSP.2017.2771722

  6. [6]

    Machine learning for anomaly detection in par- ticle physics

    Vasilis Belis, Patrick Odagiu, and Thea Klaeboe Aarrestad. Machine learning for anomaly detection in par- ticle physics. Reviews in Physics , 12: 100091, 2024. ISSN 2405-4283. DOI: https://doi.org/10.1016/j.revip.2024.100091. URL https://www.sciencedirect. com/science/article/pii/ S2405428324000017

  7. [7]

    Deep learning for anomaly detection

    Guansong Pang, Chunhua Shen, Longbing Cao, and Anton Van Den Hengel. Deep learning for anomaly detection: A review. ACM Comput. Surv., 54(2), mar 2021. ISSN 0360-0300. DOI: 10.1145/3439950. URL https://doi.org/10.1145/3439950

  8. [8]

    A Tutorial on Principal Component Analysis

    Jonathon Shlens. A tutorial on principal component analysis, 2014. URL https:// arxiv.org/abs/1404.1100

  9. [9]

    Tensor networks in a nutshell, 2017

    Jacob Biamonte and Ville Bergholm. Tensor networks in a nutshell, 2017

  10. [10]

    Perez-Garcia, F

    D. Perez-Garcia, F. Verstraete, M. M. Wolf, and J. I. Cirac. Matrix product state repre- sentations, 2007

  11. [11]

    V. Murg F. Verstraete and J.I. Cirac. Matrix product states, projected en- tangled pair states, and variational renormalization group methods for quantum spin systems. Advances in Physics, 57(2):143–224, 2008. DOI: 10.1080/14789940801912366. URL https: //doi.org/10.1080/14789940801912366

  12. [12]

    Anomaly detection with tensor networks, 2020

    Jinhui Wang, Chase Roberts, Guifre Vidal, and Stefan Leichenauer. Anomaly detection with tensor networks, 2020. URL https: //arxiv.org/abs/2006.02516

  13. [13]

    Alpaydin and Fevzi

    E. Alpaydin and Fevzi. Alimoglu. Pen-Based Recognition of Handwritten Digits. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5MG6K

  14. [14]

    Samaria and A.C

    F.S. Samaria and A.C. Harter. Parame- terisation of a stochastic model for human face identification. In Proceedings of 1994 IEEE Workshop on Applications of Com- puter Vision , pages 138–142, 1994. DOI: 10.1109/ACV.1994.341300. Accepted inQuantum 9999-99-99, click title to verify. Published under CC-BY 4.0. 10