pith. sign in

arxiv: 1907.00701 · v1 · pith:QY3NOXFMnew · submitted 2019-06-28 · 💻 cs.LG

Anomaly Subsequence Detection with Dynamic Local Density for Time Series

Pith reviewed 2026-05-25 13:51 UTC · model grok-4.3

classification 💻 cs.LG
keywords anomaly subsequence detectiontime seriesdynamic local densityTime Split Treeensemble learningoutlier detectiontrend preservation
0
0 comments X

The pith

Dynamic local density estimation detects anomaly subsequences in time series more accurately by preserving trend information.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a DLDE model that uses a Time Split Tree to divide time series dynamically for local density estimation in anomaly subsequence detection. This avoids the information loss typical of dimensionality reduction techniques and sidesteps issues like time drift and parameter tuning. Ensemble learning is added to offset randomness introduced by hash functions and variable segment choices. Tests across multiple dataset types show the approach beats prior methods with notably higher accuracy. Readers interested in reliable monitoring of sequential data would see value in a method that keeps trend details intact while improving detection performance.

Core claim

The DLDE model dynamically divides time series using Time Split Tree to estimate local density without losing trend information, and applies ensemble learning to neutralize the randomness from hash functions and segment choices, yielding higher accuracy in anomaly subsequence detection than state-of-the-art methods on varied datasets.

What carries the argument

Dynamic Local Density Estimation (DLDE) with Time Split Tree for dynamic division and ensemble learning to reduce randomness impact.

If this is right

  • Anomaly detection in high-dimensional time series becomes feasible without separate dimensionality reduction steps.
  • Applications in sensor monitoring or financial tracking gain improved detection rates while retaining original sequence trends.
  • Fewer manual parameter adjustments are needed compared with methods that rely on fixed reductions or tuning.
  • Ensemble strategies can be reused to stabilize other segment-based or hash-dependent detectors.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dynamic partitioning idea could extend to online streaming anomaly detection where data arrives continuously.
  • Similar tree-based division might help in related sequence tasks such as motif discovery or change-point detection.
  • If the ensemble step proves robust, the method may reduce sensitivity to initial random seeds across other anomaly frameworks.

Load-bearing premise

Dynamically dividing time series with the Time Split Tree preserves trend information without loss and ensemble learning fully neutralizes the randomness from hash functions and segment choices.

What would settle it

A controlled test on time series where known trend features are removed or altered, showing the DLDE accuracy no longer exceeds baseline methods.

Figures

Figures reproduced from arXiv: 1907.00701 by Ao Yin, Chunkai Zhang, Yingyang Chen.

Figure 1
Figure 1. Figure 1: The Fig. (a) is the DTW calculation matrix with adjustment window R(green window), C and Q are the two time series. The Fig. (b) is the example of DTW calculates schematics in all data sets. 3 The Proposed Algorithm Based on the analysis of dynamic time warping similarity calculation in Section 2, we propose a time series anomaly subsequence detection algorithm, Dynamic Local Density Estimation(DLDE) to di… view at source ↗
Figure 2
Figure 2. Figure 2: The structure of TSTree, the circle node represents an internal node, and a rectangle node represents a leaf node. Definition 3 (Hash Function). The data set Qt1∼td at d time points can be mapped to d hash table HashT ablet1∼td by hash function(Equation (1)). If two data points have the same hash function value, the two data points are similar. hash(p) = b p + r w c (1) where p is the time point, w is the … view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of various experiments on AUC. 4.3 Parameter analysis Dynamic window m. In the DLDE algorithm, the dynamic time segment window is randomly divided, in order to ensure the stability of the algorithm, we choose to use the idea of ensemble learning. That is randomly divide m times, and the final test result of the algorithm is the average of the number of runs. In this experiment, the sensitivity o… view at source ↗
Figure 4
Figure 4. Figure 4: The parameters analysis of dynamic window m and hash number h. Computational time. In order to calculate the consumption time, we se￾lected 8 data sets for testing. We calculate the percentage of calculation time for the five methods in each data set. As can be seen from the [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison results of each algorithm on different length time series data [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The results on ECG data chfdb01 275 and chfdb13 45590 [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
read the original abstract

Anomaly subsequence detection is to detect inconsistent data, which always contains important information, among time series. Due to the high dimensionality of the time series, traditional anomaly detection often requires a large time overhead; furthermore, even if the dimensionality reduction techniques can improve the efficiency, they will lose some information and suffer from time drift and parameter tuning. In this paper, we propose a new anomaly subsequence detection with Dynamic Local Density Estimation (DLDE) to improve the detection effect without losing the trend information by dynamically dividing the time series using Time Split Tree. In order to avoid the impact of the hash function and the randomness of dynamic time segments, ensemble learning is used. Experimental results on different types of data sets verify that the proposed model outperforms the state-of-art methods, and the accuracy has big improvement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper proposes Dynamic Local Density Estimation (DLDE) for anomaly subsequence detection in time series. It dynamically divides series via a Time Split Tree to preserve trend information (avoiding losses from dimensionality reduction or time drift), applies ensemble learning to mitigate randomness from hash functions and segment choices, and reports that experiments on multiple dataset types show outperformance over state-of-the-art methods with substantial accuracy gains.

Significance. If the preservation of trend information and neutralization of randomness hold under rigorous controls, the method could advance efficient anomaly detection for high-dimensional series by sidestepping common pitfalls of reduction techniques. The combination of dynamic partitioning and ensembling is a reasonable engineering response to the stated problems, but the significance cannot be assessed without the missing quantitative validation of the core assumptions.

major comments (2)
  1. [Abstract] Abstract: The central empirical claim ('outperforms the state-of-art methods, and the accuracy has big improvement') is stated without any reference to specific datasets, baselines, metrics, error bars, or statistical tests. This is load-bearing for the outperformance assertion and prevents verification of whether gains are attributable to DLDE.
  2. [Abstract] Abstract: The manuscript asserts that the Time Split Tree 'improve[s] the detection effect without losing the trend information' and that ensemble learning avoids 'the impact of the hash function and the randomness of dynamic time segments,' yet supplies no supporting analysis (e.g., autocorrelation or trend-statistic comparisons pre/post-split, or variance reduction across ensemble members). These properties are load-bearing for attributing any accuracy gains to the proposed procedure rather than partitioning or hashing artifacts.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed review and constructive suggestions. We address the major comments on the abstract below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: The central empirical claim ('outperforms the state-of-art methods, and the accuracy has big improvement') is stated without any reference to specific datasets, baselines, metrics, error bars, or statistical tests. This is load-bearing for the outperformance assertion and prevents verification of whether gains are attributable to DLDE.

    Authors: The abstract is intended as a high-level summary, with full details provided in the experimental section of the manuscript. To improve clarity and verifiability, we will revise the abstract to reference the specific datasets, baselines, and metrics used in the evaluation. revision: yes

  2. Referee: The manuscript asserts that the Time Split Tree 'improve[s] the detection effect without losing the trend information' and that ensemble learning avoids 'the impact of the hash function and the randomness of dynamic time segments,' yet supplies no supporting analysis (e.g., autocorrelation or trend-statistic comparisons pre/post-split, or variance reduction across ensemble members). These properties are load-bearing for attributing any accuracy gains to the proposed procedure rather than partitioning or hashing artifacts.

    Authors: While the overall performance improvements are demonstrated through experiments, we agree that direct supporting analyses for trend preservation and randomness reduction would strengthen the attribution of gains to DLDE. We will incorporate additional analyses, such as trend statistic comparisons and ensemble variance metrics, in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No circularity; DLDE is an independent algorithmic procedure with no self-referential reductions

full rationale

The paper introduces DLDE as a new procedure that dynamically partitions time series via Time Split Tree and applies ensemble learning to address hash/segment randomness. No equations, parameter fits, or derivations appear that reduce any claimed prediction or result to its own inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims rest on the novelty of the partitioning and ensembling steps rather than tautological re-expression of prior quantities, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review is abstract-only; ledger entries are inferred from stated claims. The central claim rests on the unverified assumption that the dynamic splitting preserves information and ensemble mitigates randomness.

axioms (2)
  • domain assumption Dynamic division via Time Split Tree preserves trend information without loss
    Explicitly invoked in the abstract as the reason the method improves detection without the drawbacks of dimensionality reduction.
  • domain assumption Ensemble learning neutralizes impact of hash function and segment randomness
    Stated directly as the justification for using ensemble to avoid randomness effects.

pith-pipeline@v0.9.0 · 5660 in / 1298 out tokens · 40554 ms · 2026-05-25T13:51:15.619451+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    IEEE Transactions on Information Technology in Biomedicine A Publication of the IEEE Engineering in Medicine & Biology Society 13(4), 512–8 (2009)

    Argyro, K., George, M., Christophoros, N.: Heartbeat time series classification with support vector machines. IEEE Transactions on Information Technology in Biomedicine A Publication of the IEEE Engineering in Medicine & Biology Society 13(4), 512–8 (2009)

  2. [2]

    Journal of the American College of Cardiology 7(3), 661–670 (1986)

    Baim, D.S., Colucci, W.S., Monrad, E.S., Smith, H.S., Wright, R.F., Lanoue, A., ., Gauthier, D.F., Ransil, B.J., Grossman, W., ., Braunwald, E., .: Survival of patients with severe congestive heart failure treated with oral milrinone. Journal of the American College of Cardiology 7(3), 661–670 (1986)

  3. [3]

    In: International Conference on Data Engineering (1999)

    Chan, K.P., Fu, W.C.: Efficient time series matching by wavelets. In: International Conference on Data Engineering (1999)

  4. [4]

    Chen, Y., Keogh, E., Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G.: The ucr time series classification archive (July 2015), www.cs.ucr.edu/~eamonn/time_ series_data/

  5. [5]

    Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases, vol. 23. ACM (1994)

  6. [6]

    Knowledge & Information Systems 3(3), 263–286 (2001)

    Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowledge & Information Systems 3(3), 263–286 (2001)

  7. [7]

    In: null

    Keogh, E., Lin, J., Fu, A.: Hot sax: Efficiently finding the most unusual time series subsequence. In: null. pp. 226–233. Ieee (2005)

  8. [8]

    In: International Conference on Data Engineering (2003)

    Lazaridis, I., Mehrotra, S.: Capturing sensor-generated time series with quality guarantees. In: International Conference on Data Engineering (2003)

  9. [9]

    Knowledge-Based Systems 54, 243–254 (2013) Title Suppressed Due to Excessive Length 15

    Li, G., Br¨ aysy, O., Jiang, L., Wu, Z., Wang, Y.: Finding time series discord based on bit representation clustering. Knowledge-Based Systems 54, 243–254 (2013) Title Suppressed Due to Excessive Length 15

  10. [10]

    In: Acm Sigmod Workshop on Research Issues in Data Mining & Knowledge Discovery (2003)

    Lin, J., Keogh, E., Lonardi, S., Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. In: Acm Sigmod Workshop on Research Issues in Data Mining & Knowledge Discovery (2003)

  11. [11]

    IEEE Transac- tions on Intelligent Transportation Systems 14(2), 871–882 (2013)

    Lippi, M., Bertini, M., Frasconi, P.: Short-term traffic flow forecasting: An experi- mental comparison of time-series analysis and supervised learning. IEEE Transac- tions on Intelligent Transportation Systems 14(2), 871–882 (2013)

  12. [12]

    In: IEEE International Conference on Tools with Artificial Intelligence (2006)

    Moonesinghe, H.D.K., Tan, P.N.: Outlier detection using random walks. In: IEEE International Conference on Tools with Artificial Intelligence (2006)

  13. [13]

    Expert Systems with Applications 36(2), 2027–2036 (2009)

    Ocak, H.: Automatic detection of epileptic seizures in eeg using discrete wavelet transform and approximate entropy. Expert Systems with Applications 36(2), 2027–2036 (2009)

  14. [14]

    In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

    Pham, N., Pagh, R.: A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. pp. 877–885. ACM (2012)

  15. [15]

    Knowledge-Based Systems 61(1), 89–97 (2014)

    Rahmani, A., Afra, S., Zarour, O., Addam, O., Koochakzadeh, N., Kianmehr, K., Alhajj, R., Rokne, J.: Graph-based approach for outlier detection in sequential data and its application on stock market and weather data. Knowledge-Based Systems 61(1), 89–97 (2014)

  16. [16]

    In: Acm Sigkdd International Conference on Knowledge Discovery & Data Mining (2012)

    Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Acm Sigkdd International Conference on Knowledge Discovery & Data Mining (2012)

  17. [17]

    Knowledge-Based Systems (2017)

    Ren, H., Liu, M., Li, Z., Pedrycz, W.: A piecewise aggregate pattern representation approach for anomaly detection in time series. Knowledge-Based Systems (2017)

  18. [18]

    Ieej Transactions on Electrical & Electronic Engineering 13(9), 757–762 (2018)

    Ren, H., Liu, M., Liao, X., Li, L., Ye, Z., Li, Z.: Anomaly detection in time se- ries based on interval sets: Anomaly detection in time series. Ieej Transactions on Electrical & Electronic Engineering 13(9), 757–762 (2018)

  19. [19]

    In: Acm International Conference on Web Search & Data Mining (2012)

    Ruiz, E.J., Hristidis, V., Castillo, C., Gionis, A., Jaimes, A.: Correlating financial time series with micro-blogging activity. In: Acm International Conference on Web Search & Data Mining (2012)

  20. [20]

    In: 2016 IEEE 16th International Conference on Data Mining (ICDM)

    Sathe, S., Aggarwal, C.C.: Subspace outlier detection in linear time with ran- domized hashing. In: 2016 IEEE 16th International Conference on Data Mining (ICDM). pp. 459–468 (Dec 2016). https://doi.org/10.1109/ICDM.2016.0057

  21. [21]

    In: EDBT

    Senin, P., Lin, J., Wang, X., Oates, T., Gandhi, S., Boedihardjo, A.P., Chen, C., Frankenstein, S.: Time series anomaly discovery with grammar-based compression. In: EDBT. pp. 481–492 (2015)

  22. [22]

    Comput Math Methods Med 2015, 453214 (2015)

    Sivaraks, H., Ratanamahatana, C.A.: Robust and accurate anomaly detection in ecg artifacts using time series motif discovery. Comput Math Methods Med 2015, 453214 (2015)

  23. [23]

    Neurocomputing138(11), 189–198 (2014)

    Sun, Y., Li, J., Liu, J., Sun, B., Chow, C.: An improvement of symbolic aggregate approximation distance measure for time series. Neurocomputing138(11), 189–198 (2014)

  24. [24]

    Neurocom- puting 241, 171–180 (2017)

    Tang, B., He, H.: A local density-based approach for outlier detection. Neurocom- puting 241, 171–180 (2017)

  25. [25]

    BMC bioinformatics 12, 347 (08 2011)

    Yuan, Y., Chen, Y.P.P., Ni, S., Xu, A., Tang, L., Vingron, M., Somel, M., Khaitovich, P.: Development and application of a modified dynamic time warp- ing algorithm (dtw-s) to analyses of primate brain expression time series. BMC bioinformatics 12, 347 (08 2011). https://doi.org/10.1186/1471-2105-12-347