arxiv: 2511.20577 · v3 · submitted 2025-11-25 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

MSTN: A Lightweight and Fast Model for General TimeSeries Analysis

Sumit S Shevtekar , Chandresh K Maurya

Authors on Pith no claims yet

Pith reviewed 2026-05-17 04:30 UTC · model grok-4.3

classification 💻 cs.LG

keywords time series analysismulti-scale convolutionforecastingimputationclassificationlightweight modelsgated fusionhybrid neural network

0 comments

The pith

The Multi-scale Temporal Network uses early aggregation of convolutional features, sequence modeling, and self-gated fusion to set new performance marks on time series tasks while staying under one million parameters.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents MSTN to manage the non-stationary and multi-scale nature of real-world time series that fixed-scale models often mishandle. It builds the network around a multi-scale convolutional encoder for local details, a recurrent or attention-based sequence module for longer dependencies, and a self-gated fusion step that reweights those features dynamically. The goal is accurate results on imputation, forecasting, and classification without the heavy compute or rigid priors of many current designs. If the approach works as described, practitioners could run reliable time series analysis on modest hardware with minimal per-dataset adjustment.

Core claim

MSTN is grounded in an Early Temporal Aggregation principle and integrates a multi-scale convolutional encoder that captures fine-grained local structure, a sequence modeling module that learns long-range dependencies through recurrent or attention-based mechanisms, and a self-gated fusion stage incorporating squeeze-excitation and a single dense layer to dynamically reweight and fuse multi-scale representations, enabling flexible modeling of temporal patterns spanning milliseconds to extended horizons.

What carries the argument

The self-gated fusion stage that uses squeeze-excitation and a dense layer to dynamically reweight and combine outputs from the multi-scale convolutional encoder and the sequence module.

If this is right

Achieves state-of-the-art results on 33 of 40 datasets across imputation, long-term forecasting, short-term forecasting, classification, and cross-dataset generalization.
Keeps model size to roughly 278,000 parameters in the BiLSTM variant and under 1 million in the Transformer variant.
Delivers inference in under one second and often in milliseconds, supporting low-latency deployment.
Avoids the computational cost of long-context models while still capturing both local fluctuations and slow trends.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hybrid structure might transfer to other sequential domains such as audio signals or sensor streams where events occur at mismatched time scales.
Further simplification of the fusion stage could produce even smaller variants suitable for microcontrollers.
Online or continual learning versions could be tested by feeding streaming data directly into the multi-scale encoder without full retraining.

Load-bearing premise

The specific combination of multi-scale convolutional encoder, sequence module, and self-gated fusion will generalize to new time series distributions without requiring extensive per-dataset hyperparameter retuning or suffering from benchmark overfitting.

What would settle it

Evaluating MSTN on a newly assembled set of time series datasets that contain abrupt high-magnitude events or temporal scale distributions clearly outside the range of the original 40 benchmarks and checking whether the reported accuracy gains disappear.

Figures

Figures reproduced from arXiv: 2511.20577 by Chandresh K Maurya, Sumit S Shevtekar.

read the original abstract

Real-world time series often exhibit strong non-stationarity, complex nonlinear dynamics, and behavior expressed across multiple temporal scales, from rapid local fluctuations to slow-evolving long-range trends. However, many contemporary architectures impose rigid, fixed-scale structural priors -- such as patch-based tokenization, predefined receptive fields, or frozen backbone encoders -- which can over-regularize temporal dynamics and limit adaptability to abrupt high-magnitude events. To handle this, we introduce the Multi-scale Temporal Network (MSTN), a hybrid neural architecture grounded in an Early Temporal Aggregation principle. MSTN integrates three complementary components: (i) a multi-scale convolutional encoder that captures fine-grained local structure; (ii) a sequence modeling module that learns long-range dependencies through either recurrent or attention-based mechanisms; and (iii) a self-gated fusion stage incorporating squeeze-excitation and a single dense layer to dynamically reweight and fuse multi-scale representations. This design enables MSTN to flexibly model temporal patterns spanning milliseconds to extended horizons, while avoiding the computational burden typically associated with long-context models. Across extensive benchmarks covering imputation, long term forecasting, short term forecasting, classification, and cross-dataset generalization, MSTN achieves state-of-the-art performance, establishing new best results on 33 of 40 datasets, while remaining lightweight ($\sim$278,520 params for MSTN-BiLSTM and $\sim$950,776 $\approx$ 1M for MSTN-Transformer) and suitable for low-latency inference ($<$1 sec, often in milliseconds), resource-constrained deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MSTN recombines multi-scale conv, sequence modeling, and squeeze-excitation gating into a small hybrid that claims SOTA on 33 of 40 datasets, but those numbers rest on unverified experimental controls.

read the letter

This paper introduces MSTN as a lightweight hybrid for general time series work. It stacks a multi-scale convolutional encoder to pick up local patterns, a sequence module (BiLSTM or transformer) for longer dependencies, and a self-gated fusion step with squeeze-excitation plus one dense layer to reweight the scales. The authors call this Early Temporal Aggregation and position it as more flexible than rigid patch or fixed-receptive-field designs while staying under a million parameters and running in milliseconds.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Multi-scale Temporal Network (MSTN), a hybrid architecture consisting of a multi-scale convolutional encoder, a sequence modeling module (BiLSTM or Transformer), and a self-gated fusion stage using squeeze-excitation and a dense layer. Grounded in an Early Temporal Aggregation principle, MSTN is designed to capture dynamics across multiple temporal scales in non-stationary time series. The central claim is that MSTN achieves state-of-the-art results on 33 of 40 datasets spanning imputation, long-term forecasting, short-term forecasting, classification, and cross-dataset generalization, while using only ~278k–950k parameters and achieving sub-second inference.

Significance. If the performance claims can be substantiated with fixed hyperparameters, proper statistical controls, and evidence against benchmark overfitting, the work would provide a useful lightweight general-purpose model for time series that avoids the overhead of long-context transformers while adapting to varying scales. The reported model sizes and inference speeds are practically relevant for edge deployment.

major comments (2)

[Abstract and Experimental Results] The abstract states new best results on 33/40 datasets, but the manuscript supplies no information on baseline implementations, statistical testing, data splits, or ablation controls. This directly affects the soundness of the central empirical claim.
[Experimental Evaluation] It is not reported whether a single global hyperparameter set (number of convolutional scales, hidden dimensions, fusion parameters, learning rate) was used across all 40 datasets or whether per-dataset tuning occurred. This distinction is load-bearing for the generalization argument, because per-dataset optimization could account for the reported win rate without demonstrating architectural superiority on unseen distributions.

minor comments (2)

[Abstract] Parameter counts are written as ~278,520 and ~950,776 ≈ 1M; adopt consistent scientific notation or round to the nearest 10k for readability.
[Introduction] The phrase 'Early Temporal Aggregation principle' is used without a formal definition or explicit contrast to standard multi-scale convolution; a short clarifying paragraph would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript introducing MSTN. The comments regarding experimental transparency are important, and we address each major point below with clarifications and commitments to revision.

read point-by-point responses

Referee: [Abstract and Experimental Results] The abstract states new best results on 33/40 datasets, but the manuscript supplies no information on baseline implementations, statistical testing, data splits, or ablation controls. This directly affects the soundness of the central empirical claim.

Authors: We acknowledge that the current version of the manuscript does not provide sufficient detail on these aspects of the experimental protocol. In the revised manuscript, we will add a comprehensive Experimental Setup subsection that specifies: the sources and exact configurations used for all baseline models; the statistical testing procedures (including multiple random seeds, reporting of means and standard deviations, and significance tests such as paired t-tests); the precise train/validation/test splits for each of the 40 datasets; and additional ablation experiments isolating the contributions of the multi-scale convolutional encoder, sequence modeling module, and self-gated fusion stage. These additions will directly substantiate the central empirical claims. revision: yes
Referee: [Experimental Evaluation] It is not reported whether a single global hyperparameter set (number of convolutional scales, hidden dimensions, fusion parameters, learning rate) was used across all 40 datasets or whether per-dataset tuning occurred. This distinction is load-bearing for the generalization argument, because per-dataset optimization could account for the reported win rate without demonstrating architectural superiority on unseen distributions.

Authors: The manuscript does not explicitly document this distinction. To clarify, our experiments used a fixed global hyperparameter configuration for the core architectural elements across all 40 datasets: three convolutional scales, hidden dimension of 64 for the BiLSTM variant and 128 for the Transformer variant, and standardized fusion parameters in the self-gated stage. The learning rate received only modest, category-level adjustments (e.g., 1e-3 for forecasting tasks) solely to ensure convergence stability, without any per-dataset grid search or extensive optimization. This protocol was deliberately chosen to support the generalization claim. We will revise the paper to include an explicit hyperparameter table and a statement confirming the limited, non-per-dataset nature of any adjustments. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation or empirical claims

full rationale

The paper proposes a hybrid neural architecture (multi-scale conv encoder + sequence module + self-gated fusion) motivated by addressing fixed-scale priors in prior models, then reports empirical results on public benchmarks. No equations, predictions, or uniqueness theorems are present that reduce by construction to inputs, fitted parameters, or self-citations. Performance numbers are standard train/test evaluations on external datasets and do not constitute a derivation chain. This is a self-contained empirical contribution against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The design rests on the assumption that parallel multi-scale convolutions plus a learned gate can capture arbitrary temporal dynamics without explicit scale selection; no new physical entities are postulated, but several architectural hyperparameters must be chosen to achieve the reported numbers.

free parameters (2)

number of convolutional scales
The multi-scale encoder requires choosing how many parallel resolutions to run; this choice is tuned to obtain the benchmark scores.
hidden dimension and layer counts
Sizes of the sequence modeling module and fusion layer are selected to balance accuracy and the reported parameter counts.

axioms (1)

domain assumption Early Temporal Aggregation enables flexible modeling of patterns from milliseconds to long horizons without over-regularization.
Invoked as the grounding principle for the three-component design in the abstract.

pith-pipeline@v0.9.0 · 5578 in / 1526 out tokens · 38054 ms · 2026-05-17T04:30:50.160183+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MSTN integrates three complementary components: (i) a multi-scale convolutional encoder that captures fine-grained local structure; (ii) a sequence modeling module that learns long-range dependencies through either recurrent or attention-based mechanisms; and (iii) a self-gated fusion stage incorporating squeeze-excitation and a single dense layer to dynamically reweight and fuse multi-scale representations.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

This design enables MSTN to flexibly model temporal patterns spanning milliseconds to extended horizons, while avoiding the computational burden typically associated with long-context models.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 9 internal anchors

[1]

M. A. Morid, O. R. L. Sheng, J. Dunbar, Time series prediction using deep learning methods in healthcare 14 (1) (Jan. 2023).doi:10.1145/3531326. URLhttps://doi.org/10.1145/3531326

work page doi:10.1145/3531326 2023
[2]

Kadiyala, A

A. Kadiyala, A. Kumar, Multivariate time series models for prediction of air quality inside a public transportation bus using available software, En- vironmental Progress & Sustainable Energy 33 (2) (2014) 337–341

work page 2014
[3]

Gruca, F

A. Gruca, F. Serva, L. Lliso, P. Rípodas, X. Calbet, P. Herruzo, J. Pihrt, R. Raevskyi, P. Šimánek, M. Choma, et al., Weather4cast at neurips 2022: Super-resolution rain movie prediction under spatio-temporal shifts, in: NeurIPS 2022 Competition Track, PMLR, 2022, pp. 292–313

work page 2022
[4]

E. G. Kardakos, M. C. Alexiadis, S. I. Vagropoulos, C. K. Simoglou, P. N. Biskas, A. G. Bakirtzis, Application of time series and artificial neural network models in short-term forecasting of pv power generation, in: 2013 48th International Universities’ Power Engineering Conference (UPEC), 2013, pp. 1–6.doi:10.1109/UPEC.2013.6714975

work page doi:10.1109/upec.2013.6714975 2013
[5]

H. Wu, J. Xu, J. Wang, M. Long, Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting (2022).arXiv:2106. 13008. URLhttps://arxiv.org/abs/2106.13008

work page arXiv 2022
[6]

S. Zhao, M. Jin, Z. Hou, C. Yang, Z. Li, Q. Wen, Y. Wang, Himtm: Hi- erarchical multi-scale masked time series modeling with self-distillation for long-term forecasting (2024).arXiv:2401.05012. URLhttps://arxiv.org/abs/2401.05012

work page arXiv 2024
[7]

H. Wu, T. Hu, Y. Liu, H. Zhou, J. Wang, M. Long, Timesnet: Temporal 2d-variation modeling for general time series analysis (2023).arXiv:2210. 02186. URLhttps://arxiv.org/abs/2210.02186

work page internal anchor Pith review Pith/arXiv arXiv 2023
[8]

B. Lim, S. Zohren, Time-series forecasting with deep learning: a sur- vey, Philosophical Transactions of the Royal Society A 379 (2194) (2021) 20200209.doi:10.1098/rsta.2020.0209. 26

work page doi:10.1098/rsta.2020.0209 2021
[9]

Y. Nie, N. H. Nguyen, P. Sinthong, J. Kalagnanam, A time series is worth 64 words: Long-term forecasting with transformers (2023).arXiv:2211. 14730. URLhttps://arxiv.org/abs/2211.14730

work page internal anchor Pith review Pith/arXiv arXiv 2023
[10]

A. Zeng, M. Chen, L. Zhang, Q. Xu, Are transformers effective for time series forecasting? (2022).arXiv:2205.13504. URLhttps://arxiv.org/abs/2205.13504

work page arXiv 2022
[11]

Franceschi, A

J.-Y. Franceschi, A. Dieuleveut, M. Jaggi, Unsupervised scalable represen- tation learning for multivariate time series, in: Advances in Neural Infor- mation Processing Systems (NeurIPS), 2019, pp. 4652–4663

work page 2019
[12]

S. Bai, J. Z. Kolter, V. Koltun, An empirical evaluation of generic convolu- tionalandrecurrentnetworksforsequencemodeling, CoRRabs/1803.01271 (2018). URLhttp://arxiv.org/abs/1803.01271

work page internal anchor Pith review Pith/arXiv arXiv 2018
[13]

Hochreiter, J

S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Compu- tation 9 (8) (1997) 1735–1780.doi:10.1162/neco.1997.9.8.1735

work page doi:10.1162/neco.1997.9.8.1735 1997
[14]

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

G. Lai, W.-C. Chang, Y. Yang, H. Liu, Modeling long- and short-term temporal patterns with deep neural networks (2018).arXiv:1703.07015. URLhttps://arxiv.org/abs/1703.07015

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Y. He, J. Zhao, Temporal convolutional networks for anomaly detection in time series, Journal of Physics: Conference Series 1213 (4) (2019) 042050. doi:10.1088/1742-6596/1213/4/042050

work page doi:10.1088/1742-6596/1213/4/042050 2019
[16]

Attention Is All You Need

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin, Attention is all you need (2023).arXiv:1706. 03762. URLhttps://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2023
[17]

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, W. Zhang, Informer: Beyond efficient transformer for long sequence time-series forecasting, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35, 2021, pp. 11106–11115

work page 2021
[18]

T. Zhou, Z. Ma, Q. Wen, X. Wang, L. Sun, R. Jin, Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting (2022). arXiv:2201.12740. URLhttps://arxiv.org/abs/2201.12740

work page arXiv 2022
[19]

X. Qiu, H. Cheng, X. Wu, J. Hu, C. Guo, B. Yang, A comprehensive survey of deep learning for multivariate time series forecasting: A channel strategy perspective (2025).arXiv:2502.10721. URLhttps://arxiv.org/abs/2502.10721 27

work page arXiv 2025
[20]

Chang, W.-Y

C. Chang, W.-Y. Wang, W.-C. Peng, T.-F. Chen, Llm4ts: Aligning pre- trained llms as data-efficient time-series forecasters, ACM Trans. Intell. Syst. Technol. 16 (3) (Apr. 2025).doi:10.1145/3719207. URLhttps://doi.org/10.1145/3719207

work page doi:10.1145/3719207 2025
[21]

M. Jin, S. Wang, L. Ma, Z. Chu, J. Y. Zhang, X. Shi, P.-Y. Chen, Y. Liang, Y.-F. Li, S. Pan, Q. Wen, Time-llm: Time series forecasting by reprogram- ming large language models (2024).arXiv:2310.01728. URLhttps://arxiv.org/abs/2310.01728

work page internal anchor Pith review Pith/arXiv arXiv 2024
[22]

Zhang, L

Y. Zhang, L. Ma, S. Pal, Y. Zhang, M. Coates, Multi-resolution time-series transformer for long-term forecasting (2024).arXiv:2311.04147. URLhttps://arxiv.org/abs/2311.04147

work page arXiv 2024
[23]

Han, X.-Y

L. Han, X.-Y. Chen, H.-J. Ye, D.-C. Zhan, Softs: Efficient multivariate time series forecasting with series-core fusion (2024).arXiv:2404.14197. URLhttps://arxiv.org/abs/2404.14197

work page arXiv 2024
[24]

W. Han, T. Zhu, L. Chen, H. Ning, Y. Luo, Y. Wan, Mcformer: Multivari- ate time series forecasting with mixed-channels transformer, IEEE Internet of Things Journal 11 (17) (2024) 28320–28329.doi:10.1109/JIOT.2024. 3401697

work page doi:10.1109/jiot.2024 2024
[25]

Alharthi, K

M. Alharthi, K. Mahmood, S. Patel, A. Mahmood, Emtsf:extraordinary mixture of sota models for time series forecasting (2025).arXiv:2510. 23396. URLhttps://arxiv.org/abs/2510.23396

work page arXiv 2025
[26]

T. Zhou, P. Niu, X. Wang, L. Sun, R. Jin, One fits all:power general time series analysis by pretrained lm (2023).arXiv:2302.11939. URLhttps://arxiv.org/abs/2302.11939

work page arXiv 2023
[27]

Y. Liu, T. Hu, H. Zhang, H. Wu, S. Wang, L. Ma, M. Long, itransformer: Invertedtransformersareeffectivefortimeseriesforecasting(2024).arXiv: 2310.06625. URLhttps://arxiv.org/abs/2310.06625

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Zhang, Y

T. Zhang, Y. Zhang, W. Cao, J. Bian, X. Yi, S. Zheng, J. Li, Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures (2022).arXiv:2207.01186. URLhttps://arxiv.org/abs/2207.01186

work page arXiv 2022
[29]

Rodegast, et al., Motorcycle collision dataset (2024).doi: 10.18419/darus-3301

M. Rodegast, et al., Motorcycle collision dataset (2024).doi: 10.18419/darus-3301. URLhttps://darus.uni-stuttgart.de/dataset.xhtml? persistentId=doi:10.18419/darus-3301

work page doi:10.18419/darus-3301 2024
[30]

Trindade, ElectricityLoadDiagrams20112014, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C58C86 (2015)

A. Trindade, ElectricityLoadDiagrams20112014, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C58C86 (2015). 28

work page doi:10.24432/c58c86 2015
[31]

Köllé, Wetterstation

O. Köllé, Wetterstation. weather., Technical report and dataset, Max- Planck-Institut für Biogeochemie (BGC Jena), Germany, data freely avail- able athttps://www.bgc-jena.mpg.de/wetter/(2025). URLhttps://www.bgc-jena.mpg.de/wetter/

work page 2025
[32]

The UEA multivariate time series classification archive, 2018

A. Bagnall, H. A. Dau, J. Lines, M. Flynn, J. Large, A. Bostrom, P. Southam, E. Keogh, The uea multivariate time series classification archive, 2018 (2018).arXiv:1811.00075. URLhttps://arxiv.org/abs/1811.00075

work page internal anchor Pith review Pith/arXiv arXiv 2018
[33]

dataset on powered two wheelers fall and critical events detection

A. Boubezoul, F. Dufour, S. Bouaziz, S. Espié, Corrigendum to “dataset on powered two wheelers fall and critical events detection”, Data in Brief 30 (2020) 105577.doi:https://doi.org/10.1016/j.dib.2020.105577. URLhttps://www.sciencedirect.com/science/article/pii/ S2352340920304716

work page doi:10.1016/j.dib.2020.105577 2020
[34]

Reyes-Ortiz, D

J. Reyes-Ortiz, D. Anguita, A. Ghio, L. Oneto, X. Parra, Human Activity Recognition Using Smartphones, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C54S4K (2013)

work page doi:10.24432/c54s4k 2013
[35]

Reiss, PAMAP2 Physical Activity Monitoring, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C5NW2H (2012)

A. Reiss, PAMAP2 Physical Activity Monitoring, UCI Machine Learning Repository, DOI: https://doi.org/10.24432/C5NW2H (2012)

work page doi:10.24432/c5nw2h 2012
[36]

O. I. Dissanayake, S. E. McPherson, J. Allyndrée, E. Kennedy, P. Cunning- ham, L. Riaboff, Actbecalf: Accelerometer-based multivariate time-series dataset for calf behavior classification, Data in Brief 60 (2025) 111462. doi:https://doi.org/10.1016/j.dib.2025.111462. URLhttps://www.sciencedirect.com/science/article/pii/ S2352340925001945

work page doi:10.1016/j.dib.2025.111462 2025
[37]

Davari, B

N. Davari, B. Veloso, R. Ribeiro, J. Gama, MetroPT-3 Dataset, UCI Machine Learning Repository, dOI:https://doi.org/10.24432/C5VW3R (2021)

work page doi:10.24432/c5vw3r 2021
[38]

Saxena, K

A. Saxena, K. Goebel, Nasa turbofan engine degradation simulation data set, nASA Ames Prognostics Center of Excellence (2008). URLhttps://www.nasa.gov/intelligent-systems-division/ discovery-and-systems-health/pcoe/pcoe-data-set-repository/

work page 2008
[39]

Rodegast, S

P. Rodegast, S. Maier, J. Kneifl, J. Fehr, On using machine learning algo- rithms for motorcycle collision detection, Discover Applied Sciences 6 (6) (2024) 326

work page 2024
[40]

F. Elwy, R. Aburukba, A. R. Al-Ali, A. A. Nabulsi, A. Tarek, A. Ayub, M. Elsayeh, Data-driven safe deliveries: The synergy of iot and machine learning in shared mobility, Future Internet 15 (10) (2023)

work page 2023
[41]

D. P. Ismi, S. Panchoo, M. Murinto, K-means clustering based filter feature selection on high dimensional data, International Journal of Advances in 29 Intelligent Informatics 2 (2016) 38–45. URLhttps://api.semanticscholar.org/CorpusID:43897444

work page 2016
[42]

Reiss, D

A. Reiss, D. Stricker, Introducing a new benchmarked dataset for activity monitoring, in: 2012 16th International Symposium on Wearable Comput- ers, 2012, pp. 108–109.doi:10.1109/ISWC.2012.13

work page doi:10.1109/iswc.2012.13 2012
[43]

Davari, B

N. Davari, B. Veloso, R. P. Ribeiro, P. M. Pereira, J. Gama, Predictive maintenance based on anomaly detection using deep learning for air pro- duction unit in the railway industry, in: 2021 IEEE 8th International Con- ference on Data Science and Advanced Analytics (DSAA), 2021, pp. 1–10. doi:10.1109/DSAA53316.2021.9564181

work page doi:10.1109/dsaa53316.2021.9564181 2021
[44]

G. Woo, C. Liu, D. Sahoo, A. Kumar, S. Hoi, Etsformer: Exponential smoothing transformers for time-series forecasting (2022).arXiv:2202. 01381. URLhttps://arxiv.org/abs/2202.01381

work page arXiv 2022
[45]

D. P. Kingma, J. Ba, Adam: A method for stochastic optimization (2017). arXiv:1412.6980. 30

work page internal anchor Pith review Pith/arXiv arXiv 2017