pith. sign in

arxiv: 2506.08412 · v1 · submitted 2025-06-10 · 💻 cs.LG

Learning to Hear Broken Motors: Signature-Guided Data Augmentation for Induction-Motor Diagnostics

Pith reviewed 2026-05-19 11:14 UTC · model grok-4.3

classification 💻 cs.LG
keywords induction motor diagnosticsdata augmentationmotor current signature analysisfault detectionmachine learningfrequency domain synthesisanomaly generation
0
0 comments X

The pith

Signature-guided synthesis creates realistic motor faults in the frequency domain to train better diagnostic models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Signature-Guided Data Augmentation, a method that starts with healthy motor current signals and adds synthetic anomalies directly in the frequency domain. These anomalies are shaped by the known patterns from Motor Current Signature Analysis so they match the physics of real faults like broken rotor bars or bearing issues. The resulting dataset trains supervised machine learning models without needing large collections of actual faulty motors or full physics simulations. A sympathetic reader cares because real industrial motors rarely provide enough labeled fault examples, making pure data-driven approaches unreliable in practice. If the method holds, it lets diagnostic systems combine the physical grounding of signature analysis with the flexibility of modern ML classifiers.

Core claim

An unsupervised framework called Signature-Guided Data Augmentation synthesizes physically plausible faults directly in the frequency domain of healthy current signals, guided by Motor Current Signature Analysis, thereby enabling hybrid supervised ML models that achieve superior diagnostic accuracy and reliability for three-phase induction motors without computationally intensive simulations.

What carries the argument

Signature-Guided Data Augmentation (SGDA), which injects frequency-domain anomalies into healthy signals according to motor physics signatures to generate training examples.

If this is right

  • Diagnostic models trained this way generalize across varying motor loads and speeds without additional real-fault data.
  • The approach reduces reliance on expensive field data collection or time-domain simulations for building training sets.
  • Hybrid systems gain both the interpretability of signature analysis and the pattern-recognition power of supervised classifiers.
  • Industrial applications become feasible because the method works from readily available healthy signals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same frequency-domain injection idea might transfer to other vibration-based diagnostics for pumps or gearboxes.
  • Online systems could periodically refresh the synthetic fault set using the latest healthy baseline to track gradual degradation.
  • If the signatures prove robust, the technique could lower the barrier for deploying ML diagnostics in smaller facilities that lack fault archives.

Load-bearing premise

The frequency-domain anomalies created from healthy signals match the patterns that appear when real motors develop faults in the field.

What would settle it

Train a model on SGDA-augmented healthy data, then test it on a held-out set of real motor currents recorded from motors with documented faults and check whether classification accuracy matches or exceeds models trained directly on those real faulty recordings.

Figures

Figures reproduced from arXiv: 2506.08412 by Aleksandr Khizhik, Artem Ryzhikov, Denis Derkach, Saraa Ali, Stepan Svirin.

Figure 1
Figure 1. Figure 1: Overview of the SGDA framework. During training (left), only current signals from healthy motors are [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FFT spectra of motor current signals. (a) Normal condition with only base harmonics. (b) ITSC fault [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of SGDA augmentation. Left: original FFT spectrum of a healthy signal. Middle: augmented [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Binary classification performance with and without majority voting across load and phase. Majority voting [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Confusion matrices for ResNet18: (left) binary classification, (right) multiclass classification. Results [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: F1 mean ± std for full training across all phases and load levels. Experiment 2 – Training on Phase 1 at 100% Load Only. Here, the classifier was trained using a highly constrained configuration: Phase 1 at 100% load. Testing, however, was performed across all other loads and phases to evaluate the model’s generalization under minimal training diversity. As shown in [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Signal-level normalized confusion matrix for the multiclass task under full training. Misclassification occurs [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: F1 mean ± std for limited training on one phase and one load. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Signal-level normalized confusion matrix for the multiclass task trained only on Phase 1 @ 100% load. [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Normalized confusion matrices for segment-level classification on the second motor. Left: Binary classifi [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗
read the original abstract

The application of machine learning (ML) algorithms in the intelligent diagnosis of three-phase engines has the potential to significantly enhance diagnostic performance and accuracy. Traditional methods largely rely on signature analysis, which, despite being a standard practice, can benefit from the integration of advanced ML techniques. In our study, we innovate by combining ML algorithms with a novel unsupervised anomaly generation methodology that takes into account the engine physics model. We propose Signature-Guided Data Augmentation (SGDA), an unsupervised framework that synthesizes physically plausible faults directly in the frequency domain of healthy current signals. Guided by Motor Current Signature Analysis, SGDA creates diverse and realistic anomalies without resorting to computationally intensive simulations. This hybrid approach leverages the strengths of both supervised ML and unsupervised signature analysis, achieving superior diagnostic accuracy and reliability along with wide industrial application. The findings highlight the potential of our approach to contribute significantly to the field of engine diagnostics, offering a robust and efficient solution for real-world applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes Signature-Guided Data Augmentation (SGDA), an unsupervised framework that synthesizes physically plausible faults by perturbing frequency components of healthy induction-motor current signals according to Motor Current Signature Analysis (MCSA) rules. The augmented data is used to train supervised ML classifiers for fault diagnosis, with the central claim that this hybrid physics-ML approach yields superior diagnostic accuracy, reliability, and broad industrial applicability compared to traditional signature analysis or purely data-driven methods.

Significance. If the synthetic anomalies prove representative of real faults, the method could mitigate data scarcity in industrial motor diagnostics by generating diverse, physics-constrained training examples without expensive simulations, thereby improving generalization of ML models to field conditions. The explicit use of MCSA domain knowledge to guide augmentation is a clear strength that distinguishes it from generic data-augmentation techniques.

major comments (2)
  1. [Abstract] Abstract: the assertion of 'superior diagnostic accuracy and reliability' is unsupported by any reported metrics, baselines, cross-validation protocol, or real-fault test-set description, so the central performance claim cannot be assessed from the manuscript as written.
  2. [Method] SGDA method description (likely §3 or §4): the frequency-domain perturbation rules are defined from MCSA physics, yet no quantitative distributional comparison (e.g., Wasserstein distance on selected harmonic bins, sideband amplitudes, or phase statistics) between the generated anomalies and actual field-collected fault signatures is provided; without this, the generalization claim that models trained on SGDA-augmented data will perform on real motors rests on an unverified assumption.
minor comments (2)
  1. [Method] Add explicit parameter values or pseudocode for the frequency perturbation step to support reproducibility.
  2. [Results] Include side-by-side spectral plots of healthy, SGDA-synthetic, and real-fault signals with clear labeling of the perturbed components.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important areas for improving clarity and validation. We respond to each major comment below and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion of 'superior diagnostic accuracy and reliability' is unsupported by any reported metrics, baselines, cross-validation protocol, or real-fault test-set description, so the central performance claim cannot be assessed from the manuscript as written.

    Authors: We agree that the abstract states the performance claim without sufficient supporting detail. The body of the manuscript reports accuracy improvements from SGDA-augmented classifiers versus baselines, using repeated k-fold cross-validation on a combination of augmented healthy signals and held-out real fault recordings. To address the concern, we will revise the abstract to include concise references to the key metrics, baseline comparisons, and evaluation protocol. revision: yes

  2. Referee: [Method] SGDA method description (likely §3 or §4): the frequency-domain perturbation rules are defined from MCSA physics, yet no quantitative distributional comparison (e.g., Wasserstein distance on selected harmonic bins, sideband amplitudes, or phase statistics) between the generated anomalies and actual field-collected fault signatures is provided; without this, the generalization claim that models trained on SGDA-augmented data will perform on real motors rests on an unverified assumption.

    Authors: The referee correctly notes the absence of quantitative distributional metrics. Our current validation relies on qualitative spectral alignment with MCSA-predicted fault harmonics and sidebands, together with downstream classifier performance on real test data. We will add a quantitative comparison section that reports Wasserstein distances and statistics on harmonic amplitudes and phase distributions between SGDA-generated signatures and field-collected fault examples. revision: yes

Circularity Check

0 steps flagged

No circularity: augmentation guided by external MCSA domain knowledge

full rationale

The paper defines SGDA as synthesizing frequency-domain anomalies in healthy signals using established Motor Current Signature Analysis physics rules. This is an input from external domain knowledge rather than a self-referential definition or a fitted parameter renamed as a prediction. No equations or steps in the abstract reduce the claimed diagnostic gains to quantities defined by the method's own outputs. The hybrid supervised-unsupervised claim rests on empirical evaluation against real faults, not on internal consistency alone. This is the common case of a self-contained method against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that frequency signatures derived from the motor physics model can be used to synthesize representative faults without further empirical validation.

axioms (1)
  • domain assumption Motor Current Signature Analysis supplies reliable frequency patterns that can be added to healthy signals to produce physically plausible faults.
    Invoked when the method states that SGDA is guided by Motor Current Signature Analysis to create realistic anomalies.

pith-pipeline@v0.9.0 · 5713 in / 1223 out tokens · 51326 ms · 2026-05-19T11:14:09.755251+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages

  1. [1]

    Karmakar, S

    S. Karmakar, S. Chattopadhyay, M. Mitra, S. Sengupta, Induction Motor Fault Diagnosis, Springer Singapore, 2016. doi:10.1007/978-981-10-0624-1

  2. [2]

    2020, Results in Physics, 16, 102918, doi:10.1016/j

    M. Khanjani, M. Ezoji, Electrical Fault Detection in Three-Phase Induction Motor Using Deep Network-Based Features of Thermograms, Measurement 173 (2021) 108622. doi:10.1016/j. measurement.2020.108622

  3. [3]

    D. Roy, M. Dutta, A Systematic Review and Research Perspective on Recommender Systems, Journal of Big Data 9 (1) (2022) 59. doi:10.1186/s40537-022-00592-5

  4. [4]

    Q. An, S. Rahman, J. Zhou, J. J. Kang, A Comprehensive Review on Machine Learning in Healthcare Industry: Classification, Restrictions, Opportunities and Challenges, Sensors 23 (4178) (2023). doi:10.3390/s23094178

  5. [5]

    R. S. Peres, X. Jia, J. Lee, K. Sun, A. W. Colombo, J. Barata, Industrial Artificial Intelli- gence in Industry 4.0 - Systematic Review, Challenges and Outlook, IEEE access : practical innovations, open solutions 8 (2020) 220121–220139. doi:10.1109/ACCESS.2020.3042874

  6. [6]

    Ltd., 2023

    Induction Motor Market: Global Industry Trends, Share, Size, Growth, Opportunity and Forecast 2023-2028, IMARC Services Pvt. Ltd., 2023

  7. [7]

    K. M. Siddiqui, K. Sahay, V. K. Giri, Health Monitoring and Fault Diagnosis in Induction Mo- tor, International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering 3 (1) (Jan. 2014)

  8. [8]

    Jeong, E

    Y. Jeong, E. Yang, J. H. Ryu, I. Park, M. Kang, AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly Detection Using Data Degradation Scheme, Proceedings of the ICLR 2023 Workshop on Machine Learning for IoT: Datasets, Perception, and Understanding (2023)

  9. [9]

    S. Lin, R. Clark, R. Birke, S. Sch¨ onborn, N. Trigoni, S. Roberts, Anomaly Detection for Time Series Using VAE-LSTM Hybrid Model, in: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 4322–4326. doi: 10.1109/ICASSP40776.2020.9053558

  10. [10]

    R. Fang, H. Ma, Application of MCSA and SVM to Induction Machine Rotor Fault Diagnosis, in: 2006 6th World Congress on Intelligent Control and Automation, Vol. 2, 2006, pp. 5543–

  11. [11]

    doi:10.1109/WCICA.2006.1714134

  12. [12]

    Annamacharya Institute of Technology and Sciences, Department of Electrical and Electronics Engineering, Electrical machines - ii lecture notes, https://aits-tpt.edu.in/wp-content/ uploads/2018/08/EM-II-Lecture-Notes.pdf , accessed: 2025-06-02 (2018)

  13. [13]

    Veer Surendra Sai University of Technology, Department of Electrical Engineer- ing, Lecture notes on electrical machine-ii, https://www.vssut.ac.in/lecture_notes/ lecture1424353332.pdf, accessed: 2025-06-02 (2015)

  14. [14]

    P. F. Albrecht, J. C. Appiarius, D. K. Sharma, R. M. McCoy, E. L. Owen, Assessment of the Reliability of Motors in Utility Applications - Updated, IEEE Transactions on Energy Conversion 1 (1) (1986) 39–46. doi:10.1109/TEC.1986.4765668. 28

  15. [15]

    G. K. Singh, S. Ahmed Saleh Al Kazzaz, Induction Machine Drive Condition Monitoring and Diagnostic Research—a Survey, Electric Power Systems Research 64 (2) (2003) 145–158. doi:10.1016/S0378-7796(02)00172-4

  16. [16]

    Nandi, H

    S. Nandi, H. Toliyat, X. Li, Condition Monitoring and Fault Diagnosis of Electrical Motors — a Review, IEEE Transactions on Energy Conversion 20 (4) (2005) 719–729. doi:10.1109/ TEC.2005.847955

  17. [17]

    Harris, M

    T. Harris, M. Kotzalas, Advanced Concepts of Bearing Technology: Rolling Bearing Analysis, Fifth Edition, 5th Edition, CRC Press, 2006. doi:10.1201/9781420006582

  18. [18]

    doi:10.1109/TIA

    Report of large motor reliability survey of industrial and commercial installations, part i, IEEE Transactions on Industry Applications IA-21 (4) (1985) 853–864. doi:10.1109/TIA. 1985.349532

  19. [19]

    doi:10.1049/ip-b.1986.0022

    Cameron J.R., Thomson W.T., Dow A.B., Vibration and Current Monitoring for Detecting Airgap Eccentricity in Large Induction Motors, IEE Proceedings B (Electric Power Applica- tions) 133 (3) (1986) 155–163. doi:10.1049/ip-b.1986.0022

  20. [20]

    Barrera-Llanga, J

    K. Barrera-Llanga, J. Burriel-Valencia, A. Sapena-Bano, J. Martinez-Roman, Fault Detection in Induction Machines Using Learning Models and Fourier Spectrum Image Analysis, Sensors 25 (471) (2025). doi:10.3390/s25020471

  21. [21]

    Boushaba, S

    A. Boushaba, S. Cauet, A. Chamroo, E. Etien, L. Rambault, Comparative Study Between Physics-Informed CNN and PCS in Induction Motor Broken Bars MCSA Detection, Sensors 22 (9494) (2022). doi:10.3390/s22239494

  22. [22]

    R. R. Kumar, M. Andriollo, G. Cirrincione, M. Cirrincione, A. Tortella, A Comprehensive Re- view of Conventional and Intelligence-Based Approaches for the Fault Diagnosis and Condition Monitoring of Induction Motors, Energies 15 (8938) (2022). doi:10.3390/en15238938

  23. [23]

    Halder, S

    S. Halder, S. Bhat, D. Zychma, P. Sowa, Broken Rotor Bar Fault Diagnosis Techniques Based on Motor Current Signature Analysis for Induction Motor—a Review, Energies 15 (8569) (2022). doi:10.3390/en15228569

  24. [24]

    Z. Li, F. Liu, W. Yang, S. Peng, J. Zhou, A survey of convolutional neural networks: Analysis, applications, and prospects, IEEE Transactions on Neural Networks and Learning Systems 33 (12) (2022) 6999–7019. doi:10.1109/TNNLS.2021.3084827

  25. [25]

    A. Khan, H. Hwang, H. S. Kim, Synthetic Data Augmentation and Deep Learning for the Fault Diagnosis of Rotating Machines, Mathematics 9 (2336) (2021). doi:10.3390/math9182336

  26. [26]

    Graves, A

    A. Graves, A. Graves, Long short-term memory, Supervised sequence labelling with recurrent neural networks (2012) 37–45

  27. [27]

    Vaswani, N

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, I. Polosukhin, Attention is all you need, in: I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, Vol. 30, Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/pape...

  28. [28]

    Z. Fu, Z. Liu, S. Ping, W. Li, J. Liu, Tra-acgan: A motor bearing fault diagnosis model based on an auxiliary classifier generative adversarial network and transformer network, ISA Transactions 149 (2024) 381–393. doi:https://doi.org/10.1016/j.isatra.2024.03.033. URL https://www.sciencedirect.com/science/article/pii/S0019057824001411

  29. [29]

    M. Gwak, S. Ryu, Y. Park, H.-W. Na, P. Park, Frequency-Domain Data Augmentation of Vibration Data for Fault Diagnosis Using Deep Neural Networks, in: 2022 22nd International Conference on Control, Automation and Systems (ICCAS), 2022, pp. 1588–1591. doi:10. 23919/ICCAS55662.2022.10003718

  30. [30]

    Wright, P

    R. Wright, P. Fajri, X. Fu, A. Asrari, Improved Fault Detection Using Shifting Window Data Augmentation of Induction Motor Current Signals, Energies 17 (3956) (2024). doi:10.3390/ en17163956

  31. [31]

    Zhang, Q

    Q. Zhang, Q. He, J. Qin, J. Duan, Application of Fault Diagnosis Method Combining Finite Element Method and Transfer Learning for Insufficient Turbine Rotor Fault Samples, Entropy. An International and Interdisciplinary Journal of Entropy and Information Studies 25 (414) (2023). doi:10.3390/e25030414

  32. [32]

    L. Ma, Y. Ding, Z. Wang, C. Wang, J. Ma, C. Lu, An Interpretable Data Augmentation Scheme for Machine Fault Diagnosis Based on a Sparsity-Constrained Generative Adversarial Network, Expert Systems with Applications 182 (2021) 115234. doi:10.1016/j.eswa.2021.115234

  33. [33]

    W. Liu, B. Han, A. Zheng, Z. Zheng, Fault Diagnosis for Reducers Based on a Digital Twin, Sensors 24 (2575) (2024). doi:10.3390/s24082575

  34. [34]

    Pasqualotto, M

    D. Pasqualotto, M. Zigliotto, Increasing Feasibility of Neural Network-Based Early Fault De- tection in Induction Motor Drives, IEEE Journal of Emerging and Selected Topics in Power Electronics 10 (2) (2022) 2042–2051. doi:10.1109/JESTPE.2021.3115170

  35. [35]

    P. Xia, Y. Huang, Z. Tao, C. Liu, J. Liu, A Digital Twin-Enhanced Semi-Supervised Frame- work for Motor Fault Diagnosis Based on Phase-Contrastive Current Dot Pattern, Reliability Engineering and System Safety 235 (C) (2023). doi:10.1016/j.ress.2023.109256

  36. [36]

    Zhang, F

    C. Zhang, F. Qin, W. Zhao, J. Li, T. Liu, Research on Rolling Bearing Fault Diagnosis Based on Digital Twin Data and Improved ConvNext, Sensors 23 (5334) (2023). doi:10.3390/ s23115334

  37. [37]

    D.-Y. Kim, A. B. Kareem, D. Domingo, B.-C. Shin, J.-W. Hur, Advanced data augmentation techniques for enhanced fault diagnosis in industrial centrifugal pumps, Journal of Sensor and Actuator Networks 13 (60) (2024). doi:10.3390/jsan13050060. 30 Appendix A. MCSA-based Fault Frequency Extraction Formulas • Rotor Bar Defect Frequencies: f = f1 (1 ± 2ns) , wh...