Deep Fault Diagnosis for Rotating Machinery with Scarce Labeled Samples

Jing Tian; Jing Zhang; Tao Wen; Xiaobin Xu; Xiaohui Yang; Yong Rao

arxiv: 1907.09411 · v1 · pith:DFCGVCNUnew · submitted 2019-07-13 · 📡 eess.SP

Deep Fault Diagnosis for Rotating Machinery with Scarce Labeled Samples

Jing Zhang , Jing Tian , Tao Wen , Xiaohui Yang , Yong Rao , Xiaobin Xu This is my paper

Pith reviewed 2026-05-24 22:07 UTC · model grok-4.3

classification 📡 eess.SP

keywords fault diagnosisrotating machineryscarce labeled samplessupport vector machineconvolutional neural networkpseudo-labelingspectrogramvibration signal

0 comments

The pith

SVM models can label extra samples from spectrogram features to let a 2D CNN diagnose rotating machinery faults more accurately when labeled data is scarce.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes DFD to handle fault diagnosis in rotating machinery when only scarce labeled vibration samples exist. It computes STFT spectrograms, builds a feature pool, trains multiple SVMs on different feature subsets, and selects the best ones via validation performance. Those SVMs then assign pseudo-labels to unlabeled samples, which are merged with the original labels to create an augmented training set. A 2D CNN is finally trained on this set to learn improved features and a classifier. A sympathetic reader would care because early fault detection prevents costly breakdowns, yet collecting large labeled datasets is expensive in industrial settings.

Core claim

DFD works in three phases: spectrograms of raw signals yield a pool of time-frequency features; candidate SVMs trained on feature combinations with scarce labels are ranked on a validation set and the top models predict labels for unlabeled data; the resulting augmented training set trains a 2D CNN that outperforms both the original SVMs and a vanilla CNN trained only on the scarce labels.

What carries the argument

The augmented training set (ATS) formed by combining scarce labeled samples with pseudo-label predictions from the highest-performing SVM models.

If this is right

The CNN learns more discriminative features than either the SVMs or a CNN trained solely on scarce labels.
Diagnostic performance exceeds that of the selected SVM models alone.
The overall pipeline remains computationally efficient enough for real-time monitoring.
The method transfers diagnostic expertise encoded in hand-crafted features into the deep network without requiring additional manual labeling.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pseudo-labeling step could be applied to other vibration or sensor-based classification problems where domain-specific features are already well studied.
If SVM predictions vary across runs, adding a confidence threshold or ensemble voting on the pseudo-labels might reduce noise in the augmented set.
Extending the feature pool with additional signal-processing techniques could increase the chance of selecting even stronger SVM teachers.

Load-bearing premise

The predictions from the selected SVM models on unlabeled samples are accurate enough to serve as reliable pseudo-labels that improve CNN training rather than introduce harmful errors.

What would settle it

If a CNN trained on the augmented training set shows lower test accuracy than a vanilla CNN trained only on the original scarce labeled samples, the central claim does not hold.

Figures

Figures reproduced from arXiv: 1907.09411 by Jing Tian, Jing Zhang, Tao Wen, Xiaobin Xu, Xiaohui Yang, Yong Rao.

**Figure 2.** Figure 2: (a)The feature candidate pool F (b) The pretrained model on F (c) The model candidate pool 3. The proposed deep CNN architecture Different from prior work which uses the raw vibration wave signal as input, we leverage its spectrograms as input which has good representation ability in both time and frequency domain. Therefore, we design a 2D CNN based deep network structure. Specifically, the proposed arc… view at source ↗

**Figure 4.** Figure 4: (a) Exemplar waveforms (b) Exemplar spectrograms [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: (a) Visualization of Tim-Var (b) Visualization of Fre-Var [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: The inference times with different settings of batch size in case 1 [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 9.** Figure 9: The spectrograms corresponding to the above waveform [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 7.** Figure 7: The test platform of Bearing Fault dataset [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 10.** Figure 10: a) Visualization of Tim-Var (b) Visualization of Fre-Mea [PITH_FULL_IMAGE:figures/full_fig_p008_10.png] view at source ↗

read the original abstract

Early and accurately detecting faults in rotating machinery is crucial for operation safety of the modern manufacturing system. In this paper, we proposed a novel Deep fault diagnosis (DFD) method for rotating machinery with scarce labeled samples. DFD tackles the challenging problem by transferring knowledge from shallow models, which is based on the idea that shallow models trained with different hand-crafted features can reveal the latent prior knowledge and diagnostic expertise and have good generalization ability even with scarce labeled samples. DFD can be divided into three phases. First, a spectrogram of the raw vibration signal is calculated by applying a Short-time Fourier transform (STFT). From those spectrograms, discriminative time-frequency domain features can be extracted and used to form a feature pool. Then, several candidate Support vector machine (SVM) models are trained with different combinations of features in the feature pool with scarce labeled samples. By evaluating the pretrained SVM models on the validation set, the most discriminative features and best-performed SVM models can be selected, which are used to make predictions on the unlabeled samples. The predicted labels reserve the expert knowledge originally carried by the SVM model. They are combined together with the scarce fine labeled samples to form an Augmented training set (ATS). Finally, a novel 2D deep Convolutional neural network (CNN) model is trained on the ATS to learn more discriminative features and a better classifier. Experimental results on two fault diagnosis datasets demonstrate the effectiveness of the proposed DFD, which achieves better performance than SVM models and the vanilla deep CNN model trained on scarce labeled samples. Moreover, it is computationally efficient and is promising for real-time rotating machinery fault diagnosis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper describes a pipeline using STFT features, SVM ensemble for pseudo-labeling, then 2D CNN, but the abstract supplies no numbers or checks on whether the pseudo-labels are reliable.

read the letter

The main takeaway is a practical three-phase setup for rotating machinery fault diagnosis when labeled samples are scarce. Compute STFT spectrograms from vibration signals, build a feature pool, train multiple SVMs on different feature subsets with the few labels, pick the best performers on a validation split, apply those SVMs to label the rest of the data, and feed the augmented set to a 2D CNN. The abstract says this beats plain SVM and a vanilla CNN on two datasets and runs fast enough for real-time use.

Referee Report

2 major / 2 minor

Summary. The paper proposes a Deep Fault Diagnosis (DFD) method for rotating machinery under scarce labeled samples. It computes STFT spectrograms, extracts a pool of time-frequency features, trains multiple SVMs on different feature combinations using the limited labels, selects the best SVMs via validation performance, generates pseudo-labels on unlabeled samples, augments the training set, and trains a 2D CNN on the augmented set. The central claim is that this yields better fault diagnosis performance than standalone SVMs or a vanilla CNN trained only on the scarce labels, demonstrated on two datasets.

Significance. If the empirical gains are robust, the approach offers a practical semi-supervised strategy for industrial fault diagnosis by injecting diagnostic expertise from shallow models into deep learning when labels are expensive to obtain. It targets a real constraint in rotating machinery monitoring and could inform other domains with limited supervision, provided the pseudo-label mechanism is shown to be reliable rather than a source of noise.

major comments (2)

[method description (DFD phases)] The method description (three phases of DFD): the performance gain over the vanilla CNN is attributed to the pseudo-labels generated by the selected SVMs, yet no measurement of pseudo-label accuracy on the unlabeled samples is reported, nor is there an ablation that trains the CNN with and without the augmented set. With very few labeled samples the validation set used for SVM selection is necessarily small, so the assumption that the selected SVMs produce sufficiently accurate pseudo-labels on the target distribution is unverified and load-bearing for the central claim.
[Experimental results] Experimental results section: the abstract and results assert superior performance on two fault diagnosis datasets, but the manuscript supplies no concrete metrics (accuracy, F1, etc.), sample sizes, number of labeled vs. unlabeled samples, cross-validation protocol, or statistical significance tests. Without these details the central empirical claim cannot be evaluated.

minor comments (2)

[abstract] The abstract states 'a novel 2D deep Convolutional neural network (CNN) model' with inconsistent capitalization of 'Convolutional'.
[method description] Notation for the Augmented training set (ATS) is introduced but not used consistently in later sections when describing the CNN training.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the paper accordingly.

read point-by-point responses

Referee: [method description (DFD phases)] The method description (three phases of DFD): the performance gain over the vanilla CNN is attributed to the pseudo-labels generated by the selected SVMs, yet no measurement of pseudo-label accuracy on the unlabeled samples is reported, nor is there an ablation that trains the CNN with and without the augmented set. With very few labeled samples the validation set used for SVM selection is necessarily small, so the assumption that the selected SVMs produce sufficiently accurate pseudo-labels on the target distribution is unverified and load-bearing for the central claim.

Authors: We agree an ablation study would strengthen the evidence. We will add a comparison of CNN performance trained on scarce labels alone versus the augmented set in the revision. Direct pseudo-label accuracy cannot be computed without ground-truth labels on the unlabeled data; downstream CNN gains provide the supporting evidence. SVM selection employs k-fold cross-validation on the labeled samples to address the small validation set concern. revision: partial
Referee: [Experimental results] Experimental results section: the abstract and results assert superior performance on two fault diagnosis datasets, but the manuscript supplies no concrete metrics (accuracy, F1, etc.), sample sizes, number of labeled vs. unlabeled samples, cross-validation protocol, or statistical significance tests. Without these details the central empirical claim cannot be evaluated.

Authors: We will revise the experimental section to explicitly report all metrics (accuracy, F1), sample sizes (labeled vs. unlabeled per dataset), the cross-validation protocol, and any statistical significance tests. These details exist in our experiments and will be clearly presented. revision: yes

standing simulated objections not resolved

Direct measurement of pseudo-label accuracy on the unlabeled samples, as ground-truth labels are unavailable by definition.

Circularity Check

0 steps flagged

No significant circularity; empirical pipeline validated externally

full rationale

The paper describes a three-phase empirical method (STFT spectrograms, multi-SVM feature selection for pseudo-labeling unlabeled samples, then CNN training on the augmented set) whose performance claims rest on experiments on two external fault diagnosis datasets. No equations, derivations, or self-citations reduce the claimed gains to quantities defined by the method itself. The pseudo-label accuracy assumption is a validity risk but does not create circularity in the derivation chain. This is the most common honest finding for standard ML pipelines.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method depends on the domain assumption that shallow SVMs trained on different feature subsets capture transferable diagnostic expertise even with scarce labels; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Shallow models trained with different hand-crafted features can reveal the latent prior knowledge and diagnostic expertise and have good generalization ability even with scarce labeled samples.
This premise is stated directly as the basis for transferring knowledge from SVMs to the CNN.

pith-pipeline@v0.9.0 · 5834 in / 1258 out tokens · 19838 ms · 2026-05-24T22:07:54.137348+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages

[1]

A sparse auto encoder- based deep neural network approach for induction motor faults classiﬁcation

W.J. Sun, S.Y. Shao, R. Zhao, et al. , “A sparse auto encoder- based deep neural network approach for induction motor faults classiﬁcation”, Measurement, Vol.89, pp.171–178, 2016

work page 2016
[2]

Real-time motor fault detection by 1D convolutional neural networks

T. Ince, S. Kiranyaz, L. Eren, et al. , “Real-time motor fault detection by 1D convolutional neural networks”, IEEE Trans- actions on Industrial Electronics , Vol.63, No.11, pp.7067–7075, 2016

work page 2016
[3]

Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings

M. Gan, C. Wang and C.A. Zhu, “Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings”, Me- chanical Systems and Signal Processing , Vol.72–73, pp.92–104, 2016

work page 2016
[4]

An enhanced bispectrum technique with auxiliary frequency injection for induction mo- tor health condition monitoring

D.Z. Li, W. Wang and F. Ismailm, “An enhanced bispectrum technique with auxiliary frequency injection for induction mo- tor health condition monitoring”, IEEE Transactions on In- strumentation & Measurement , Vol.64, No.10, pp.2679–2687, 2015

work page 2015
[5]

Challenges in the indus- trial applications of fault diagnostic systems

S. Dash and V. Venkatasubramanian, “Challenges in the indus- trial applications of fault diagnostic systems”, Computers and Chemical Engineering, Vol.24, No.2–7, pp.785–791, 2000

work page 2000
[6]

Support vector machine in machine condition monitoring and fault diagnosis

A. Widodo and B.S. Yang, “Support vector machine in machine condition monitoring and fault diagnosis”, Mechanical Systems and Signal Processing , Vol.21, No.6, pp.2560–2574, 2007

work page 2007
[7]

A new approach to intelligent fault diagnosis of rotating machinery

Y.G. Lei, Z.J. He and Y.Y. Zi, “A new approach to intelligent fault diagnosis of rotating machinery”,Expert Systems with Ap- plications, Vol.35, No.4, pp.1593–1600, 2008

work page 2008
[8]

Fault diagnosis for rotating machinery with scarce labeled samples: a Deep CNN method based on knowledge-transferring from shallow models

J. Zhang, D.Q. Zhang, M.Y. Yang, et al. , “Fault diagnosis for rotating machinery with scarce labeled samples: a Deep CNN method based on knowledge-transferring from shallow models”, International Conference on Control, Automation and Infor- mation Sciences, Hangzhou, China, pp.482–487, 2018

work page 2018
[9]

Bearing fault detection via stator current noise cancellation and statistical control

W. Zhou, T.G. Habetler and R.G. Harley, “ Bearing fault detection via stator current noise cancellation and statistical control”,IEEE Transactions on Industrial Electronics , Vol.55, No.12, pp.4260–4269, 2008

work page 2008
[10]

Fault diagnosis of rolling element bearing using time-domain features and neural networks

B. Sreejith, A.K. Verma and A. Srividya, “ Fault diagnosis of rolling element bearing using time-domain features and neural networks”,IEEE Region 10 and the Third International Con- ference on Industrial and Information Systems , Peradeniya,Sri Lanka, pp.1–6,2009

work page 2009
[11]

Diesel engine fault di- agnosis using intrinsic time-scale decomposition and multistage Adaboost relevance vector machine

Y. Liu, J.H. Zhang, K.J. Qin, et al. , “Diesel engine fault di- agnosis using intrinsic time-scale decomposition and multistage Adaboost relevance vector machine”, Proceedings of the Insti- tution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, Vol.232, No.5, pp.881–894, 2018

work page 2018
[12]

Real-time fault diag- nosis for gas turbine generator systems using extreme learning machin

P.K. Wong, Z.X. Yang, C.M Vong, et al., “Real-time fault diag- nosis for gas turbine generator systems using extreme learning machin”, Neurocomputing, Vol.128, pp.249–257, 2014

work page 2014
[13]

Gearbox fault iden- tiﬁcation and classiﬁcation with convolutional neural net- works

Z.Q. Chen, C. Li and R.V. Sanchez, “Gearbox fault iden- tiﬁcation and classiﬁcation with convolutional neural net- works”,Shock and Vibration , Vol.2015, Article ID 390134, 10 pages, 2015

work page 2015
[14]

Multivariate empirical mode decomposition and its application to fault diagnosis of rolling bearing

Y. Lv, R. Yuan and G.B. Song, “Multivariate empirical mode decomposition and its application to fault diagnosis of rolling bearing”, Mechanical Systems and Signal Processing , Vol.81, pp.219–234, 2016

work page 2016
[15]

Enhanced empirical wavelet transform based time-frequency analysis and its appli- cation to Rolling Bearing Fault Diagnosis

J.D. Zheng, H.Y. Pan, X.L Qi, et al. , “Enhanced empirical wavelet transform based time-frequency analysis and its appli- cation to Rolling Bearing Fault Diagnosis”, Acta Electronica Sinica, Vol.46, No.2, pp.358–364, 2018.(in Chinese)

work page 2018
[16]

Rolling element bearing de- fect detection using the generalized synchrosqueezing transform guided by timeCfrequency ridge enhancement

C. Li, V. Sanchez, G. Zurita, et al., “Rolling element bearing de- fect detection using the generalized synchrosqueezing transform guided by timeCfrequency ridge enhancement”, Isa Transac- tions, Vol.60, pp.274–284, 2016

work page 2016
[17]

Rolling bearing fault diagnosis under variable conditions using LMD-SVD and extreme learn- ing machine

Y. Tian, J. Ma, C. Lu, et al. , “Rolling bearing fault diagnosis under variable conditions using LMD-SVD and extreme learn- ing machine”, Mechanism and Machine Theory , Vol.90, pp.175– 186, 2015

work page 2015
[18]

Support vector machines

M.A. Hearst, S.T. Dumais, E. Osman, et al. , “Support vector machines”, IEEE Intelligent Systems , Vol.13, No.4, pp.18–28, 1998

work page 1998
[19]

A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized SVM

X.Y. Zhang, Y.T. Liang, J.Z. Zhou, et al. , “A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized SVM”, Measure- ment, Vol.69, pp.164–179, 2015

work page 2015
[20]

A new process in- dustry fault diagnosis algorithm based on ensemble improved binary-tree SVM

A.N. Wang, M. Sha, L.M. Liu, et al. , “A new process in- dustry fault diagnosis algorithm based on ensemble improved binary-tree SVM”, Chinese Journal of Electronics , Vol.24, No.2, pp.258–262, 2015

work page 2015
[21]

A decision-theoretic generalization of on-line learning and an application to boosting

Y. Freund and R.E. Robert, “A decision-theoretic generalization of on-line learning and an application to boosting”, Journal of computer and system sciences , Vol.55, No.1, pp.119–139, 1997

work page 1997
[22]

Boosting feature selection using information metric for classiﬁcation

H.W. Liu, L. Liu and H.J. Zhang, “Boosting feature selection using information metric for classiﬁcation”, Neurocomputing, Vol.73, No.1–3,pp.295–303, 2009

work page 2009
[23]

Extreme learning machine: Theory and applications

G.B. Huang, Q.Y. Zhu and C.K. Siew, “Extreme learning machine: Theory and applications”, Neurocomputing, Vol.70, No.1–3, pp.489–501, 2006

work page 2006
[24]

A fast learning algo- rithm for deep belief nets

G.E. Hinton, S. Osindero and Y.W. Teh, “A fast learning algo- rithm for deep belief nets”, Neural computation, Vol.18, No.7, pp.1527–1554, 2006

work page 2006
[25]

Rolling bear- ing fault diagnosis using an optimization deep belief net- work

H.D. Shao, H.K. Jiang, X. Zhang, et al. , “Rolling bear- ing fault diagnosis using an optimization deep belief net- work”,Measurement Science and Technology , Vol.26, No.11, 2015

work page 2015
[26]

Convolutional networks for images, speech, and time series

Y. Lecun and Y. Bengio, “Convolutional networks for images, speech, and time series”, The handbook of brain theory and neu- ral networks,1995

work page 1995
[27]

A novel separability ob- jective function in CNN for feature extraction of SAR im- ages

F. Gao, M. Wang, J. Wang, et al., “A novel separability ob- jective function in CNN for feature extraction of SAR im- ages”,Chinese Journal of Electronics , Vol.28, No.2, pp.423– 429,2019

work page 2019
[28]

CNN feature boosted seqSLAM for real-Time loop closure detection

D.D. Bai, C.Q. Wang, B. Zhang, et al. , “CNN feature boosted seqSLAM for real-Time loop closure detection”,Chinese Journal of Electronics, Vol.27, No.3, pp.488–499, 2018

work page 2018
[29]

Research of facial beauty prediction based on deep convolutional features using double activation layer

J.Y. Gan, Y.K. Zhai, Y. Huang, et al. , “Research of facial beauty prediction based on deep convolutional features using double activation layer”,Acta Electronica Sinica , Vol.47, No.3, pp.636–642, 2019.(in Chinese)

work page 2019
[30]

Recent advances in deep learning for speech research at Microsoft

L. Deng, J.Y Li J.T. Huang, et al. , “Recent advances in deep learning for speech research at Microsoft”, IEEE Interna- tional Conference on Acoustics , Vancouver, British Columbia, Canada, pp.8604–8608,2013

work page 2013
[31]

ImageNet Clas- siﬁcation with Deep Convolutional Neural Networks

A. Krizhevsky, I. Sutskever and G.E. Hinton, “ImageNet Clas- siﬁcation with Deep Convolutional Neural Networks”, Interna- tional Conference on Neural Information Processing Systems , Doha, Qatar, pp.1097–1105,2012

work page 2012
[32]

Faster R-CNN: towards real-time object detection with region proposal networks

S.Q. Ren, K.M. He, R. Girshick, et al., “Faster R-CNN: towards real-time object detection with region proposal networks”, In- ternational Conference on Neural Information Processing Sys- tems, Kuching, Malaysia, pp.91–99,2015

work page 2015
[33]

Jia, Y.G

F. Jia, Y.G. Lei, J. Lin, et al., “Deep neural networks: A promis- ing tool for fault characteristic mining and intelligent diagnosis Deep Fault Diagnosis for Rotating Machinery with Scarce Labeled Samples 11 of rotating machinery with massive data”, Mechanical Systems and Signal Processing , Vol.72–73, pp.303–315, 2016

work page 2016
[34]

Fault diagnosis for rotating ma- chinery using multiple sensors and convolutional neural net- works

M. Xia, T. Li, L. Xu, et al. , “Fault diagnosis for rotating ma- chinery using multiple sensors and convolutional neural net- works”, IEEE/ASME Transactions on Mechatronics , Vol.23, No.1, pp.101–110, 2017

work page 2017
[35]

Automatic multi- fault recognition in TFDS based on convolutional neural net- work

J.H. Sun, Z.W. Xiao and Y.X. Xie, “Automatic multi- fault recognition in TFDS based on convolutional neural net- work”,Neurocomputing, Vol.222, pp.127–136, 2017

work page 2017
[36]

Ultrasonic signal classiﬁcation and imaging system for composite materials via deep convolutional neural networks

M. Meng, Y.J. Chua, E. Wouterson, et al. , “Ultrasonic signal classiﬁcation and imaging system for composite materials via deep convolutional neural networks”, Neurocomputing, Vol.257, pp.128–135, 2017

work page 2017
[37]

Binary coding of speech spectrograms using a deep auto-encoder

L. Deng, M.L. Seltzer, D. Yu, et al. , “Binary coding of speech spectrograms using a deep auto-encoder”, 11th Annual Con- ference of the International Speech Communication Associa- tion,Makuhari,Japan,2010

work page 2010
[38]

A shallow net- work with combined pooling for fast traﬃc sign recogni- tion

J.M. Zhang, Q.Q. Huang, H.L. Wu, et al. , “A shallow net- work with combined pooling for fast traﬃc sign recogni- tion”,Information, Vol.8, No.2, pp.45, 2017

work page 2017
[39]

How does batch normalization help optimization?

S. Santurkar, D. Tsipras, A. Ilyas, et al. , “How does batch normalization help optimization?”, Advances in Neural Infor- mation Processing Systems , Montreal,Canada, pp.2483–2493, 2018

work page 2018
[40]

Caﬀe: Convolu- tional architecture for fast feature embedding

Y.Q. Jia, E. Shelhamer, J. Donahue, et al. , “Caﬀe: Convolu- tional architecture for fast feature embedding”,Proc.of the 22nd ACM international conference on Multimedia , Orlando,Florida USA, pp.675–678, 2014

work page 2014
[41]

Visualizing data using t-SNE

L.V.D. Maaten and G. Hinton, “Visualizing data using t-SNE”, Journal of machine learning research , Vol.9, No.Nov, pp.2579– 2605, 2008

work page 2008
[42]

A multimodal feature fusion- based deep learning method for online fault diagnosis of rotating machinery

F.N. Zhou, P. Hu, S. Yang, et al., “A multimodal feature fusion- based deep learning method for online fault diagnosis of rotating machinery”, Sensors, Vol.18, No.10, pp.3521, 2018

work page 2018
[43]

Research on com- bined intelligent fault diagnostic method based on CELCD and MFVPMCD

H.Y. Pan, J.D. Zheng, Y. Yang, et al. , “Research on com- bined intelligent fault diagnostic method based on CELCD and MFVPMCD”, Acta Electronica Sinica , Vol.45, No.3, pp.546– 551, 2017.(in Chinese)

work page 2017

[1] [1]

A sparse auto encoder- based deep neural network approach for induction motor faults classiﬁcation

W.J. Sun, S.Y. Shao, R. Zhao, et al. , “A sparse auto encoder- based deep neural network approach for induction motor faults classiﬁcation”, Measurement, Vol.89, pp.171–178, 2016

work page 2016

[2] [2]

Real-time motor fault detection by 1D convolutional neural networks

T. Ince, S. Kiranyaz, L. Eren, et al. , “Real-time motor fault detection by 1D convolutional neural networks”, IEEE Trans- actions on Industrial Electronics , Vol.63, No.11, pp.7067–7075, 2016

work page 2016

[3] [3]

Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings

M. Gan, C. Wang and C.A. Zhu, “Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings”, Me- chanical Systems and Signal Processing , Vol.72–73, pp.92–104, 2016

work page 2016

[4] [4]

An enhanced bispectrum technique with auxiliary frequency injection for induction mo- tor health condition monitoring

D.Z. Li, W. Wang and F. Ismailm, “An enhanced bispectrum technique with auxiliary frequency injection for induction mo- tor health condition monitoring”, IEEE Transactions on In- strumentation & Measurement , Vol.64, No.10, pp.2679–2687, 2015

work page 2015

[5] [5]

Challenges in the indus- trial applications of fault diagnostic systems

S. Dash and V. Venkatasubramanian, “Challenges in the indus- trial applications of fault diagnostic systems”, Computers and Chemical Engineering, Vol.24, No.2–7, pp.785–791, 2000

work page 2000

[6] [6]

Support vector machine in machine condition monitoring and fault diagnosis

A. Widodo and B.S. Yang, “Support vector machine in machine condition monitoring and fault diagnosis”, Mechanical Systems and Signal Processing , Vol.21, No.6, pp.2560–2574, 2007

work page 2007

[7] [7]

A new approach to intelligent fault diagnosis of rotating machinery

Y.G. Lei, Z.J. He and Y.Y. Zi, “A new approach to intelligent fault diagnosis of rotating machinery”,Expert Systems with Ap- plications, Vol.35, No.4, pp.1593–1600, 2008

work page 2008

[8] [8]

Fault diagnosis for rotating machinery with scarce labeled samples: a Deep CNN method based on knowledge-transferring from shallow models

J. Zhang, D.Q. Zhang, M.Y. Yang, et al. , “Fault diagnosis for rotating machinery with scarce labeled samples: a Deep CNN method based on knowledge-transferring from shallow models”, International Conference on Control, Automation and Infor- mation Sciences, Hangzhou, China, pp.482–487, 2018

work page 2018

[9] [9]

Bearing fault detection via stator current noise cancellation and statistical control

W. Zhou, T.G. Habetler and R.G. Harley, “ Bearing fault detection via stator current noise cancellation and statistical control”,IEEE Transactions on Industrial Electronics , Vol.55, No.12, pp.4260–4269, 2008

work page 2008

[10] [10]

Fault diagnosis of rolling element bearing using time-domain features and neural networks

B. Sreejith, A.K. Verma and A. Srividya, “ Fault diagnosis of rolling element bearing using time-domain features and neural networks”,IEEE Region 10 and the Third International Con- ference on Industrial and Information Systems , Peradeniya,Sri Lanka, pp.1–6,2009

work page 2009

[11] [11]

Diesel engine fault di- agnosis using intrinsic time-scale decomposition and multistage Adaboost relevance vector machine

Y. Liu, J.H. Zhang, K.J. Qin, et al. , “Diesel engine fault di- agnosis using intrinsic time-scale decomposition and multistage Adaboost relevance vector machine”, Proceedings of the Insti- tution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science, Vol.232, No.5, pp.881–894, 2018

work page 2018

[12] [12]

Real-time fault diag- nosis for gas turbine generator systems using extreme learning machin

P.K. Wong, Z.X. Yang, C.M Vong, et al., “Real-time fault diag- nosis for gas turbine generator systems using extreme learning machin”, Neurocomputing, Vol.128, pp.249–257, 2014

work page 2014

[13] [13]

Gearbox fault iden- tiﬁcation and classiﬁcation with convolutional neural net- works

Z.Q. Chen, C. Li and R.V. Sanchez, “Gearbox fault iden- tiﬁcation and classiﬁcation with convolutional neural net- works”,Shock and Vibration , Vol.2015, Article ID 390134, 10 pages, 2015

work page 2015

[14] [14]

Multivariate empirical mode decomposition and its application to fault diagnosis of rolling bearing

Y. Lv, R. Yuan and G.B. Song, “Multivariate empirical mode decomposition and its application to fault diagnosis of rolling bearing”, Mechanical Systems and Signal Processing , Vol.81, pp.219–234, 2016

work page 2016

[15] [15]

Enhanced empirical wavelet transform based time-frequency analysis and its appli- cation to Rolling Bearing Fault Diagnosis

J.D. Zheng, H.Y. Pan, X.L Qi, et al. , “Enhanced empirical wavelet transform based time-frequency analysis and its appli- cation to Rolling Bearing Fault Diagnosis”, Acta Electronica Sinica, Vol.46, No.2, pp.358–364, 2018.(in Chinese)

work page 2018

[16] [16]

Rolling element bearing de- fect detection using the generalized synchrosqueezing transform guided by timeCfrequency ridge enhancement

C. Li, V. Sanchez, G. Zurita, et al., “Rolling element bearing de- fect detection using the generalized synchrosqueezing transform guided by timeCfrequency ridge enhancement”, Isa Transac- tions, Vol.60, pp.274–284, 2016

work page 2016

[17] [17]

Rolling bearing fault diagnosis under variable conditions using LMD-SVD and extreme learn- ing machine

Y. Tian, J. Ma, C. Lu, et al. , “Rolling bearing fault diagnosis under variable conditions using LMD-SVD and extreme learn- ing machine”, Mechanism and Machine Theory , Vol.90, pp.175– 186, 2015

work page 2015

[18] [18]

Support vector machines

M.A. Hearst, S.T. Dumais, E. Osman, et al. , “Support vector machines”, IEEE Intelligent Systems , Vol.13, No.4, pp.18–28, 1998

work page 1998

[19] [19]

A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized SVM

X.Y. Zhang, Y.T. Liang, J.Z. Zhou, et al. , “A novel bearing fault diagnosis model integrated permutation entropy, ensemble empirical mode decomposition and optimized SVM”, Measure- ment, Vol.69, pp.164–179, 2015

work page 2015

[20] [20]

A new process in- dustry fault diagnosis algorithm based on ensemble improved binary-tree SVM

A.N. Wang, M. Sha, L.M. Liu, et al. , “A new process in- dustry fault diagnosis algorithm based on ensemble improved binary-tree SVM”, Chinese Journal of Electronics , Vol.24, No.2, pp.258–262, 2015

work page 2015

[21] [21]

A decision-theoretic generalization of on-line learning and an application to boosting

Y. Freund and R.E. Robert, “A decision-theoretic generalization of on-line learning and an application to boosting”, Journal of computer and system sciences , Vol.55, No.1, pp.119–139, 1997

work page 1997

[22] [22]

Boosting feature selection using information metric for classiﬁcation

H.W. Liu, L. Liu and H.J. Zhang, “Boosting feature selection using information metric for classiﬁcation”, Neurocomputing, Vol.73, No.1–3,pp.295–303, 2009

work page 2009

[23] [23]

Extreme learning machine: Theory and applications

G.B. Huang, Q.Y. Zhu and C.K. Siew, “Extreme learning machine: Theory and applications”, Neurocomputing, Vol.70, No.1–3, pp.489–501, 2006

work page 2006

[24] [24]

A fast learning algo- rithm for deep belief nets

G.E. Hinton, S. Osindero and Y.W. Teh, “A fast learning algo- rithm for deep belief nets”, Neural computation, Vol.18, No.7, pp.1527–1554, 2006

work page 2006

[25] [25]

Rolling bear- ing fault diagnosis using an optimization deep belief net- work

H.D. Shao, H.K. Jiang, X. Zhang, et al. , “Rolling bear- ing fault diagnosis using an optimization deep belief net- work”,Measurement Science and Technology , Vol.26, No.11, 2015

work page 2015

[26] [26]

Convolutional networks for images, speech, and time series

Y. Lecun and Y. Bengio, “Convolutional networks for images, speech, and time series”, The handbook of brain theory and neu- ral networks,1995

work page 1995

[27] [27]

A novel separability ob- jective function in CNN for feature extraction of SAR im- ages

F. Gao, M. Wang, J. Wang, et al., “A novel separability ob- jective function in CNN for feature extraction of SAR im- ages”,Chinese Journal of Electronics , Vol.28, No.2, pp.423– 429,2019

work page 2019

[28] [28]

CNN feature boosted seqSLAM for real-Time loop closure detection

D.D. Bai, C.Q. Wang, B. Zhang, et al. , “CNN feature boosted seqSLAM for real-Time loop closure detection”,Chinese Journal of Electronics, Vol.27, No.3, pp.488–499, 2018

work page 2018

[29] [29]

Research of facial beauty prediction based on deep convolutional features using double activation layer

J.Y. Gan, Y.K. Zhai, Y. Huang, et al. , “Research of facial beauty prediction based on deep convolutional features using double activation layer”,Acta Electronica Sinica , Vol.47, No.3, pp.636–642, 2019.(in Chinese)

work page 2019

[30] [30]

Recent advances in deep learning for speech research at Microsoft

L. Deng, J.Y Li J.T. Huang, et al. , “Recent advances in deep learning for speech research at Microsoft”, IEEE Interna- tional Conference on Acoustics , Vancouver, British Columbia, Canada, pp.8604–8608,2013

work page 2013

[31] [31]

ImageNet Clas- siﬁcation with Deep Convolutional Neural Networks

A. Krizhevsky, I. Sutskever and G.E. Hinton, “ImageNet Clas- siﬁcation with Deep Convolutional Neural Networks”, Interna- tional Conference on Neural Information Processing Systems , Doha, Qatar, pp.1097–1105,2012

work page 2012

[32] [32]

Faster R-CNN: towards real-time object detection with region proposal networks

S.Q. Ren, K.M. He, R. Girshick, et al., “Faster R-CNN: towards real-time object detection with region proposal networks”, In- ternational Conference on Neural Information Processing Sys- tems, Kuching, Malaysia, pp.91–99,2015

work page 2015

[33] [33]

Jia, Y.G

F. Jia, Y.G. Lei, J. Lin, et al., “Deep neural networks: A promis- ing tool for fault characteristic mining and intelligent diagnosis Deep Fault Diagnosis for Rotating Machinery with Scarce Labeled Samples 11 of rotating machinery with massive data”, Mechanical Systems and Signal Processing , Vol.72–73, pp.303–315, 2016

work page 2016

[34] [34]

Fault diagnosis for rotating ma- chinery using multiple sensors and convolutional neural net- works

M. Xia, T. Li, L. Xu, et al. , “Fault diagnosis for rotating ma- chinery using multiple sensors and convolutional neural net- works”, IEEE/ASME Transactions on Mechatronics , Vol.23, No.1, pp.101–110, 2017

work page 2017

[35] [35]

Automatic multi- fault recognition in TFDS based on convolutional neural net- work

J.H. Sun, Z.W. Xiao and Y.X. Xie, “Automatic multi- fault recognition in TFDS based on convolutional neural net- work”,Neurocomputing, Vol.222, pp.127–136, 2017

work page 2017

[36] [36]

Ultrasonic signal classiﬁcation and imaging system for composite materials via deep convolutional neural networks

M. Meng, Y.J. Chua, E. Wouterson, et al. , “Ultrasonic signal classiﬁcation and imaging system for composite materials via deep convolutional neural networks”, Neurocomputing, Vol.257, pp.128–135, 2017

work page 2017

[37] [37]

Binary coding of speech spectrograms using a deep auto-encoder

L. Deng, M.L. Seltzer, D. Yu, et al. , “Binary coding of speech spectrograms using a deep auto-encoder”, 11th Annual Con- ference of the International Speech Communication Associa- tion,Makuhari,Japan,2010

work page 2010

[38] [38]

A shallow net- work with combined pooling for fast traﬃc sign recogni- tion

J.M. Zhang, Q.Q. Huang, H.L. Wu, et al. , “A shallow net- work with combined pooling for fast traﬃc sign recogni- tion”,Information, Vol.8, No.2, pp.45, 2017

work page 2017

[39] [39]

How does batch normalization help optimization?

S. Santurkar, D. Tsipras, A. Ilyas, et al. , “How does batch normalization help optimization?”, Advances in Neural Infor- mation Processing Systems , Montreal,Canada, pp.2483–2493, 2018

work page 2018

[40] [40]

Caﬀe: Convolu- tional architecture for fast feature embedding

Y.Q. Jia, E. Shelhamer, J. Donahue, et al. , “Caﬀe: Convolu- tional architecture for fast feature embedding”,Proc.of the 22nd ACM international conference on Multimedia , Orlando,Florida USA, pp.675–678, 2014

work page 2014

[41] [41]

Visualizing data using t-SNE

L.V.D. Maaten and G. Hinton, “Visualizing data using t-SNE”, Journal of machine learning research , Vol.9, No.Nov, pp.2579– 2605, 2008

work page 2008

[42] [42]

A multimodal feature fusion- based deep learning method for online fault diagnosis of rotating machinery

F.N. Zhou, P. Hu, S. Yang, et al., “A multimodal feature fusion- based deep learning method for online fault diagnosis of rotating machinery”, Sensors, Vol.18, No.10, pp.3521, 2018

work page 2018

[43] [43]

Research on com- bined intelligent fault diagnostic method based on CELCD and MFVPMCD

H.Y. Pan, J.D. Zheng, Y. Yang, et al. , “Research on com- bined intelligent fault diagnostic method based on CELCD and MFVPMCD”, Acta Electronica Sinica , Vol.45, No.3, pp.546– 551, 2017.(in Chinese)

work page 2017