Remaining Useful Lifetime Prediction via Deep Domain Adaptation

Alp Akcay; Paulo R. de O. da Costa; Uzay Kaymak; Yingqian Zhang

arxiv: 1907.07480 · v1 · pith:ZUFQIVQEnew · submitted 2019-07-17 · 💻 cs.LG · stat.ML

Remaining Useful Lifetime Prediction via Deep Domain Adaptation

Paulo R. de O. da Costa , Alp Akcay , Yingqian Zhang , Uzay Kaymak This is my paper

Pith reviewed 2026-05-24 20:23 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords remaining useful lifedomain adaptationLSTMDANNprognosticsdistribution shifttime serieshealth management

0 comments

The pith

Domain adversarial training on LSTM features produces RUL predictions that hold across different operating conditions without target labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Sensor data for remaining useful lifetime prediction often comes from mismatched distributions due to varying operating conditions and fault modes. Standard models trained on one set of conditions degrade when applied to another because they lack labels for retraining in the new domain. The paper trains an LSTM on time-windowed sequences from a labeled source domain while using adversarial training to make the extracted features indistinguishable from those in an unlabeled target domain. A regressor then predicts RUL from these shared features. Experiments on datasets with shifted conditions show the adapted predictions are more reliable than those from a source-only model.

Core claim

A Domain Adversarial Neural Network combined with LSTM extracts features from time-windowed sensor data such that a domain classifier cannot distinguish source from target while an RUL regressor still performs well on the source; the resulting features support direct RUL regression on the target domain that contains only sensor readings and no failure labels.

What carries the argument

Domain Adversarial Neural Network (DANN) that pits an LSTM feature extractor against a domain classifier while jointly training an RUL regressor on the same features.

If this is right

RUL models can be deployed on equipment operating under new conditions without first collecting complete run-to-failure traces in those conditions.
The same source data can serve multiple target domains provided the sensor modalities remain comparable.
Temporal structure captured by the time-window LSTM is preserved while domain-specific statistics are suppressed.
Prediction reliability improves specifically when distribution shift arises from operating conditions or fault modes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adversarial setup could be tested on other time-series regression tasks such as predicting equipment degradation rates or energy consumption under shifting regimes.
If the learned features prove stable, the approach reduces the cost of labeling new failure events in industrial fleets.
Combining the method with modest amounts of target-domain pseudo-labels might close the remaining gap to fully supervised performance.
Limits would appear if the target domain introduces entirely new sensor noise characteristics not present in the source.

Load-bearing premise

Adversarial training on time-windowed sequences will produce features that remain useful for RUL regression even after the domain classifier is fooled.

What would settle it

On a target dataset with known RUL values, check whether the mean absolute error of the adapted model is lower than that of a source-only LSTM by a margin larger than the variability seen across multiple random seeds.

Figures

Figures reproduced from arXiv: 1907.07480 by Alp Akcay, Paulo R. de O. da Costa, Uzay Kaymak, Yingqian Zhang.

**Figure 1.** Figure 1: LSTM memory cell [51]. In our proposed model, we use LSTM layers to extract temporal features contained in the previous time windows of size Tw before a RUL prediction. In an LSTM, the memory cell ( [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗

**Figure 2.** Figure 2: The proposed domain adaptation deep network architecture. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Normalised sensor values 100 time steps before a failure for each C-MAPPS dataset. The [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: Performance metrics plot. The Scoring performance metric overpenalises positive errors [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: The training procedure of the LSTM-DANN Next, the proposed LSTM-DANN architecture is defined including the number of LSTM and fully connected hidden layers, number of cells in each layer, learning rate and gradient update algorithm. We train using the Rectified Linear Unit (ReLU) as activation function. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: RUL predictions of the TARGET-ONLY, SOURCE-ONLY and LSTM-DANN models [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: RUL predictions of the TARGET-ONLY, SOURCE-ONLY and LSTM-DANN-STD mod [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗

read the original abstract

In Prognostics and Health Management (PHM) sufficient prior observed degradation data is usually critical for Remaining Useful Lifetime (RUL) prediction. Most previous data-driven prediction methods assume that training (source) and testing (target) condition monitoring data have similar distributions. However, due to different operating conditions, fault modes, noise and equipment updates distribution shift exists across different data domains. This shift reduces the performance of predictive models previously built to specific conditions when no observed run-to-failure data is available for retraining. To address this issue, this paper proposes a new data-driven approach for domain adaptation in prognostics using Long Short-Term Neural Networks (LSTM). We use a time window approach to extract temporal information from time-series data in a source domain with observed RUL values and a target domain containing only sensor information. We propose a Domain Adversarial Neural Network (DANN) approach to learn domain-invariant features that can be used to predict the RUL in the target domain. The experimental results show that the proposed method can provide more reliable RUL predictions under datasets with different operating conditions and fault modes. These results suggest that the proposed method offers a promising approach to performing domain adaptation in practical PHM applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a direct DANN-plus-LSTM application to RUL that flags a real PHM problem but leaves the core assumption about aligned features preserving degradation rates untested.

read the letter

Hi colleague, this paper takes the standard DANN setup from Ganin and applies it to LSTM embeddings extracted from time-windowed sensor data for remaining useful life prediction. Source data has RUL labels; target data has only sensors. The feature extractor is trained to fool a domain classifier while a regressor learns RUL from the source only. They test on C-MAPSS subsets that differ in operating conditions and fault modes, and the abstract states the method gives more reliable target predictions. That combination for the RUL task is not already in the cited prior work, so the application itself is new. The practical motivation is also on target: distribution shift really does limit model reuse in industrial settings, and anything that reduces the need for fresh run-to-failure data would be useful. The method follows the usual supervised domain-adaptation pipeline without introducing circularity or invented quantities. The soft spot is the one the stress-test note flags. Adversarial alignment on the LSTM embeddings pushes domain invariance, but RUL depends on how the temporal degradation trajectory maps to remaining life. Nothing in the architecture enforces that this mapping survives the shift, especially when fault modes change. Without target RUL labels or an auxiliary check that the regression signal is preserved, any reported gains rest on an assumption that has not been directly tested. The abstract supplies no numbers, baselines, or statistical detail, so it is impossible to judge from the summary whether the results actually move the needle or simply reflect implementation choices. This is mainly for readers already working on sensor-based prognostics who need to handle condition shifts. It is incremental rather than foundational, so I would not cite it unless I needed a reference for exactly this setup. I would still send it for peer review; the application angle is concrete enough that referees could usefully pressure-test the experimental evidence and the invariance assumption.

Referee Report

2 major / 1 minor

Summary. The paper proposes a domain-adversarial LSTM architecture (DANN) that extracts fixed-length time windows from source-domain sensor sequences with RUL labels and target-domain sequences without labels, trains an LSTM feature extractor adversarially against a domain classifier while regressing RUL only on source labels, and claims that the resulting domain-invariant features yield more reliable RUL predictions on C-MAPSS subsets that differ in operating conditions and fault modes.

Significance. If the central claim holds, the work would provide a practical route to label-free transfer of RUL models across heterogeneous PHM datasets, a setting where run-to-failure data are often unavailable in the target domain.

major comments (2)

[Proposed DANN-LSTM method] The architecture description (time-window LSTM followed by domain classifier on embeddings and RUL regressor on source only) contains no mechanism that aligns degradation trajectories or rates across domains. Because the adversarial loss operates on static embeddings rather than on the mapping from trajectory to remaining life, a shift that alters how sensor evolution maps to RUL can remain after training; the reported gains therefore rest on an untested assumption that feature-level invariance is sufficient.
[Abstract / Experimental results] The abstract asserts that experiments demonstrate improved reliability yet supplies no quantitative metrics, baseline comparisons, dataset sizes, or statistical tests. Without these details the support for the central claim cannot be evaluated.

minor comments (1)

[Method] Notation for the time-window length, LSTM hidden size, and adversarial weighting hyper-parameter should be introduced once and used consistently.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses

Referee: [Proposed DANN-LSTM method] The architecture description (time-window LSTM followed by domain classifier on embeddings and RUL regressor on source only) contains no mechanism that aligns degradation trajectories or rates across domains. Because the adversarial loss operates on static embeddings rather than on the mapping from trajectory to remaining life, a shift that alters how sensor evolution maps to RUL can remain after training; the reported gains therefore rest on an untested assumption that feature-level invariance is sufficient.

Authors: The referee correctly notes that the method performs feature-level alignment via adversarial training on LSTM embeddings extracted from fixed time windows, without an explicit term for aligning degradation rates or trajectory mappings. Our design follows the standard DANN framework, where the LSTM captures temporal dynamics within each window and the adversarial objective encourages domain-invariant representations for the subsequent RUL regressor. We acknowledge that this rests on the assumption that such invariance suffices for the distribution shifts present in the C-MAPSS subsets (different operating conditions and fault modes). The empirical improvements reported in the experiments provide support for this assumption on the evaluated data, but we agree that a more explicit trajectory-alignment mechanism could be investigated. In revision we will add a paragraph in the method and discussion sections clarifying this modeling choice and its limitations. revision: partial
Referee: [Abstract / Experimental results] The abstract asserts that experiments demonstrate improved reliability yet supplies no quantitative metrics, baseline comparisons, dataset sizes, or statistical tests. Without these details the support for the central claim cannot be evaluated.

Authors: We agree that the abstract should be more informative. The full manuscript contains quantitative results on the C-MAPSS datasets, including comparisons against non-adapted baselines and the proposed DANN-LSTM, but these details are not summarized in the abstract. We will revise the abstract to include key performance metrics (e.g., RMSE or score improvements), mention of the datasets and their sizes, and reference to the baselines used. revision: yes

Circularity Check

0 steps flagged

Standard DANN+LSTM domain adaptation pipeline with no self-referential reductions

full rationale

The paper describes a conventional supervised domain-adaptation setup: time-windowed sensor sequences are processed by LSTM, a domain classifier is trained adversarially on the embeddings, and the regressor is supervised solely by source-domain RUL labels. No equations, loss terms, or claimed derivations in the abstract or description reduce the target-domain RUL predictions to quantities fitted on target labels, self-defined by the alignment objective, or imported via self-citation chains. The method is presented as an application of existing DANN techniques rather than a derivation whose outputs are tautological with its inputs. The reported performance therefore rests on empirical generalization rather than construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated beyond the standard assumption that sensor time series contain transferable temporal patterns and that adversarial feature alignment can bridge domain gaps.

pith-pipeline@v0.9.0 · 5751 in / 1008 out tokens · 28515 ms · 2026-05-24T20:23:18.597717+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 6 internal anchors

[1]

Atamuradov, K

V. Atamuradov, K. Medjaher, P. Dersin, B. Lamoureux, N. Zerhouni, Prognostics and health manage- ment for maintenance practitioners-review, implementation and tools evaluation, International Journal of Prognostics and Health Management 8 (2017) 1–31

work page 2017
[2]

A. K. Jardine, D. Lin, D. Banjevic, A review on machinery diagnostics and prognostics implementing condition-based maintenance, Mechanical Systems and Signal Processing 20 (2006) 1483–1510

work page 2006
[3]

Papakostas, P

N. Papakostas, P. Papachatzakis, V. Xanthakis, D. Mourtzis, G. Chryssolouris, An approach to oper- ational aircraft maintenance planning, Decision Support Systems 48 (2010) 604–612

work page 2010
[4]

Cubillo, S

A. Cubillo, S. Perinpanayagam, M. Esperon-Miguez, A review of physics-based models in prognostics: Application to gears and bearings of rotating machinery, 2016

work page 2016
[5]

Y. Lei, N. Li, L. Guo, N. Li, T. Yan, J. Lin, Machinery health prognostics: A systematic review from data acquisition to RUL prediction, 2018

work page 2018
[6]

X. S. Si, W. Wang, C. H. Hu, D. H. Zhou, Remaining useful life estimation - A review on the statistical data driven approaches, European Journal of Operational Research 213 (2011) 1–14

work page 2011
[7]

S. Dong, T. Luo, Bearing degradation process prediction based on the PCA and optimized LS-SVM model, Measurement: Journal of the International Measurement Confederation 46 (2013) 3143–3152

work page 2013
[8]

Benkedjouh, K

T. Benkedjouh, K. Medjaher, N. Zerhouni, S. Rechak, Remaining useful life estimation based on nonlin- ear feature reduction and support vector regression, Engineering Applications of Artiﬁcial Intelligence 26 (2013) 1751–1760

work page 2013
[9]

Faghih-Roohi, S

S. Faghih-Roohi, S. Hajizadeh, A. Nunez, R. Babuska, B. De Schutter, Deep convolutional neural networks for detection of rail surface defects, in: Proceedings of the International Joint Conference on Neural Networks, volume 2016-Octob, pp. 2584–2589

work page 2016
[10]

X. Li, Q. Ding, J. Q. Sun, Remaining useful life estimation in prognostics using deep convolution neural networks, Reliability Engineering and System Safety 172 (2018) 1–11

work page 2018
[11]

Listou Ellefsen, E

A. Listou Ellefsen, E. Bjørlykhaug, V. Æsøy, S. Ushakov, H. Zhang, Remaining useful life predictions for turbofan engine degradation using semi-supervised deep architecture, Reliability Engineering & System Safety 183 (2019) 240–251

work page 2019
[12]

Zheng, K

S. Zheng, K. Ristovski, A. Farahat, C. Gupta, Long Short-Term Memory Network for Remaining Useful Life estimation, in: 2017 IEEE International Conference on Prognostics and Health Management, ICPHM 2017, pp. 88–95

work page 2017
[13]

Y. Wu, M. Yuan, S. Dong, L. Lin, Y. Liu, Neurocomputing Remaining useful life estimation of engineered systems using vanilla LSTM neural networks, Neurocomputing 275 (2018) 167–179

work page 2018
[14]

Y. Hong, W. Q. Meeker, J. D. McCalley, et al., Prediction of remaining life of power transformers based on left truncated and right censored lifetime data, The Annals of Applied Statistics 3 (2009) 857–879

work page 2009
[15]

S. J. Pan, Q. Yang, A Survey on Transfer Learning, IEEE Transactions on Knowledge and Data Engineering 22 (2010) 1345–1359. 28

work page 2010
[16]

Jiang, C

J. Jiang, C. Zhai, Instance weighting for domain adaptation in nlp, in: Proceedings of the 45th annual meeting of the association of computational linguistics, pp. 264–271

work page
[17]

Fernando, A

B. Fernando, A. Habrard, M. Sebban, T. Tuytelaars, Unsupervised visual domain adaptation using subspace alignment, in: Proceedings of the IEEE international conference on computer vision, pp. 2960–2967

work page
[18]

Tzeng, J

E. Tzeng, J. Hoﬀman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, in: Computer Vision and Pattern Recognition (CVPR), volume 1, p. 4

work page
[19]

Ganin, E

Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, V. Lem- pitsky, Domain-adversarial training of neural networks, The Journal of Machine Learning Research 17 (2016) 2096–2030

work page 2016
[20]

Hochreiter, J

S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (1997) 1735–1780

work page 1997
[21]

P. J. Werbos, et al., Backpropagation through time: what it does and how to do it, Proceedings of the IEEE 78 (1990) 1550–1560

work page 1990
[22]

Saxena, K

A. Saxena, K. Goebel, D. Simon, N. Eklund, Damage propagation modeling for aircraft engine run-to- failure simulation, in: 2008 International Conference on Prognostics and Health Management, PHM 2008, pp. 1–9

work page 2008
[23]

D. He, E. Bechhoefer, Development and validation of bearing diagnostic and prognostic tools using hums condition indicators, in: 2008 IEEE Aerospace Conference, pp. 1–8

work page 2008
[24]

E. Zio, F. Di Maio, A data-driven fuzzy approach for predicting the remaining useful life in dynamic failure scenarios of a nuclear system, Reliability Engineering & System Safety 95 (2010) 49–57

work page 2010
[25]

Tian, An artiﬁcial neural network method for remaining useful life prediction of equipment subject to condition monitoring, Journal of Intelligent Manufacturing 23 (2012) 227–237

Z. Tian, An artiﬁcial neural network method for remaining useful life prediction of equipment subject to condition monitoring, Journal of Intelligent Manufacturing 23 (2012) 227–237

work page 2012
[26]

Huang, L

R. Huang, L. Xi, X. Li, C. R. Liu, H. Qiu, J. Lee, Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods, Mechanical systems and signal processing 21 (2007) 193–207

work page 2007
[27]

Bengio, P

Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with gradient descent is diﬃcult, IEEE transactions on neural networks 5 (1994) 157–166

work page 1994
[28]

K. Cho, B. Van Merri¨ enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using rnn encoder-decoder for statistical machine translation, arXiv preprint arXiv:1406.1078 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[29]

M. Yuan, Y. Wu, L. Lin, Fault diagnosis and remaining useful life estimation of aero engine using lstm neural network, in: 2016 IEEE International Conference on Aircraft Utility Systems (AUS), pp. 135–140

work page 2016
[30]

Y. Wu, M. Yuan, S. Dong, L. Lin, Y. Liu, Remaining useful life estimation of engineered systems using vanilla lstm neural networks, Neurocomputing 275 (2018) 167–179

work page 2018
[31]

M. Z. Hossain, F. A. Sohel, M. F. Shiratuddin, H. Laga, A comprehensive survey of deep learning for image captioning., CoRR abs/1810.04020 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[32]

G. S. Babu, P. Zhao, X.-L. Li, Deep convolutional neural network based regression approach for estimation of remaining useful life, in: International conference on database systems for advanced applications, Springer, pp. 214–228

work page
[33]

Huang, A

J. Huang, A. Gretton, K. Borgwardt, B. Sch¨ olkopf, A. J. Smola, Correcting sample selection bias by unlabeled data, in: Advances in neural information processing systems, pp. 601–608

work page
[34]

S. J. Pan, I. W. Tsang, J. T. Kwok, Q. Yang, Domain adaptation via transfer component analysis, IEEE Transactions on Neural Networks 22 (2011) 199–210

work page 2011
[35]

B. Sun, J. Feng, K. Saenko, Return of frustratingly easy domain adaptation., in: AAAI, volume 6, p. 8

work page
[36]

Ben-David, J

S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, J. W. Vaughan, A theory of learning from diﬀerent domains, Machine Learning 79 (2010) 151–175

work page 2010
[37]

M. Long, Y. Cao, J. Wang, M. I. Jordan, Learning transferable features with deep adaptation networks, arXiv preprint arXiv:1502.02791 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[38]

Deep Domain Confusion: Maximizing for Domain Invariance

E. Tzeng, J. Hoﬀman, N. Zhang, K. Saenko, T. Darrell, Deep domain confusion: Maximizing for 29 domain invariance, arXiv preprint arXiv:1412.3474 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[39]

B. Sun, K. Saenko, Deep coral: Correlation alignment for deep domain adaptation, in: European Conference on Computer Vision, Springer, pp. 443–450

work page
[40]

Domain-Adversarial Neural Networks

H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, Domain-adversarial neural net- works, arXiv preprint arXiv:1412.4446 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[41]

Goodfellow, J

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Advances in neural information processing systems, pp. 2672–2680

work page
[42]

Cortes, M

C. Cortes, M. Mohri, Domain adaptation and sample bias correction theory and algorithm for regres- sion, Theoretical Computer Science 519 (2014) 103–126

work page 2014
[43]

Lopez-Paz, J

D. Lopez-Paz, J. M. Hern´ andez-lobato, B. Sch¨ olkopf, Semi-supervised domain adaptation with non- parametric copulas, in: Advances in neural information processing systems, pp. 665–673

work page
[44]

Nikzad-Langerodi, W

R. Nikzad-Langerodi, W. Zellinger, E. Lughofer, S. Saminger-Platz, Domain-invariant partial-least- squares regression, Analytical chemistry 90 (2018) 6693–6701

work page 2018
[45]

Purushotham, W

S. Purushotham, W. Carvalho, T. Nilanon, Y. Liu, Variational Adversarial Deep Domain Adaptation for Health Care Time Series Analysis, 29th Conference on Neural Information Processing Systems (2016)

work page 2016
[46]

W. Lu, B. Liang, Y. Cheng, D. Meng, J. Yang, T. Zhang, Deep model based domain adaptation for fault diagnosis, IEEE Transactions on Industrial Electronics 64 (2017) 2296–2305

work page 2017
[47]

Zhang, G

W. Zhang, G. Peng, C. Li, Y. Chen, Z. Zhang, A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals, Sensors 17 (2017) 425

work page 2017
[48]

X. Li, W. Zhang, Q. Ding, J.-Q. Sun, Multi-layer domain adaptation method for rolling bearing fault diagnosis, Signal Processing 157 (2019) 180–197

work page 2019
[49]

J. Xie, L. Zhang, L. Duan, J. Wang, On cross-domain feature fusion in gearbox fault diagnosis un- der various operating conditions based on transfer component analysis, in: 2016 IEEE International Conference on Prognostics and Health Management (ICPHM), pp. 1–6

work page 2016
[50]

X. Li, W. Zhang, Q. Ding, Cross-domain fault diagnosis of rolling element bearings using deep generative neural networks, IEEE Transactions on Industrial Electronics (2018)

work page 2018
[51]

Olah, Understanding lstm networks, 2015

C. Olah, Understanding lstm networks, 2015

work page 2015
[52]

Srivastava, G

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overﬁtting, The Journal of Machine Learning Research 15 (2014) 1929–1958

work page 2014
[53]

F. O. Heimes, Recurrent neural networks for remaining useful life estimation, in: 2008 International Conference on Prognostics and Health Management, pp. 1–6

work page 2008
[54]

Tieleman, G

T. Tieleman, G. Hinton, Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural networks for machine learning 4 (2012) 26–31

work page 2012
[55]

Chollet, et al., Keras, 2015

F. Chollet, et al., Keras, 2015

work page 2015
[56]

Abadi, et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015

M. Abadi, et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorﬂow.org

work page 2015
[57]

D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[58]

Zhang, P

C. Zhang, P. Lim, A. Qin, K. C. Tan, Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics, IEEE transactions on neural networks and learning systems 28 (2017) 2306–2318. 30

work page 2017

[1] [1]

Atamuradov, K

V. Atamuradov, K. Medjaher, P. Dersin, B. Lamoureux, N. Zerhouni, Prognostics and health manage- ment for maintenance practitioners-review, implementation and tools evaluation, International Journal of Prognostics and Health Management 8 (2017) 1–31

work page 2017

[2] [2]

A. K. Jardine, D. Lin, D. Banjevic, A review on machinery diagnostics and prognostics implementing condition-based maintenance, Mechanical Systems and Signal Processing 20 (2006) 1483–1510

work page 2006

[3] [3]

Papakostas, P

N. Papakostas, P. Papachatzakis, V. Xanthakis, D. Mourtzis, G. Chryssolouris, An approach to oper- ational aircraft maintenance planning, Decision Support Systems 48 (2010) 604–612

work page 2010

[4] [4]

Cubillo, S

A. Cubillo, S. Perinpanayagam, M. Esperon-Miguez, A review of physics-based models in prognostics: Application to gears and bearings of rotating machinery, 2016

work page 2016

[5] [5]

Y. Lei, N. Li, L. Guo, N. Li, T. Yan, J. Lin, Machinery health prognostics: A systematic review from data acquisition to RUL prediction, 2018

work page 2018

[6] [6]

X. S. Si, W. Wang, C. H. Hu, D. H. Zhou, Remaining useful life estimation - A review on the statistical data driven approaches, European Journal of Operational Research 213 (2011) 1–14

work page 2011

[7] [7]

S. Dong, T. Luo, Bearing degradation process prediction based on the PCA and optimized LS-SVM model, Measurement: Journal of the International Measurement Confederation 46 (2013) 3143–3152

work page 2013

[8] [8]

Benkedjouh, K

T. Benkedjouh, K. Medjaher, N. Zerhouni, S. Rechak, Remaining useful life estimation based on nonlin- ear feature reduction and support vector regression, Engineering Applications of Artiﬁcial Intelligence 26 (2013) 1751–1760

work page 2013

[9] [9]

Faghih-Roohi, S

S. Faghih-Roohi, S. Hajizadeh, A. Nunez, R. Babuska, B. De Schutter, Deep convolutional neural networks for detection of rail surface defects, in: Proceedings of the International Joint Conference on Neural Networks, volume 2016-Octob, pp. 2584–2589

work page 2016

[10] [10]

X. Li, Q. Ding, J. Q. Sun, Remaining useful life estimation in prognostics using deep convolution neural networks, Reliability Engineering and System Safety 172 (2018) 1–11

work page 2018

[11] [11]

Listou Ellefsen, E

A. Listou Ellefsen, E. Bjørlykhaug, V. Æsøy, S. Ushakov, H. Zhang, Remaining useful life predictions for turbofan engine degradation using semi-supervised deep architecture, Reliability Engineering & System Safety 183 (2019) 240–251

work page 2019

[12] [12]

Zheng, K

S. Zheng, K. Ristovski, A. Farahat, C. Gupta, Long Short-Term Memory Network for Remaining Useful Life estimation, in: 2017 IEEE International Conference on Prognostics and Health Management, ICPHM 2017, pp. 88–95

work page 2017

[13] [13]

Y. Wu, M. Yuan, S. Dong, L. Lin, Y. Liu, Neurocomputing Remaining useful life estimation of engineered systems using vanilla LSTM neural networks, Neurocomputing 275 (2018) 167–179

work page 2018

[14] [14]

Y. Hong, W. Q. Meeker, J. D. McCalley, et al., Prediction of remaining life of power transformers based on left truncated and right censored lifetime data, The Annals of Applied Statistics 3 (2009) 857–879

work page 2009

[15] [15]

S. J. Pan, Q. Yang, A Survey on Transfer Learning, IEEE Transactions on Knowledge and Data Engineering 22 (2010) 1345–1359. 28

work page 2010

[16] [16]

Jiang, C

J. Jiang, C. Zhai, Instance weighting for domain adaptation in nlp, in: Proceedings of the 45th annual meeting of the association of computational linguistics, pp. 264–271

work page

[17] [17]

Fernando, A

B. Fernando, A. Habrard, M. Sebban, T. Tuytelaars, Unsupervised visual domain adaptation using subspace alignment, in: Proceedings of the IEEE international conference on computer vision, pp. 2960–2967

work page

[18] [18]

Tzeng, J

E. Tzeng, J. Hoﬀman, K. Saenko, T. Darrell, Adversarial discriminative domain adaptation, in: Computer Vision and Pattern Recognition (CVPR), volume 1, p. 4

work page

[19] [19]

Ganin, E

Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, V. Lem- pitsky, Domain-adversarial training of neural networks, The Journal of Machine Learning Research 17 (2016) 2096–2030

work page 2016

[20] [20]

Hochreiter, J

S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (1997) 1735–1780

work page 1997

[21] [21]

P. J. Werbos, et al., Backpropagation through time: what it does and how to do it, Proceedings of the IEEE 78 (1990) 1550–1560

work page 1990

[22] [22]

Saxena, K

A. Saxena, K. Goebel, D. Simon, N. Eklund, Damage propagation modeling for aircraft engine run-to- failure simulation, in: 2008 International Conference on Prognostics and Health Management, PHM 2008, pp. 1–9

work page 2008

[23] [23]

D. He, E. Bechhoefer, Development and validation of bearing diagnostic and prognostic tools using hums condition indicators, in: 2008 IEEE Aerospace Conference, pp. 1–8

work page 2008

[24] [24]

E. Zio, F. Di Maio, A data-driven fuzzy approach for predicting the remaining useful life in dynamic failure scenarios of a nuclear system, Reliability Engineering & System Safety 95 (2010) 49–57

work page 2010

[25] [25]

Tian, An artiﬁcial neural network method for remaining useful life prediction of equipment subject to condition monitoring, Journal of Intelligent Manufacturing 23 (2012) 227–237

Z. Tian, An artiﬁcial neural network method for remaining useful life prediction of equipment subject to condition monitoring, Journal of Intelligent Manufacturing 23 (2012) 227–237

work page 2012

[26] [26]

Huang, L

R. Huang, L. Xi, X. Li, C. R. Liu, H. Qiu, J. Lee, Residual life predictions for ball bearings based on self-organizing map and back propagation neural network methods, Mechanical systems and signal processing 21 (2007) 193–207

work page 2007

[27] [27]

Bengio, P

Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with gradient descent is diﬃcult, IEEE transactions on neural networks 5 (1994) 157–166

work page 1994

[28] [28]

K. Cho, B. Van Merri¨ enboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio, Learning phrase representations using rnn encoder-decoder for statistical machine translation, arXiv preprint arXiv:1406.1078 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[29] [29]

M. Yuan, Y. Wu, L. Lin, Fault diagnosis and remaining useful life estimation of aero engine using lstm neural network, in: 2016 IEEE International Conference on Aircraft Utility Systems (AUS), pp. 135–140

work page 2016

[30] [30]

Y. Wu, M. Yuan, S. Dong, L. Lin, Y. Liu, Remaining useful life estimation of engineered systems using vanilla lstm neural networks, Neurocomputing 275 (2018) 167–179

work page 2018

[31] [31]

M. Z. Hossain, F. A. Sohel, M. F. Shiratuddin, H. Laga, A comprehensive survey of deep learning for image captioning., CoRR abs/1810.04020 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[32] [32]

G. S. Babu, P. Zhao, X.-L. Li, Deep convolutional neural network based regression approach for estimation of remaining useful life, in: International conference on database systems for advanced applications, Springer, pp. 214–228

work page

[33] [33]

Huang, A

J. Huang, A. Gretton, K. Borgwardt, B. Sch¨ olkopf, A. J. Smola, Correcting sample selection bias by unlabeled data, in: Advances in neural information processing systems, pp. 601–608

work page

[34] [34]

S. J. Pan, I. W. Tsang, J. T. Kwok, Q. Yang, Domain adaptation via transfer component analysis, IEEE Transactions on Neural Networks 22 (2011) 199–210

work page 2011

[35] [35]

B. Sun, J. Feng, K. Saenko, Return of frustratingly easy domain adaptation., in: AAAI, volume 6, p. 8

work page

[36] [36]

Ben-David, J

S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, J. W. Vaughan, A theory of learning from diﬀerent domains, Machine Learning 79 (2010) 151–175

work page 2010

[37] [37]

M. Long, Y. Cao, J. Wang, M. I. Jordan, Learning transferable features with deep adaptation networks, arXiv preprint arXiv:1502.02791 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[38] [38]

Deep Domain Confusion: Maximizing for Domain Invariance

E. Tzeng, J. Hoﬀman, N. Zhang, K. Saenko, T. Darrell, Deep domain confusion: Maximizing for 29 domain invariance, arXiv preprint arXiv:1412.3474 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[39] [39]

B. Sun, K. Saenko, Deep coral: Correlation alignment for deep domain adaptation, in: European Conference on Computer Vision, Springer, pp. 443–450

work page

[40] [40]

Domain-Adversarial Neural Networks

H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, Domain-adversarial neural net- works, arXiv preprint arXiv:1412.4446 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[41] [41]

Goodfellow, J

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Advances in neural information processing systems, pp. 2672–2680

work page

[42] [42]

Cortes, M

C. Cortes, M. Mohri, Domain adaptation and sample bias correction theory and algorithm for regres- sion, Theoretical Computer Science 519 (2014) 103–126

work page 2014

[43] [43]

Lopez-Paz, J

D. Lopez-Paz, J. M. Hern´ andez-lobato, B. Sch¨ olkopf, Semi-supervised domain adaptation with non- parametric copulas, in: Advances in neural information processing systems, pp. 665–673

work page

[44] [44]

Nikzad-Langerodi, W

R. Nikzad-Langerodi, W. Zellinger, E. Lughofer, S. Saminger-Platz, Domain-invariant partial-least- squares regression, Analytical chemistry 90 (2018) 6693–6701

work page 2018

[45] [45]

Purushotham, W

S. Purushotham, W. Carvalho, T. Nilanon, Y. Liu, Variational Adversarial Deep Domain Adaptation for Health Care Time Series Analysis, 29th Conference on Neural Information Processing Systems (2016)

work page 2016

[46] [46]

W. Lu, B. Liang, Y. Cheng, D. Meng, J. Yang, T. Zhang, Deep model based domain adaptation for fault diagnosis, IEEE Transactions on Industrial Electronics 64 (2017) 2296–2305

work page 2017

[47] [47]

Zhang, G

W. Zhang, G. Peng, C. Li, Y. Chen, Z. Zhang, A new deep learning model for fault diagnosis with good anti-noise and domain adaptation ability on raw vibration signals, Sensors 17 (2017) 425

work page 2017

[48] [48]

X. Li, W. Zhang, Q. Ding, J.-Q. Sun, Multi-layer domain adaptation method for rolling bearing fault diagnosis, Signal Processing 157 (2019) 180–197

work page 2019

[49] [49]

J. Xie, L. Zhang, L. Duan, J. Wang, On cross-domain feature fusion in gearbox fault diagnosis un- der various operating conditions based on transfer component analysis, in: 2016 IEEE International Conference on Prognostics and Health Management (ICPHM), pp. 1–6

work page 2016

[50] [50]

X. Li, W. Zhang, Q. Ding, Cross-domain fault diagnosis of rolling element bearings using deep generative neural networks, IEEE Transactions on Industrial Electronics (2018)

work page 2018

[51] [51]

Olah, Understanding lstm networks, 2015

C. Olah, Understanding lstm networks, 2015

work page 2015

[52] [52]

Srivastava, G

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple way to prevent neural networks from overﬁtting, The Journal of Machine Learning Research 15 (2014) 1929–1958

work page 2014

[53] [53]

F. O. Heimes, Recurrent neural networks for remaining useful life estimation, in: 2008 International Conference on Prognostics and Health Management, pp. 1–6

work page 2008

[54] [54]

Tieleman, G

T. Tieleman, G. Hinton, Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude, COURSERA: Neural networks for machine learning 4 (2012) 26–31

work page 2012

[55] [55]

Chollet, et al., Keras, 2015

F. Chollet, et al., Keras, 2015

work page 2015

[56] [56]

Abadi, et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015

M. Abadi, et al., TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorﬂow.org

work page 2015

[57] [57]

D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[58] [58]

Zhang, P

C. Zhang, P. Lim, A. Qin, K. C. Tan, Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics, IEEE transactions on neural networks and learning systems 28 (2017) 2306–2318. 30

work page 2017