pith. sign in

arxiv: 2605.31241 · v1 · pith:WLDUKGLFnew · submitted 2026-05-29 · 💻 cs.LG

Bifurcated Remaining Useful Life Prediction: A Hybrid Approach for Realistic Uncertainty Characterization

Pith reviewed 2026-06-28 23:03 UTC · model grok-4.3

classification 💻 cs.LG
keywords remaining useful lifehybrid modeluncertainty quantificationLSTM autoencoderturbofan engineWeibull distributionprobabilistic neural networkNASA C-MAPSS
0
0 comments X

The pith

A hybrid framework for remaining useful life prediction bifurcates engine operation into healthy and degraded regimes and uses continuous state probabilities to weight an ensemble of survival analysis and probabilistic neural networks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a method to estimate remaining useful life in turbofan engines that accounts for changing uncertainty patterns over an engine's life. It trains an LSTM autoencoder solely on early healthy data to detect the transition to degradation via reconstruction error. This error is turned into a probability that smoothly blends a Weibull-based predictor for the healthy phase with a neural network predictor for the degraded phase. The result is uncertainty estimates that are wide early on but narrow near failure, reflecting actual physical behavior. This matters for maintenance decisions because it avoids overconfident early predictions or underconfident late ones.

Core claim

The central discovery is a state-aware hybrid model that bifurcates the prediction task at the healthy-to-degraded transition, detected by an LSTM autoencoder, and combines Conditional Weibull Survival Analysis for the healthy regime with a Probabilistic Neural Network using Monte Carlo Dropout for the degraded regime, with the two outputs weighted by continuous probabilities derived from a calibrated sigmoid on the autoencoder error.

What carries the argument

LSTM autoencoder reconstruction error converted to continuous state probabilities via calibrated sigmoid, used to dynamically weight the ensemble of Conditional Weibull Survival Analysis and Probabilistic Neural Network with Monte Carlo Dropout.

If this is right

  • Uncertainty bands become physically consistent, with high confidence near end-of-life and appropriate variance early in operation.
  • The method captures both aleatoric and epistemic uncertainties in the degraded regime.
  • Continuous probabilities avoid abrupt switches between models, providing smoother predictions.
  • Trained only on nominal data, the classifier remains robust without needing labeled degradation data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the state classification works on other datasets, the same bifurcation could improve RUL estimates in batteries or bearings.
  • Calibrating the sigmoid on one dataset might require re-calibration for different operating conditions.
  • Extending the framework to multiple degradation stages could further refine the uncertainty characterization.

Load-bearing premise

The reconstruction error from an LSTM autoencoder trained only on data with RUL greater than 150 cycles can be reliably converted into state probabilities that correctly indicate when to switch from survival analysis to neural network predictions.

What would settle it

Running the model on the NASA C-MAPSS test set and finding that the uncertainty intervals do not narrow significantly near actual failure times, or that early predictions show similar confidence to late ones, would disprove the claim of physically consistent uncertainty bands.

Figures

Figures reproduced from arXiv: 2605.31241 by Antonio Nappa, Arkaitz Artetxe, Basilio Sierra, Xabier Belaunzaran.

Figure 1
Figure 1. Figure 1: Illustration summarizing the proposed pipeline. The autoencoder is trained with strictly healthy instances, and [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of MAE Loss data of the windows reconstructed with the autoencoder over the original windows [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of the calibrated degradation probability for the complete training and validation sets of the [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: RUL prediction performed using Weibull Survival Analysis. Error bars represent the calculated standard [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: RUL prediction performed with the Probabilistic LSTM trained with only degraded instances. Error bars [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Final weighted RUL predictions on the FD001 dataset. Error bars represent the calculated standard deviation [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

This study presents a novel hybrid prognostic framework for uncertainty-aware Remaining Useful Life (RUL) estimation in turbofan engines using the NASA C-MAPSS dataset. The framework employs a state-aware strategy that bifurcates the engines operational lifespan into "healthy" and "degraded" regimes. An LSTM-based autoencoder, trained strictly on nominal data (RUL > 150 cycles), monitors reconstruction error to act as a robust state classifier. For the healthy regime, a Conditional Weibull Survival Analysis is used for Mean Residual Life estimation. For the degraded regime, a Probabilistic Neural Network with Monte Carlo Dropout captures both aleatoric and epistemic uncertainties. Rather than using rigid binary labels, a calibrated sigmoid function converts the autoencoders output into continuous state probabilities, dynamically weighting the final ensemble prediction. The primary strength of this framework is its generation of physically consistent uncertainty bands, yielding high-confidence predictions near end-of-life while accurately reflecting the inherent variance of early operation, providing a robust tool for risk-informed maintenance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a hybrid prognostic framework for uncertainty-aware Remaining Useful Life (RUL) estimation on the NASA C-MAPSS turbofan dataset. It bifurcates engine lifespan into healthy and degraded regimes via an LSTM autoencoder (trained only on nominal data with RUL > 150 cycles) whose reconstruction error is mapped by a calibrated sigmoid to continuous state probabilities. These probabilities dynamically weight a Conditional Weibull model (healthy regime) and an MC-Dropout Probabilistic Neural Network (degraded regime) to produce ensemble RUL predictions whose uncertainty bands are claimed to be physically consistent—high-confidence near end-of-life and realistically broad early in operation.

Significance. If the dynamic weighting mechanism can be shown to deliver the claimed physically consistent uncertainty bands, the work would provide a useful addition to prognostics by combining regime-specific models with continuous probabilistic blending, potentially improving risk-informed maintenance decisions.

major comments (2)
  1. [Abstract and state-classification methodology] The central claim of physically consistent uncertainty bands rests on the calibrated sigmoid converting LSTM autoencoder reconstruction error into continuous probabilities p_healthy and p_degraded that weight the Conditional Weibull and MC-Dropout PNN outputs. No fitting procedure, reliability diagram, or validation against actual regime transitions is supplied to demonstrate that the resulting weights produce the advertised behavior rather than an arbitrary blend.
  2. [LSTM autoencoder description] The assumption that an LSTM autoencoder trained strictly on RUL > 150 cycles yields reconstruction errors that meaningfully weight the ensemble lacks any calibration details or empirical check that the error-to-probability mapping aligns with true healthy-to-degraded transitions.
minor comments (2)
  1. [Methodology] Clarify the exact functional form and parameter values of the calibrated sigmoid.
  2. [Experiments] Add quantitative comparison of uncertainty-band width and coverage against baselines across the full RUL range.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback, which highlights important aspects of our state-classification methodology that require greater transparency. We address each major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract and state-classification methodology] The central claim of physically consistent uncertainty bands rests on the calibrated sigmoid converting LSTM autoencoder reconstruction error into continuous probabilities p_healthy and p_degraded that weight the Conditional Weibull and MC-Dropout PNN outputs. No fitting procedure, reliability diagram, or validation against actual regime transitions is supplied to demonstrate that the resulting weights produce the advertised behavior rather than an arbitrary blend.

    Authors: We agree that the manuscript does not currently supply the requested details on the sigmoid calibration. In the revised version, we will add a full description of the fitting procedure (including the data and optimization used to determine the sigmoid parameters), a reliability diagram evaluating the probability calibration, and empirical validation results (e.g., comparison of derived p_healthy against observed regime transitions on held-out C-MAPSS trajectories) to demonstrate that the weighting produces the claimed behavior. revision: yes

  2. Referee: [LSTM autoencoder description] The assumption that an LSTM autoencoder trained strictly on RUL > 150 cycles yields reconstruction errors that meaningfully weight the ensemble lacks any calibration details or empirical check that the error-to-probability mapping aligns with true healthy-to-degraded transitions.

    Authors: We acknowledge the need for additional empirical support here. The revised manuscript will expand the LSTM autoencoder subsection to include the calibration details of the error-to-probability mapping and quantitative or visual empirical checks confirming alignment between reconstruction error thresholds and actual healthy-to-degraded transitions observed in the dataset. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the hybrid RUL framework

full rationale

The described framework trains an LSTM autoencoder strictly on nominal data (RUL > 150), a Conditional Weibull model for the healthy regime, and an MC-Dropout PNN for the degraded regime as independent components. A calibrated sigmoid maps reconstruction error to continuous state probabilities for ensemble weighting. No equations, derivations, or self-citations are shown that reduce any output (e.g., uncertainty bands or weighted predictions) to fitted inputs by construction. The components rely on separate data-driven training rather than tautological definitions or load-bearing self-references, rendering the chain self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim depends on the autoencoder functioning as a reliable degradation detector and on the regime split being physically meaningful; free parameters include the 150-cycle threshold and sigmoid calibration constants, while axioms are standard assumptions about time-series reconstruction error signaling state changes.

free parameters (2)
  • RUL threshold for nominal training data
    Fixed at 150 cycles to select training data for the autoencoder; directly affects state classification boundary.
  • Sigmoid calibration parameters
    Used to map reconstruction error to continuous state probabilities; chosen to weight the ensemble dynamically.
axioms (1)
  • domain assumption Reconstruction error from an LSTM autoencoder trained only on nominal data reliably indicates transition from healthy to degraded regime
    Invoked to justify the state classifier without additional validation details in the abstract.

pith-pipeline@v0.9.1-grok · 5715 in / 1349 out tokens · 25362 ms · 2026-06-28T23:03:49.618840+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    doi:10.1016/j.jmsy.2024.05.021

    ISSN 0278-6125. doi:10.1016/j.jmsy.2024.05.021. Huiqin Li, Zhengxin Zhang, Tianmei Li, and Xiaosheng Si. A review on physics-informed data-driven remaining useful life prediction: Challenges and opportunities.Mechanical Systems and Signal Processing, 209:111120,

  2. [2]

    doi:10.1016/j.ymssp.2024.111120

    ISSN 0888-3270. doi:10.1016/j.ymssp.2024.111120. Thamo Sutharssan, Stoyan Stoyanov, Chris Bailey, and Chunyan Yin. Prognostic and health management for engineering systems: a review of the data-driven approach and algorithms.The Journal of Engineering, 2015(7):215–222,

  3. [3]

    Nam-Ho Kim, Dawn An, and Joo-Ho Choi.Prognostics and Health Management of Engineering Systems

    doi:10.1049/joe.2014.0303. Nam-Ho Kim, Dawn An, and Joo-Ho Choi.Prognostics and Health Management of Engineering Systems. Springer International Publishing, Cham, Switzerland,

  4. [4]

    Venkat Nemani, Luca Biggio, Xun Huan, Zhen Hu, Olga Fink, Anh Tran, Yan Wang, Xiaoge Zhang, and Chao Hu

    doi:10.1007/978-3-319-44742-1. Venkat Nemani, Luca Biggio, Xun Huan, Zhen Hu, Olga Fink, Anh Tran, Yan Wang, Xiaoge Zhang, and Chao Hu. Uncertainty quantification in machine learning for engineering design and health prognostics: A tutorial.Mechanical Systems and Signal Processing, 205:110796,

  5. [5]

    doi:10.1016/j.ymssp.2023.110796

    ISSN 0888-3270. doi:10.1016/j.ymssp.2023.110796. Mariana Salinas-Camus, Kai Goebel, and Nick Eleftheroglou. A comprehensive review and evaluation framework for data-driven prognostics: Uncertainty, robustness, interpretability, and feasibility.Mechanical Systems and Signal Processing, 237:113015,

  6. [6]

    doi:10.1016/j.ymssp.2025.113015

    ISSN 0888-3270. doi:10.1016/j.ymssp.2025.113015. Jerald F. Lawless.Statistical models and methods for lifetime data. Wiley series in probability and statistics. Wiley, Hoboken, N.J, second edition. edition,

  7. [7]

    doi:10.1111/j.2517-6161.1972.tb00899.x

    ISSN 0035-9246. doi:10.1111/j.2517-6161.1972.tb00899.x. Waloddi Weibull. A Statistical Distribution Function of Wide Applicability.Journal of Applied Mechanics,

  8. [8]

    ISBN 978-1-5090-5710-8

    doi:10.1109/ICPHM.2017.7998311. Xiang Li, Qian Ding, and Jian-Qiao Sun. Remaining useful life estimation in prognostics using deep con- volution neural networks.Reliability Engineering & System Safety, 172:1–11,

  9. [9]

    Remaining useful life estimation in prognostics using deep convolution neural networks,

    ISSN 0951-8320. doi:10.1016/j.ress.2017.11.021. Lu Liu, Xiao Song, and Zhetao Zhou. Aircraft engine remaining useful life estimation via a double attention- based data-driven architecture.Reliability Engineering & System Safety, 221:108330,

  10. [10]

    doi:10.1016/j.ress.2022.108330

    ISSN 0951-8320. doi:10.1016/j.ress.2022.108330. Samiha M. Elsherif, Bassel Hafiz, M. A. Makhlouf, and Osama Farouk. A deep learning-based prognostic approach for predicting turbofan engine degradation and remaining useful life.Scientific Reports, 15(1):26251,

  11. [11]

    doi:10.1038/s41598-025-09155-z

    ISSN 2045-2322. doi:10.1038/s41598-025-09155-z. Felix O. Heimes. Recurrent neural networks for remaining useful life estimation. In2008 International Conference on Prognostics and Health Management, pages 1–6,

  12. [12]

    doi:10.1109/PHM.2008.4711422. 14 Bifurcated Remaining Useful Life Prediction: A Hybrid Approach for Realistic Uncertainty CharacterizationA PREPRINT Mihaela Mitici, Ingeborg de Pater, Anne Barros, and Zhiguo Zeng. Dynamic predictive maintenance for multiple components using data-driven probabilistic rul prognostics: The case of turbofan engines.Reliabilit...

  13. [13]

    doi:10.1016/j.ress.2023.109199

    ISSN 0951-8320. doi:10.1016/j.ress.2023.109199. Owais Asif, Sajjad Ali Haider, Syed Rameez Naqvi, John FW Zaki, Kyung-Sup Kwak, and SM Riazul Islam. A deep learning model for remaining useful life prediction of aircraft turbofan engine on c-mapss dataset.Ieee Access, 10: 95425–95440,

  14. [14]

    doi:10.1016/j.ress.2025.111451

    ISSN 0951-8320. doi:10.1016/j.ress.2025.111451. Ya Song, Guo Shi, Leyi Chen, Xinpei Huang, and Tangbin Xia. Remaining useful life prediction of turbofan engine using hybrid model based on autoencoder and bidirectional long short-term memory.Journal of Shanghai Jiaotong University (Science), 23(1):85–94, dec

  15. [15]

    doi:10.1007/s12204-018-2027-5

    ISSN 1995-8188. doi:10.1007/s12204-018-2027-5. Yarin Gal.Uncertainty in deep learning. PhD thesis, University of Cambridge,

  16. [16]

    Pierre Dersin and Roberto Rocchetta

    doi:10.1109/SRSE67406.2025.11357334. Pierre Dersin and Roberto Rocchetta. Analysis of rul dynamics and uncertainty via time transformation.Reliability Engineering & System Safety, 266:111730,

  17. [17]

    doi:10.1016/j.ress.2025.111730

    ISSN 0951-8320. doi:10.1016/j.ress.2025.111730. Weiwen Peng, Zhi-Sheng Ye, and Nan Chen. Bayesian deep-learning-based health prognostics toward prognostics uncertainty.IEEE Transactions on Industrial Electronics, 67(3):2283–2293,

  18. [18]

    Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, and Olga Fink

    doi:10.1109/TIE.2019.2907440. Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, and Olga Fink. Uncertainty-aware prognosis via deep gaussian process.IEEE Access, 9:123517–123527,

  19. [19]

    Yarin Gal and Zoubin Ghahramani

    doi:10.1109/ACCESS.2021.3110049. Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: representing model uncertainty in deep learning. InProceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 1050–1059. JMLR.org,

  20. [20]

    Luis Basora, Arthur Viens, Manuel Arias Chao, and Xavier Olive

    doi:10.1177/16878132241239802. Luis Basora, Arthur Viens, Manuel Arias Chao, and Xavier Olive. A benchmark on uncertainty quantification for deep learning prognostics.Reliability Engineering & System Safety, 253:110513,

  21. [21]

    doi:10.1016/j.ress.2024.110513

    ISSN 0951-8320. doi:10.1016/j.ress.2024.110513. Quy Le Xuan, Yeremia G. Adhisantoso, Marco Munderloh, and Jörn Ostermann. Uncertainty-aware remaining useful life prediction for predictive maintenance using deep learning.Procedia CIRP, 118:116–121,

  22. [22]

    doi:10.1016/j.procir.2023.06.021

    ISSN 2212-8271. doi:10.1016/j.procir.2023.06.021. 16th CIRP Conference on Intelligent Computation in Manufacturing Engineering. Abhinav Saxena, Kai Goebel, Don Simon, and Neil Ekelund. Damage propagation modeling for aircraft engine run-to-failure simulation. In2008 International Conference on Prognostics and Health Management, pages 1–9. IEEE,

  23. [23]

    doi:10.1109/PHM.2008.4711414. Dean K. Frederick, Jonathan A. DeCastro, and Jonathan S. Litt. User’s guide for the commercial modular aero- propulsion system simulation (C-MAPSS). Technical Report NASA/TM-2007-215026, National Aeronautics and Space Administration, Glenn Research Center, Cleveland, Ohio,

  24. [24]

    Yuling Zhan, Ziqian Kong, Ziqi Wang, Xiaohang Jin, and Zhengguo Xu

    doi:10.1016/j.ress.2022.108482. Yuling Zhan, Ziqian Kong, Ziqi Wang, Xiaohang Jin, and Zhengguo Xu. Remaining useful life prediction with uncertainty quantification based on multi-distribution fusion structure.Reliability Engineering & System Safety, 251: 110383,

  25. [25]

    doi:10.1016/j.ress.2024.110383

    ISSN 0951-8320. doi:10.1016/j.ress.2024.110383. 15