pith. sign in

arxiv: 1907.11848 · v1 · pith:SNFVY4GGnew · submitted 2019-07-27 · 📡 eess.SP

Recurrent Neural Networks with Long Term Temporal Dependencies in Machine Tool Wear Diagnosis and Prognosis

Pith reviewed 2026-05-24 15:07 UTC · model grok-4.3

classification 📡 eess.SP
keywords LSTMrecurrent neural networkstool wearprognosismachine condition monitoringvibration signalsremaining useful lifemilling
0
0 comments X

The pith

LSTM recurrent networks model tool wear from vibration signals better than simple RNNs for prediction and remaining life estimation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies long short-term memory networks to predict how cutting tools wear down during machining. It uses vibration data collected near the workpiece to model how the tool's condition changes over time and how that condition appears in the sensor readings. This avoids needing detailed physics equations for each type of wear. The LSTM approach captures longer sequences of data than ordinary recurrent networks, leading to more accurate forecasts of tool state one or two steps ahead and estimates of how much life remains in the tool. If the method works, factories could monitor tools without building custom mathematical models for every machine setup.

Core claim

An LSTM-based recurrent neural network can learn both the system transition function and the system observation function from vibration signals to predict cutting tool wear states and remaining useful life, outperforming a simple RNN in one-step and two-step ahead predictions on milling machine experiments.

What carries the argument

LSTM architecture that maintains long-term dependencies in sequential vibration data to model tool degradation without analytic wear models.

If this is right

  • One-step and two-step look ahead predictions of tool wear state become feasible from indirect vibration measurements.
  • Remaining useful life of cutting tool inserts can be estimated using a generative RNN approach.
  • The method eliminates the need for domain-specific analytic models required by HMMs, Kalman filters, and particle filters.
  • Performance improves over simple RNNs when long-term temporal dependencies are present in the data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Factories could integrate this into existing CNC machines with minimal sensor additions.
  • Similar LSTM modeling might apply to other degradation processes like bearing wear or battery health if vibration-like signals are available.
  • Online monitoring could reduce unplanned downtime by scheduling tool changes based on predicted RUL.
  • The approach might generalize to multi-sensor fusion if more signals are added.

Load-bearing premise

Vibration signals collected near the workpiece fixtures carry enough information about gradual tool wear to let the network learn accurate transition and observation functions.

What would settle it

If retraining the LSTM on new milling runs produces higher prediction error than a simple RNN or than a physics-based model on the same data, the claim that LSTM captures the dependencies better would not hold.

read the original abstract

Data-driven approaches to automated machine condition monitoring are gaining popularity due to advancements made in sensing technologies and computing algorithms. This paper proposes the use of a deep learning model, based on Long Short-Term Memory (LSTM) architecture for a recurrent neural network (RNN) which captures long term dependencies for modeling sequential data. In the context of estimating cutting tool wear amounts, this LSTM based RNN approach utilizes a system transition and system observation function based on a minimally intrusive vibration sensor signal located near the workpiece fixtures. By applying an LSTM based RNN, the method helps to avoid building an analytic model for specific tool wear machine degradation, overcoming the assumptions made by Hidden Markov Models, Kalman filter, and Particle filter based approaches. The proposed approach is tested using experiments performed on a milling machine. We have demonstrated one-step and two-step look ahead cutting tool state prediction using online indirect measurements obtained from vibration signals. Additionally, the study also estimates remaining useful life (RUL) of a machine cutting tool insert through generative RNN. The experimental results show that our approach, applying the LSTM to model system observation and transition function is able to outperform the functions modeled with a simple RNN.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes an LSTM-based RNN to model system transition and observation functions from minimally intrusive vibration signals for milling tool wear diagnosis and prognosis. It claims this approach enables accurate one-step and two-step ahead state predictions plus RUL estimation, outperforms a baseline simple RNN, and avoids the analytic modeling assumptions of HMMs, Kalman filters, and particle filters. Experiments are performed on a milling machine.

Significance. If the empirical outperformance holds under proper validation, the work would demonstrate a practical data-driven alternative for tool wear monitoring that leverages long-term temporal dependencies without requiring domain-specific physics models. The use of generative RNN for RUL is a positive aspect, but the lack of reported quantitative metrics, dataset sizes, cross-validation procedures, or error bars in the provided description limits assessment of whether the central claim is load-bearing or reproducible.

major comments (2)
  1. [Abstract] Abstract: the claim that the LSTM approach 'is able to outperform the functions modeled with a simple RNN' for one/two-step prediction and RUL estimation is presented without any numerical results, RMSE/MAE values, statistical significance tests, or dataset details; this prevents evaluation of whether the reported advantage is meaningful or merely due to hyperparameter tuning.
  2. [Experimental results] The experimental section (inferred from abstract) does not appear to report independent test-set performance, cross-validation strategy, or comparison against non-RNN baselines; without these the outperformance claim cannot be assessed for circularity or generalization beyond the training distribution.
minor comments (2)
  1. Notation for the system transition and observation functions should be defined explicitly with equations rather than described only in prose.
  2. Clarify the exact architecture (number of layers, hidden units, training procedure) and whether the same hyperparameter search was performed for both LSTM and simple RNN baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed comments. We address each major comment point-by-point below and indicate revisions to the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the LSTM approach 'is able to outperform the functions modeled with a simple RNN' for one/two-step prediction and RUL estimation is presented without any numerical results, RMSE/MAE values, statistical significance tests, or dataset details; this prevents evaluation of whether the reported advantage is meaningful or merely due to hyperparameter tuning.

    Authors: We agree that the abstract would be strengthened by including quantitative results. In the revised manuscript we will add specific RMSE and MAE values for the one- and two-step predictions and RUL estimates, along with a brief mention of dataset size, to allow readers to assess the magnitude of the reported outperformance over the simple RNN. revision: yes

  2. Referee: [Experimental results] The experimental section (inferred from abstract) does not appear to report independent test-set performance, cross-validation strategy, or comparison against non-RNN baselines; without these the outperformance claim cannot be assessed for circularity or generalization beyond the training distribution.

    Authors: The experimental section describes the milling-machine data collection and the direct comparison against the simple RNN baseline. To improve clarity we will explicitly state that an independent test set was held out, describe the cross-validation procedure used, and note that the simple-RNN comparison serves as the primary data-driven baseline; we acknowledge that additional non-RNN machine-learning baselines are not included and will add a short discussion of this scope limitation. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical ML comparison is self-contained

full rationale

The paper presents a data-driven LSTM model trained on vibration signals to predict tool wear and RUL, with an empirical claim of outperformance versus simple RNN on milling-machine experiments. No load-bearing mathematical derivation, self-definitional equations, fitted-input predictions, or self-citation chains appear in the abstract or described claims. Performance metrics are measured on experimental data and remain falsifiable on held-out sequences, satisfying the criteria for an independent result.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that vibration signals suffice for learning long-term dependencies and on the neural network's ability to model transition and observation functions; the model itself contains a large number of fitted parameters.

free parameters (1)
  • LSTM network weights, biases, and hyperparameters
    All network parameters are fitted to the experimental vibration data during training.
axioms (1)
  • domain assumption Vibration signals near workpiece fixtures contain the necessary information to capture long-term tool wear dynamics
    Invoked when the paper states that indirect measurements from a minimally intrusive sensor can replace analytic degradation models.

pith-pipeline@v0.9.0 · 5732 in / 1298 out tokens · 24587 ms · 2026-05-24T15:07:55.746987+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 4 internal anchors

  1. [1]

    and Schmidhuber, J., 2001

    Hochreiter, S., Bengio, Y., Frasconi, P. and Schmidhuber, J., 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies

  2. [2]

    and Schmidhuber, J., 1997

    Hochreiter, S. and Schmidhuber, J., 1997. Long short -term memory. Neural computation, 9(8), pp.1735-1780

  3. [3]

    and Krishnakumar, K., 2008

    Wang, X., Wang, W., Huang, Y., Nguyen, N. and Krishnakumar, K., 2008. Design of neural network- based estimator for tool wear modeling in hard turning. Journal of intelligent manufacturing, 19(4), pp.383-396

  4. [4]

    and Karpat, Y., 2005

    Özel, T. and Karpat, Y., 2005. Predictive modeling of surface roughness and tool wear in hard turning using regression and neural networks. International Journal of Machine Tools and Manufacture, 45(4), pp.467-479

  5. [5]

    and Kumara, S.R.T., 1995

    Kamarthi, S.V. and Kumara, S.R.T., 1995. A new neural network architecture for continuous estimation of flank wear in turning. In Proc. First World Congr. Intelligent Manufacturing Processes and Systems (Vol. 2, pp. 1145-1156). 13

  6. [6]

    and Langari, R., 1997, June

    Luetzig, G., Sanchez -Castillo, M. and Langari, R., 1997, June. On tool wear estimation through neural networks. In Neural Networks, 1997., International Conference on (Vol. 4, pp. 2359 -2363). IEEE

  7. [7]

    Generating Sequences With Recurrent Neural Networks

    Graves, A., 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850

  8. [8]

    and Courville, A., 2016

    Goodfellow, I., Bengio, Y. and Courville, A., 2016. Deep learning. MIT Press

  9. [9]

    Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

    Chung, J., Gulcehre, C., Cho, K. and Bengio, Y., 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555

  10. [10]

    Adam: A Method for Stochastic Optimization

    Kingma, D. and Ba, J., 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

  11. [11]

    and Lee, Y.S., 2017

    Zhang, J., Starly, B., Cai, Y., Cohen, P.H. and Lee, Y.S., 2017. Particle learning in online tool wear diagnosis and prognosis. Journal of Manufacturing Processes

  12. [12]

    Multi-Sensor Prognostics using an Unsupervised Health Index based on LSTM Encoder-Decoder

    Malhotra, P., TV, V., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P. and Shroff, G., 2016. Multi- sensor prognostics using an unsupervised health index based on ls tm encoder-decoder. arXiv preprint arXiv:1608.06154

  13. [13]

    and Salakhutdinov, R., 2014

    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I. and Salakhutdinov, R., 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of machine learning research, 15(1), pp.1929-1958. *Text Figure(s) Table(s) Click here to view linked References Figure(s) Click here to download high resolution image Figure(s) Click here to d...