Explainable AI to Improve Machine Learning Reliability for Industrial Cyber-Physical Systems
Pith reviewed 2026-05-16 11:50 UTC · model grok-4.3
The pith
SHAP analysis of time-series decomposition reveals insufficient context, so increasing input window size improves ML performance for industrial CPS.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By applying SHAP values to the effects of time-series data decomposition components on model predictions, the authors observe evidence on the lack of sufficient contextual information during model training. By increasing the window size of data instances, informed by the XAI findings for this use-case, they are able to improve model performance.
What carries the argument
SHAP values computed on components from time-series data decomposition, used to diagnose insufficient contextual information and guide enlargement of the input window.
If this is right
- Higher reliability for ML components in safety-critical industrial infrastructure.
- Improved generalization of predictions to data arriving after the training period.
- A repeatable workflow in which XAI findings directly dictate changes to input representation.
- Fewer instances of unexpected model behavior on new operating conditions in CPS environments.
Where Pith is reading between the lines
- The same XAI-guided window adjustment might apply to other sensor-driven time-series tasks where context spans longer periods than initially assumed.
- Larger windows could raise real-time inference latency or memory use, requiring trade-off analysis in deployed CPS.
- Re-running the SHAP analysis after the change would confirm whether the original diagnosis was complete or if new patterns emerge.
Load-bearing premise
The SHAP patterns correctly identify insufficient contextual information as the root cause, and simply enlarging the input window will improve generalization on future data without introducing overfitting or latency problems.
What would settle it
Retraining the model on the same CPS data but with the larger window size and measuring accuracy on a truly future held-out test set that was never seen during the original XAI analysis.
Figures
read the original abstract
Industrial Cyber-Physical Systems (CPS) are sensitive infrastructure from both safety and economics perspectives, making their reliability critically important. Machine Learning (ML), specifically deep learning, is increasingly integrated in industrial CPS, but the inherent complexity of ML models results in non-transparent operation. Rigorous evaluation is needed to prevent models from exhibiting unexpected behaviour on future, unseen data. Explainable AI (XAI) can be used to uncover model reasoning, allowing a more extensive analysis of behaviour. We apply XAI to improve predictive performance of ML models intended for an industrial CPS use-case. We analyse the effects of components from time-series data decomposition on model predictions using SHAP values. Through this method, we observe evidence on the lack of sufficient contextual information during model training. By increasing the window size of data instances, informed by the XAI findings for this use-case, we are able to improve model performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that applying SHAP-based XAI analysis to time-series decomposition components in ML models for industrial CPS reveals insufficient contextual information in the training data. The authors then increase the input window size based on these XAI findings and report improved model performance on the use-case.
Significance. If supported by rigorous quantitative evidence, the approach of using XAI to diagnose and correct input-window deficiencies could aid reliability in safety-critical CPS applications. The idea of interpreting SHAP attributions on decomposed components to guide hyperparameter choices is potentially useful, but the manuscript currently provides no metrics, baselines, or validation to substantiate the performance gain.
major comments (3)
- [Abstract] Abstract: the claim that 'increasing the window size of data instances, informed by the XAI findings... we are able to improve model performance' supplies no quantitative metrics, error bars, baseline comparisons, model architecture details, or dataset description, leaving the central empirical result without verifiable support.
- [Results] Results section: no ablation is reported that compares the XAI-selected window size against other enlargements, nor are statistical significance tests, overfitting checks, or latency measurements on future unseen data provided.
- [Methodology] Methodology: the interpretation of SHAP patterns on decomposition components as direct causal evidence of 'lack of sufficient contextual information' is not accompanied by a test distinguishing correlation from causation in the presence of temporal autocorrelations; enlarging the window may introduce noise rather than improve generalization.
minor comments (1)
- Add explicit definitions for all time-series decomposition components and SHAP computation parameters to allow reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments highlight important gaps in empirical validation and methodological clarity that we will address through targeted revisions. Below we respond point-by-point to the major comments.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'increasing the window size of data instances, informed by the XAI findings... we are able to improve model performance' supplies no quantitative metrics, error bars, baseline comparisons, model architecture details, or dataset description, leaving the central empirical result without verifiable support.
Authors: We agree that the abstract currently lacks the quantitative details needed to substantiate the performance claim. In the revised manuscript we will expand the abstract to report specific metrics (e.g., F1-score improvement with standard deviation across runs), a brief baseline comparison, model architecture summary, and dataset characteristics, ensuring the central empirical result is verifiable from the abstract alone. revision: yes
-
Referee: [Results] Results section: no ablation is reported that compares the XAI-selected window size against other enlargements, nor are statistical significance tests, overfitting checks, or latency measurements on future unseen data provided.
Authors: We accept that additional quantitative controls are required. The revised Results section will include an ablation comparing the XAI-selected window size against multiple alternative enlargements, paired statistical significance tests (e.g., t-tests with p-values), learning-curve analysis to assess overfitting, and inference latency measurements on held-out future data to confirm generalization. revision: yes
-
Referee: [Methodology] Methodology: the interpretation of SHAP patterns on decomposition components as direct causal evidence of 'lack of sufficient contextual information' is not accompanied by a test distinguishing correlation from causation in the presence of temporal autocorrelations; enlarging the window may introduce noise rather than improve generalization.
Authors: The SHAP attributions on decomposed components provide consistent correlational patterns indicating that certain components receive low attribution under the original window. While we do not claim strict causation, the decomposition itself isolates trend, seasonal, and residual effects, reducing the impact of raw autocorrelation. We will revise the Methodology section to explicitly distinguish correlation from causation, add a limitations paragraph on this point, and include a sensitivity check that monitors validation performance across window sizes to detect potential noise introduction. revision: partial
Circularity Check
No significant circularity; empirical outcome stands on reported experiment
full rationale
The paper reports an empirical sequence: SHAP analysis on time-series decomposition components is used to observe evidence of insufficient contextual information, after which the input window size is enlarged and a performance improvement is measured. No equations define the improvement in terms of the SHAP values themselves, no fitted parameter is renamed as a prediction, and no self-citation chain is invoked to justify the result by construction. The central claim therefore remains an independent experimental finding rather than a tautology or statistical artifact forced by the paper's own inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Adadi, A., Berrada, M.: Peeking Inside the Black-Box: A Survey on Explain- ableArtificialIntelligence(XAI).IEEEAccess(2018).https://doi.org/10.1109/ ACCESS.2018.2870052 12 A. Jutte and U. Odyurt
-
[2]
Bento, J.a., Saleiro, P., Cruz, A.F., Figueiredo, M.A., Bizarro, P.: TimeSHAP: Explaining Recurrent Models through Sequence Perturbations. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (2021).https://doi.org/10.1145/3447548.3467166
-
[3]
In: Pattern Recog- nition, Computer Vision, and Image Processing
Dardouillet, P., Benoit, A., Amri, E., Bolon, P., Dubucq, D., Credoz, A.: Explain- ability of Image Semantic Segmentation Through SHAP Values. In: Pattern Recog- nition, Computer Vision, and Image Processing. ICPR 2022 International Work- shops and Challenges (2023).https://doi.org/10.1007/978-3-031-37731-0_19
-
[4]
Artificial Intelligence Review (2023).https://doi.org/10.1007/s10462-022-10354-7
Ferraro, A., Galli, A., Moscato, V., Sperlì, G.: Evaluating eXplainable artificial intelligence tools for hard disk drive predictive maintenance. Artificial Intelligence Review (2023).https://doi.org/10.1007/s10462-022-10354-7
-
[5]
Goyal, Y., Feder, A., Shalit, U., Kim, B.: Explaining Classifiers with Causal Con- cept Effect (CaCE) (2020).https://doi.org/10.48550/arXiv.1907.07165
-
[6]
Griffin, D., Lim, J.: Signal estimation from modified short-time Fourier transform. IEEE Transactions on Acoustics, Speech, and Signal Processing (1984).https: //doi.org/10.1109/TASSP.1984.1164317
-
[7]
SIAM Review (1989).https://doi.org/10.1137/1031129
Heil, C.E., Walnut, D.F.: Continuous and Discrete Wavelet Transforms. SIAM Review (1989).https://doi.org/10.1137/1031129
-
[8]
Hoenig, A., Roy, K., Acquaah, Y.T., Yi, S., Desai, S.S.: Explainable AI for Cyber- Physical Systems: Issues and Challenges. IEEE Access (2024).https://doi.org/ 10.1109/ACCESS.2024.3395444
-
[9]
Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences (1998). https://doi.org/10.1098/rspa.1998.0193
-
[10]
Jutte, A., Ahmed, F., Linssen, J., van Keulen, M.: C-SHAP for time series: An approach to high-level temporal explanations (2025).https://doi.org/10.48550/ arXiv.2504.11159, Under review
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[11]
Remote Sensing (2022).https://doi.org/10.3390/rs14091970
Kawauchi, H., Fuse, T.: SHAP-Based Interpretable Object Detection Method for Satellite Imagery. Remote Sensing (2022).https://doi.org/10.3390/rs14091970
-
[12]
In: 2022 27th International Conference on Automation and Computing (ICAC) (2022)
Khan, T., Ahmad, K., Khan, J., Khan, I., Ahmad, N.: An Explainable Re- gression Framework for Predicting Remaining Useful Life of Machines. In: 2022 27th International Conference on Automation and Computing (ICAC) (2022). https://doi.org/10.1109/ICAC55051.2022.9911162
-
[13]
Journal of the American Statistical Association , author =
Killick, R., Fearnhead, P., Eckley, I.A.: Optimal Detection of Changepoints With a Linear Computational Cost. Journal of the American Statistical Association (2012).https://doi.org/10.1080/01621459.2012.737745
-
[14]
In: Proceedings of the 35th International Conference on Machine Learning (2018)
Kim, B., Wattenberg, M., Gilmer, J., Cai, C., Wexler, J., Viegas, F., sayres, R.: Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV). In: Proceedings of the 35th International Conference on Machine Learning (2018)
work page 2018
-
[15]
In: Advances in Neural Information Processing Systems (2017)
Lundberg, S.M., Lee, S.I.: A Unified Approach to Interpreting Model Predictions. In: Advances in Neural Information Processing Systems (2017)
work page 2017
-
[16]
Journal of Biomedical Informatics (2023).https://doi.org/10.1016/j.jbi
Nayebi, A., Tipirneni, S., Reddy, C.K., Foreman, B., Subbian, V.: WindowSHAP: An efficient framework for explaining time-series classifiers based on Shapley val- ues. Journal of Biomedical Informatics (2023).https://doi.org/10.1016/j.jbi. 2023.104438
-
[17]
Odyurt, U., Roeder, J., Pimentel, A.D., Alonso, I.G., de Laat, C.: Power passports for fault tolerance: Anomaly detection in industrial cps using electrical efb. In: 2021 XAI to Improve ML Reliability for Industrial CPS 13 4th IEEE International Conference on Industrial Cyber-Physical Systems (ICPS) (2021).https://doi.org/10.1109/ICPS49255.2021.9468262
-
[18]
Ribeiro, M.T., Singh, S., Guestrin, C.: "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016).https: //doi.org/10.1145/2939672.2939778
-
[19]
Advances in Neural Information Processing Systems (2023)
Sun, A., Ma, P., Yuan, Y., Wang, S.: Explain Any Concept: Segment Anything Meets Concept-Based Explanation. Advances in Neural Information Processing Systems (2023)
work page 2023
-
[20]
Jour- nal of Big Data (2024).https://doi.org/10.1186/s40537-024-00905-w
Wang, H., Liang, Q., Hancock, J.T., Khoshgoftaar, T.M.: Feature selection strate- gies: a comparative analysis of SHAP-value and importance-based methods. Jour- nal of Big Data (2024).https://doi.org/10.1186/s40537-024-00905-w
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.