pith. sign in

arxiv: 2507.21023 · v2 · pith:6LTEI5XSnew · submitted 2025-07-28 · 💻 cs.LG · eess.SP

On Using the Shapley Value for Anomaly Localization: A Statistical Investigation

Pith reviewed 2026-05-19 02:01 UTC · model grok-4.3

classification 💻 cs.LG eess.SP
keywords Shapley valueanomaly localizationsensor datastatistical testindependent observationscomputational complexity
0
0 comments X p. Extension
pith:6LTEI5XS Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{6LTEI5XS}

Prints a linked pith:6LTEI5XS badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

The pith

A single fixed term from the Shapley value matches the full calculation for anomaly localization accuracy at lower complexity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether the complete Shapley value is required to localize anomalies in sensor data systems. Using a controlled mathematical model of anomalies, the experiments find that one fixed term drawn from the Shapley formula produces an equivalent probability of error while cutting computational cost. A formal proof shows the result must hold for every set of independent observations. Readers would care because many sensor networks rely on quick, reliable anomaly detection, and this reduction removes a major practical barrier.

Core claim

The central claim is that a test based on a single fixed term in the Shapley value calculation achieves the same probability of error as a test using the entire Shapley value for anomaly localization, while requiring lower complexity. This holds in all cases tested experimentally, and a proof establishes it for every independent observation case. For dependent observations, the equivalence lacks a proof but was observed in tests.

What carries the argument

The single fixed term extracted from the Shapley value, which replaces the full sum over all coalitions when deciding which sensor produced the anomaly.

If this is right

  • Anomaly localization tests become feasible on resource-limited sensor nodes without loss of detection reliability.
  • The same performance guarantee applies to any number of independent sensors as shown by the proof.
  • System designers can replace the full Shapley summation with the fixed term in code for independent data streams.
  • The reduction scales directly with the number of sensors, removing the exponential cost of the original method.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If real sensor streams often behave as independent, the simplified test could be deployed in industrial monitoring without further changes.
  • Dependent observations such as spatially correlated measurements may still require the full calculation or a different fixed term.
  • The result invites direct comparison against other anomaly scoring methods on the same controlled model to measure relative gains.

Load-bearing premise

The mathematical anomaly model used for experiments and the proof accurately captures the statistical behavior of real sensor anomalies.

What would settle it

An independent observation case where the single-term test produces a measurably higher error rate than the full Shapley value test would disprove the claimed equivalence.

read the original abstract

Recent publications have suggested using the Shapley value for anomaly localization for sensor data systems. Using a reasonable mathematical anomaly model for full control, experiments indicate that using a single fixed term in the Shapley value calculation achieves a lower complexity anomaly localization test, with the same probability of error, as a test using the Shapley value for all cases tested. A proof demonstrates these conclusions must be true for all independent observation cases. For dependent observation cases, no proof is available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper investigates the application of the Shapley value to anomaly localization in sensor data systems. Using a mathematical anomaly model for controlled experiments, it finds that a single fixed term in the Shapley value calculation provides an anomaly localization test with lower complexity and the same probability of error as using the full Shapley value. A proof establishes this for all independent observation cases, while no proof is available for dependent cases.

Significance. Should the equivalence hold as claimed, this work could meaningfully advance practical anomaly localization by offering a computationally simpler alternative without loss in error probability for independent data. The mathematical proof for the independent case is a notable strength, providing theoretical backing beyond the experimental results. This may encourage further exploration of simplified Shapley-based methods in machine learning applications for anomaly detection.

minor comments (2)
  1. The abstract references 'recent publications' without providing specific citations; including them would help contextualize the contribution within the literature.
  2. The abstract describes the anomaly model as 'reasonable' and mentions 'all cases tested' but provides no details on the model definition, the fixed term, or the tested cases; the full manuscript should supply these for verification and replication.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their review of our manuscript. We are pleased that the report recognizes the potential practical value of our results for anomaly localization and identifies the mathematical proof for independent observations as a strength. We respond below to the summary provided in the report.

read point-by-point responses
  1. Referee: The paper investigates the application of the Shapley value to anomaly localization in sensor data systems. Using a mathematical anomaly model for controlled experiments, it finds that a single fixed term in the Shapley value calculation provides an anomaly localization test with lower complexity and the same probability of error as using the full Shapley value. A proof establishes this for all independent observation cases, while no proof is available for dependent cases.

    Authors: This is an accurate summary of the manuscript. The equivalence result for error probability is established both empirically across tested cases and via a formal proof that holds for all independent observation scenarios. The abstract and main text already note explicitly that no general proof is available for the dependent case, where we rely on experimental validation instead. revision: no

standing simulated objections not resolved
  • We currently have no general proof for the dependent observation cases and are unable to provide one.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The abstract describes experiments under a controlled mathematical anomaly model and a proof establishing equivalence for all independent observation cases. No equations, self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citations are present in the available text. The central result is presented as derived via explicit proof rather than by construction from inputs or prior self-referential claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on a domain assumption that the chosen mathematical anomaly model is reasonable for controlled experiments. No free parameters or new invented entities are described in the abstract.

axioms (1)
  • domain assumption A reasonable mathematical anomaly model exists that allows full experimental control.
    Explicitly invoked in the abstract as the basis for the experiments and conclusions.

pith-pipeline@v0.9.0 · 5574 in / 1254 out tokens · 25407 ms · 2026-05-19T02:01:06.564962+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 2 internal anchors

  1. [1]

    P. K. Varshney, Distributed Detection and Data Fusion. Springer Science & Business Media, 2012

  2. [2]

    Non-coherent source localization with distributed sensor array networks,

    Z. Wan, W. Liu, and P. Willett, “Non-coherent source localization with distributed sensor array networks,” in 2022 IEEE 12th Sensor Array and Multichannel Sig- nal Processing Workshop (SAM) , Trondheim, Norway, 2022, pp. 86–90. DOI: 10 . 1109 / SAM53842 . 2022 . 9827843

  3. [3]

    Heterogeneous sensor fusion with out of sync data,

    B. Chen et al., “Heterogeneous sensor fusion with out of sync data,” in 2020 IEEE Aerospace Conference , Big Sky, MT, USA, 2020, pp. 1–6. DOI: 10 . 1109 / AERO47225.2020.9172681

  4. [4]

    Target location estima- tion in sensor networks with quantized data,

    R. Niu and P. K. Varshney, “Target location estima- tion in sensor networks with quantized data,” IEEE Transactions on Signal Processing , vol. 54, no. 12, pp. 4519–4528, Dec. 2006. DOI: 10.1109/TSP.2006. 882082

  5. [5]

    Data fusion,

    R. Viswanathan, “Data fusion,” in Computer Vision , Springer, Cham, 2020. [Online]. Available: https://doi. org/10.1007/978-3-030-03243-2%5C 298-1

  6. [6]

    Local node selection for localization in a distributed sensor network,

    L. M. Kaplan, “Local node selection for localization in a distributed sensor network,” IEEE Transactions on Aerospace and Electronic Systems , vol. 42, no. 1, pp. 136–146, Jan. 2006. DOI: 10 . 1109 / TAES . 2006 . 1603410

  7. [7]

    Rajam ¨aki and V

    R. Rajam ¨aki and V . Koivunen, Sparse Sensor Arrays for Active Sensing: Models, Configurations, and Appli- cations. 2024

  8. [8]

    Sensors cybersecurity,

    D. A. Gritzalis, G. Pantziou, and R. Rom ´an-Castro, “Sensors cybersecurity,”Sensors (Basel), vol. 21, no. 5, Mar. 2021. DOI: 10.3390/s21051762

  9. [9]

    [Online]

    The global risks report 2020, 2020. [Online]. Available: https : / / www. weforum . org / reports / the - global - risks - report-2020

  10. [10]

    E-sfd: Explainable sensor fault detection in the ics anomaly detection system,

    C. Hwang and T. Lee, “E-sfd: Explainable sensor fault detection in the ics anomaly detection system,” IEEE Access, vol. 9, pp. 140 470–140 486, 2021. DOI: 10 . 1109/ACCESS.2021.3119573

  11. [11]

    Unsupervised multi-sensor anomaly local- ization with explainable ai,

    M. Ameli, V . Pfanschilling, A. Amirli, W. Maaß, and K. Kersting, “Unsupervised multi-sensor anomaly local- ization with explainable ai,” inInternational Conference on Artificial Intelligence Applications and Innovations , Springer, 2022, pp. 507–519

  12. [12]

    Z. Li, Y . Zhu, and M. van Leeuwen, A survey on ex- plainable anomaly detection , 2023. arXiv: 2210.06959 [cs.LG]. [Online]. Available: https://arxiv.org/abs/ 2210.06959

  13. [13]

    D. S. Watson, J. O’Hara, N. Tax, R. Mudd, and I. Guy, Explaining predictive uncertainty with information theoretic shapley values , 2023. arXiv: 2306 . 05724 [stat.ML]. [Online]. Available: https : / / arxiv. org / abs/2306.05724

  14. [14]

    A value for n-person games,

    L. S. Shapley, “A value for n-person games,” in Contri- butions to the Theory of Games, Volume II , ser. Annals of Mathematics Studies, H. Kuhn and A. Tucker, Eds., vol. 28, Princeton, NJ: Princeton University Press, 1953, pp. 307–317

  15. [15]

    Deep learning for anomaly detection: A review,

    G. Pang, C. Shen, L. Cao, and A. V . D. Hengel, “Deep learning for anomaly detection: A review,” ACM Comput. Surv., vol. 54, no. 2, Mar. 2021, ISSN : 0360- 5

  16. [16]

    [Online]

    DOI: 10.1145/3439950. [Online]. Available: https: //doi.org/10.1145/3439950

  17. [17]

    Deep Learning for Anomaly Detection: A Survey

    R. Chalapathy and S. Chawla, “Deep learning for anomaly detection: A survey,” CoRR, vol. abs/1901.03407, 2019. arXiv: 1901 . 03407. [Online]. Available: http://arxiv.org/abs/1901.03407

  18. [18]

    Explainable ai: Using shapley value to explain complex anomaly detection ml-based,

    J. Zou and O. Petrosian, “Explainable ai: Using shapley value to explain complex anomaly detection ml-based,” Artificial Intelligence and Applications , pp. 152–164,

  19. [19]

    DOI: 10.3233/FAIA200777

  20. [20]

    Using kernel shap xai method to optimize the network anomaly detection model,

    K. Roshan and A. Zafar, “Using kernel shap xai method to optimize the network anomaly detection model,” in 2022 9th International Conference on Computing for Sustainable Global Development (INDIACom) , 2022, pp. 74–80. DOI: 10 . 23919 / INDIACom54597 . 2022 . 9763241

  21. [21]

    Takeishi and Y

    N. Takeishi and Y . Kawahara, A characteristic function for shapley-value-based attribution of anomaly scores ,

  22. [22]

    [Online]

    arXiv: 2004.04464 [cs.LG]. [Online]. Avail- able: https://arxiv.org/abs/2004.04464

  23. [23]

    Shapley values of reconstruction errors of pca for explaining anomaly detection,

    N. Takeishi, “Shapley values of reconstruction errors of pca for explaining anomaly detection,” in 2019 International Conference on Data Mining Workshops (ICDMW), 2019, pp. 793–798. DOI: 10.1109/ICDMW. 2019.00117

  24. [24]

    Explaining anomalies detected by autoencoders using shapley additive explanations,

    L. Antwarg, R. M. Miller, B. Shapira, and L. Rokach, “Explaining anomalies detected by autoencoders using shapley additive explanations,” Expert Systems with Applications, vol. 186, p. 115 736, 2021, ISSN : 0957-

  25. [25]

    [Online]

    DOI: https://doi.org/10.1016/j.eswa.2021.115736. [Online]. Available: https : / / www. sciencedirect . com / science/article/pii/S0957417421011155

  26. [26]

    Model inde- pendent feature attributions: Shapley values that un- cover non-linear dependencies,

    D. Fryer, I. Strumke, and H. Nguyen, “Model inde- pendent feature attributions: Shapley values that un- cover non-linear dependencies,” PeerJ Computer Sci- ence, vol. 7, e582, Jun. 2021. DOI: 10 . 7717 / peerj - cs.582

  27. [27]

    Explaining indi- vidual predictions when features are dependent: More accurate approximations to shapley values,

    K. Aas, M. Jullum, and A. Løland, “Explaining indi- vidual predictions when features are dependent: More accurate approximations to shapley values,” Artificial Intelligence, vol. 298, p. 103 502, 2021, ISSN : 0004-

  28. [28]

    [Online]

    DOI: https://doi.org/10.1016/j.artint.2021.103502. [Online]. Available: https : / / www. sciencedirect . com / science/article/pii/S0004370221000539

  29. [29]

    A. B. Owen and C. Prieur, On shapley value for measuring importance of dependent inputs, 2017. arXiv: 1610 . 02080 [math.ST]. [Online]. Available: https : //arxiv.org/abs/1610.02080

  30. [30]

    The many shapley values for model explanation,

    M. Sundararajan, A. Najmi, and A. Sundararajan, “The many shapley values for model explanation,” Interna- tional Journal of Game Theory , vol. 49, no. 1, pp. 45– 66, 2020