pith. sign in

arxiv: 2606.28065 · v1 · pith:DB64NVPKnew · submitted 2026-06-26 · 💻 cs.LG · cs.AI

OperatorSHAP: Fast and Accurate Shapley Value Estimation for Neural Operators

Pith reviewed 2026-06-29 05:06 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords Shapley valuesneural operatorsexplainable AIattribution methodsfunction spacegrid-agnosticAumann-Shapley values
0
0 comments X

The pith

OperatorSHAP trains grid-agnostic explainers that deliver Shapley attributions for neural operators and match discrete values across resolutions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces OperatorSHAP to make Shapley value attributions practical for neural operators that process data on irregular grids and geometries. Existing amortized methods like FastSHAP require homogeneous inputs and cannot handle the variable resolutions common in physical modeling. OperatorSHAP supplies both a training procedure and a theoretical link to Aumann-Shapley values in function space. The resulting attributions stay consistent with standard discrete Shapley values when resolution changes and can be applied to new grid sizes without retraining the explainer.

Core claim

OperatorSHAP is a grid-agnostic attribution method and training procedure for neural operators that establishes a theoretical framework for attributions in function space connecting to Aumann-Shapley values; the method produces explanations that remain consistent with state-of-the-art discrete Shapley values across resolutions and transfer across grid sizes without retraining.

What carries the argument

OperatorSHAP, the grid-agnostic training procedure that extends FastSHAP-style amortized explainers from homogeneous inputs to neural operators defined on function spaces.

If this is right

  • Attributions remain consistent when the neural operator is queried at different input resolutions.
  • The trained explainer can be reused on new grid sizes or geometries without additional training.
  • Attributions inherit desirable properties from the Aumann-Shapley framework once the function-space connection is in place.
  • Practical explanation of safety-critical predictions becomes feasible for models that ingest data from irregular physical domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same training idea might be applied to other attribution techniques that currently assume fixed input dimensions.
  • Function-space interpretability could become a standard requirement when designing new neural operators for physics.
  • Empirical checks on real-world simulation data with highly irregular meshes would test whether the consistency holds outside synthetic benchmarks.

Load-bearing premise

A theoretical framework for attributions in function space can be established that connects to Aumann-Shapley values and supports grid-agnostic training plus consistency across resolutions.

What would settle it

Direct comparison showing that OperatorSHAP attributions diverge from discrete Shapley values computed on the same neural operator when the input grid size or resolution is altered.

Figures

Figures reproduced from arXiv: 2606.28065 by Eyke H\"ullermeier, Felix Czaja, Joshua Stiller, Santo M. A. R. Thies.

Figure 1
Figure 1. Figure 1: Left: Table with an overview of current methods. Right: Two different samples from the MeshGraphNets cylinder-flow dataset [30]. Top shows a heatmap of airflow velocity strengths, with a grid of sensor points, and the cylinder object obstructing the airflow. The bottom shows a different sample’s attribution map from OperatorSHAP. The explained spatial point is marked with a red star. The varying cylinder p… view at source ↗
Figure 2
Figure 2. Figure 2: Explanations and input data of the 2D heat-equation. The left column shows input [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of classical Shapley values (blue bars) and Aumann–Shapley density (orange [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Runtime of attribution methods vs. ground truth (for [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Normalized mean squared error (NMSE) of attribution methods vs. ground truth (for [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Amortization visualization: We depict the total number of necessary backbone evaluations [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
read the original abstract

Understanding model predictions is essential for physical applications, where outputs often inform safety-critical decisions, such as structural load assessment, weather warnings, and clinical diagnosis. Shapley values satisfy many desirable properties as an attribution method, but their computational cost during inference hinders their practical use. Current amortized explainers, such as FastSHAP, are limited to homogeneous inputs, which is problematic for physical applications where data often comes from irregular grids and geometries. We introduce OperatorSHAP, a grid-agnostic attribution method and training procedure that allows us to train FastSHAP-like explainers for neural operators. We establish a theoretical framework for attributions in function space, connecting to Aumann-Shapley values. We further show that OperatorSHAP's explanations are consistent with state-of-the-art discrete Shapley values across resolutions and transfer across grid sizes without retraining.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces OperatorSHAP, a grid-agnostic attribution method and training procedure for estimating Shapley values on neural operators. It establishes a theoretical framework for attributions in function space that connects to Aumann-Shapley values, and claims that the resulting explanations are consistent with state-of-the-art discrete Shapley values across resolutions while transferring across grid sizes without retraining.

Significance. If the theoretical framework is rigorous and the empirical consistency/transfer results hold, the work would be significant for enabling fast, practical attributions in physical modeling applications that use neural operators on irregular grids and geometries, where standard amortized explainers like FastSHAP do not apply.

major comments (2)
  1. [Theoretical framework section] Theoretical framework (connecting to Aumann-Shapley values): this link is load-bearing for the central claims of grid-agnostic training and resolution-consistent explanations. The measure-theoretic or continuity assumptions required for the connection must be stated explicitly and verified to hold for the neural operators and irregular grids used; otherwise the consistency and transfer results risk being empirical observations rather than consequences of the framework.
  2. [Experiments / results section] Consistency claim with discrete Shapley values: the paper must detail how the baseline discrete Shapley values are computed for the comparison (especially at higher resolutions where exact computation is intractable) and report quantitative metrics (e.g., correlation or error bounds) to substantiate the 'consistent across resolutions' statement.
minor comments (1)
  1. [Abstract] Abstract: specify which particular state-of-the-art discrete Shapley method is used for the consistency comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below and will incorporate revisions to strengthen the theoretical exposition and empirical validation.

read point-by-point responses
  1. Referee: [Theoretical framework section] Theoretical framework (connecting to Aumann-Shapley values): this link is load-bearing for the central claims of grid-agnostic training and resolution-consistent explanations. The measure-theoretic or continuity assumptions required for the connection must be stated explicitly and verified to hold for the neural operators and irregular grids used; otherwise the consistency and transfer results risk being empirical observations rather than consequences of the framework.

    Authors: We agree that explicit statement of the assumptions is necessary to make the connection rigorous. In the revised manuscript we will add a dedicated paragraph in Section 3 listing the measure-theoretic requirements (continuity of the neural operator in the L² topology, measurability of the value function, and integrability of the marginal contributions) and will verify that these conditions are satisfied by the Fourier Neural Operator and DeepONet architectures on the irregular grids used in our experiments, both analytically for the linear case and empirically via Lipschitz-constant estimates for the nonlinear operators. revision: yes

  2. Referee: [Experiments / results section] Consistency claim with discrete Shapley values: the paper must detail how the baseline discrete Shapley values are computed for the comparison (especially at higher resolutions where exact computation is intractable) and report quantitative metrics (e.g., correlation or error bounds) to substantiate the 'consistent across resolutions' statement.

    Authors: We will expand the experimental section to specify the exact procedure used for the discrete baselines: at low resolutions we employed exact enumeration where feasible; at higher resolutions we used Monte-Carlo sampling with 10,000 coalitions per instance (following the standard sampling estimator for Shapley values) together with the same background distribution as OperatorSHAP. In addition, we will report Pearson correlation coefficients and mean absolute deviation between OperatorSHAP attributions and the sampled discrete values across all tested resolutions, together with 95% confidence intervals obtained from 50 independent runs. revision: yes

Circularity Check

0 steps flagged

No circularity; framework presented as independent extension of Aumann-Shapley values.

full rationale

The provided abstract and description contain no equations, no fitted parameters renamed as predictions, and no self-citations that bear the central load. The theoretical framework is described as newly established and connected to the external Aumann-Shapley concept; consistency and transfer results are stated as empirical consequences rather than definitions. No derivation step reduces to its own inputs by construction, so the paper's claims remain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no information on free parameters, axioms, or invented entities used in the method.

pith-pipeline@v0.9.1-grok · 5687 in / 1017 out tokens · 24212 ms · 2026-06-29T05:06:55.792764+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

41 extracted references · 16 canonical work pages · 3 internal anchors

  1. [1]

    Aumann and Lloyd S

    Robert J. Aumann and Lloyd S. Shapley.Values of non-atomic games. Princeton University Press, December 2015. ISBN 978-1-4008-6708-0. doi: 10.1515/9781400867080. URL https://www. degruyterbrill.com/document/doi/10.1515/9781400867080/html

  2. [2]

    Experiment tracking with Weights and Biases, 2020

    Lukas Biewald. Experiment tracking with Weights and Biases, 2020. URLhttps://www.wandb.com/

  3. [3]

    Proxy-SPEX: Sample-efficient interpretability via sparse feature interactions in LLMs

    Landon Butler, Abhineet Agarwal, Justin Singh Kang, Yigit Efe Erginbas, Bin Yu, and Kannan Ram- chandran. Proxy-SPEX: Sample-efficient interpretability via sparse feature interactions in LLMs. InThe Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025

  4. [4]

    Polynomial calculation of the Shapley value based on sampling.Computers & Operations Research, 36(5):1726–1730, 2009

    Javier Castro, Daniel Gómez, and Juan Tejada. Polynomial calculation of the Shapley value based on sampling.Computers & Operations Research, 36(5):1726–1730, 2009. doi: 10.1016/j.cor.2008.04.004

  5. [5]

    Learning to estimate Shapley values with vision transformers

    Ian Connick Covert, Chanwoo Kim, and Su-In Lee. Learning to estimate Shapley values with vision transformers. InThe Eleventh International Conference on Learning Representations (ICLR), 2023. URL https://openreview.net/forum?id=5ktFNz_pJLK

  6. [7]

    InstaSHAP: Interpretable additive models explain Shapley values instantly

    James Enouen and Yan Liu. InstaSHAP: Interpretable additive models explain Shapley values instantly. InThe Thirteenth International Conference on Learning Representations (ICLR), 2025. URL https: //openreview.net/forum?id=ky7vVlBQBY

  7. [8]

    Axiomatic characterizations of proba- bilistic and cardinal-probabilistic interaction indices.Games and Economic Behavior, 55(1):72–99, 2006

    Katsushige Fujimoto, Ivan Kojadinovic, and Jean-Luc Marichal. Axiomatic characterizations of proba- bilistic and cardinal-probabilistic interaction indices.Games and Economic Behavior, 55(1):72–99, 2006. ISSN 0899-8256. doi: https://doi.org/10.1016/j.geb.2005.03.002. URL https://www.sciencedirect. com/science/article/pii/S0899825605000278

  8. [9]

    SHAP-IQ: Unified approximation of any-order Shapley interactions

    Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, and Barbara Hammer. SHAP-IQ: Unified approximation of any-order Shapley interactions. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 11515–11551, 2023

  9. [10]

    KernelSHAP-IQ: Weighted least square optimization for Shapley interactions

    Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, and Barbara Hammer. KernelSHAP-IQ: Weighted least square optimization for Shapley interactions. InProceedings of the International Conference on Machine Learning (ICML), pages 14308–14342, 2024. 10

  10. [12]

    PolySHAP: Extending KernelSHAP with Interaction-Informed Polynomial Regression

    Fabian Fumagalli, R. Teal Witter, and Christopher Musco. PolySHAP: Extending KernelSHAP with interaction-informed polynomial regression.arXiv preprint arXiv:2601.18608, 2026. doi: 10.48550/ ARXIV .2601.18608

  11. [13]

    Multiwavelet-based operator learning for differential equations

    Gaurav Gupta, Xiongye Xiao, and Paul Bogdan. Multiwavelet-based operator learning for differential equations. InAdvances in Neural Information Processing Systems 34: Annual Conference on Neural Infor- mation Processing Systems 2021 (NeurIPS), pages 24048–24062, 2021. URL https://proceedings. neurips.cc/paper/2021/hash/c9e5c2b59d98488fe1070e744041ea0e-Abst...

  12. [14]

    FastSHAP: Real-time Shapley value estimation

    Neil Jethani, Mukund Sudarshan, Ian Covert, Su-In Lee, and Rajesh Ranganath. FastSHAP: Real-time Shapley value estimation. InInternational Conference on Learning Representations (ICLR), March 2022. doi: 10.48550/arXiv.2107.07436. URL http://arxiv.org/abs/2107.07436. arXiv:2107.07436 [stat]

  13. [15]

    Approximating the Shapley value without marginal contributions

    Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, and Eyke Hüllermeier. Approximating the Shapley value without marginal contributions. InThirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, IAAI 2024, Fourteenth Symposium on Educational Advances in Artific...

  14. [16]

    URLhttps://doi.org/10.1609/aaai.v38i12.29225

    doi: 10.1609/AAAI.V38I12.29225. URLhttps://doi.org/10.1609/aaai.v38i12.29225

  15. [17]

    SV ARM-IQ: Efficient approximation of any-order Shapley interactions through stratification

    Patrick Kolpaczki, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüllermeier. SV ARM-IQ: Efficient approximation of any-order Shapley interactions through stratification. InProceed- ings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 3520–3528, 2024

  16. [18]

    arXiv preprint arXiv:2412.10354 , year =

    Jean Kossaifi, Nikola Kovachki, Zongyi Li, David Pitt, Miguel Liu-Schiaffini, Robert Joseph George, Boris Bonev, Kamyar Azizzadenesheli, Julius Berner, Valentin Duruisseaux, and Anima Anandkumar. A library for learning Neural Operators.arXiv preprint arXiv:2412.10354, 2025

  17. [19]

    Neural Operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023

    Nikola Kovachki, Zongyi Li, Burigede Liu, Kamyar Azizzadenesheli, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Neural Operator: Learning maps between function spaces with applications to PDEs.Journal of Machine Learning Research, 24(89):1–97, 2023. URL http: //jmlr.org/papers/v24/21-1524.html

  18. [20]

    Stuart, and Anima Anandkumar

    Zongyi Li, Nikola Borislavov Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew M. Stuart, and Anima Anandkumar. Fourier Neural Operator for parametric partial differential equations. In9th International Conference on Learning Representations (ICLR), 2021. URL https: //openreview.net/forum?id=c8P9NQVtmnO

  19. [21]

    Kovachki, Christopher B

    Zongyi Li, Nikola B. Kovachki, Christopher B. Choy, Boyi Li, Jean Kossaifi, Shourya Prakash Otta, Mohammad Amin Nabian, Maximilian Stadler, Christian Hundt, Kamyar Azizzadenesheli, and Ani- mashree Anandkumar. Geometry-informed Neural Operator for large-scale 3D PDEs. InAdvances in Neural Information Processing Systems 36: Annual Conference on Neural Info...

  20. [22]

    A new baseline assumption of Integated Gradients based on Shaply value, May 2024

    Shuyang Liu, Zixuan Chen, Ge Shi, Ji Wang, Changjie Fan, Yu Xiong, Runze Wu Yujing Hu, Ze Ji, and Yang Gao. A new baseline assumption of Integated Gradients based on Shaply value, May 2024. URL http://arxiv.org/abs/2310.04821. arXiv:2310.04821 [cs]

  21. [23]

    Neural Operators with localized integral and differential kernels

    Miguel Liu-Schiaffini, Julius Berner, Boris Bonev, Thorsten Kurth, Kamyar Azizzadenesheli, and Anima Anandkumar. Neural Operators with localized integral and differential kernels. InForty-First International Conference on Machine Learning (ICML), pages 32576–32594, 2024. URL https://proceedings. mlr.press/v235/liu-schiaffini24a.html

  22. [24]

    A unified approach to interpreting model predictions

    Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V on Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems (NeurIPS), volume 30. Curran Asso- ciates, Inc., 2017. URL https://proceedings.neurips.cc/paper_files/paper/2017/fil...

  23. [25]

    M., Erion, G., Chen, H., DeGrave, A., Prutkin, J

    Scott M. Lundberg, Gabriel G. Erion, Hugh Chen, Alex J. DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. From local explanations to global understanding with explainable AI for trees.Nature Machine Intelligence, 2(1):56–67, 2020. doi: 10.1038/s42256-019-0138-9. 11

  24. [26]

    Amortized Linear-time Exact Shapley Value for Product-Kernel Methods

    Majid Mohammadi, Siu Lun Chau, and Krikamol Muandet. Computing exact Shapley values in polynomial time for product-kernel methods.arXiv preprint arXiv:2505.16516, 2025

  25. [27]

    Beyond TreeSHAP: Efficient computation of any-order Shapley interactions for tree ensembles

    Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüllermeier. Beyond TreeSHAP: Efficient computation of any-order Shapley interactions for tree ensembles. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 14388–14396, 2024. doi: 10.1609/aaai.v38i13.29352

  26. [28]

    Exact computation of any-order Shapley interactions for graph neural networks

    Maximilian Muschalik, Fabian Fumagalli, Paolo Frazzetto, Janine Strotherm, Luca Hermes, Alessandro Sperduti, Eyke Hüllermeier, and Barbara Hammer. Exact computation of any-order Shapley interactions for graph neural networks. InProceedings of the International Conference on Learning Representations (ICLR), 2025

  27. [29]

    Teal Witter

    Christopher Musco and R. Teal Witter. Provably accurate Shapley value estimation via leverage score sampling. In Y . Yue, A. Garg, N. Peng, F. Sha, and R. Yu, ed- itors,International Conference on Learning Representations (ICLR), volume 2025, pages 64532–64559, 2025. URL https://proceedings.iclr.cc/paper_files/paper/2025/file/ a224ff18cc99a71751aa2b791186...

  28. [30]

    Chapter 56 values of games with infinitely many players

    Abraham Neyman. Chapter 56 values of games with infinitely many players. InHandbook of Game Theory with Economic Applications, volume 3, pages 2121–2167. Elsevier, 2002. ISBN 978-0-444-89428-1. doi: 10.1016/S1574-0005(02)03019-9. URL https://linkinghub.elsevier.com/retrieve/pii/ S1574000502030199

  29. [31]

    Battaglia

    Tobias Pfaff, Meire Fortunato, Alvaro Sanchez-Gonzalez, and Peter W. Battaglia. Learning mesh-based simulation with graph networks. In9th International Conference on Learning Representations (ICLR),

  30. [32]

    URLhttps://openreview.net/forum?id=roNqYL0_XP

  31. [33]

    Number 3 in De Gruyter series in nonlinear analysis and applications

    Thomas Runst and Winfried Sickel.Sobolev spaces of fractional order, Nemytskij operators, and nonlinear partial differential equations. Number 3 in De Gruyter series in nonlinear analysis and applications. Walter de Gruyter, Berlin ; New York, 1996. ISBN 978-3-11-015113-8

  32. [34]

    Lloyd S. Shapley. A value forn-person games. In Alvin E. Roth, editor,The Shapley Value, pages 31–40. Cambridge University Press, 1 edition, October 1988. ISBN 978-0-521-36177-4 978-0-521-02133-3 978-0-511-52844-6. doi: 10.1017/CBO9780511528446.003. URL https://www.cambridge.org/ core/product/identifier/CBO9780511528446A008/type/book_part

  33. [35]

    Axiomatic attribution for deep networks

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In Doina Precup and Yee Whye Teh, editors,Proceedings of the 34th International Conference on Machine Learning (ICML), volume 70 ofProceedings of Machine Learning Research, pages 3319–3328. PMLR, August

  34. [36]

    URLhttps://proceedings.mlr.press/v70/sundararajan17a.html

  35. [37]

    Faith-Shap: The faithful Shapley interaction index.Journal of Machine Learning Research, 24:94:1–94:42, 2023

    Che-Ping Tsai, Chih-Kuan Yeh, and Pradeep Ravikumar. Faith-Shap: The faithful Shapley interaction index.Journal of Machine Learning Research, 24:94:1–94:42, 2023. URL https://jmlr.org/papers/ v24/22-0202.html

  36. [38]

    Wang and Ruoxi Jia

    Jiachen T. Wang and Ruoxi Jia. Data Banzhaf: A robust data valuation framework for machine learning. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 6388–6421, 2023

  37. [39]

    Wang, Prateek Mittal, and Ruoxi Jia

    Jiachen T. Wang, Prateek Mittal, and Ruoxi Jia. Efficient data Shapley for weighted nearest neighbor algorithms. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 2557–2565, 2024

  38. [40]

    Teal Witter, Yurong Liu, and Christopher Musco

    R. Teal Witter, Yurong Liu, and Christopher Musco. Regression-adjusted Monte Carlo estimators for Shapley values and probabilistic values, January 2026. URL http://arxiv.org/abs/2506.11849. arXiv:2506.11849 [cs]

  39. [41]

    Hydra - A framework for elegantly configuring complex applications, 2019

    Omry Yadan. Hydra - A framework for elegantly configuring complex applications, 2019. URL https: //github.com/facebookresearch/hydra. tex.howpublished: Github

  40. [42]

    R&d-agent: An llm-agent framework towards autonomous data science.CoRR, abs/2505.14738, 2025

    Peng Yu, Chao Xu, Albert Bifet, and Jesse Read. Linear TreeShap.arXiv preprint arXiv:2209.08192, abs/2209.08192, 2022. doi: 10.48550/ARXIV .2209.08192. URL https://doi.org/10.48550/arXiv. 2209.08192

  41. [43]

    tidx” is the snapshot index used as the backbone target on a uniform grid of nsnap snapshots over [0, T]; “static

    Artjom Zern, Klaus Broelemann, and Gjergji Kasneci. Interventional SHAP values and interaction values for piecewise linear regression trees. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 11164–11173, 2023. 12 A Additional background information A.1 A short primer on PDEs and solution operators Many physical systems are desc...