Proxy-Based Approximation of Shapley and Banzhaf Interactions

Eyke H\"ullermeier; Fabian Fumagalli; Hubert Baniecki; Maximilian Muschalik; R. Teal Witter; Santo M. A. R. Thies

arxiv: 2605.22738 · v2 · pith:YDSIFA3Mnew · submitted 2026-05-21 · 💻 cs.LG · cs.AI· stat.ML

Proxy-Based Approximation of Shapley and Banzhaf Interactions

Santo M. A. R. Thies , Hubert Baniecki , R. Teal Witter , Eyke H\"ullermeier , Maximilian Muschalik , Fabian Fumagalli This is my paper

Pith reviewed 2026-05-25 06:23 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords Shapley interactionsBanzhaf interactionsproxy modelsresidual correctionTreeSHAPinteraction approximationexplainable AI

0 comments

The pith

ProxySHAP combines tree-based proxy models with residual correction to approximate Shapley and Banzhaf interactions more accurately than prior estimators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces ProxySHAP to estimate higher-order Shapley and Banzhaf interactions that current methods trade off between speed and accuracy. It pairs the sample efficiency of tree-based proxy models with residual correction to reach consistency. A polynomial-time generalization of interventional TreeSHAP is derived to obtain exact interaction indices for tree ensembles without exponential tree-depth costs. The Maximum Sample Reuse strategy is analyzed to correct proxy bias without variance growing exponentially with interaction size. Benchmarking shows ProxySHAP records the lowest error across small- and large-budget regimes and outperforms ProxySPEX and KernelSHAP-IQ on downstream tasks.

Core claim

ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. It derives a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, bypassing exponential tree-depth dependencies, and formally characterizes conditions under which Maximum Sample Reuse corrects proxy bias without exponential variance scaling.

What carries the argument

ProxySHAP, which pairs tree-based proxy models with residual adjustment via Maximum Sample Reuse (MSR) to correct bias while preserving efficiency.

Load-bearing premise

The residual adjustment strategy corrects proxy bias without its variance scaling exponentially with interaction size under the conditions analyzed in the paper.

What would settle it

A test showing that ProxySHAP fails to achieve the lowest error versus ProxySPEX and KernelSHAP-IQ on models with thousands of features in either small- or large-budget regimes would disprove the superiority claim.

Figures

Figures reproduced from arXiv: 2605.22738 by Eyke H\"ullermeier, Fabian Fumagalli, Hubert Baniecki, Maximilian Muschalik, R. Teal Witter, Santo M. A. R. Thies.

**Figure 1.** Figure 1: Left: A ProxySHAP explanation of the SigLIP-2 model using only 2048 model calls. Right: In Phase 1, we fit a regression proxy model using sampled binary coalitions and game values. In Phase 2, we extract proxy interactions and, when appropriate, adjust them using residual estimates. sum of its marginal contributions across all possible subsets: ϕ p i (ν) := X T ⊆N\{i} pt(n)∆iν(T), (1) where ∆iν(T) := ν(T ∪… view at source ↗

**Figure 2.** Figure 2: Runtime improvement of extracting interactions using our Algorithm 2 over Fourier-based extraction. Per-dataset speedups and the effect of tree depth on approximation quality are shown in Figures 13 and 14. Proposition 3.2 shows that the interactions of the tree proxy can be computed exactly by aggregating leaf-wise contributions. In particular, for a fixed interaction S, extraction requires only a singl… view at source ↗

**Figure 3.** Figure 3: Comparison of ProxySHAP with and without MSR adjustment, measured by the MSE ratio. While MSR improves Shapley value approximation, it can degrade higher-order interaction estimates, as its variance scales as n k−1/|T | for interactions of order k (Theorem 3.3). This motivates the adjusted ProxySHAP estimator ϕˆProxySHAP S (ν; T ) = ϕ p S (ˆνT ) + ϕˆMSR S (ν − νˆT ; T ). While MSR often improves singleton … view at source ↗

**Figure 4.** Figure 4: Approximation quality (Relative MSE) for Shapley interactions of ProxySHAP across different configurations and state-of-the-art baselines. Additional results for Shapley and Banzhaf interactions on all 47 datasets can be found in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Relative MSE for pairwise Shapley interaction approximation of ProxySHAP with HPO (top) and for large n (bottom). Further results in [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Area between the insertion/deletion curves (AID) for explaining two CLIP ViT variants on the MS COCO dataset with ProxySHAP, ProxySPEX, and the FIxLIP baseline. Setup. Similarly to Baniecki et al. [2], we analyze the CLIP model in two vision transformer variants: ViT-32 and ViT-16. We explain a sample of 200 image–text pairs from the MS COCO dataset [33], which contain around 10– 30 text tokens per input… view at source ↗

**Figure 7.** Figure 7: Empirical variance scaling with sampling budget |T| for interaction orders |S| = k. For each order k, the plot shows the mean, minimum, and maximum empirical variance over all subsets S of size k. The black curve denotes the theoretical bound shape, namely proportional to ∥v∥ 2 ∞ log(n)/|T| for k = 1 and to ∥v∥ 2 ∞n k−1/|T| for k > 1. Since big-O bounds are defined only up to a multiplicative constant, the… view at source ↗

**Figure 8.** Figure 8: Faithfulness R2 for explaining CLIP (ViT-16) on the MS COCO dataset with ProxySHAP, ProxySPEX, and the FIxLIP baseline. D.2 XGBoost Default for Large Player Counts [PITH_FULL_IMAGE:figures/full_fig_p035_8.png] view at source ↗

**Figure 9.** Figure 9: Ablation on approximating the cross-modal FIxLIP estimator. Faithfulness R2 for explaining two CLIP variants on MS COCO with ProxySHAP and the FIxLIP baseline. Motivated by this observation, we evaluate ProxySHAP (XGBoost+HPO-Informed) as an alternative default proxy for large-scale games. The results show that this configuration improves approximation quality in low-budget regimes and for games with many … view at source ↗

**Figure 10.** Figure 10: Approximation quality of two different XGBoost defaults. We show that using 2000 trees with a maximum depth of 3 improves estimation quality in low- to medium-budget regimes. low-budget regimes and for games with many players, where the standard XGBoost default may be insufficient to capture the relevant interaction structure. D.3 Runtime We evaluate runtime by translating model evaluations into wall-cloc… view at source ↗

**Figure 11.** Figure 11: Approximation quality as a function of runtime for second- and third-order interaction estimation across different per-evaluation cost regimes. 38 [PITH_FULL_IMAGE:figures/full_fig_p038_11.png] view at source ↗

**Figure 12.** Figure 12: Ablations of sampling weights and residual approximators for ProxySHAP. We further investigate the effect of the residual approximator and sampling weights used in the adjustment step. Specifically, we compare SHAP-IQ [15] and KernelSHAP-IQ [16] as model-agnostic residual approximators. We also compare leverage weights, as used in LeverageSHAP [45], with KernelSHAPIQ weights [16]. As underlying games, … view at source ↗

**Figure 13.** Figure 13: Approximation quality (Relative MSE) of ProxySHAP and ProxySPEX using different maximum tree depth options across small, medium, and large player domains. Our method relies on the ability to efficiently extract exact cardinal-probabilistic interaction indices from the underlying tree-based model. We extend interventional TreeSHAP by Zern et al. [65] to extract the exact cardinal-probabilistic interaction … view at source ↗

**Figure 14.** Figure 14: Speedup of interventional extraction compared to Fourier extraction for extracting all interactions of order 1, 2, and 3 across different datasets. 41 [PITH_FULL_IMAGE:figures/full_fig_p041_14.png] view at source ↗

**Figure 15.** Figure 15: Predicted versus ground-truth normalized interaction values for different approximation methods and sampling budgets. Each point represents one interaction value from one dataset and one benchmark run; points closer to the diagonal indicate better agreement with the exact interaction values. Columns compare ProxySHAP, ProxySPEX, SHAPIQ, PermutationSamplingSII, and KernelSHAPIQ, while rows correspond to in… view at source ↗

**Figure 16.** Figure 16: ProxySHAP with disjoint coalition sets for proxy fitting and residual adjustment. drawn as the multiplicative interval [¯r/s, r¯ · s], corresponding to one standard deviation in log-space. Hence, values below one indicate that adjustment improves approximation quality, whereas values above one indicate that it deteriorates. For interaction indices, however, the effect of adjustment is more nuanced. While … view at source ↗

**Figure 17.** Figure 17: Selection of representative approximation curves for SII and BII at second- and third-order interactions. 45 [PITH_FULL_IMAGE:figures/full_fig_p045_17.png] view at source ↗

**Figure 18.** Figure 18: Winnermap comparing the best performing method for each dataset and budget for SII orders 2 and 3. Note that the HPO-Informed variants are considered only for datasets with more than 1000 features in this overview. 46 [PITH_FULL_IMAGE:figures/full_fig_p046_18.png] view at source ↗

**Figure 19.** Figure 19: Winnermap comparing the best performing method for each dataset and budget for BII orders 2 and 3. Note that the HPO-Informed variants are considered only for datasets with more than 1000 features in this overview. 47 [PITH_FULL_IMAGE:figures/full_fig_p047_19.png] view at source ↗

read the original abstract

Shapley and Banzhaf interactions capture the complex dynamics inherent in modern machine learning applications. However, current estimators for these higher-order interactions trade off between speed and accuracy. To overcome this limitation, we introduce ProxySHAP. ProxySHAP reconciles the high sample efficiency of tree-based proxy models with a principled path to consistency via residual correction. On a theoretical level, we derive a polynomial-time generalization of interventional TreeSHAP to compute exact interaction indices for tree ensembles, successfully bypassing exponential tree-depth dependencies in prior methods. Furthermore, we formally analyze the residual adjustment strategy, characterizing the specific conditions under which Maximum Sample Reuse (MSR) corrects proxy bias without its variance scaling exponentially with interaction size. Extensive benchmarking demonstrates that ProxySHAP sets a new state-of-the-art standard for approximation quality, including in large-scale applications with thousands of features. By achieving the lowest error in both small- and large-budget regimes, ProxySHAP significantly outperforms the prior best estimators ProxySPEX and KernelSHAP-IQ, while also delivering superior performance on downstream explainability tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ProxySHAP adds a polynomial-time TreeSHAP extension for exact interaction indices plus a residual correction that targets bias without exponential variance growth.

read the letter

The core advance is a polynomial-time generalization of interventional TreeSHAP that computes exact Shapley and Banzhaf interaction indices on tree ensembles, sidestepping the depth-dependent exponential cost in earlier approaches. They pair this with ProxySHAP, which runs a tree proxy for speed and then applies Maximum Sample Reuse to correct the resulting bias, with a formal characterization of when that correction keeps variance from scaling badly with interaction order. That combination is new relative to ProxySPEX and KernelSHAP-IQ. The paper reports that the method reaches lower approximation error than the priors in both low- and high-budget regimes and scales to thousands of features while improving downstream explanation tasks. If the experiments hold, this is a direct practical step for anyone estimating higher-order interactions on tree models. The main soft spot is that the variance-control guarantee is conditioned on specific properties of the residual and the data; those conditions need to be verified as realistic rather than overly narrow. The abstract gives no equations or proof sketches, so the referee will have to check whether the derivation is tight and whether the benchmarks include proper error bars and dataset variety. No circularity or self-referential fitting appears in the claims. This work is for the interpretability subgroup that already uses TreeSHAP-style methods and wants better interaction estimates without giving up speed. A reader focused on tree ensembles will find the scaling claims and the new exact method worth examining. It deserves peer review because the stated contributions are concrete and falsifiable.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces ProxySHAP for approximating Shapley and Banzhaf interaction indices. It derives a polynomial-time exact interventional TreeSHAP generalization for tree ensembles that avoids exponential depth dependence, proposes the Maximum Sample Reuse (MSR) residual correction strategy, and formally characterizes conditions under which MSR removes proxy bias without exponential variance growth in interaction order. Extensive experiments claim that ProxySHAP achieves the lowest approximation error across small- and large-budget regimes and outperforms ProxySPEX and KernelSHAP-IQ, with additional gains on downstream explainability tasks even for thousands of features.

Significance. If the polynomial-time exact proxy derivation and the variance-control analysis hold, the work supplies a practically useful advance in higher-order interaction estimation for tree-based models. The combination of an exact, efficient proxy with a theoretically grounded residual correction is a clear technical contribution, and the reported benchmarking, if reproducible, would support adoption in large-scale XAI applications.

minor comments (3)

[§4.2] §4.2 (MSR algorithm): a short pseudocode block would make the sample-reuse logic and the exact bias-correction step easier to verify.
[Table 2] Table 2 and Figure 4: standard deviations or error bars across the reported runs are missing; their addition would strengthen the SOTA claim.
[§6] The paper would benefit from an explicit limitations paragraph addressing applicability outside tree ensembles and the sensitivity of MSR to the proxy-model quality.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of ProxySHAP, recognition of the polynomial-time exact interventional TreeSHAP generalization, and the variance-control analysis for MSR, as well as the recommendation for minor revision. No major comments appear in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper derives a new polynomial-time generalization of interventional TreeSHAP for exact interaction indices on tree ensembles and provides a formal analysis of the Maximum Sample Reuse residual correction under explicitly stated conditions. These steps are presented as independent theoretical contributions rather than reductions to fitted parameters or prior self-citations. Benchmarking results are empirical comparisons against external baselines (ProxySPEX, KernelSHAP-IQ) and do not rely on any internal redefinition or self-referential prediction. The derivation chain remains self-contained with no load-bearing steps that collapse to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies insufficient detail to enumerate free parameters, axioms, or invented entities; the central claims rest on unstated modeling assumptions about tree ensembles and the validity of the residual-correction analysis.

pith-pipeline@v0.9.0 · 5748 in / 1237 out tokens · 40184 ms · 2026-05-25T06:23:41.557344+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 4 internal anchors

[1]

Efficient and Accurate Explanation Estimation with Distribution Compression

Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, and Przemyslaw Biecek. Efficient and Accurate Explanation Estimation with Distribution Compression. InProceedings of the International Conference on Learning Representations (ICLR), 2025

work page 2025
[2]

Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions

Hubert Baniecki, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüller- meier, and Przemyslaw Biecek. Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025
[3]

Weighted voting doesn’t work: A mathematical analysis.Rutgers Law Review, 19:317, 1964

John F Banzhaf III. Weighted voting doesn’t work: A mathematical analysis.Rutgers Law Review, 19:317, 1964

work page 1964
[4]

Proxy-SPEX: Sample-efficient interpretability via sparse feature interactions in LLMs

Landon Butler, Abhineet Agarwal, Justin Singh Kang, Yigit Efe Erginbas, Bin Yu, and Kannan Ramchandran. Proxy-SPEX: Sample-efficient interpretability via sparse feature interactions in LLMs. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025
[5]

Yu-Xuan Cai, Hai-Yan Chen, Ya-Jing Qu, Wen-Hao Zhao, Mei-Ying Wang, Ying Chen, and Jin Ma. Improved vertical distribution prediction of soil vocs contamination in site-scale utilizing ensemble machine learning approach integrated with molecular descriptors.Journal of Hazardous Materials, page 139452, 2025

work page 2025
[6]

Masset, R

Javier Castro, Daniel Gómez, and Juan Tejada. Polynomial calculation of the Shapley value based on sampling.Computers & Operations Research, 36(5):1726–1730, 2009. doi: 10.1016/j. cor.2008.04.004

work page doi:10.1016/j 2009
[7]

Improving polynomial estima- tion of the Shapley value by stratified random sampling with optimum allocation.Computers & Operations Research, 82:180–188, 2017

Javier Castro, Daniel Gómez, Elisenda Molina, and Juan Tejada. Improving polynomial estima- tion of the Shapley value by stratified random sampling with optimum allocation.Computers & Operations Research, 82:180–188, 2017. doi: 10.1016/j.cor.2017.01.019

work page doi:10.1016/j.cor.2017.01.019 2017
[8]

Chen and C

Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 785–794. ACM, 2016. doi: 10.1145/2939672.2939785

work page doi:10.1145/2939672.2939785 2016
[9]

Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression

Ian Covert and Su-In Lee. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 3457–3465, 2021

work page 2021
[10]

Stochastic amortization: A unified approach to accelerate feature and data attribution

Ian Connick Covert, Chanwoo Kim, Su-In Lee, James Zou, and Tatsunori Hashimoto. Stochastic amortization: A unified approach to accelerate feature and data attribution. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024
[11]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. InProceedings of the International Conference on Learning Representation...

work page 2021
[12]

InstaSHAP: Interpretable additive models explain shapley values instantly

James Enouen and Yan Liu. InstaSHAP: Interpretable additive models explain shapley values instantly. InProceedings of the International Conference on Learning Representations (ICLR), 2025

work page 2025
[13]

Tabarena: A living benchmark for machine learning on tabular data

Nick Erickson, Lennart Purucker, Andrej Tschalzev, David Holzmüller, Prateek Mutalik Desai, David Salinas, and Frank Hutter. Tabarena: A living benchmark for machine learning on tabular data. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2026

work page 2026
[14]

Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices.Games and Economic Behavior, 55(1):72–99, 2006

Katsushige Fujimoto, Ivan Kojadinovic, and Jean-Luc Marichal. Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices.Games and Economic Behavior, 55(1):72–99, 2006. doi: 10.1016/j.geb.2005.03.002

work page doi:10.1016/j.geb.2005.03.002 2006
[15]

SHAP-IQ: Unified Approximation of any-order Shapley Interactions

Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, and Barbara Hammer. SHAP-IQ: Unified Approximation of any-order Shapley Interactions. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 11515–11551, 2023

work page 2023
[16]

KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions

Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, and Barbara Hammer. KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions. In Proceedings of the International Conference on Machine Learning (ICML), pages 14308–14342, 2024

work page 2024
[17]

Unifying Feature-Based Explanations with Functional ANOV A and Cooperative Game Theory

Fabian Fumagalli, Maximilian Muschalik, Eyke Hüllermeier, Barbara Hammer, and Julia Herbinger. Unifying Feature-Based Explanations with Functional ANOV A and Cooperative Game Theory. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 5140–5148, 2025

work page 2025
[18]

SHAP values via sparse fourier repre- sentation

Ali Gorji, Andisheh Amrollahi, and Andreas Krause. SHAP values via sparse fourier repre- sentation. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2025

work page 2025
[19]

An axiomatic approach to the concept of interaction among players in cooperative games.International Journal of Game Theory, 28(4):547–565,

Michel Grabisch and Marc Roubens. An axiomatic approach to the concept of interaction among players in cooperative games.International Journal of Game Theory, 28(4):547–565,

work page
[20]

doi: 10.1007/s001820050125

work page doi:10.1007/s001820050125
[21]

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Benjamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, Mihir Manium, Rosen Yu, Felix Jablon- ski, Shi Bin Hoo, Anurag Garg, Jake Robertson, Magnus Bühler, Vladyslav Moroshan, Lennart Purucker, Clara Cornu, Lilly Charlotte Wehrhahn, Alessandro Bonetto, Bernhard Schö...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2025
[22]

Naofumi Hama, Masayoshi Mase, and Art B. Owen. Deletion and insertion tests in regression models.Journal of Machine Learning Research, 24:290:1–290:38, 2023

work page 2023
[23]

Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association, 58:13–30, 1963

Wassily Hoeffding. Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association, 58:13–30, 1963

work page 1963
[24]

Accurate predictions on small data with a tab- ular foundation model.Nature, 637(8045):319–326, 2025

Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tab- ular foundation model.Nature, 637(8045):319–326, 2025. doi: 10.1038/s41586-024-08328-6

work page doi:10.1038/s41586-024-08328-6 2025
[25]

Fast- SHAP: Real-Time Shapley Value Estimation

Neil Jethani, Mukund Sudarshan, Ian Connick Covert, Su-In Lee, and Rajesh Ranganath. Fast- SHAP: Real-Time Shapley Value Estimation. InProceedings of the International Conference on Learning Representations (ICLR), 2022

work page 2022
[26]

Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Nezihe Merve Gürel, Bo Li, Ce Zhang, Dawn Song, and Costas J. Spanos. Towards efficient data valuation based on the shapley value. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 1167–1176, 2019

work page 2019
[27]

Hierarchical Banzhaf interaction for general video-language representation learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(3):2125–2139, 2025

Peng Jin, Hao Li, Li Yuan, Shuicheng Yan, and Jie Chen. Hierarchical Banzhaf interaction for general video-language representation learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(3):2125–2139, 2025. 11

work page 2025
[28]

SPEX: Scaling feature interaction explanations for LLMs

Justin Singh Kang, Landon Butler, Abhineet Agarwal, Yigit Efe Erginbas, Ramtin Pedarsani, Bin Yu, and Kannan Ramchandran. SPEX: Scaling feature interaction explanations for LLMs. InProceedings of the Conference on Machine Learning (ICML), pages 28878–28903, 2025

work page 2025
[29]

Derivation and validation of toxicophores for mutagenicity prediction.Journal of Medicinal Chemistry, 48(1):312–320, 2005

Jeroen Kazius, Ross McGuire, and Roberta Bursi. Derivation and validation of toxicophores for mutagenicity prediction.Journal of Medicinal Chemistry, 48(1):312–320, 2005. doi: 10.1021/jm040835a

work page doi:10.1021/jm040835a 2005
[30]

LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 3146–3154, 2017

work page 2017
[31]

Approximating the shapley value without marginal contributions

Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, and Eyke Hüllermeier. Approximating the shapley value without marginal contributions. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 13246–13255, 2024

work page 2024
[32]

SV ARM-IQ: efficient approximation of any-order shapley interactions through stratification

Patrick Kolpaczki, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüllermeier. SV ARM-IQ: efficient approximation of any-order shapley interactions through stratification. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 3520–3528, 2024

work page 2024
[33]

Datasets: A community library for natural language processing

Quentin Lhoest, Albert Villanova del Moral, Patrick von Platen, Thomas Wolf, Mario Šaško, Yacine Jernite, Abhishek Thakur, Lewis Tunstall, Suraj Patil, Mariama Drame, Julien Chaumond, Julien Plu, Joe Davison, Simon Brandeis, Victor Sanh, Teven Le Scao, Kevin Canwen Xu, Nicolas Patry, Steven Liu, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Nat...

work page 2021
[34]

Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C

Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. In European Conference on Computer Vision ECCV, volume 8693, pages 740–755, 2014

work page 2014
[35]

Smac3: A versatile bayesian optimization package for hyperparameter optimization.Journal of Machine Learning Research, 23(54):1–9, 2022

Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, René Sass, and Frank Hutter. Smac3: A versatile bayesian optimization package for hyperparameter optimization.Journal of Machine Learning Research, 23(54):1–9, 2022. URLhttp://jmlr.org/papers/v23/21-0888.html

work page 2022
[36]

Disentangling environmental effects on perovskite solar cell performance via interpretable machine learning.ACS Energy Letters, 11: 1609–1617, 2026

Tianran Liu, Nicky Evans, Kangyu Ji, Ronaldo Lee, Aaron Zhu, Vinn Nguyen, James Serdy, Elizabeth M Wall, Yongli Lu, Florian A Formica, et al. Disentangling environmental effects on perovskite solar cell performance via interpretable machine learning.ACS Energy Letters, 11: 1609–1617, 2026

work page 2026
[37]

Lundberg and Su-In Lee

Scott M. Lundberg and Su-In Lee. A Unified Approach to Interpreting Model Predictions. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 4765– 4774, 2017

work page 2017
[38]

Lundberg, Gabriel G

Scott M. Lundberg, Gabriel G. Erion, Hugh Chen, Alex J. DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. From local explanations to global understanding with explainable AI for trees.Nature Machine Intelligence, 2(1):56–67,

work page
[39]

doi: 10.1038/s42256-019-0138-9

work page doi:10.1038/s42256-019-0138-9
[40]

Maas, Raymond E

Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y . Ng, and Christopher Potts. Learning word vectors for sentiment analysis. InProceedings of the Association for Computational Linguistics: Human Language Technologies (HLT), pages 142–150, 2011

work page 2011
[41]

Axiomatic characterizations of generalized values.Discrete Applied Mathematics, 155(1):26–43, 2007

Jean-Luc Marichal, Ivan Kojadinovic, and Katsushige Fujimoto. Axiomatic characterizations of generalized values.Discrete Applied Mathematics, 155(1):26–43, 2007. doi: 10.1016/J.DAM. 2006.05.002. 12

work page doi:10.1016/j.dam 2007
[42]

Amortized Linear-time Exact Shapley Value for Product-Kernel Methods

Majid Mohammadi, Siu Lun Chau, and Krikamol Muandet. Computing exact Shapley values in polynomial time for product-kernel methods.arXiv preprint, arXiv:2505.16516, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[43]

General pitfalls of model-agnostic interpretation methods for machine learning models

Christoph Molnar, Gunnar König, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Chris- tian A Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, and Bernd Bischl. General pitfalls of model-agnostic interpretation methods for machine learning models. InxxAI-Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, pages 39–68, 2022

work page 2020
[44]

shapiq: Shapley Interactions for Machine Learning

Maximilian Muschalik, Hubert Baniecki, Fabian Fumagalli, Patrick Kolpaczki, Barbara Ham- mer, and Eyke Hüllermeier. shapiq: Shapley Interactions for Machine Learning. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 130324–130357, 2024

work page 2024
[45]

Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles

Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüllermeier. Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 14388–14396,

work page
[46]

doi: 10.1609/aaai.v38i13.29352

work page doi:10.1609/aaai.v38i13.29352
[47]

Exact Computation of Any- Order Shapley Interactions for Graph Neural Networks

Maximilian Muschalik, Fabian Fumagalli, Paolo Frazzetto, Janine Strotherm, Luca Hermes, Alessandro Sperduti, Eyke Hüllermeier, and Barbara Hammer. Exact Computation of Any- Order Shapley Interactions for Graph Neural Networks. InProceedings of the Conference on Learning Representations (ICLR), 2025

work page 2025
[48]

Teal Witter

Christopher Musco and R. Teal Witter. Provably Accurate Shapley Value Estimation via Leverage Score Sampling. InProceedings of the International Conference on Learning Repre- sentations (ICLR), 2025

work page 2025
[49]

From decision trees to boolean logic: A fast and unified SHAP algorithm

Alexander Nadel and Ron Wettenstein. From decision trees to boolean logic: A fast and unified SHAP algorithm. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 24476–24485, 2026. doi: 10.1609/AAAI.V40I29.39630

work page doi:10.1609/aaai.v40i29.39630 2026
[50]

Lars H. B. Olsen, Ingrid K. Glad, Martin Jullum, and Kjersti Aas. Using Shapley values and variational autoencoders to explain predictive models with dependent mixed features.Journal of Machine Learning Research, 23(213):1–51, 2022

work page 2022
[51]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceed- ings of the International Conference on Machine Learning ICML, pages 8748–8763, 2021

work page 2021
[52]

The shapley value in machine learning

Benedek Rozemberczki, Lauren Watson, Péter Bayer, Hao-Tsung Yang, Oliver Kiss, Sebastian Nilsson, and Rik Sarkar. The shapley value in machine learning. InProceedings of International Joint Conference on Artificial Intelligence (IJCAI), pages 5572–5579, 2022

work page 2022
[53]

Evaluating attribution for graph neural networks

Benjamin Sanchez-Lengeling, Jennifer Wei, Brian Lee, Emily Reif, Peter Wang, Wesley Qian, Kevin McCloskey, Lucy Colwell, and Alexander Wiltschko. Evaluating attribution for graph neural networks. InThe Thirty-third Annual Conference on Neural Information Processing Systems, volume 33, pages 5898–5910, 2020

work page 2020
[54]

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter.CoRR, abs/1910.01108, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910
[55]

Investigating the impact of conceptual metaphors on LLM-based NLI through shapley interactions

Meghdut Sengupta, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüller- meier, Debanjan Ghosh, and Henning Wachsmuth. Investigating the impact of conceptual metaphors on LLM-based NLI through shapley interactions. InFindings of the Association for Computational Linguistics: EMNLP 2025, pages 17393–17403, 2025

work page 2025
[56]

L. S. Shapley. A Value for n-Person Games. InContributions to the Theory of Games (AM-28), Volume II, pages 307–318. Princeton University Press, 1953

work page 1953
[57]

Adaptive prompting: Ad-hoc prompt composition for social bias detection

Maximilian Spliethöver, Tim Knebler, Fabian Fumagalli, Maximilian Muschalik, Barbara Hammer, Eyke Hüllermeier, and Henning Wachsmuth. Adaptive prompting: Ad-hoc prompt composition for social bias detection. InProceedings of the Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics, 2025. 13

work page 2025
[58]

An Efficient Explanation of Individual Classifications using Game Theory.Journal of Machine Learning Research, 11:1–18, 2010

Erik Strumbelj and Igor Kononenko. An Efficient Explanation of Individual Classifications using Game Theory.Journal of Machine Learning Research, 11:1–18, 2010

work page 2010
[59]

Explaining prediction models and individual predictions with feature contributions.Knowledge and Information Systems, 41(3):647–665, 2014

Erik Strumbelj and Igor Kononenko. Explaining prediction models and individual predictions with feature contributions.Knowledge and Information Systems, 41(3):647–665, 2014. doi: 10.1007/s10115-013-0679-x

work page doi:10.1007/s10115-013-0679-x 2014
[60]

The Shapley Taylor Interaction Index

Mukund Sundararajan, Kedar Dhamdhere, and Ashish Agarwal. The Shapley Taylor Interaction Index. InProceedings of the International Conference on Machine Learning (ICML), pages 9259–9268, 2020

work page 2020
[61]

Faith-Shap: The Faithful Shapley Interaction Index.Journal of Machine Learning Research, 24(94):1–42, 2023

Che-Ping Tsai, Chih-Kuan Yeh, and Pradeep Ravikumar. Faith-Shap: The Faithful Shapley Interaction Index.Journal of Machine Learning Research, 24(94):1–42, 2023

work page 2023
[62]

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Al- abdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al. SigLIP 2: Multilingual vision-language encoders with improved semantic understanding, local- ization, and dense features.arXiv preprint arXiv:2502.14786, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[63]

Wang and Ruoxi Jia

Jiachen T. Wang and Ruoxi Jia. Data banzhaf: A robust data valuation framework for machine learning. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 6388–6421, 2023

work page 2023
[64]

Wang, Prateek Mittal, and Ruoxi Jia

Jiachen T. Wang, Prateek Mittal, and Ruoxi Jia. Efficient data Shapley for weighted nearest neighbor algorithms. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 2557–2565, 2024

work page 2024
[65]

HyperSHAP: Shapley Values and Interactions for Hyperparameter Importance

Marcel Wever, Maximilian Muschalik, Fabian Fumagalli, and Marius Lindauer. HyperSHAP: Shapley Values and Interactions for Hyperparameter Importance. InAAAI, 2026

work page 2026
[66]

Teal Witter, Yurong Liu, and Christopher Musco

R. Teal Witter, Yurong Liu, and Christopher Musco. Regression-adjusted monte carlo esti- mators for shapley values and probabilistic values. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2025. URL https://openreview.net/forum? id=Qabko39AS5

work page 2025
[67]

How powerful are graph neural networks? InProceedings of the International Conference on Learning Representations (ICLR),

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InProceedings of the International Conference on Learning Representations (ICLR),

work page
[68]

URLhttps://openreview.net/forum?id=ryGs6iA5Km

work page
[69]

Interventional SHAP values and interac- tion values for piecewise linear regression trees

Artjom Zern, Klaus Broelemann, and Gjergji Kasneci. Interventional SHAP values and interac- tion values for piecewise linear regression trees. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 11164–11173, 2023

work page 2023
[70]

Proxy-based Approximation of Shapley and Banzhaf Interactions

Chenyang Zhao, Kun Wang, Janet H. Hsiao, and Antoni B. Chan. Grad-ECLIP: Gradient-based visual and textual explanations for CLIP. InProceedings of the International Conference on Machine Learning ICML, 2024. 14 Appendix for “Proxy-based Approximation of Shapley and Banzhaf Interactions” A Proofs 16 A.1 Proof of Proposition 3.2 . . . . . . . . . . . . . . ...

work page 2024
[71]

We also compare leverage weights, as used in LeverageSHAP [45], with KernelSHAP- IQ weights [ 16]

as model-agnostic residual approxima- tors. We also compare leverage weights, as used in LeverageSHAP [45], with KernelSHAP- IQ weights [ 16]. As underlying games, we use VIT4BY4PATCHES, BIKESHARINGLO- CALXAI, CALIFORNIAHOUSINGLOCALXAI, CORRGROUPS60LOCALXAI, and COMMUNI- TIESANDCRIMELOCALXAI; details on these datasets are provided in Section C.1. For each...

work page
[72]

Sampling and evaluation.Coalitions T ⊆2 N are sampled and evaluated, yielding the dataset D={(T, ν(T))} T∈T

work page
[73]

Proxy fitting.A gradient-boosted tree model, by default LightGBM, is fitted on D by minimizing the mean squared error

work page
[74]

ProxySPEX then keeps a minimal subset C ⋆ ⊆ F of coefficients that explains at least95%of the total squared Fourier mass, C ⋆ = arg min C⊆F |C|s.t

Fourier extraction and truncation.Fourier coefficients are extracted from the fitted tree proxy. ProxySPEX then keeps a minimal subset C ⋆ ⊆ F of coefficients that explains at least95%of the total squared Fourier mass, C ⋆ = arg min C⊆F |C|s.t. P F∈C F 2 P F∈F F 2 ≥0.95, whereFdenotes the set of Fourier coefficients extracted from the tree

work page
[75]

Limitations

Adjustment.Given the truncated coefficient set C ⋆, ProxySPEX applies a refinement step to improve the extracted Fourier coefficients. It constructs a design matrix X∈ {−1,+1} |T |×|C ⋆| with entries Xi,j = (−1)|Ti∩Cj |, and solves the regularized regression problem F ⋆ = arg min F∈R |C⋆ | ∥ν−XF∥ 2 2 +λ∥F∥ 2 2. The truncation step is essential for making ...

work page
[76]

Guidelines: • The answer [N/A] means that the paper does not involve crowdsourcing nor research with human subjects

Institutional review board (IRB) approvals or equivalent for research with human subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or ...

work page

[1] [1]

Efficient and Accurate Explanation Estimation with Distribution Compression

Hubert Baniecki, Giuseppe Casalicchio, Bernd Bischl, and Przemyslaw Biecek. Efficient and Accurate Explanation Estimation with Distribution Compression. InProceedings of the International Conference on Learning Representations (ICLR), 2025

work page 2025

[2] [2]

Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions

Hubert Baniecki, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüller- meier, and Przemyslaw Biecek. Explaining Similarity in Vision-Language Encoders with Weighted Banzhaf Interactions. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025

[3] [3]

Weighted voting doesn’t work: A mathematical analysis.Rutgers Law Review, 19:317, 1964

John F Banzhaf III. Weighted voting doesn’t work: A mathematical analysis.Rutgers Law Review, 19:317, 1964

work page 1964

[4] [4]

Proxy-SPEX: Sample-efficient interpretability via sparse feature interactions in LLMs

Landon Butler, Abhineet Agarwal, Justin Singh Kang, Yigit Efe Erginbas, Bin Yu, and Kannan Ramchandran. Proxy-SPEX: Sample-efficient interpretability via sparse feature interactions in LLMs. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

work page 2025

[5] [5]

Yu-Xuan Cai, Hai-Yan Chen, Ya-Jing Qu, Wen-Hao Zhao, Mei-Ying Wang, Ying Chen, and Jin Ma. Improved vertical distribution prediction of soil vocs contamination in site-scale utilizing ensemble machine learning approach integrated with molecular descriptors.Journal of Hazardous Materials, page 139452, 2025

work page 2025

[6] [6]

Masset, R

Javier Castro, Daniel Gómez, and Juan Tejada. Polynomial calculation of the Shapley value based on sampling.Computers & Operations Research, 36(5):1726–1730, 2009. doi: 10.1016/j. cor.2008.04.004

work page doi:10.1016/j 2009

[7] [7]

Improving polynomial estima- tion of the Shapley value by stratified random sampling with optimum allocation.Computers & Operations Research, 82:180–188, 2017

Javier Castro, Daniel Gómez, Elisenda Molina, and Juan Tejada. Improving polynomial estima- tion of the Shapley value by stratified random sampling with optimum allocation.Computers & Operations Research, 82:180–188, 2017. doi: 10.1016/j.cor.2017.01.019

work page doi:10.1016/j.cor.2017.01.019 2017

[8] [8]

Chen and C

Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 785–794. ACM, 2016. doi: 10.1145/2939672.2939785

work page doi:10.1145/2939672.2939785 2016

[9] [9]

Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression

Ian Covert and Su-In Lee. Improving KernelSHAP: Practical Shapley Value Estimation Using Linear Regression. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 3457–3465, 2021

work page 2021

[10] [10]

Stochastic amortization: A unified approach to accelerate feature and data attribution

Ian Connick Covert, Chanwoo Kim, Su-In Lee, James Zou, and Tatsunori Hashimoto. Stochastic amortization: A unified approach to accelerate feature and data attribution. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024

[11] [11]

An image is worth 16x16 words: Transformers for image recognition at scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. InProceedings of the International Conference on Learning Representation...

work page 2021

[12] [12]

InstaSHAP: Interpretable additive models explain shapley values instantly

James Enouen and Yan Liu. InstaSHAP: Interpretable additive models explain shapley values instantly. InProceedings of the International Conference on Learning Representations (ICLR), 2025

work page 2025

[13] [13]

Tabarena: A living benchmark for machine learning on tabular data

Nick Erickson, Lennart Purucker, Andrej Tschalzev, David Holzmüller, Prateek Mutalik Desai, David Salinas, and Frank Hutter. Tabarena: A living benchmark for machine learning on tabular data. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2026

work page 2026

[14] [14]

Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices.Games and Economic Behavior, 55(1):72–99, 2006

Katsushige Fujimoto, Ivan Kojadinovic, and Jean-Luc Marichal. Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices.Games and Economic Behavior, 55(1):72–99, 2006. doi: 10.1016/j.geb.2005.03.002

work page doi:10.1016/j.geb.2005.03.002 2006

[15] [15]

SHAP-IQ: Unified Approximation of any-order Shapley Interactions

Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, and Barbara Hammer. SHAP-IQ: Unified Approximation of any-order Shapley Interactions. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 11515–11551, 2023

work page 2023

[16] [16]

KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions

Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, and Barbara Hammer. KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions. In Proceedings of the International Conference on Machine Learning (ICML), pages 14308–14342, 2024

work page 2024

[17] [17]

Unifying Feature-Based Explanations with Functional ANOV A and Cooperative Game Theory

Fabian Fumagalli, Maximilian Muschalik, Eyke Hüllermeier, Barbara Hammer, and Julia Herbinger. Unifying Feature-Based Explanations with Functional ANOV A and Cooperative Game Theory. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 5140–5148, 2025

work page 2025

[18] [18]

SHAP values via sparse fourier repre- sentation

Ali Gorji, Andisheh Amrollahi, and Andreas Krause. SHAP values via sparse fourier repre- sentation. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2025

work page 2025

[19] [19]

An axiomatic approach to the concept of interaction among players in cooperative games.International Journal of Game Theory, 28(4):547–565,

Michel Grabisch and Marc Roubens. An axiomatic approach to the concept of interaction among players in cooperative games.International Journal of Game Theory, 28(4):547–565,

work page

[20] [20]

doi: 10.1007/s001820050125

work page doi:10.1007/s001820050125

[21] [21]

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Benjamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, Mihir Manium, Rosen Yu, Felix Jablon- ski, Shi Bin Hoo, Anurag Garg, Jake Robertson, Magnus Bühler, Vladyslav Moroshan, Lennart Purucker, Clara Cornu, Lilly Charlotte Wehrhahn, Alessandro Bonetto, Bernhard Schö...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv 2025

[22] [22]

Naofumi Hama, Masayoshi Mase, and Art B. Owen. Deletion and insertion tests in regression models.Journal of Machine Learning Research, 24:290:1–290:38, 2023

work page 2023

[23] [23]

Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association, 58:13–30, 1963

Wassily Hoeffding. Probability inequalities for sums of bounded random variables.Journal of the American Statistical Association, 58:13–30, 1963

work page 1963

[24] [24]

Accurate predictions on small data with a tab- ular foundation model.Nature, 637(8045):319–326, 2025

Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister, and Frank Hutter. Accurate predictions on small data with a tab- ular foundation model.Nature, 637(8045):319–326, 2025. doi: 10.1038/s41586-024-08328-6

work page doi:10.1038/s41586-024-08328-6 2025

[25] [25]

Fast- SHAP: Real-Time Shapley Value Estimation

Neil Jethani, Mukund Sudarshan, Ian Connick Covert, Su-In Lee, and Rajesh Ranganath. Fast- SHAP: Real-Time Shapley Value Estimation. InProceedings of the International Conference on Learning Representations (ICLR), 2022

work page 2022

[26] [26]

Ruoxi Jia, David Dao, Boxin Wang, Frances Ann Hubis, Nick Hynes, Nezihe Merve Gürel, Bo Li, Ce Zhang, Dawn Song, and Costas J. Spanos. Towards efficient data valuation based on the shapley value. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 1167–1176, 2019

work page 2019

[27] [27]

Hierarchical Banzhaf interaction for general video-language representation learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(3):2125–2139, 2025

Peng Jin, Hao Li, Li Yuan, Shuicheng Yan, and Jie Chen. Hierarchical Banzhaf interaction for general video-language representation learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 47(3):2125–2139, 2025. 11

work page 2025

[28] [28]

SPEX: Scaling feature interaction explanations for LLMs

Justin Singh Kang, Landon Butler, Abhineet Agarwal, Yigit Efe Erginbas, Ramtin Pedarsani, Bin Yu, and Kannan Ramchandran. SPEX: Scaling feature interaction explanations for LLMs. InProceedings of the Conference on Machine Learning (ICML), pages 28878–28903, 2025

work page 2025

[29] [29]

Derivation and validation of toxicophores for mutagenicity prediction.Journal of Medicinal Chemistry, 48(1):312–320, 2005

Jeroen Kazius, Ross McGuire, and Roberta Bursi. Derivation and validation of toxicophores for mutagenicity prediction.Journal of Medicinal Chemistry, 48(1):312–320, 2005. doi: 10.1021/jm040835a

work page doi:10.1021/jm040835a 2005

[30] [30]

LightGBM: A Highly Efficient Gradient Boosting Decision Tree

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 3146–3154, 2017

work page 2017

[31] [31]

Approximating the shapley value without marginal contributions

Patrick Kolpaczki, Viktor Bengs, Maximilian Muschalik, and Eyke Hüllermeier. Approximating the shapley value without marginal contributions. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 13246–13255, 2024

work page 2024

[32] [32]

SV ARM-IQ: efficient approximation of any-order shapley interactions through stratification

Patrick Kolpaczki, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüllermeier. SV ARM-IQ: efficient approximation of any-order shapley interactions through stratification. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 3520–3528, 2024

work page 2024

[33] [33]

Datasets: A community library for natural language processing

Quentin Lhoest, Albert Villanova del Moral, Patrick von Platen, Thomas Wolf, Mario Šaško, Yacine Jernite, Abhishek Thakur, Lewis Tunstall, Suraj Patil, Mariama Drame, Julien Chaumond, Julien Plu, Joe Davison, Simon Brandeis, Victor Sanh, Teven Le Scao, Kevin Canwen Xu, Nicolas Patry, Steven Liu, Angelina McMillan-Major, Philipp Schmid, Sylvain Gugger, Nat...

work page 2021

[34] [34]

Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C

Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. In European Conference on Computer Vision ECCV, volume 8693, pages 740–755, 2014

work page 2014

[35] [35]

Smac3: A versatile bayesian optimization package for hyperparameter optimization.Journal of Machine Learning Research, 23(54):1–9, 2022

Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, René Sass, and Frank Hutter. Smac3: A versatile bayesian optimization package for hyperparameter optimization.Journal of Machine Learning Research, 23(54):1–9, 2022. URLhttp://jmlr.org/papers/v23/21-0888.html

work page 2022

[36] [36]

Disentangling environmental effects on perovskite solar cell performance via interpretable machine learning.ACS Energy Letters, 11: 1609–1617, 2026

Tianran Liu, Nicky Evans, Kangyu Ji, Ronaldo Lee, Aaron Zhu, Vinn Nguyen, James Serdy, Elizabeth M Wall, Yongli Lu, Florian A Formica, et al. Disentangling environmental effects on perovskite solar cell performance via interpretable machine learning.ACS Energy Letters, 11: 1609–1617, 2026

work page 2026

[37] [37]

Lundberg and Su-In Lee

Scott M. Lundberg and Su-In Lee. A Unified Approach to Interpreting Model Predictions. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 4765– 4774, 2017

work page 2017

[38] [38]

Lundberg, Gabriel G

Scott M. Lundberg, Gabriel G. Erion, Hugh Chen, Alex J. DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. From local explanations to global understanding with explainable AI for trees.Nature Machine Intelligence, 2(1):56–67,

work page

[39] [39]

doi: 10.1038/s42256-019-0138-9

work page doi:10.1038/s42256-019-0138-9

[40] [40]

Maas, Raymond E

Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y . Ng, and Christopher Potts. Learning word vectors for sentiment analysis. InProceedings of the Association for Computational Linguistics: Human Language Technologies (HLT), pages 142–150, 2011

work page 2011

[41] [41]

Axiomatic characterizations of generalized values.Discrete Applied Mathematics, 155(1):26–43, 2007

Jean-Luc Marichal, Ivan Kojadinovic, and Katsushige Fujimoto. Axiomatic characterizations of generalized values.Discrete Applied Mathematics, 155(1):26–43, 2007. doi: 10.1016/J.DAM. 2006.05.002. 12

work page doi:10.1016/j.dam 2007

[42] [42]

Amortized Linear-time Exact Shapley Value for Product-Kernel Methods

Majid Mohammadi, Siu Lun Chau, and Krikamol Muandet. Computing exact Shapley values in polynomial time for product-kernel methods.arXiv preprint, arXiv:2505.16516, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[43] [43]

General pitfalls of model-agnostic interpretation methods for machine learning models

Christoph Molnar, Gunnar König, Julia Herbinger, Timo Freiesleben, Susanne Dandl, Chris- tian A Scholbeck, Giuseppe Casalicchio, Moritz Grosse-Wentrup, and Bernd Bischl. General pitfalls of model-agnostic interpretation methods for machine learning models. InxxAI-Beyond Explainable AI: International Workshop, Held in Conjunction with ICML 2020, pages 39–68, 2022

work page 2020

[44] [44]

shapiq: Shapley Interactions for Machine Learning

Maximilian Muschalik, Hubert Baniecki, Fabian Fumagalli, Patrick Kolpaczki, Barbara Ham- mer, and Eyke Hüllermeier. shapiq: Shapley Interactions for Machine Learning. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), pages 130324–130357, 2024

work page 2024

[45] [45]

Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles

Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, and Eyke Hüllermeier. Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions for Tree Ensembles. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 14388–14396,

work page

[46] [46]

doi: 10.1609/aaai.v38i13.29352

work page doi:10.1609/aaai.v38i13.29352

[47] [47]

Exact Computation of Any- Order Shapley Interactions for Graph Neural Networks

Maximilian Muschalik, Fabian Fumagalli, Paolo Frazzetto, Janine Strotherm, Luca Hermes, Alessandro Sperduti, Eyke Hüllermeier, and Barbara Hammer. Exact Computation of Any- Order Shapley Interactions for Graph Neural Networks. InProceedings of the Conference on Learning Representations (ICLR), 2025

work page 2025

[48] [48]

Teal Witter

Christopher Musco and R. Teal Witter. Provably Accurate Shapley Value Estimation via Leverage Score Sampling. InProceedings of the International Conference on Learning Repre- sentations (ICLR), 2025

work page 2025

[49] [49]

From decision trees to boolean logic: A fast and unified SHAP algorithm

Alexander Nadel and Ron Wettenstein. From decision trees to boolean logic: A fast and unified SHAP algorithm. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 24476–24485, 2026. doi: 10.1609/AAAI.V40I29.39630

work page doi:10.1609/aaai.v40i29.39630 2026

[50] [50]

Lars H. B. Olsen, Ingrid K. Glad, Martin Jullum, and Kjersti Aas. Using Shapley values and variational autoencoders to explain predictive models with dependent mixed features.Journal of Machine Learning Research, 23(213):1–51, 2022

work page 2022

[51] [51]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceed- ings of the International Conference on Machine Learning ICML, pages 8748–8763, 2021

work page 2021

[52] [52]

The shapley value in machine learning

Benedek Rozemberczki, Lauren Watson, Péter Bayer, Hao-Tsung Yang, Oliver Kiss, Sebastian Nilsson, and Rik Sarkar. The shapley value in machine learning. InProceedings of International Joint Conference on Artificial Intelligence (IJCAI), pages 5572–5579, 2022

work page 2022

[53] [53]

Evaluating attribution for graph neural networks

Benjamin Sanchez-Lengeling, Jennifer Wei, Brian Lee, Emily Reif, Peter Wang, Wesley Qian, Kevin McCloskey, Lucy Colwell, and Alexander Wiltschko. Evaluating attribution for graph neural networks. InThe Thirty-third Annual Conference on Neural Information Processing Systems, volume 33, pages 5898–5910, 2020

work page 2020

[54] [54]

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter.CoRR, abs/1910.01108, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1910

[55] [55]

Investigating the impact of conceptual metaphors on LLM-based NLI through shapley interactions

Meghdut Sengupta, Maximilian Muschalik, Fabian Fumagalli, Barbara Hammer, Eyke Hüller- meier, Debanjan Ghosh, and Henning Wachsmuth. Investigating the impact of conceptual metaphors on LLM-based NLI through shapley interactions. InFindings of the Association for Computational Linguistics: EMNLP 2025, pages 17393–17403, 2025

work page 2025

[56] [56]

L. S. Shapley. A Value for n-Person Games. InContributions to the Theory of Games (AM-28), Volume II, pages 307–318. Princeton University Press, 1953

work page 1953

[57] [57]

Adaptive prompting: Ad-hoc prompt composition for social bias detection

Maximilian Spliethöver, Tim Knebler, Fabian Fumagalli, Maximilian Muschalik, Barbara Hammer, Eyke Hüllermeier, and Henning Wachsmuth. Adaptive prompting: Ad-hoc prompt composition for social bias detection. InProceedings of the Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics, 2025. 13

work page 2025

[58] [58]

An Efficient Explanation of Individual Classifications using Game Theory.Journal of Machine Learning Research, 11:1–18, 2010

Erik Strumbelj and Igor Kononenko. An Efficient Explanation of Individual Classifications using Game Theory.Journal of Machine Learning Research, 11:1–18, 2010

work page 2010

[59] [59]

Explaining prediction models and individual predictions with feature contributions.Knowledge and Information Systems, 41(3):647–665, 2014

Erik Strumbelj and Igor Kononenko. Explaining prediction models and individual predictions with feature contributions.Knowledge and Information Systems, 41(3):647–665, 2014. doi: 10.1007/s10115-013-0679-x

work page doi:10.1007/s10115-013-0679-x 2014

[60] [60]

The Shapley Taylor Interaction Index

Mukund Sundararajan, Kedar Dhamdhere, and Ashish Agarwal. The Shapley Taylor Interaction Index. InProceedings of the International Conference on Machine Learning (ICML), pages 9259–9268, 2020

work page 2020

[61] [61]

Faith-Shap: The Faithful Shapley Interaction Index.Journal of Machine Learning Research, 24(94):1–42, 2023

Che-Ping Tsai, Chih-Kuan Yeh, and Pradeep Ravikumar. Faith-Shap: The Faithful Shapley Interaction Index.Journal of Machine Learning Research, 24(94):1–42, 2023

work page 2023

[62] [62]

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Al- abdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al. SigLIP 2: Multilingual vision-language encoders with improved semantic understanding, local- ization, and dense features.arXiv preprint arXiv:2502.14786, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[63] [63]

Wang and Ruoxi Jia

Jiachen T. Wang and Ruoxi Jia. Data banzhaf: A robust data valuation framework for machine learning. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 6388–6421, 2023

work page 2023

[64] [64]

Wang, Prateek Mittal, and Ruoxi Jia

Jiachen T. Wang, Prateek Mittal, and Ruoxi Jia. Efficient data Shapley for weighted nearest neighbor algorithms. InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), pages 2557–2565, 2024

work page 2024

[65] [65]

HyperSHAP: Shapley Values and Interactions for Hyperparameter Importance

Marcel Wever, Maximilian Muschalik, Fabian Fumagalli, and Marius Lindauer. HyperSHAP: Shapley Values and Interactions for Hyperparameter Importance. InAAAI, 2026

work page 2026

[66] [66]

Teal Witter, Yurong Liu, and Christopher Musco

R. Teal Witter, Yurong Liu, and Christopher Musco. Regression-adjusted monte carlo esti- mators for shapley values and probabilistic values. InProceedings of Advances in Neural Information Processing Systems (NeurIPS), 2025. URL https://openreview.net/forum? id=Qabko39AS5

work page 2025

[67] [67]

How powerful are graph neural networks? InProceedings of the International Conference on Learning Representations (ICLR),

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? InProceedings of the International Conference on Learning Representations (ICLR),

work page

[68] [68]

URLhttps://openreview.net/forum?id=ryGs6iA5Km

work page

[69] [69]

Interventional SHAP values and interac- tion values for piecewise linear regression trees

Artjom Zern, Klaus Broelemann, and Gjergji Kasneci. Interventional SHAP values and interac- tion values for piecewise linear regression trees. InProceedings of the AAAI Conference on Artificial Intelligence (AAAI), pages 11164–11173, 2023

work page 2023

[70] [70]

Proxy-based Approximation of Shapley and Banzhaf Interactions

Chenyang Zhao, Kun Wang, Janet H. Hsiao, and Antoni B. Chan. Grad-ECLIP: Gradient-based visual and textual explanations for CLIP. InProceedings of the International Conference on Machine Learning ICML, 2024. 14 Appendix for “Proxy-based Approximation of Shapley and Banzhaf Interactions” A Proofs 16 A.1 Proof of Proposition 3.2 . . . . . . . . . . . . . . ...

work page 2024

[71] [71]

We also compare leverage weights, as used in LeverageSHAP [45], with KernelSHAP- IQ weights [ 16]

as model-agnostic residual approxima- tors. We also compare leverage weights, as used in LeverageSHAP [45], with KernelSHAP- IQ weights [ 16]. As underlying games, we use VIT4BY4PATCHES, BIKESHARINGLO- CALXAI, CALIFORNIAHOUSINGLOCALXAI, CORRGROUPS60LOCALXAI, and COMMUNI- TIESANDCRIMELOCALXAI; details on these datasets are provided in Section C.1. For each...

work page

[72] [72]

Sampling and evaluation.Coalitions T ⊆2 N are sampled and evaluated, yielding the dataset D={(T, ν(T))} T∈T

work page

[73] [73]

Proxy fitting.A gradient-boosted tree model, by default LightGBM, is fitted on D by minimizing the mean squared error

work page

[74] [74]

ProxySPEX then keeps a minimal subset C ⋆ ⊆ F of coefficients that explains at least95%of the total squared Fourier mass, C ⋆ = arg min C⊆F |C|s.t

Fourier extraction and truncation.Fourier coefficients are extracted from the fitted tree proxy. ProxySPEX then keeps a minimal subset C ⋆ ⊆ F of coefficients that explains at least95%of the total squared Fourier mass, C ⋆ = arg min C⊆F |C|s.t. P F∈C F 2 P F∈F F 2 ≥0.95, whereFdenotes the set of Fourier coefficients extracted from the tree

work page

[75] [75]

Limitations

Adjustment.Given the truncated coefficient set C ⋆, ProxySPEX applies a refinement step to improve the extracted Fourier coefficients. It constructs a design matrix X∈ {−1,+1} |T |×|C ⋆| with entries Xi,j = (−1)|Ti∩Cj |, and solves the regularized regression problem F ⋆ = arg min F∈R |C⋆ | ∥ν−XF∥ 2 2 +λ∥F∥ 2 2. The truncation step is essential for making ...

work page

[76] [76]

Guidelines: • The answer [N/A] means that the paper does not involve crowdsourcing nor research with human subjects

Institutional review board (IRB) approvals or equivalent for research with human subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or ...

work page