Shapley Value Approximation Based on k-Additive Games

Eyke H\"ullermeier; Guilherme Dean Pelegrina; Patrick Kolpaczki

arxiv: 2502.04763 · v2 · submitted 2025-02-07 · 💻 cs.GT · cs.LG

Shapley Value Approximation Based on k-Additive Games

Guilherme Dean Pelegrina , Patrick Kolpaczki , Eyke H\"ullermeier This is my paper

Pith reviewed 2026-05-23 04:25 UTC · model grok-4.3

classification 💻 cs.GT cs.LG

keywords Shapley valueapproximationk-additive gamescooperative gamesfair divisionfeature attribution

0 comments

The pith

Fitting a k-additive surrogate game yields its exact Shapley values as estimates for the original game.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SVAkADD, a method that approximates Shapley values by first fitting a simpler k-additive game to the original cooperative game. Because the value function of a k-additive game depends only on coalitions of size at most k, its Shapley values admit an exact closed-form expression that can be computed efficiently. These exact surrogate values are then taken directly as the approximation for the original game. The approach targets the exponential cost of exact Shapley computation in settings such as feature attribution or data valuation. Empirical comparisons with existing approximation techniques are reported to assess performance.

Core claim

The central claim is that a k-additive surrogate game fitted to the original game permits the exact Shapley values of the surrogate to serve as estimates for the Shapley values of the original game, because k-additivity makes the exact values of the surrogate computable in polynomial time.

What carries the argument

The k-additive surrogate game, a cooperative game whose characteristic function is fully determined by interactions among at most k agents and therefore admits a closed-form expression for its Shapley values.

If this is right

The method replaces sampling-based estimates with exact values computed from the surrogate.
Computational cost grows polynomially rather than exponentially in the number of agents once k is fixed.
The same surrogate fitting step can be reused across multiple queries on the same game.
The technique applies directly to both cooperative-game fair division and to feature or data-point attribution in machine learning models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Varying k offers an explicit accuracy-computation trade-off that could be tuned per problem instance.
The fitting step itself may be combined with existing sampling or regression techniques to improve surrogate quality.
Because the surrogate is itself a valid game, its exact values satisfy the same axioms that Shapley values satisfy, which may be useful when axiomatic properties matter more than numerical fidelity to the original game.

Load-bearing premise

The Shapley values of the fitted k-additive surrogate will be close enough to those of the original game for the intended applications.

What would settle it

A set of test games where the difference between the surrogate Shapley values and the true Shapley values remains large even after the fitting procedure for moderate values of k.

Figures

Figures reproduced from arXiv: 2502.04763 by Eyke H\"ullermeier, Guilherme Dean Pelegrina, Patrick Kolpaczki.

**Figure 1.** Figure 1: The sampled coalition values ν(A1), . . . , ν(AT ) from the given game (N, ν) are used to fit a k-additive surrogate game (N, νk) in polynomial time. The Shapley values ϕ k 1 , . . . , ϕk n of (N, νk) are obtained immediately from its k-additive representation. Since νk approximates ν, these serve as estimates of the true Shapley values ϕ1, . . . , ϕn of (N, ν). to the obtained evaluations. However, we can… view at source ↗

**Figure 2.** Figure 2: MSE of SVAkADD averaged over 100 repetitions in dependence of available budget T for different additivity degrees k. Datasets stem from various explanation types: global (a)-(c), local (d)-(f), and unsupervised (g)-(i) with differing player numbers n. for classification and the mean squared error for regression on a test set. For each evaluated coalition a random forest is retrained on a training set. We e… view at source ↗

**Figure 3.** Figure 3: MSE of SVAkADD and competing methods averaged over 100 repetitions in dependence of available budget T. Datasets stem from various explanation types: global (a)-(c), local (d)-(f), and unsupervised (g)-(i) with differing player numbers n. behavior for low k, specifically k = 1, by the model’s inability to achieve a good fit due to missing flexibility. As a result, the convergence to the exact Shapley valu… view at source ↗

**Figure 4.** Figure 4: MSE of SVAkADD and competing methods averaged over 100 repetitions in dependence of available sample budget T. Datasets stem from various explanation types (i) global (first row), (ii) local (second row), and unsupervised (third row) with differing player numbers n. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_4.png] view at source ↗

read the original abstract

The Shapley value is the prevalent solution for fair division problems in which a payout is to be divided among multiple agents. By adopting a game-theoretic view, the idea of fair division and the Shapley value can also be used in machine learning to quantify the individual contribution of features or data points to the performance of a predictive model. Despite its popularity and axiomatic justification, the Shapley value suffers from a computational complexity that scales exponentially with the number of entities involved, and hence requires approximation methods for its reliable estimation. We propose SVA$k_{\text{ADD}}$, a novel approximation method that fits a $k$-additive surrogate game. By taking advantage of $k$-additivity, we are able to elicit the exact Shapley values of the surrogate game and then use these values as estimates for the original fair division problem. The efficacy of our method is evaluated empirically and compared to competing methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes SVAkADD, a Shapley value approximation method that constructs a k-additive surrogate game by fitting it to the original value function v, then computes the exact Shapley values of the surrogate (which are tractable due to k-additivity) and uses them as estimates for the original game. Efficacy is assessed via empirical comparisons against existing approximation techniques on synthetic and real-world instances.

Significance. If the empirical results are robust, the method offers a structurally motivated alternative to sampling-based or Monte Carlo approximations by exploiting the closed-form Shapley formula available for k-additive games. This could be useful in machine-learning settings where the number of features or data points is moderate but exact computation remains prohibitive, provided the fitting step remains efficient and the approximation error is controlled.

major comments (2)

[§3] §3 (Fitting procedure): the optimization problem used to obtain the k-additive coefficients of the surrogate is load-bearing for both correctness and complexity claims; the manuscript must state the precise objective (e.g., least-squares over coalitions, with or without regularization) and its algorithmic complexity, because without this the claim that the overall procedure is polynomial-time cannot be verified.
[§5] §5 (Empirical evaluation): the reported error metrics are not accompanied by an ablation on the hyper-parameter k or on the number of coalitions used in fitting; without these controls it is impossible to assess whether the reported superiority over baselines is stable or an artifact of particular k choices.

minor comments (2)

[Abstract / §2] Notation for the surrogate game should be introduced once and used consistently (the abstract writes k-ADD while the body appears to use different subscripting).
[§5] Table captions and axis labels in the experimental figures should explicitly state the value of k employed for each SVAkADD run.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and the recommendation for minor revision. We address each major comment below.

read point-by-point responses

Referee: [§3] §3 (Fitting procedure): the optimization problem used to obtain the k-additive coefficients of the surrogate is load-bearing for both correctness and complexity claims; the manuscript must state the precise objective (e.g., least-squares over coalitions, with or without regularization) and its algorithmic complexity, because without this the claim that the overall procedure is polynomial-time cannot be verified.

Authors: We agree that additional detail is needed. The fitting procedure minimizes the squared error between the k-additive surrogate and the original value function v over a collection of coalitions (least-squares objective with no regularization). The number of variables equals the number of k-additive terms, which is O(n^k) and therefore polynomial for fixed k. The resulting linear system is solved via standard methods whose complexity is polynomial in the number of variables and the number of sampled coalitions. In the revision we will insert the exact objective function, the closed-form expression for the coefficients when possible, and the explicit polynomial-time bound into §3. revision: yes
Referee: [§5] §5 (Empirical evaluation): the reported error metrics are not accompanied by an ablation on the hyper-parameter k or on the number of coalitions used in fitting; without these controls it is impossible to assess whether the reported superiority over baselines is stable or an artifact of particular k choices.

Authors: We accept that the current experiments lack these controls. The revised manuscript will add two new figures: one showing approximation error as a function of k (for k = 1 … 4) on the same benchmark instances, and one showing error versus the number of coalitions supplied to the fitting routine. These ablations will confirm that the reported gains remain stable across reasonable choices of k and sample size. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper proposes a constructive approximation method: fit a k-additive surrogate game to the original value function, then compute its exact Shapley values (tractable due to k-additivity) and use them as estimates. This does not reduce by construction to the original Shapley values or any fitted input renamed as prediction. No self-citation is load-bearing for the central claim, no uniqueness theorem is imported from the authors, and no ansatz is smuggled via citation. The method is evaluated empirically against competitors, making the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The fitting procedure for the surrogate game is implied but not detailed, so no ledger entries can be extracted.

pith-pipeline@v0.9.0 · 5692 in / 1058 out tokens · 20057 ms · 2026-05-23T04:25:42.690713+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Shapley Regression for Rare Disease Diagnosis Support: a case study on APDS
cs.LG 2026-05 unverdicted novelty 5.0

Shapley regression replaces the linear predictor in logistic regression with a k-additive cooperative game to detect APDS and other rare diseases from symptom data while remaining transparent.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · cited by 1 Pith paper · 1 internal anchor

[1]

Cai, W., Kordabad, A

doi: 10.3390/app13042038. Cai, W., Kordabad, A. B., and Gros, S. Energy manage- ment in residential microgrid using model predictive control-based reinforcement learning and Shapley value. Engineering Applications of Artificial Intelligence, 119 (January):105793,

work page doi:10.3390/app13042038
[2]

doi: 10.1016/j.engappai.2022. 105793. Castro, J., G´omez, D., and Tejada, J. Polynomial calculation of the shapley value based on sampling. Computers & Operations Research, 36(5):1726–1730,

work page doi:10.1016/j.engappai.2022 2022
[3]

B., Dror, G., and Ruppin, E

Cohen, S. B., Dror, G., and Ruppin, E. Feature selection via coalitional game theory. Neural Comput., 19(7):1939– 1961,

work page 1939
[4]

and Rastegar, M

Ebrahimi, M. and Rastegar, M. Towards an interpretable data-driven switch placement model in electric power dis- tribution systems: An explainable artificial intelligence- based approach. Engineering Applications of Artifi- cial Intelligence, 129(March 2022):107637,

work page 2022
[5]

Fiestras-Janeiro, M

doi: 10.1016/j.engappai.2023.107637. Fiestras-Janeiro, M. G., Garc ´ıa-Jurado, I., Meca, A., and Mosquera, M. A. Cooperative game theory and inventory management. European Journal of Operational Research, 210:459–466,

work page doi:10.1016/j.engappai.2023.107637 2023
[6]

Fumagalli, F., Muschalik, M., Kolpaczki, P., H¨ullermeier, E., and Hammer, B

doi: 10.1016/j.ejor.2010.06.025. Fumagalli, F., Muschalik, M., Kolpaczki, P., H¨ullermeier, E., and Hammer, B. SHAP-IQ: unified approximation of any- order shapley interactions. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS),

work page doi:10.1016/j.ejor.2010.06.025 2010
[7]

Bounding the Estimation Error of Sampling-based Shapley Value Approximation

Maleki, S., Tran-Thanh, L., Hines, G., Rahwan, T., and Rogers, A. Bounding the estimation error of sampling- based shapley value approximation with/without stratify- ing. CoRR, abs/1306.4265,

work page internal anchor Pith review Pith/arXiv arXiv
[8]

Marc´ılio, W. E. and Eler, D. M. From explanations to feature selection: assessing shap values as feature selection mech- anism. In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) , pp. 340–347,

work page 2020
[9]

Mitchell, R., Cooper, J., Frank, E., and Holmes, G

doi: 10.1109/SIBGRAPI51738.2020.00053. Mitchell, R., Cooper, J., Frank, E., and Holmes, G. Sam- pling permutations for shapley value estimation. Journal of Machine Learning Research, 23(43):1–46,

work page doi:10.1109/sibgrapi51738.2020.00053 2020
[10]

F., Hussain, O

Nimmy, S. F., Hussain, O. K., Chakrabortty, R. K., Hussain, F. K., and Saberi, M. Interpreting the antecedents of 10 Shapley Value Approximation Based on k-Additive Games a predicted output by capturing the interdependencies among the system features and their evolution over time. Engineering Applications of Artificial Intelligence, 117 (November 2022):105596,

work page 2022
[11]

2022.105596

doi: 10.1016/j.engappai. 2022.105596. Okhrati, R. and Lipani, A. A multilinear sampling algo- rithm to estimate shapley values. In 25th International Conference on Pattern Recognition ICPR, pp. 7992–7999,

work page doi:10.1016/j.engappai 2022
[12]

Pelegrina, G

doi: 10.1109/TAI.2024.3365082. Pelegrina, G. D., Duarte, L. T., Grabisch, M., and Romano, J. M. T. The multilinear model in multicriteria decision making: The case of 2-additive capacities and contribu- tions to parameter identification. European Journal of Operational Research, 282,

work page doi:10.1109/tai.2024.3365082 2024
[13]

D., Duarte, L

Pelegrina, G. D., Duarte, L. T., and Grabisch, M. A k- additive choquet integral-based approach to approximate the SHAP values for local interpretability in machine learning. Artificial Intelligence, 325:104014, 2023a. Pelegrina, G. D., Duarte, L. T., and Grabisch, M. In- terpreting the contribution of sensors in blind source extraction by means of Shaple...

work page doi:10.1109/lsp.2023.3295759 2023
[14]

, nas random variables X1,

proposed to view the features 1, . . . , nas random variables X1, . . . , Xn such that the datapoints are realizations of their joint distribution. Next, the worth of a coalition S is given by their total correlation ν(S) = X i∈S H(Xi) − H(S) where H(Xi) denotes the Shannon entropy of Xi and H(S) the contained random variables joint Shannon entropy. The u...

work page 2009

[1] [1]

Cai, W., Kordabad, A

doi: 10.3390/app13042038. Cai, W., Kordabad, A. B., and Gros, S. Energy manage- ment in residential microgrid using model predictive control-based reinforcement learning and Shapley value. Engineering Applications of Artificial Intelligence, 119 (January):105793,

work page doi:10.3390/app13042038

[2] [2]

doi: 10.1016/j.engappai.2022. 105793. Castro, J., G´omez, D., and Tejada, J. Polynomial calculation of the shapley value based on sampling. Computers & Operations Research, 36(5):1726–1730,

work page doi:10.1016/j.engappai.2022 2022

[3] [3]

B., Dror, G., and Ruppin, E

Cohen, S. B., Dror, G., and Ruppin, E. Feature selection via coalitional game theory. Neural Comput., 19(7):1939– 1961,

work page 1939

[4] [4]

and Rastegar, M

Ebrahimi, M. and Rastegar, M. Towards an interpretable data-driven switch placement model in electric power dis- tribution systems: An explainable artificial intelligence- based approach. Engineering Applications of Artifi- cial Intelligence, 129(March 2022):107637,

work page 2022

[5] [5]

Fiestras-Janeiro, M

doi: 10.1016/j.engappai.2023.107637. Fiestras-Janeiro, M. G., Garc ´ıa-Jurado, I., Meca, A., and Mosquera, M. A. Cooperative game theory and inventory management. European Journal of Operational Research, 210:459–466,

work page doi:10.1016/j.engappai.2023.107637 2023

[6] [6]

Fumagalli, F., Muschalik, M., Kolpaczki, P., H¨ullermeier, E., and Hammer, B

doi: 10.1016/j.ejor.2010.06.025. Fumagalli, F., Muschalik, M., Kolpaczki, P., H¨ullermeier, E., and Hammer, B. SHAP-IQ: unified approximation of any- order shapley interactions. In Proceedings of Advances in Neural Information Processing Systems (NeurIPS),

work page doi:10.1016/j.ejor.2010.06.025 2010

[7] [7]

Bounding the Estimation Error of Sampling-based Shapley Value Approximation

Maleki, S., Tran-Thanh, L., Hines, G., Rahwan, T., and Rogers, A. Bounding the estimation error of sampling- based shapley value approximation with/without stratify- ing. CoRR, abs/1306.4265,

work page internal anchor Pith review Pith/arXiv arXiv

[8] [8]

Marc´ılio, W. E. and Eler, D. M. From explanations to feature selection: assessing shap values as feature selection mech- anism. In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI) , pp. 340–347,

work page 2020

[9] [9]

Mitchell, R., Cooper, J., Frank, E., and Holmes, G

doi: 10.1109/SIBGRAPI51738.2020.00053. Mitchell, R., Cooper, J., Frank, E., and Holmes, G. Sam- pling permutations for shapley value estimation. Journal of Machine Learning Research, 23(43):1–46,

work page doi:10.1109/sibgrapi51738.2020.00053 2020

[10] [10]

F., Hussain, O

Nimmy, S. F., Hussain, O. K., Chakrabortty, R. K., Hussain, F. K., and Saberi, M. Interpreting the antecedents of 10 Shapley Value Approximation Based on k-Additive Games a predicted output by capturing the interdependencies among the system features and their evolution over time. Engineering Applications of Artificial Intelligence, 117 (November 2022):105596,

work page 2022

[11] [11]

2022.105596

doi: 10.1016/j.engappai. 2022.105596. Okhrati, R. and Lipani, A. A multilinear sampling algo- rithm to estimate shapley values. In 25th International Conference on Pattern Recognition ICPR, pp. 7992–7999,

work page doi:10.1016/j.engappai 2022

[12] [12]

Pelegrina, G

doi: 10.1109/TAI.2024.3365082. Pelegrina, G. D., Duarte, L. T., Grabisch, M., and Romano, J. M. T. The multilinear model in multicriteria decision making: The case of 2-additive capacities and contribu- tions to parameter identification. European Journal of Operational Research, 282,

work page doi:10.1109/tai.2024.3365082 2024

[13] [13]

D., Duarte, L

Pelegrina, G. D., Duarte, L. T., and Grabisch, M. A k- additive choquet integral-based approach to approximate the SHAP values for local interpretability in machine learning. Artificial Intelligence, 325:104014, 2023a. Pelegrina, G. D., Duarte, L. T., and Grabisch, M. In- terpreting the contribution of sensors in blind source extraction by means of Shaple...

work page doi:10.1109/lsp.2023.3295759 2023

[14] [14]

, nas random variables X1,

proposed to view the features 1, . . . , nas random variables X1, . . . , Xn such that the datapoints are realizations of their joint distribution. Next, the worth of a coalition S is given by their total correlation ν(S) = X i∈S H(Xi) − H(S) where H(Xi) denotes the Shannon entropy of Xi and H(S) the contained random variables joint Shannon entropy. The u...

work page 2009