Dynamic Shapley Computation

Hsi-Wen Chen; Jian Pei; Ming-Syan Chen; Xuan Yang

REVIEW 2 major objections 2 minor 1 cited by

Reviewed by Pith at T0; open to challenge.

T0 means a machine referee read the full paper against a public rubric. The mark states how deep the mechanical check went, never who wrote it. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.3

D-Shap maintains Shapley values in a player-by-task matrix to enable fast local updates for changing tasks and players.

2026-05-21 06:34 UTC pith:LBSYPSSQ

load-bearing objection D-Shap reframes Shapley computation as player-by-task matrix maintenance and uses locality for cheap updates, but the whole thing rests on assumptions about sparse task dependencies and valuation similarity that lack error bounds. the 2 major comments →

arxiv 2605.20620 v1 pith:LBSYPSSQ submitted 2026-05-20 cs.LG cs.DBcs.GT

Dynamic Shapley Computation

Xuan Yang , Hsi-Wen Chen , Ming-Syan Chen , Jian Pei This is my paper

classification cs.LG cs.DBcs.GT

keywords Shapley valuedata valuationdynamic computationmatrix maintenanceutility localitycoalition localitymachine learningtraining data

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to make Shapley-based data valuation feasible in dynamic environments where tasks and training data change frequently. It does this by representing values in a matrix and using the fact that tasks involve few players and similar tasks have similar values to update only small portions. Full recomputation is avoided for new tasks through interpolation and for new players through block updates. This matters because existing methods require complete recalculation each time, which is too slow for practical use in evolving machine learning systems. The result is millisecond task updates and up to a thousand-fold reduction in player update costs with comparable accuracy.

Core claim

By representing Shapley values as a player-by-task matrix and exploiting utility locality and coalition locality, D-Shap performs new task valuations via structure-aware interpolation and confines player-induced updates to affected local matrix blocks, while self-valuation constructs the initial matrix from training data using subset reuse and coverage-aware anchor selection.

What carries the argument

The player-by-task matrix that stores contributions and the exploitation of utility locality and coalition locality to support structured matrix maintenance through interpolation and local block updates.

Load-bearing premise

The method relies on each task depending on only a small subset of training players and similar tasks producing similar valuations so that local updates and interpolation do not lose much accuracy.

What would settle it

Measuring the change in Shapley values for a new task across many players outside its small subset, or checking if interpolation accuracy drops below competitive levels when task similarity is low, would test if the locality assumptions hold in practice.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Task updates complete in milliseconds instead of full recomputation.
Player updates reduce computational cost by up to three orders of magnitude.
Valuation quality stays competitive with full recomputation across various models.
Self-valuation allows starting without predefined evaluation tasks using scalable subset reuse.
Updates remain confined to small affected portions of the matrix.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar locality-based maintenance could apply to other data contribution measures in dynamic settings.
The approach might support continuous learning pipelines where data valuation needs to adapt in real time.
Extending the matrix structure could handle simultaneous changes in both tasks and players more seamlessly.
Applications in distributed training could use this to dynamically adjust data contributions without heavy overhead.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes D-Shap, a dynamic Shapley computation framework for data valuation. It models Shapley values as a player-by-task matrix and exploits utility locality (tasks depend on small player subsets) and coalition locality (similar tasks yield similar valuations) to support efficient updates: new-task valuations via structure-aware interpolation and new-player updates via local matrix blocks. Self-valuation is introduced to build the initial matrix from training data using subset reuse and coverage-aware anchors. Experiments across models claim millisecond-scale task updates, up to three orders of magnitude reduction in player-update cost, and valuation quality competitive with full recomputation.

Significance. If the locality assumptions hold with bounded error, the matrix-maintenance perspective could make principled Shapley valuation practical in evolving ML pipelines (e.g., continual learning or data markets). The explicit separation of task and player updates via structured locality is a constructive algorithmic contribution.

major comments (2)

[§3] §3 (D-Shap framework): The central efficiency and quality claims rest on utility locality and coalition locality enabling accurate interpolation and local block updates, yet the manuscript provides no explicit error bounds, sensitivity analysis, or characterization of how task similarity is quantified or when the 'small subset' assumption breaks. This leaves the 'competitive with full recomputation' guarantee dependent on unverified empirical behavior rather than controlled approximation guarantees.
[§5] §5 (Experiments): The abstract asserts 'valuation quality competitive with full recomputation' and 'up to three orders of magnitude' speedups, but without reported quantitative metrics, error bars, baseline definitions, or ablation on locality strength, the empirical support for the load-bearing quality claim cannot be assessed from the provided description.

minor comments (2)

[Abstract] Abstract: 'diverse models' is stated without naming the models, datasets, or exact locality metrics used, which would aid reproducibility.
[§3] Notation: The transition from exact Shapley values to the interpolated matrix entries could be clarified with a small illustrative example early in the method section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important opportunities to strengthen the theoretical grounding and empirical presentation of D-Shap. We address each major comment below and describe the revisions we will incorporate.

read point-by-point responses

Referee: [§3] §3 (D-Shap framework): The central efficiency and quality claims rest on utility locality and coalition locality enabling accurate interpolation and local block updates, yet the manuscript provides no explicit error bounds, sensitivity analysis, or characterization of how task similarity is quantified or when the 'small subset' assumption breaks. This leaves the 'competitive with full recomputation' guarantee dependent on unverified empirical behavior rather than controlled approximation guarantees.

Authors: We agree that the absence of explicit error bounds leaves the approximation claims reliant on empirical validation. In the revised manuscript we will add a dedicated subsection to §3 that derives approximation error bounds for the structure-aware interpolation under the stated locality assumptions. Task similarity will be formalized via a kernel on task feature vectors, and we will characterize the regime in which the small-subset assumption yields bounded error. A sensitivity analysis with respect to subset size and locality strength will also be included. revision: yes
Referee: [§5] §5 (Experiments): The abstract asserts 'valuation quality competitive with full recomputation' and 'up to three orders of magnitude' speedups, but without reported quantitative metrics, error bars, baseline definitions, or ablation on locality strength, the empirical support for the load-bearing quality claim cannot be assessed from the provided description.

Authors: Section 5 of the full manuscript already contains tables reporting mean absolute error versus full recomputation, wall-clock times with standard deviations over repeated runs, and explicit baseline definitions. Nevertheless, we acknowledge that an ablation isolating the contribution of locality strength is missing. We will add this ablation study, varying subset size and task-similarity threshold, and will ensure all quantitative metrics, error bars, and baseline specifications are summarized in the abstract and results section of the revision. revision: yes

Circularity Check

0 steps flagged

No circularity: algorithmic framework rests on stated locality assumptions and empirical validation

full rationale

The derivation presents Shapley values as a player-by-task matrix and introduces D-Shap updates via structure-aware interpolation and local block modifications, justified by the explicit assumptions of utility locality (small player subsets per task) and coalition locality (similar tasks yield similar valuations). These assumptions are invoked to enable efficient maintenance rather than derived from the method itself. Performance claims (millisecond task updates, up to 1000x player-update cost reduction, competitive quality) are positioned as experimental outcomes against full recomputation baselines, with no equations or steps that rename fitted parameters as predictions or reduce results to self-citations by construction. The initial self-valuation step is described as a construction technique using subset reuse, not a circular re-derivation of the target values.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the existence of utility locality and coalition locality in real tasks and on the ability of structure-aware interpolation to preserve valuation quality; no free parameters or new entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Tasks depend on small subsets of players and similar tasks produce similar Shapley vectors (utility and coalition locality).
Invoked to justify local block updates and interpolation; location: abstract paragraph describing the two locality properties.

pith-pipeline@v0.9.0 · 5756 in / 1308 out tokens · 41309 ms · 2026-05-21T06:34:26.675551+00:00 · methodology

0 comments

read the original abstract

Shapley-based data valuation provides a principled way to quantify the contribution of training data, but its high computational cost makes it impractical in dynamic settings where tasks and training players evolve. Existing methods treat Shapley computation as a one-shot process and collapse contributions into aggregated scores, preventing reuse and requiring recomputation under any change. We introduce a new perspective that represents Shapley values as a player-by-task matrix and formulates dynamic valuation as a structured matrix maintenance problem. We exploit the fact that each task depends on a small subset of training players and that similar tasks yield similar valuations, leading to utility locality and coalition locality. Based on these insights, we propose D-Shap, a dynamic valuation framework that enables efficient updates by modifying only a small portion of the matrix: new task valuations are inferred via structure-aware interpolation, while updates induced by new players are confined to affected local matrix blocks. To eliminate the need for pre-specified evaluation tasks, we introduce self-valuation, which constructs the initial matrix directly from training data, supported by scalable subset reuse and coverage-aware anchor selection. Experiments across diverse models show that D-Shap performs task updates in milliseconds and reduces the cost of player updates by up to three orders of magnitude, while achieving valuation quality competitive with full recomputation.

Figures

Figures reproduced from arXiv: 2605.20620 by Hsi-Wen Chen, Jian Pei, Ming-Syan Chen, Xuan Yang.

**Figure 1.** Figure 1: Overview of D-Shap. (a) D-Shap represents Shapley values as a player-by-task matrix Φ, where Φi,j = ϕ(zi , tj ) denotes player zi’s contribution to task tj . (b) Task-incremental update (Sec. 3.3): a new task t ′ is valued by interpolating nearby columns (green) under the model-induced distance dΓ. (c) Player-incremental update (Sec. 3.4): a new player z ′ affects only columns whose local games involve it … view at source ↗

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We exploit the fact that each task depends on a small subset of training players and that similar tasks yield similar valuations, leading to utility locality and coalition locality... new task valuations are inferred via structure-aware interpolation, while updates induced by new players are confined to affected local matrix blocks.
IndisputableMonolith/Cost/FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Proposition 1 (Utility locality). Under Assumption 1... ∥ϕ(t)−ϕ(t′)∥∞ ≤ 2LΓ dΓ(t,t′). Theorem 1 (Interpolation error)... ≤ 2LΓ ε.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Framework for Graph-Conditioned Hierarchical Shapley Attribution in Patent Valuation
cs.GT 2026-06 unverdicted novelty 6.0

PatentXAI approximates Shapley profit attributions for patents by limiting coalitions to Markov blankets in a knowledge graph and uses hierarchical allocation, with synthetic scaling experiments showing approximation ...

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · cited by 1 Pith paper

[1]

Adebayo, M

J. Adebayo, M. Muelly, I. Liccardi, and B. Kim. Debugging tests for model explanations.arXiv preprint arXiv:2011.05429, 2020

work page arXiv 2011
[2]

Agarwal, M

A. Agarwal, M. Dahleh, and T. Sarkar. A marketplace for data: An algorithmic solution. In Proceedings of the 2019 ACM Conference on Economics and Computation, pages 701–726, 2019

work page 2019
[3]

S. Basu, P. Pope, and S. Feizi. Influence functions in deep learning are fragile.arXiv preprint arXiv:2006.14651, 2020

work page arXiv 2006
[4]

Bhardwaj, H

E. Bhardwaj, H. Gujral, S. Wu, C. Zogheib, T. Maharaj, and C. Becker. The state of data curation at neurips: An assessment of dataset development practices in the datasets and benchmarks track.Advances in Neural Information Processing Systems, 37:53626–53648, 2024

work page 2024
[5]

Bousquet and A

O. Bousquet and A. Elisseeff. Stability and generalization.Journal of machine learning research, 2(Mar):499–526, 2002

work page 2002
[6]

Breiman, J

L. Breiman, J. Friedman, R. A. Olshen, and C. J. Stone.Classification and Regression Trees. Wadsworth International Group, 1984

work page 1984
[7]

Castro, D

J. Castro, D. Gómez, and J. Tejada. Polynomial calculation of the shapley value based on sampling.Computers & operations research, 36(5):1726–1730, 2009

work page 2009
[8]

Cortes and V

C. Cortes and V . Vapnik. Support-vector networks.Machine learning, 20(3):273–297, 1995

work page 1995
[9]

Covert, C

I. Covert, C. Kim, S.-I. Lee, J. Zou, and T. Hashimoto. Stochastic amortization: A unified approach to accelerate feature and data attribution.Advances in Neural Information Processing Systems, 37:4374–4423, 2024

work page 2024
[10]

De Lange, R

M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars. A continual learning survey: Defying forgetting in classification tasks.IEEE transactions on pattern analysis and machine intelligence, 44(7):3366–3385, 2021

work page 2021
[11]

Deng and C

X. Deng and C. H. Papadimitriou. On the complexity of cooperative solution concepts.Mathe- matics of operations research, 19(2):257–266, 1994

work page 1994
[12]

S. Dudani. The distance-weighted k-nearest-neighbor rule.IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(4):325–327, 1976

work page 1976
[13]

Duval and F

A. Duval and F. D. Malliaros. Graphsvx: Shapley value explanations for graph neural networks. InJoint European conference on machine learning and knowledge discovery in databases, pages 302–318. Springer, 2021

work page 2021
[14]

R. A. Fisher. The use of multiple measurements in taxonomic problems.Annals of Eugenics, 7(2):179–188, 1936

work page 1936
[15]

Ghorbani, M

A. Ghorbani, M. Kim, and J. Zou. A distributional framework for data valuation. InInternational Conference on Machine Learning, pages 3535–3544. PMLR, 2020

work page 2020
[16]

Ghorbani and J

A. Ghorbani and J. Zou. Data shapley: Equitable valuation of data for machine learning. In International conference on machine learning, pages 2242–2251. PMLR, 2019

work page 2019
[17]

Ghorbani, J

A. Ghorbani, J. Zou, and A. Esteva. Data shapley valuation for efficient batch active learning. In2022 56th asilomar conference on signals, systems, and computers, pages 1456–1462. IEEE, 2022

work page 2022
[18]

T. F. Gonzalez. Clustering to minimize the maximum intercluster distance.Theoretical computer science, 38:293–306, 1985

work page 1985
[19]

Hamilton, Z

W. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017. 10

work page 2017
[20]

Jethani, M

N. Jethani, M. Sudarshan, I. C. Covert, S.-I. Lee, and R. Ranganath. Fastshap: Real-time shapley value estimation. InInternational conference on learning representations, 2021

work page 2021
[21]

R. Jia, D. Dao, B. Wang, F. A. Hubis, N. M. Gurel, B. Li, C. Zhang, C. J. Spanos, and D. Song. Efficient task-specific data valuation for nearest neighbor algorithms.arXiv preprint arXiv:1908.08619, 2019

work page arXiv 1908
[22]

R. Jia, D. Dao, B. Wang, F. A. Hubis, N. Hynes, N. M. Gürel, B. Li, C. Zhang, D. Song, and C. J. Spanos. Towards efficient data valuation based on the shapley value. InThe 22nd international conference on artificial intelligence and statistics, pages 1167–1176. PMLR, 2019

work page 2019
[23]

Jiang, W

K. Jiang, W. Liang, J. Y . Zou, and Y . Kwon. Opendataval: a unified benchmark for data valuation.Advances in Neural Information Processing Systems, 36:28624–28647, 2023

work page 2023
[24]

T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. InInternational Conference on Learning Representations (ICLR), 2017

work page 2017
[25]

P. W. Koh and P. Liang. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017

work page 2017
[26]

LeCun, L

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

work page 1998
[27]

J. Liu, J. Lou, J. Liu, L. Xiong, J. Pei, and J. Sun. Dealer: An end-to-end model marketplace with differential privacy.Proceedings of the VLDB Endowment, 14(6), 2021

work page 2021
[28]

Z. Liu, Y . Chen, H. Yu, Y . Liu, and L. Cui. Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning.ACM Transactions on intelligent Systems and Technology (TIST), 13(4):1–21, 2022

work page 2022
[29]

S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. From local explanations to global understanding with explainable ai for trees.Nature machine intelligence, 2(1):56–67, 2020

work page 2020
[30]

S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions.Advances in neural information processing systems, 30, 2017

work page 2017
[31]

Mairal, J

J. Mairal, J. Ponce, G. Sapiro, A. Zisserman, and F. Bach. Supervised dictionary learning. Advances in neural information processing systems, 21, 2008

work page 2008
[32]

A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. InInformation Retrieval, volume 3, pages 127–163. Springer, 2000

work page 2000
[33]

Mirzasoleiman, J

B. Mirzasoleiman, J. Bilmes, and J. Leskovec. Coresets for data-efficient training of machine learning models. InInternational Conference on Machine Learning, pages 6950–6960. PMLR, 2020

work page 2020
[34]

M. J. Osborne and A. Rubinstein.A course in game theory. MIT press, 1994

work page 1994
[35]

G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter. Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019

work page 2019
[36]

S. M. Park, K. Georgiev, A. Ilyas, G. Leclerc, and A. M ˛ adry. Trak: attributing model behavior at scale. InProceedings of the 40th International Conference on Machine Learning, pages 27074–27113, 2023

work page 2023
[37]

Shalev-Shwartz, O

S. Shalev-Shwartz, O. Shamir, N. Srebro, and K. Sridharan. Learnability, stability and uniform convergence.The Journal of Machine Learning Research, 11:2635–2670, 2010

work page 2010
[38]

N. Z. Shapiro and L. S. Shapley. Values of large games, i: A limit theorem.Mathematics of Operations Research, 3(1):1–9, 1978

work page 1978
[39]

L. S. Shapley et al. A value for n-person games. 1953. 11

work page 1953
[40]

W. N. Street, W. H. Wolberg, and O. L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis.IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, 1905:861–870, 1993

work page 1993
[41]

H. Sun, Y . Xiong, R. Wu, X. Cai, C. Fan, L. Zhang, and X.-Y . Li. Fast-datashapley: Neural modeling for training data valuation. InProceedings of the Nineteenth ACM International Conference on Web Search and Data Mining, pages 607–617, 2026

work page 2026
[42]

Q. Sun, J. Zhang, J. Liu, L. Xiong, J. Pei, and K. Ren. Shapley value approximation based on complementary contribution.IEEE Transactions on Knowledge and Data Engineering, 2024

work page 2024
[43]

J. T. Wang and R. Jia. Data banzhaf: A robust data valuation framework for machine learning. InInternational conference on artificial intelligence and statistics, pages 6388–6421. PMLR, 2023

work page 2023
[44]

J. T. Wang, P. Mittal, D. Song, and R. Jia. Data shapley in one training run. InThe Thirteenth International Conference on Learning Representations

work page
[45]

H. Xia, J. Zhang, Q. Sun, J. Liu, K. Ren, L. Xiong, and J. Pei. Computing shapley values for dynamic data.IEEE Transactions on Knowledge and Data Engineering, 37(6):3253–3271, 2025

work page 2025
[46]

K. Xu, C. Li, Y . Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka. Representation learning on graphs with jumping knowledge networks. InInternational conference on machine learning, pages 5453–5462. pmlr, 2018

work page 2018
[47]

Yang, H.-W

X. Yang, H.-W. Chen, M.-S. Chen, and J. Pei. Local shapley: Model-induced locality and optimal reuse in data valuation.arXiv preprint arXiv:2603.03672, 2026

work page arXiv 2026
[48]

C.-K. Yeh, J. Kim, I. E.-H. Yen, and P. K. Ravikumar. Representer point selection for explaining deep neural networks.Advances in neural information processing systems, 31, 2018

work page 2018
[49]

H. Yuan, H. Yu, J. Wang, K. Li, and S. Ji. On explainability of graph neural networks via subgraph explorations. InInternational conference on machine learning, pages 12241–12252. PMLR, 2021

work page 2021
[50]

Zhang, H

J. Zhang, H. Xia, Q. Sun, J. Liu, L. Xiong, J. Pei, and K. Ren. Dynamic shapley value computation. In2023 IEEE 39th International Conference on Data Engineering (ICDE), pages 639–652. IEEE, 2023. 12 A Notations Table 6: Summary of notation. Notation Description D={z i}n i=1 Training players, also called players T={t j}m j=1 Evaluation tasks zi Thei-th tra...

work page 2023

[1] [1]

Adebayo, M

J. Adebayo, M. Muelly, I. Liccardi, and B. Kim. Debugging tests for model explanations.arXiv preprint arXiv:2011.05429, 2020

work page arXiv 2011

[2] [2]

Agarwal, M

A. Agarwal, M. Dahleh, and T. Sarkar. A marketplace for data: An algorithmic solution. In Proceedings of the 2019 ACM Conference on Economics and Computation, pages 701–726, 2019

work page 2019

[3] [3]

S. Basu, P. Pope, and S. Feizi. Influence functions in deep learning are fragile.arXiv preprint arXiv:2006.14651, 2020

work page arXiv 2006

[4] [4]

Bhardwaj, H

E. Bhardwaj, H. Gujral, S. Wu, C. Zogheib, T. Maharaj, and C. Becker. The state of data curation at neurips: An assessment of dataset development practices in the datasets and benchmarks track.Advances in Neural Information Processing Systems, 37:53626–53648, 2024

work page 2024

[5] [5]

Bousquet and A

O. Bousquet and A. Elisseeff. Stability and generalization.Journal of machine learning research, 2(Mar):499–526, 2002

work page 2002

[6] [6]

Breiman, J

L. Breiman, J. Friedman, R. A. Olshen, and C. J. Stone.Classification and Regression Trees. Wadsworth International Group, 1984

work page 1984

[7] [7]

Castro, D

J. Castro, D. Gómez, and J. Tejada. Polynomial calculation of the shapley value based on sampling.Computers & operations research, 36(5):1726–1730, 2009

work page 2009

[8] [8]

Cortes and V

C. Cortes and V . Vapnik. Support-vector networks.Machine learning, 20(3):273–297, 1995

work page 1995

[9] [9]

Covert, C

I. Covert, C. Kim, S.-I. Lee, J. Zou, and T. Hashimoto. Stochastic amortization: A unified approach to accelerate feature and data attribution.Advances in Neural Information Processing Systems, 37:4374–4423, 2024

work page 2024

[10] [10]

De Lange, R

M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars. A continual learning survey: Defying forgetting in classification tasks.IEEE transactions on pattern analysis and machine intelligence, 44(7):3366–3385, 2021

work page 2021

[11] [11]

Deng and C

X. Deng and C. H. Papadimitriou. On the complexity of cooperative solution concepts.Mathe- matics of operations research, 19(2):257–266, 1994

work page 1994

[12] [12]

S. Dudani. The distance-weighted k-nearest-neighbor rule.IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(4):325–327, 1976

work page 1976

[13] [13]

Duval and F

A. Duval and F. D. Malliaros. Graphsvx: Shapley value explanations for graph neural networks. InJoint European conference on machine learning and knowledge discovery in databases, pages 302–318. Springer, 2021

work page 2021

[14] [14]

R. A. Fisher. The use of multiple measurements in taxonomic problems.Annals of Eugenics, 7(2):179–188, 1936

work page 1936

[15] [15]

Ghorbani, M

A. Ghorbani, M. Kim, and J. Zou. A distributional framework for data valuation. InInternational Conference on Machine Learning, pages 3535–3544. PMLR, 2020

work page 2020

[16] [16]

Ghorbani and J

A. Ghorbani and J. Zou. Data shapley: Equitable valuation of data for machine learning. In International conference on machine learning, pages 2242–2251. PMLR, 2019

work page 2019

[17] [17]

Ghorbani, J

A. Ghorbani, J. Zou, and A. Esteva. Data shapley valuation for efficient batch active learning. In2022 56th asilomar conference on signals, systems, and computers, pages 1456–1462. IEEE, 2022

work page 2022

[18] [18]

T. F. Gonzalez. Clustering to minimize the maximum intercluster distance.Theoretical computer science, 38:293–306, 1985

work page 1985

[19] [19]

Hamilton, Z

W. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017. 10

work page 2017

[20] [20]

Jethani, M

N. Jethani, M. Sudarshan, I. C. Covert, S.-I. Lee, and R. Ranganath. Fastshap: Real-time shapley value estimation. InInternational conference on learning representations, 2021

work page 2021

[21] [21]

R. Jia, D. Dao, B. Wang, F. A. Hubis, N. M. Gurel, B. Li, C. Zhang, C. J. Spanos, and D. Song. Efficient task-specific data valuation for nearest neighbor algorithms.arXiv preprint arXiv:1908.08619, 2019

work page arXiv 1908

[22] [22]

R. Jia, D. Dao, B. Wang, F. A. Hubis, N. Hynes, N. M. Gürel, B. Li, C. Zhang, D. Song, and C. J. Spanos. Towards efficient data valuation based on the shapley value. InThe 22nd international conference on artificial intelligence and statistics, pages 1167–1176. PMLR, 2019

work page 2019

[23] [23]

Jiang, W

K. Jiang, W. Liang, J. Y . Zou, and Y . Kwon. Opendataval: a unified benchmark for data valuation.Advances in Neural Information Processing Systems, 36:28624–28647, 2023

work page 2023

[24] [24]

T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. InInternational Conference on Learning Representations (ICLR), 2017

work page 2017

[25] [25]

P. W. Koh and P. Liang. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017

work page 2017

[26] [26]

LeCun, L

Y . LeCun, L. Bottou, Y . Bengio, and P. Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

work page 1998

[27] [27]

J. Liu, J. Lou, J. Liu, L. Xiong, J. Pei, and J. Sun. Dealer: An end-to-end model marketplace with differential privacy.Proceedings of the VLDB Endowment, 14(6), 2021

work page 2021

[28] [28]

Z. Liu, Y . Chen, H. Yu, Y . Liu, and L. Cui. Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning.ACM Transactions on intelligent Systems and Technology (TIST), 13(4):1–21, 2022

work page 2022

[29] [29]

S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. From local explanations to global understanding with explainable ai for trees.Nature machine intelligence, 2(1):56–67, 2020

work page 2020

[30] [30]

S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions.Advances in neural information processing systems, 30, 2017

work page 2017

[31] [31]

Mairal, J

J. Mairal, J. Ponce, G. Sapiro, A. Zisserman, and F. Bach. Supervised dictionary learning. Advances in neural information processing systems, 21, 2008

work page 2008

[32] [32]

A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. InInformation Retrieval, volume 3, pages 127–163. Springer, 2000

work page 2000

[33] [33]

Mirzasoleiman, J

B. Mirzasoleiman, J. Bilmes, and J. Leskovec. Coresets for data-efficient training of machine learning models. InInternational Conference on Machine Learning, pages 6950–6960. PMLR, 2020

work page 2020

[34] [34]

M. J. Osborne and A. Rubinstein.A course in game theory. MIT press, 1994

work page 1994

[35] [35]

G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter. Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019

work page 2019

[36] [36]

S. M. Park, K. Georgiev, A. Ilyas, G. Leclerc, and A. M ˛ adry. Trak: attributing model behavior at scale. InProceedings of the 40th International Conference on Machine Learning, pages 27074–27113, 2023

work page 2023

[37] [37]

Shalev-Shwartz, O

S. Shalev-Shwartz, O. Shamir, N. Srebro, and K. Sridharan. Learnability, stability and uniform convergence.The Journal of Machine Learning Research, 11:2635–2670, 2010

work page 2010

[38] [38]

N. Z. Shapiro and L. S. Shapley. Values of large games, i: A limit theorem.Mathematics of Operations Research, 3(1):1–9, 1978

work page 1978

[39] [39]

L. S. Shapley et al. A value for n-person games. 1953. 11

work page 1953

[40] [40]

W. N. Street, W. H. Wolberg, and O. L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis.IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, 1905:861–870, 1993

work page 1993

[41] [41]

H. Sun, Y . Xiong, R. Wu, X. Cai, C. Fan, L. Zhang, and X.-Y . Li. Fast-datashapley: Neural modeling for training data valuation. InProceedings of the Nineteenth ACM International Conference on Web Search and Data Mining, pages 607–617, 2026

work page 2026

[42] [42]

Q. Sun, J. Zhang, J. Liu, L. Xiong, J. Pei, and K. Ren. Shapley value approximation based on complementary contribution.IEEE Transactions on Knowledge and Data Engineering, 2024

work page 2024

[43] [43]

J. T. Wang and R. Jia. Data banzhaf: A robust data valuation framework for machine learning. InInternational conference on artificial intelligence and statistics, pages 6388–6421. PMLR, 2023

work page 2023

[44] [44]

J. T. Wang, P. Mittal, D. Song, and R. Jia. Data shapley in one training run. InThe Thirteenth International Conference on Learning Representations

work page

[45] [45]

H. Xia, J. Zhang, Q. Sun, J. Liu, K. Ren, L. Xiong, and J. Pei. Computing shapley values for dynamic data.IEEE Transactions on Knowledge and Data Engineering, 37(6):3253–3271, 2025

work page 2025

[46] [46]

K. Xu, C. Li, Y . Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka. Representation learning on graphs with jumping knowledge networks. InInternational conference on machine learning, pages 5453–5462. pmlr, 2018

work page 2018

[47] [47]

Yang, H.-W

X. Yang, H.-W. Chen, M.-S. Chen, and J. Pei. Local shapley: Model-induced locality and optimal reuse in data valuation.arXiv preprint arXiv:2603.03672, 2026

work page arXiv 2026

[48] [48]

C.-K. Yeh, J. Kim, I. E.-H. Yen, and P. K. Ravikumar. Representer point selection for explaining deep neural networks.Advances in neural information processing systems, 31, 2018

work page 2018

[49] [49]

H. Yuan, H. Yu, J. Wang, K. Li, and S. Ji. On explainability of graph neural networks via subgraph explorations. InInternational conference on machine learning, pages 12241–12252. PMLR, 2021

work page 2021

[50] [50]

Zhang, H

J. Zhang, H. Xia, Q. Sun, J. Liu, L. Xiong, J. Pei, and K. Ren. Dynamic shapley value computation. In2023 IEEE 39th International Conference on Data Engineering (ICDE), pages 639–652. IEEE, 2023. 12 A Notations Table 6: Summary of notation. Notation Description D={z i}n i=1 Training players, also called players T={t j}m j=1 Evaluation tasks zi Thei-th tra...

work page 2023