Dynamic Shapley Computation
Pith reviewed 2026-05-21 06:34 UTC · model grok-4.3
The pith
D-Shap maintains Shapley values in a player-by-task matrix to enable fast local updates for changing tasks and players.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By representing Shapley values as a player-by-task matrix and exploiting utility locality and coalition locality, D-Shap performs new task valuations via structure-aware interpolation and confines player-induced updates to affected local matrix blocks, while self-valuation constructs the initial matrix from training data using subset reuse and coverage-aware anchor selection.
What carries the argument
The player-by-task matrix that stores contributions and the exploitation of utility locality and coalition locality to support structured matrix maintenance through interpolation and local block updates.
If this is right
- Task updates complete in milliseconds instead of full recomputation.
- Player updates reduce computational cost by up to three orders of magnitude.
- Valuation quality stays competitive with full recomputation across various models.
- Self-valuation allows starting without predefined evaluation tasks using scalable subset reuse.
- Updates remain confined to small affected portions of the matrix.
Where Pith is reading between the lines
- Similar locality-based maintenance could apply to other data contribution measures in dynamic settings.
- The approach might support continuous learning pipelines where data valuation needs to adapt in real time.
- Extending the matrix structure could handle simultaneous changes in both tasks and players more seamlessly.
- Applications in distributed training could use this to dynamically adjust data contributions without heavy overhead.
Load-bearing premise
The method relies on each task depending on only a small subset of training players and similar tasks producing similar valuations so that local updates and interpolation do not lose much accuracy.
What would settle it
Measuring the change in Shapley values for a new task across many players outside its small subset, or checking if interpolation accuracy drops below competitive levels when task similarity is low, would test if the locality assumptions hold in practice.
Figures
read the original abstract
Shapley-based data valuation provides a principled way to quantify the contribution of training data, but its high computational cost makes it impractical in dynamic settings where tasks and training players evolve. Existing methods treat Shapley computation as a one-shot process and collapse contributions into aggregated scores, preventing reuse and requiring recomputation under any change. We introduce a new perspective that represents Shapley values as a player-by-task matrix and formulates dynamic valuation as a structured matrix maintenance problem. We exploit the fact that each task depends on a small subset of training players and that similar tasks yield similar valuations, leading to utility locality and coalition locality. Based on these insights, we propose D-Shap, a dynamic valuation framework that enables efficient updates by modifying only a small portion of the matrix: new task valuations are inferred via structure-aware interpolation, while updates induced by new players are confined to affected local matrix blocks. To eliminate the need for pre-specified evaluation tasks, we introduce self-valuation, which constructs the initial matrix directly from training data, supported by scalable subset reuse and coverage-aware anchor selection. Experiments across diverse models show that D-Shap performs task updates in milliseconds and reduces the cost of player updates by up to three orders of magnitude, while achieving valuation quality competitive with full recomputation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes D-Shap, a dynamic Shapley computation framework for data valuation. It models Shapley values as a player-by-task matrix and exploits utility locality (tasks depend on small player subsets) and coalition locality (similar tasks yield similar valuations) to support efficient updates: new-task valuations via structure-aware interpolation and new-player updates via local matrix blocks. Self-valuation is introduced to build the initial matrix from training data using subset reuse and coverage-aware anchors. Experiments across models claim millisecond-scale task updates, up to three orders of magnitude reduction in player-update cost, and valuation quality competitive with full recomputation.
Significance. If the locality assumptions hold with bounded error, the matrix-maintenance perspective could make principled Shapley valuation practical in evolving ML pipelines (e.g., continual learning or data markets). The explicit separation of task and player updates via structured locality is a constructive algorithmic contribution.
major comments (2)
- [§3] §3 (D-Shap framework): The central efficiency and quality claims rest on utility locality and coalition locality enabling accurate interpolation and local block updates, yet the manuscript provides no explicit error bounds, sensitivity analysis, or characterization of how task similarity is quantified or when the 'small subset' assumption breaks. This leaves the 'competitive with full recomputation' guarantee dependent on unverified empirical behavior rather than controlled approximation guarantees.
- [§5] §5 (Experiments): The abstract asserts 'valuation quality competitive with full recomputation' and 'up to three orders of magnitude' speedups, but without reported quantitative metrics, error bars, baseline definitions, or ablation on locality strength, the empirical support for the load-bearing quality claim cannot be assessed from the provided description.
minor comments (2)
- [Abstract] Abstract: 'diverse models' is stated without naming the models, datasets, or exact locality metrics used, which would aid reproducibility.
- [§3] Notation: The transition from exact Shapley values to the interpolated matrix entries could be clarified with a small illustrative example early in the method section.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important opportunities to strengthen the theoretical grounding and empirical presentation of D-Shap. We address each major comment below and describe the revisions we will incorporate.
read point-by-point responses
-
Referee: [§3] §3 (D-Shap framework): The central efficiency and quality claims rest on utility locality and coalition locality enabling accurate interpolation and local block updates, yet the manuscript provides no explicit error bounds, sensitivity analysis, or characterization of how task similarity is quantified or when the 'small subset' assumption breaks. This leaves the 'competitive with full recomputation' guarantee dependent on unverified empirical behavior rather than controlled approximation guarantees.
Authors: We agree that the absence of explicit error bounds leaves the approximation claims reliant on empirical validation. In the revised manuscript we will add a dedicated subsection to §3 that derives approximation error bounds for the structure-aware interpolation under the stated locality assumptions. Task similarity will be formalized via a kernel on task feature vectors, and we will characterize the regime in which the small-subset assumption yields bounded error. A sensitivity analysis with respect to subset size and locality strength will also be included. revision: yes
-
Referee: [§5] §5 (Experiments): The abstract asserts 'valuation quality competitive with full recomputation' and 'up to three orders of magnitude' speedups, but without reported quantitative metrics, error bars, baseline definitions, or ablation on locality strength, the empirical support for the load-bearing quality claim cannot be assessed from the provided description.
Authors: Section 5 of the full manuscript already contains tables reporting mean absolute error versus full recomputation, wall-clock times with standard deviations over repeated runs, and explicit baseline definitions. Nevertheless, we acknowledge that an ablation isolating the contribution of locality strength is missing. We will add this ablation study, varying subset size and task-similarity threshold, and will ensure all quantitative metrics, error bars, and baseline specifications are summarized in the abstract and results section of the revision. revision: yes
Circularity Check
No circularity: algorithmic framework rests on stated locality assumptions and empirical validation
full rationale
The derivation presents Shapley values as a player-by-task matrix and introduces D-Shap updates via structure-aware interpolation and local block modifications, justified by the explicit assumptions of utility locality (small player subsets per task) and coalition locality (similar tasks yield similar valuations). These assumptions are invoked to enable efficient maintenance rather than derived from the method itself. Performance claims (millisecond task updates, up to 1000x player-update cost reduction, competitive quality) are positioned as experimental outcomes against full recomputation baselines, with no equations or steps that rename fitted parameters as predictions or reduce results to self-citations by construction. The initial self-valuation step is described as a construction technique using subset reuse, not a circular re-derivation of the target values.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Tasks depend on small subsets of players and similar tasks produce similar Shapley vectors (utility and coalition locality).
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinctionreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We exploit the fact that each task depends on a small subset of training players and that similar tasks yield similar valuations, leading to utility locality and coalition locality... new task valuations are inferred via structure-aware interpolation, while updates induced by new players are confined to affected local matrix blocks.
-
IndisputableMonolith/Cost/FunctionalEquationwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Proposition 1 (Utility locality). Under Assumption 1... ∥ϕ(t)−ϕ(t′)∥∞ ≤ 2LΓ dΓ(t,t′). Theorem 1 (Interpolation error)... ≤ 2LΓ ε.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
J. Adebayo, M. Muelly, I. Liccardi, and B. Kim. Debugging tests for model explanations.arXiv preprint arXiv:2011.05429, 2020
-
[2]
A. Agarwal, M. Dahleh, and T. Sarkar. A marketplace for data: An algorithmic solution. In Proceedings of the 2019 ACM Conference on Economics and Computation, pages 701–726, 2019
work page 2019
- [3]
-
[4]
E. Bhardwaj, H. Gujral, S. Wu, C. Zogheib, T. Maharaj, and C. Becker. The state of data curation at neurips: An assessment of dataset development practices in the datasets and benchmarks track.Advances in Neural Information Processing Systems, 37:53626–53648, 2024
work page 2024
-
[5]
O. Bousquet and A. Elisseeff. Stability and generalization.Journal of machine learning research, 2(Mar):499–526, 2002
work page 2002
-
[6]
L. Breiman, J. Friedman, R. A. Olshen, and C. J. Stone.Classification and Regression Trees. Wadsworth International Group, 1984
work page 1984
- [7]
-
[8]
C. Cortes and V . Vapnik. Support-vector networks.Machine learning, 20(3):273–297, 1995
work page 1995
- [9]
-
[10]
M. De Lange, R. Aljundi, M. Masana, S. Parisot, X. Jia, A. Leonardis, G. Slabaugh, and T. Tuytelaars. A continual learning survey: Defying forgetting in classification tasks.IEEE transactions on pattern analysis and machine intelligence, 44(7):3366–3385, 2021
work page 2021
-
[11]
X. Deng and C. H. Papadimitriou. On the complexity of cooperative solution concepts.Mathe- matics of operations research, 19(2):257–266, 1994
work page 1994
-
[12]
S. Dudani. The distance-weighted k-nearest-neighbor rule.IEEE Transactions on Systems, Man, and Cybernetics, SMC-6(4):325–327, 1976
work page 1976
-
[13]
A. Duval and F. D. Malliaros. Graphsvx: Shapley value explanations for graph neural networks. InJoint European conference on machine learning and knowledge discovery in databases, pages 302–318. Springer, 2021
work page 2021
-
[14]
R. A. Fisher. The use of multiple measurements in taxonomic problems.Annals of Eugenics, 7(2):179–188, 1936
work page 1936
-
[15]
A. Ghorbani, M. Kim, and J. Zou. A distributional framework for data valuation. InInternational Conference on Machine Learning, pages 3535–3544. PMLR, 2020
work page 2020
-
[16]
A. Ghorbani and J. Zou. Data shapley: Equitable valuation of data for machine learning. In International conference on machine learning, pages 2242–2251. PMLR, 2019
work page 2019
-
[17]
A. Ghorbani, J. Zou, and A. Esteva. Data shapley valuation for efficient batch active learning. In2022 56th asilomar conference on signals, systems, and computers, pages 1456–1462. IEEE, 2022
work page 2022
-
[18]
T. F. Gonzalez. Clustering to minimize the maximum intercluster distance.Theoretical computer science, 38:293–306, 1985
work page 1985
-
[19]
W. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017. 10
work page 2017
-
[20]
N. Jethani, M. Sudarshan, I. C. Covert, S.-I. Lee, and R. Ranganath. Fastshap: Real-time shapley value estimation. InInternational conference on learning representations, 2021
work page 2021
- [21]
-
[22]
R. Jia, D. Dao, B. Wang, F. A. Hubis, N. Hynes, N. M. Gürel, B. Li, C. Zhang, D. Song, and C. J. Spanos. Towards efficient data valuation based on the shapley value. InThe 22nd international conference on artificial intelligence and statistics, pages 1167–1176. PMLR, 2019
work page 2019
- [23]
-
[24]
T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. InInternational Conference on Learning Representations (ICLR), 2017
work page 2017
-
[25]
P. W. Koh and P. Liang. Understanding black-box predictions via influence functions. In International conference on machine learning, pages 1885–1894. PMLR, 2017
work page 2017
- [26]
-
[27]
J. Liu, J. Lou, J. Liu, L. Xiong, J. Pei, and J. Sun. Dealer: An end-to-end model marketplace with differential privacy.Proceedings of the VLDB Endowment, 14(6), 2021
work page 2021
-
[28]
Z. Liu, Y . Chen, H. Yu, Y . Liu, and L. Cui. Gtg-shapley: Efficient and accurate participant contribution evaluation in federated learning.ACM Transactions on intelligent Systems and Technology (TIST), 13(4):1–21, 2022
work page 2022
-
[29]
S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. From local explanations to global understanding with explainable ai for trees.Nature machine intelligence, 2(1):56–67, 2020
work page 2020
-
[30]
S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions.Advances in neural information processing systems, 30, 2017
work page 2017
- [31]
-
[32]
A. K. McCallum, K. Nigam, J. Rennie, and K. Seymore. Automating the construction of internet portals with machine learning. InInformation Retrieval, volume 3, pages 127–163. Springer, 2000
work page 2000
-
[33]
B. Mirzasoleiman, J. Bilmes, and J. Leskovec. Coresets for data-efficient training of machine learning models. InInternational Conference on Machine Learning, pages 6950–6960. PMLR, 2020
work page 2020
-
[34]
M. J. Osborne and A. Rubinstein.A course in game theory. MIT press, 1994
work page 1994
-
[35]
G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter. Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019
work page 2019
-
[36]
S. M. Park, K. Georgiev, A. Ilyas, G. Leclerc, and A. M ˛ adry. Trak: attributing model behavior at scale. InProceedings of the 40th International Conference on Machine Learning, pages 27074–27113, 2023
work page 2023
-
[37]
S. Shalev-Shwartz, O. Shamir, N. Srebro, and K. Sridharan. Learnability, stability and uniform convergence.The Journal of Machine Learning Research, 11:2635–2670, 2010
work page 2010
-
[38]
N. Z. Shapiro and L. S. Shapley. Values of large games, i: A limit theorem.Mathematics of Operations Research, 3(1):1–9, 1978
work page 1978
-
[39]
L. S. Shapley et al. A value for n-person games. 1953. 11
work page 1953
-
[40]
W. N. Street, W. H. Wolberg, and O. L. Mangasarian. Nuclear feature extraction for breast tumor diagnosis.IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science and Technology, 1905:861–870, 1993
work page 1993
-
[41]
H. Sun, Y . Xiong, R. Wu, X. Cai, C. Fan, L. Zhang, and X.-Y . Li. Fast-datashapley: Neural modeling for training data valuation. InProceedings of the Nineteenth ACM International Conference on Web Search and Data Mining, pages 607–617, 2026
work page 2026
-
[42]
Q. Sun, J. Zhang, J. Liu, L. Xiong, J. Pei, and K. Ren. Shapley value approximation based on complementary contribution.IEEE Transactions on Knowledge and Data Engineering, 2024
work page 2024
-
[43]
J. T. Wang and R. Jia. Data banzhaf: A robust data valuation framework for machine learning. InInternational conference on artificial intelligence and statistics, pages 6388–6421. PMLR, 2023
work page 2023
-
[44]
J. T. Wang, P. Mittal, D. Song, and R. Jia. Data shapley in one training run. InThe Thirteenth International Conference on Learning Representations
-
[45]
H. Xia, J. Zhang, Q. Sun, J. Liu, K. Ren, L. Xiong, and J. Pei. Computing shapley values for dynamic data.IEEE Transactions on Knowledge and Data Engineering, 37(6):3253–3271, 2025
work page 2025
-
[46]
K. Xu, C. Li, Y . Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka. Representation learning on graphs with jumping knowledge networks. InInternational conference on machine learning, pages 5453–5462. pmlr, 2018
work page 2018
-
[47]
X. Yang, H.-W. Chen, M.-S. Chen, and J. Pei. Local shapley: Model-induced locality and optimal reuse in data valuation.arXiv preprint arXiv:2603.03672, 2026
-
[48]
C.-K. Yeh, J. Kim, I. E.-H. Yen, and P. K. Ravikumar. Representer point selection for explaining deep neural networks.Advances in neural information processing systems, 31, 2018
work page 2018
-
[49]
H. Yuan, H. Yu, J. Wang, K. Li, and S. Ji. On explainability of graph neural networks via subgraph explorations. InInternational conference on machine learning, pages 12241–12252. PMLR, 2021
work page 2021
-
[50]
J. Zhang, H. Xia, Q. Sun, J. Liu, L. Xiong, J. Pei, and K. Ren. Dynamic shapley value computation. In2023 IEEE 39th International Conference on Data Engineering (ICDE), pages 639–652. IEEE, 2023. 12 A Notations Table 6: Summary of notation. Notation Description D={z i}n i=1 Training players, also called players T={t j}m j=1 Evaluation tasks zi Thei-th tra...
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.