ShapShift: Explaining Model Prediction Shifts with Subgroup Conditional Shapley Values
Pith reviewed 2026-05-10 16:09 UTC · model grok-4.3
The pith
ShapShift attributes model prediction shifts to changes in conditional probabilities of subgroups defined by decision trees.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ShapShift decomposes a model's prediction shift into contributions from changes in the conditional probabilities of interpretable subgroups. For a single decision tree the attributions are exact and follow directly from probability changes at each split node. For tree ensembles the method selects the single tree that best explains the shift and accounts for the remaining effects separately. For arbitrary models it grows surrogate trees with a custom objective function that defines the subgroups, then applies the same conditional Shapley attribution.
What carries the argument
Subgroup conditional Shapley values that attribute the total prediction shift to changes in conditional probabilities along paths defined by decision-tree splits.
If this is right
- Model monitors can identify the specific subgroups whose probability changes drive a shift in average predictions.
- Explanations remain interpretable even when the underlying model is a large ensemble or a neural network.
- Approximation methods make the attributions practical for real-time monitoring despite the cost of exact computation.
- The same subgroup decomposition can be applied across different model classes without retraining the original model.
Where Pith is reading between the lines
- If tree-defined subgroups consistently explain shifts, practitioners could use them as early-warning indicators for data drift before the shift becomes large.
- The surrogate-tree construction suggests that any model whose behavior can be approximated by partitions might admit similar shift explanations without direct access to its internals.
- The method implicitly treats the tree structure as a sufficient statistic for the relevant aspects of the input distribution.
Load-bearing premise
That subgroups defined by decision tree structures capture the dominant drivers of the observed prediction shift and that selecting one tree plus residual handling suffices for ensembles.
What would settle it
An observed prediction shift where adjusting the conditional probabilities of the attributed subgroups fails to reproduce most of the measured change in average model output.
Figures
read the original abstract
Changes in input distribution can induce shifts in the average predictions of machine learning models. Such prediction shifts may impact downstream business outcomes (e.g. a bank's loan approval rate), so understanding their causes can be crucial. We propose \ours{}: a Shapley value method for attributing prediction shifts to changes in the conditional probabilities of interpretable subgroups of data, where these subgroups are defined by the structure of decision trees. We initially apply this method to single decision trees, providing exact explanations based on conditional probability changes at split nodes. Next, we extend it to tree ensembles by selecting the most explanatory tree and accounting for residual effects. Finally, we propose a model-agnostic variant using surrogate trees grown with a novel objective function, allowing application to models like neural networks. While exact computation can be intensive, approximation techniques enable practical application. We show that \ours{} provides simple, faithful, and near-complete explanations of prediction shifts across model classes, aiding model monitoring in dynamic environments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ShapShift, a Shapley-value-based method for attributing shifts in average model predictions to changes in conditional probabilities over interpretable subgroups defined by decision-tree structure. For single trees it claims exact computation from split-node probability changes; for ensembles it selects the single most explanatory tree and handles residuals separately; for black-box models it grows surrogate trees under a novel objective. The central claim is that the resulting attributions are simple, faithful, and near-complete across model classes and therefore useful for monitoring prediction shifts in dynamic environments.
Significance. If the near-complete claim can be substantiated, the method would offer a practical, axiom-grounded tool for diagnosing distribution-shift effects on deployed models, which is relevant to high-stakes monitoring tasks. The grounding in standard Shapley axioms and the tree-structured subgroup definition are clear strengths; however, the absence of residual bounds or quantitative validation for the ensemble case limits the immediate impact.
major comments (2)
- [ensemble extension] The ensemble extension (described after the single-tree case) selects one tree and attributes the remainder to a residual term. The manuscript provides neither an analytic bound on the residual fraction nor empirical measurements of its magnitude across the reported experiments. Because the central claim of 'near-complete' explanations across model classes rests on this residual being negligible, the lack of such quantification is load-bearing.
- [surrogate-tree variant] The surrogate-tree construction for black-box models relies on a novel objective function whose derivation is only sketched. Without an explicit statement of the objective, a proof that the resulting attributions remain faithful to the original model's conditional-probability shifts, or ablation results showing sensitivity to the surrogate, the model-agnostic claim cannot be evaluated.
minor comments (2)
- [abstract] The abstract states that 'exact computation can be intensive' and that 'approximation techniques enable practical application,' yet no concrete approximation algorithm, complexity analysis, or reference to the relevant section is supplied.
- [methods] Notation for the conditional Shapley values and the precise definition of 'subgroup' should be introduced with a single, self-contained equation early in the methods section to improve readability.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. The comments highlight important areas for strengthening the validation of the ensemble and surrogate extensions, which we address below. We plan to incorporate the suggested clarifications and additional analyses in a revised manuscript.
read point-by-point responses
-
Referee: The ensemble extension (described after the single-tree case) selects one tree and attributes the remainder to a residual term. The manuscript provides neither an analytic bound on the residual fraction nor empirical measurements of its magnitude across the reported experiments. Because the central claim of 'near-complete' explanations across model classes rests on this residual being negligible, the lack of such quantification is load-bearing.
Authors: We agree that explicit quantification of the residual is necessary to support the near-complete claim. Deriving a general analytic bound is difficult because the residual depends on the specific ensemble structure, data distribution, and tree selection heuristic. However, we will add empirical measurements of the residual fraction (as a percentage of the total shift) for all ensemble experiments in the revised manuscript. These measurements will be reported alongside the existing results to demonstrate that the residual is typically small in practice. We will also expand the description of the tree-selection criterion to clarify how it minimizes the residual. revision: partial
-
Referee: The surrogate-tree construction for black-box models relies on a novel objective function whose derivation is only sketched. Without an explicit statement of the objective, a proof that the resulting attributions remain faithful to the original model's conditional-probability shifts, or ablation results showing sensitivity to the surrogate, the model-agnostic claim cannot be evaluated.
Authors: We acknowledge that the surrogate-tree section requires a more complete presentation. In the revision we will state the novel objective function in explicit mathematical form, including the precise optimization criterion used to grow the surrogate. We will add a short theoretical argument showing that the conditional Shapley values computed on the surrogate remain faithful to the original model's subgroup probability shifts under the chosen objective, and we will include ablation experiments that vary the surrogate hyperparameters and report the resulting attribution stability. These changes will allow readers to evaluate the model-agnostic extension directly. revision: yes
Circularity Check
No circularity: derivation builds on standard Shapley axioms and independent extensions
full rationale
The paper defines ShapShift by applying Shapley values to attribute shifts in average predictions to changes in conditional subgroup probabilities, where subgroups are induced by decision tree splits. For single trees this yields exact attributions at nodes, following directly from the value function without redefining the target shift in terms of the attributions themselves. The ensemble extension (selecting one tree plus residuals) and surrogate-tree variant (with novel objective) are presented as practical constructions motivated by computational needs, not as outputs forced by fitting or self-referential definitions. No load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation appear in the derivation chain. The method remains self-contained against external Shapley axioms and tree structure, with no step reducing by construction to its own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Shapley value axioms (efficiency, symmetry, dummy player, additivity)
Reference graph
Works this paper leans on
-
[1]
S. Ackerman, P. Dube, E. Farchi, O. Raz, and M. Zalmanovici. Machine learning model drift detection via weak data slices. In2021 IEEE/ACM Third International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest), pages 1–8, Los Alamitos, CA, USA, jun 2021. IEEE Computer Society
work page 2021
-
[2]
Alnur Ali, Maxime Cauchois, and John C Duchi. The lifecycle of a statistical model: Model failure detection, identification, and refitting.arXiv preprint arXiv:2202.04166, 2022
- [3]
-
[4]
Evaluating and aggregating feature-based model explanations
Umang Bhatt, Adrian Weller, and José MF Moura. Evaluating and aggregating feature-based model explanations. InProceedings of the Twenty-Ninth International Conference on Interna- tional Joint Conferences on Artificial Intelligence, pages 3016–3022, 2021
work page 2021
-
[5]
Random forests.Machine learning, 45:5–32, 2001
Leo Breiman. Random forests.Machine learning, 45:5–32, 2001
work page 2001
-
[6]
Kailash Budhathoki, Dominik Janzing, Patrick Bloebaum, and Hoiyi Ng. Why did the distri- bution change? In Arindam Banerjee and Kenji Fukumizu, editors,Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 ofProceedings of Machine Learning Research, pages 1666–1674. PMLR, 13–15 Apr 2021
work page 2021
-
[7]
Diagnosing model perfor- mance under distribution shift.arXiv preprint arXiv:2303.02011, 2023
Tiffany Tianhui Cai, Hongseok Namkoong, and Steve Yadlowsky. Diagnosing model perfor- mance under distribution shift.arXiv preprint arXiv:2303.02011, 2023
-
[8]
Damien de Mijolla, Christopher Frye, Markus Kunesch, John Mansir, and Ilya Feige. Human- interpretable model explainability on high-dimensional data.arXiv preprint arXiv:2010.07384, 2020
-
[9]
The spotlight: A general method for discovering systematic errors in deep learning models
Greg d’Eon, Jason d’Eon, James R Wright, and Kevin Leyton-Brown. The spotlight: A general method for discovering systematic errors in deep learning models. InProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, pages 1962–1981, 2022
work page 2022
-
[10]
Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. Retiring adult: New datasets for fair machine learning.Advances in neural information processing systems, 34:6478–6490, 2021
work page 2021
-
[11]
Jean Feng, Harvineet Singh, Fan Xia, Adarsh Subbaswamy, and Alexej Gossmann. A hierarchical decomposition for explaining ml performance discrepancies.arXiv preprint arXiv:2402.14254, 2024
-
[12]
FICO Explainable Machine Learning (xML) Challenge, 2018
FICO. FICO Explainable Machine Learning (xML) Challenge, 2018. https://community.fico.com/s/explainable-machinelearning-challenge
work page 2018
-
[13]
Jerome H Friedman. Greedy function approximation: a gradient boosting machine.Annals of statistics, pages 1189–1232, 2001. 10
work page 2001
-
[14]
Christopher Frye, Colin Rowat, and Ilya Feige. Asymmetric shapley values: incorporating causal knowledge into model-agnostic explainability.Advances in neural information processing systems, 33:1229–1239, 2020
work page 2020
-
[15]
Faircanary: Rapid continuous explainable fairness
Avijit Ghosh, Aalok Shanbhag, and Christo Wilson. Faircanary: Rapid continuous explainable fairness. InProceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, AIES ’22, page 307–316, New York, NY , USA, 2022. Association for Computing Machinery
work page 2022
-
[16]
Fabian Hinder, Valerie Vaquet, and Barbara Hammer. One or two things we know about concept drift–a survey on monitoring evolving environments.arXiv preprint arXiv:2310.15826, 2023
-
[17]
Sanjay Kariyappa, Freddy Lecue, Saumitra Mishra, Christopher Pond, Daniele Magazzeni, and Manuela Veloso. Progressive inference: Explaining decoder-only sequence classification models using intermediate predictions. InForty-first International Conference on Machine Learning, 2024
work page 2024
-
[18]
Moni- toring and explainability of models in production.arXiv preprint arXiv:2007.06299, 2020
Janis Klaise, Arnaud Van Looveren, Clive Cox, Giovanni Vacanti, and Alexandru Coca. Moni- toring and explainability of models in production.arXiv preprint arXiv:2007.06299, 2020
-
[19]
Towards explanatory model monitoring
Alexander Koebler, Thomas Decker, Michael Lebacher, Ingo Thon, V olker Tresp, and Florian Buettner. Towards explanatory model monitoring. InXAI in Action: Past, Present, and Future Applications, 2023
work page 2023
-
[20]
Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang. Concept bottleneck models. InInternational conference on machine learning, pages 5338–5348. PMLR, 2020
work page 2020
-
[21]
Sean Kulinski and David I. Inouye. Towards explaining distribution shifts. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 17931–17952. PMLR, 23–29 Jul 2023
work page 2023
-
[22]
Jiashuo Liu, Tianyu Wang, Peng Cui, and Hongseok Namkoong. On the need for a language describing distribution shifts: Illustrations on tabular datasets.Advances in Neural Information Processing Systems, 36, 2024
work page 2024
-
[23]
A unified approach to interpreting model predictions
Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017
work page 2017
-
[24]
Cancer diagnosis via linear programming
Olvi L Mangasarian and William H Wolberg. Cancer diagnosis via linear programming. Technical report, University of Wisconsin-Madison Department of Computer Sciences, 1990
work page 1990
-
[25]
Christoph Molnar.Interpretable machine learning. Lulu. com, 2020
work page 2020
-
[26]
Explanation shift: Detecting distribution shifts on tabular data via the explanation space
Carlos Mougan, Klaus Broelemann, Gjergji Kasneci, Thanassis Tiropanis, and Steffen Staab. Explanation shift: Detecting distribution shifts on tabular data via the explanation space. In NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications, 2022
work page 2022
-
[27]
Sparse spatial autoregressions.Statistics & Probability Letters, 33(3):291–297, 1997
R Kelley Pace and Ronald Barry. Sparse spatial autoregressions.Statistics & Probability Letters, 33(3):291–297, 1997
work page 1997
-
[28]
Thomas Schnake, Oliver Eberle, Jonas Lederer, Shinichi Nakajima, Kristof T Schütt, Klaus- Robert Müller, and Grégoire Montavon. Higher-order explanations of graph neural networks via relevant walks.IEEE transactions on pattern analysis and machine intelligence, 44(11):7581– 7596, 2021
work page 2021
-
[29]
Unified shapley framework to explain prediction drift, 2021
Aalok Shanbhag, Avijit Ghosh, and Josh Rubin. Unified shapley framework to explain prediction drift, 2021
work page 2021
- [30]
-
[31]
Using the adap learning algorithm to forecast the onset of diabetes mellitus
Jack W Smith, James E Everhart, WC Dickson, William C Knowler, and Robert Scott Johannes. Using the adap learning algorithm to forecast the onset of diabetes mellitus. InProceedings of the annual symposium on computer application in medical care, page 261. American Medical Informatics Association, 1988. 11
work page 1988
-
[32]
Haoran Zhang, Harvineet Singh, and Shalmali Joshi. “Why did the Model Fail?”: Attributing Model Performance Changes to Distribution Shifts. InICML 2022: Workshop on Spurious Correlations, Invariance and Stability, 2022. 12 A Methodological Details and Discussion A.1 Handling General Models via Scalarisation Although the notation in the main paper presents...
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.