Explaining the "Why": A Unified Framework for the Additive Attribution of Changes in Arbitrary Measures
Pith reviewed 2026-05-07 12:36 UTC · model grok-4.3
The pith
A classification of measures by mathematical structure yields a spectrum of attribution algorithms from approximations to exact closed-form solutions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that classifying measures based on their mathematical structure enables a spectrum of algorithms—from general approximations to exact, closed-form solutions—that offer a principled trade-off between generality and performance. This is shown through simulations confirming numerical accuracy and generality for non-additive measures, a Simpson's Paradox case study demonstrating unique interpretability, and experiments where the framework significantly outperforms existing root cause analysis systems.
What carries the argument
The classification of measures according to mathematical structure, which selects the appropriate attribution algorithm from approximations to exact solutions within the cooperative game reframing.
If this is right
- Exact closed-form attribution becomes available for measures whose structure matches the classification criteria.
- Non-additive measures receive consistent approximate attributions that preserve the cooperative game properties.
- The same framework applies uniformly across data dimensions and measure compositions without custom per-measure adjustments.
- Root cause analysis systems built on this approach achieve higher accuracy than prior methods on both synthetic and real tasks.
- Interpretability improves in paradoxical cases such as Simpson's Paradox because attributions respect the full game structure.
Where Pith is reading between the lines
- The classification may suggest analogous structure-based breakdowns for attribution problems in model interpretability or causal analysis.
- Implementers could benchmark the exact solutions against sampling methods on common measures such as sums, ratios, and products to quantify speed gains.
- The cooperative game view could be extended to dynamic settings where measures evolve over time, treating successive snapshots as sequential games.
Load-bearing premise
That reframing attribution as a cooperative game over arbitrary measures produces a holistic and rigorous solution that existing methods lack, and that the structure-based classification reliably enables both generality and performance.
What would settle it
Run the exact closed-form algorithm on a measure the classification assigns to that category and observe whether the attributed contributions sum exactly to the observed change in the aggregated measure on held-out simulated data.
Figures
read the original abstract
Explaining why aggregated measures change is a critical challenge in data analytics that existing systems struggle to address. While current attribution methods exist, they lack a unified solution that is simultaneously general for arbitrary measures, holistic across both data dimensions and measure composition, and rigorous in its interpretability. To bridge this gap, we introduce a principled framework that reframes attribution through the powerful lens of cooperative game theory. Our key contribution is a classification of measures based on their mathematical structure, which enables a spectrum of algorithms-from general approximations to exact, closed-form solutions-that offer a principled trade-off between generality and performance. We demonstrate our framework's superiority through a multi-faceted evaluation: simulations first confirm its numerical accuracy and then its generality for non-additive measures; a case study on Simpson's Paradox showcases its unique interpretability; and a final experiment proves its practical utility by significantly outperforming existing root cause analysis systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a unified framework for attributing changes in arbitrary aggregated measures by reframing the attribution problem as a cooperative game. Its central contribution is a classification of measures according to their mathematical structure, which supports a spectrum of algorithms ranging from general approximations to exact closed-form solutions and provides a principled generality-performance trade-off. The framework is evaluated via simulations confirming numerical accuracy and applicability to non-additive measures, a Simpson's Paradox case study demonstrating interpretability, and experiments showing outperformance relative to existing root cause analysis systems.
Significance. If the central claims hold, the work would offer a significant advance in explainable data analytics by supplying a general, holistic, and rigorous method for measure attribution that extends beyond additive cases and unifies disparate approaches through game-theoretic modeling. The structure-based classification and resulting algorithmic spectrum could enable practical improvements in root cause analysis while maintaining interpretability guarantees. Strengths include the explicit handling of non-additive measures and the multi-faceted evaluation design.
major comments (1)
- §3 (Framework and value function construction): The definition of the characteristic function v(S) — the measure change attributable to coalition S — is not uniquely determined for arbitrary (especially non-additive) measures. Different choices of baseline, marginal contribution ordering, or interaction encoding produce different games and thus different attributions for the same data. The classification of measures addresses computational tractability after v is fixed but does not resolve this prior modeling choice, so the claims of a 'unified,' 'holistic,' and 'rigorous' solution remain conditional on an unstated canonical construction of v that may not exist without additional assumptions.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comment raises an important point about the construction of the value function, which we address below by clarifying the role of our measure classification in guiding this choice. We have revised the paper to strengthen the presentation of this aspect.
read point-by-point responses
-
Referee: §3 (Framework and value function construction): The definition of the characteristic function v(S) — the measure change attributable to coalition S — is not uniquely determined for arbitrary (especially non-additive) measures. Different choices of baseline, marginal contribution ordering, or interaction encoding produce different games and thus different attributions for the same data. The classification of measures addresses computational tractability after v is fixed but does not resolve this prior modeling choice, so the claims of a 'unified,' 'holistic,' and 'rigorous' solution remain conditional on an unstated canonical construction of v that may not exist without additional assumptions.
Authors: We agree that constructing v(S) involves modeling decisions, especially for non-additive measures where interactions must be encoded. Section 3 of the manuscript defines v(S) explicitly as the attributable change in the target measure for coalition S relative to a fixed baseline (typically the reference period or population), with the precise formulation determined by the measure's mathematical structure per our classification. For additive measures, v(S) reduces to the sum of marginal contributions; for non-additive cases (e.g., ratios or products), it incorporates higher-order terms via the structure-specific encoding detailed in §3.2–3.4. This classification therefore does more than address tractability: it supplies the canonical construction of v(S) for each class, ensuring the resulting game yields additive attributions that are unique within the chosen structure. Different baselines or orderings are possible in principle, but our framework restricts them to structure-preserving choices that maintain the additivity guarantee. To address the concern directly, we have expanded §3 with a new paragraph and example table illustrating how the classification dictates the v(S) definition, thereby reinforcing the unified and rigorous character of the approach. revision: yes
Circularity Check
No circularity: new classification and game-theoretic reframing are self-contained
full rationale
The paper introduces a classification of measures by mathematical structure to support a spectrum of attribution algorithms derived from cooperative game theory. No equations, fitting procedures, or self-citations are presented that reduce any claimed result to its own inputs by construction. The central reframing and classification steps are presented as novel contributions rather than tautological renamings or fitted predictions, with evaluations (simulations, Simpson's Paradox case, and root-cause comparisons) serving as external checks rather than internal re-derivations.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Measures admit a classification by mathematical structure that enables both exact and approximate attribution algorithms.
Reference graph
Works this paper leans on
-
[1]
Explaining individual predictions when features are dependent: More accurate approximations to shapley values
Kjersti Aas, Martin Jullum, and Anders Løland. Explaining individual predictions when features are dependent: More accurate approximations to shapley values. Artificial Intelligence, 298:103502, 2021
2021
-
[2]
B.W. Ang. Lmdi decomposition approach: A guide for implementation.Energy Policy, 86:233–238, 2015
2015
-
[3]
Factorizing changes in energy and environmental indicators through decomposition.Energy, 23(6):489–495, 1998
B.W Ang, F.Q Zhang, and Ki-Hong Choi. Factorizing changes in energy and environmental indicators through decomposition.Energy, 23(6):489–495, 1998
1998
-
[4]
A logarithmic mean divisia index decomposition of co 2 emissions from energy use in romania
Mariana Carmelia Balanica-Dragomir, Gabriel Murariu, and Lucian Puiu Georgescu. A logarithmic mean divisia index decomposition of co 2 emissions from energy use in romania. Papers 2403.04354, arXiv.org, Mar 2024
-
[5]
Adtributor: Rev- enue debugging in advertising systems
Ranjita Bhagwan, Rahul Kumar, Ramachandran Ramjee, George Varghese, Sur- jyakanta Mohapatra, Hemanth Manoharan, and Piyush Shah. Adtributor: Rev- enue debugging in advertising systems. InSymposium on Networked Systems Design and Implementation, 2014
2014
-
[6]
Causal structure-based root cause analysis of outliers
Kailash Budhathoki, Lenon Minorics, Patrick Bloebaum, and Dominik Janzing. Causal structure-based root cause analysis of outliers. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning Resear...
2022
-
[7]
Covert, Scott M
Hugh Chen, Ian C. Covert, Scott M. Lundberg, and Su-In Lee. Algorithms to estimate shapley value feature attributions, 2022
2022
-
[8]
Lundberg, and Su-In Lee
Hugh Chen, Scott M. Lundberg, and Su-In Lee. Explaining a series of models by propagating shapley values.Nature Communications, 13(1), August 2022
2022
-
[9]
1973 berkeley graduate admissions data
Data Science Discovery. 1973 berkeley graduate admissions data. https: //discovery.cs.illinois.edu/dataset/berkeley/. Accessed: 2025-08-15
1973
-
[10]
Root cause analysis of failures in microservices through causal discovery
Azam Ikram, Sarthak Chakraborty, Subrata Mitra, Shiv Saini, Saurabh Bagchi, and Murat Kocaoglu. Root cause analysis of failures in microservices through causal discovery. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 31158–31170. Curran Associates, Inc., 2022
2022
-
[11]
Feature relevance quantification in explainable ai: A causal perspective
Dominik Janzing, Lenon Minorics, and Patrick Blöbaum. Feature relevance quantification in explainable ai: A causal perspective. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics, volume 108 of PMLR, pages 3227–3237, 2020
2020
-
[12]
Fastshap: Real-time shapley value estimation
Neil Jethani, Mukund Sudarshan, Rumen Watcher, and Ramesh Raskar. Fastshap: Real-time shapley value estimation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 7131–7139, 2021
2021
-
[13]
Autoroot: A novel fault localization schema of multi-dimensional root causes
Pengkun Jing, Yanni Han, Jiyan Sun, Tao Lin, and Yanjie Hu. Autoroot: A novel fault localization schema of multi-dimensional root causes. In2021 IEEE Wireless Communications and Networking Conference (WCNC), pages 1–7, 2021
2021
-
[14]
Riskloc: Localization of multi-dimensional root causes by weighted risk, 2022
Marcus Kalander. Riskloc: Localization of multi-dimensional root causes by weighted risk, 2022
2022
-
[15]
Marcus Kalander. Riskloc: Localization of multi-dimensional root causes by weighted risk.arXiv preprint arXiv:2205.10004, 2022
-
[16]
Causal inference-based root cause analysis for online service systems with intervention recognition
Mingjie Li, Zeyan Li, Kanglin Yin, Xiaohui Nie, Wenchi Zhang, Kaixin Sui, and Dan Pei. Causal inference-based root cause analysis for online service systems with intervention recognition. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22, page 3230–3240, New York, NY, USA, 2022. Association for Computing Machinery
2022
-
[17]
Generic and robust root cause localization for multi-dimensional data in online service systems, 2023
Zeyan Li, Junjie Chen, Yihao Chen, Chengyang Luo, Yiwei Zhao, Yongqian Sun, Kaixin Sui, Xiping Wang, Dapeng Liu, Xing Jin, Qi Wang, and Dan Pei. Generic and robust root cause localization for multi-dimensional data in online service systems, 2023
2023
-
[18]
idice: Prob- lem identification for emerging issues
Qingwei Lin, Jian-Guang Lou, Hongyu Zhang, and Dongmei Zhang. idice: Prob- lem identification for emerging issues. In2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE), pages 214–224, 2016
2016
-
[19]
A unified approach to interpreting model predic- tions, 2017
Scott Lundberg and Su-In Lee. A unified approach to interpreting model predic- tions, 2017
2017
-
[20]
Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M
Scott M. Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. Ex- plainable ai for trees: From local explanations to global understanding, 2019
2019
-
[21]
Lundberg, Gabriel G
Scott M. Lundberg, Gabriel G. Erion, and Su-In Lee. Consistent individualized feature attribution for tree ensembles, 2019
2019
-
[22]
Anomaly detection and fault localization an automated process for advertising systems, 1 2018
Moa Persson and Linnea Rudenius. Anomaly detection and fault localization an automated process for advertising systems, 1 2018
2018
-
[23]
why should i trust you?
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should i trust you?": Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pages 1135–1144. ACM, 2016
2016
-
[24]
The cascading analysts algorithm
Matthias Ruhl, Mukund Sundararajan, and Qiqi Yan. The cascading analysts algorithm. InProceedings of the 2018 International Conference on Management of Data, SIGMOD ’18, page 1083–1096, New York, NY, USA, 2018. Association for Computing Machinery
2018
-
[25]
Columbus Salley and E. F. Codd. Providing olap to user-analysts: An it mandate. 1998
1998
-
[26]
Explaining differences in multidimensional aggregates
Sunita Sarawagi. Explaining differences in multidimensional aggregates. In Proceedings of the 25th International Conference on Very Large Data Bases, VLDB ’99, page 42–53, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc
1999
-
[27]
idiff: Informative summarization of differences in multidi- mensional aggregates.Data Mining and Knowledge Discovery, 5(4):255–276, 10 2001
Sunita Sarawagi. idiff: Informative summarization of differences in multidi- mensional aggregates.Data Mining and Knowledge Discovery, 5(4):255–276, 10 2001
2001
-
[28]
A value for n-person games
Lloyd S Shapley. A value for n-person games. In Harold W. Kuhn and Albert W. Tucker, editors,Contributions to the Theory of Games II, pages 307–317. Princeton University Press, Princeton, 1953
1953
-
[29]
Minglin Shen, Yanping Hou, Keying Liang, Wenjing Zhu, Chin Hao Chong, Yuejing Bin, Xiaoyong Zhou, and Linwei Ma. Energy-system characteristic shifts and their quantitative impacts on china’s co2 trajectory: Evidence from a high-resolution energy allocation analysis–lmdi sectoral decomposition.Energy, 335:137905, 2025
2025
-
[30]
Learning important features through propagating activation differences
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. Learning important features through propagating activation differences. InProceedings of the 34th International Conference on Machine Learning, volume 70 ofPMLR, pages 3145– 3153, 2017
2017
-
[31]
Robust anomaly clue localization of multi-dimensional derived measure for online video services.IEEE Transactions on Services Computing, 16(2):1387–1401, 2023
Yongqian Sun, Daguo Cheng, Pengxiang Jin, Quan Ding, Shenglin Zhang, Xu Chen, Yuzhi Zhang, Minghan Liang, Dan Pei, Jianyan Zheng, Sen Luo, and Xinyu Tang. Robust anomaly clue localization of multi-dimensional derived measure for online video services.IEEE Transactions on Services Computing, 16(2):1387–1401, 2023
2023
-
[32]
Hotspot: Anomaly localization for additive kpis with multi-dimensional attributes.IEEE Access, 6:10909–10923, 2018
Yongqian Sun, Youjian Zhao, Ya Su, Dapeng Liu, Xiaohui Nie, Yuan Meng, Shiwen Cheng, Dan Pei, Shenglin Zhang, Xianping Qu, and Xuanyou Guo. Hotspot: Anomaly localization for additive kpis with multi-dimensional attributes.IEEE Access, 6:10909–10923, 2018
2018
-
[33]
The many shapley values for model explanation
Mukund Sundararajan and Amir Najmi. The many shapley values for model explanation. InProceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020
2020
-
[34]
Axiomatic attribution for deep networks
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. InProceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 3319–3328. JMLR.org, 2017
2017
-
[35]
Andrea Tonon, Meng Zhang, Bora Caglayan, Fei Shen, Tong Gui, MingXue Wang, and Rong Zhou. RADICE: causal graph based root cause analysis for system performance diagnostic.CoRR, abs/2501.11545, 2025
-
[36]
Incremental causal graph learning for online root cause analysis
Dongjie Wang, Zhengzhang Chen, Yanjie Fu, Yanchi Liu, and Haifeng Chen. Incremental causal graph learning for online root cause analysis. InProceedings 10 Explaining the “Why”: A Unified Framework for the Additive Attribution of Changes in Arbitrary Measures Conference’17, July 2017, Washington, DC, USA of the 29th ACM SIGKDD Conference on Knowledge Disco...
2017
-
[37]
Cmmd: Cross-metric multi-dimensional root cause analysis
Shifu Yan, Caihua Shan, Wenyi Yang, Bixiong Xu, Dongsheng Li, Lili Qiu, Jie Tong, and Qi Zhang. Cmmd: Cross-metric multi-dimensional root cause analysis. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22, page 4310–4320, New York, NY, USA, 2022. Association for Computing Machinery
2022
-
[38]
Fast treeshap: Accelerating shap value computation for trees, 2022
Jilei Yang. Fast treeshap: Accelerating shap value computation for trees, 2022
2022
-
[39]
Mulan: Multi- modal causal structure learning and root cause analysis for microservice systems
Lecheng Zheng, Zhengzhang Chen, Jingrui He, and Haifeng Chen. Mulan: Multi- modal causal structure learning and root cause analysis for microservice systems. InProceedings of the ACM Web Conference 2024, WWW ’24, page 4107–4116, New York, NY, USA, 2024. Association for Computing Machinery. 11
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.