Global Aggregations of Local Explanations for Black Box models
Pith reviewed 2026-05-25 01:42 UTC · model grok-4.3
The pith
Aggregations of local explanations yield reliable global insights into black-box models when chosen appropriately.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Global Aggregations of Local Explanations (GALE) show that the choice of aggregation determines whether local explanations can be combined into trustworthy statements about how features affect the black-box model's predictions at scale; LIME's built-in global importance does not reliably represent this global behavior, while the new aggregations do and additionally identify distinguishing features.
What carries the argument
Global Aggregations of Local Explanations (GALE), the process of combining multiple local explanations to derive global statements on feature influence.
If this is right
- Different aggregation choices produce measurably different global importance rankings.
- LIME's global importance score does not reliably track the black-box model's overall decision process.
- Aggregated explanations can surface features that distinguish between prediction classes.
- Global model understanding becomes possible without building a single global surrogate model.
Where Pith is reading between the lines
- The same aggregation logic could be applied to local explanations from methods other than LIME.
- In regulated domains, these aggregations might support auditing by producing stable global feature lists that can be checked against domain knowledge.
- Performance of the aggregations may vary with model type or data distribution, suggesting targeted validation on new domains.
Load-bearing premise
Local explanations such as those produced by LIME are faithful enough to the underlying black-box model that their aggregates accurately reflect global feature effects.
What would settle it
A test that measures whether the ranked feature importances from a given aggregation match the actual change in model output when those features are systematically altered across a large held-out set of instances.
Figures
read the original abstract
The decision-making process of many state-of-the-art machine learning models is inherently inscrutable to the extent that it is impossible for a human to interpret the model directly: they are black box models. This has led to a call for research on explaining black box models, for which there are two main approaches. Global explanations that aim to explain a model's decision making process in general, and local explanations that aim to explain a single prediction. Since it remains challenging to establish fidelity to black box models in globally interpretable approximations, much attention is put on local explanations. However, whether local explanations are able to reliably represent the black box model and provide useful insights remains an open question. We present Global Aggregations of Local Explanations (GALE) with the objective to provide insights in a model's global decision making process. Overall, our results reveal that the choice of aggregation matters. We find that the global importance introduced by Local Interpretable Model-agnostic Explanations (LIME) does not reliably represent the model's global behavior. Our proposed aggregations are better able to represent how features affect the model's predictions, and to provide global insights by identifying distinguishing features.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Global Aggregations of Local Explanations (GALE) to derive global insights into black-box model behavior by aggregating local explanations (e.g., from LIME). It claims that the choice of aggregation matters, that LIME's global importance measure does not reliably represent the model's global behavior, and that the proposed aggregations better capture how features affect predictions and identify distinguishing features.
Significance. If the empirical claims hold, the work would be moderately significant for the XAI community: it directly tests whether local explanations can be aggregated into trustworthy global statements and supplies a concrete counter-example to LIME's global importance. The observation that aggregation choice is consequential is useful, but the absence of any equations, fidelity metrics, datasets, or statistical tests in the supplied text prevents evaluation of whether the superiority result is robust.
major comments (2)
- [Abstract] Abstract: the central claim that 'our proposed aggregations are better able to represent how features affect the model's predictions' is presented without any description of the aggregation operators, the fidelity metric used to compare them to LIME global importance, the datasets, or the statistical tests. This absence makes the superiority result impossible to verify and is load-bearing for the paper's main contribution.
- [Abstract] Abstract: the weakest assumption noted by the reader—that local explanations (LIME) are sufficiently faithful for aggregation to yield accurate global statements—is not addressed or tested in the provided text, leaving open whether any aggregation method can overcome unfaithful local explanations.
Simulated Author's Rebuttal
We thank the referee for the feedback. We respond point-by-point to the major comments and indicate planned revisions to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that 'our proposed aggregations are better able to represent how features affect the model's predictions' is presented without any description of the aggregation operators, the fidelity metric used to compare them to LIME global importance, the datasets, or the statistical tests. This absence makes the superiority result impossible to verify and is load-bearing for the paper's main contribution.
Authors: The abstract is space-constrained, but the full manuscript defines the aggregation operators (including mean and median variants), specifies the fidelity metric used to compare against LIME global importance, lists the datasets, and reports the statistical tests. To make the central claim verifiable from the abstract itself, we will revise the abstract to include a concise description of the evaluation setup. revision: yes
-
Referee: [Abstract] Abstract: the weakest assumption noted by the reader—that local explanations (LIME) are sufficiently faithful for aggregation to yield accurate global statements—is not addressed or tested in the provided text, leaving open whether any aggregation method can overcome unfaithful local explanations.
Authors: The manuscript already states that whether local explanations reliably represent the black-box model remains an open question. Our contribution focuses on the effect of aggregation choice given local explanations from LIME. We agree that an explicit discussion of the faithfulness assumption would strengthen the paper and will add a dedicated paragraph in the discussion section addressing this limitation and its implications for the results. revision: yes
Circularity Check
No significant circularity identified
full rationale
The abstract and available text present empirical claims about GALE aggregations outperforming LIME global importance without any equations, parameter-fitting steps, or derivation chain. No self-citations, ansatzes, or uniqueness theorems are invoked in the provided content. The comparison to LIME relies on external prior work (Ribeiro et al.) rather than self-referential reduction. Per hard rules, absence of quotable reductions to inputs by construction yields score 0; the paper's central claim remains independent of its own fitted values in the visible sections.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Local explanations from methods such as LIME are faithful enough to the black-box model that their aggregation yields reliable global feature effects.
Reference graph
Works this paper leans on
-
[1]
Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, 7 (2015), e0130140
work page 2015
-
[2]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems . 2672–2680. SIGIR ’19, July 21–25, 2019, Paris, France Ilse van der Linden, Hinda Haned, and Evangelos Kanoulas sci_space rec_sport_basebal...
work page 2014
-
[3]
Klaus Greff, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2017. LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28, 10 (2017), 2222–2232
work page 2017
-
[4]
Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) 51, 5 (2018), 93
work page 2018
-
[5]
Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 597–606
work page 2015
-
[6]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classifica- tion with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105
work page 2012
-
[7]
Scott Lundberg and Su-In Lee. 2016. An unexpected unity among methods for interpreting model predictions. arXiv preprint arXiv:1611.07478 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[8]
Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems . 4768–4777
work page 2017
-
[9]
Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579–2605
work page 2008
-
[10]
Brent Mittelstadt, Chris Russell, and Sandra Wachter. 2018. Explaining explana- tions in ai. arXiv preprint arXiv:1811.01439 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
W James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu
-
[12]
arXiv preprint arXiv:1901.04592 (2019)
Interpretable machine learning: definitions, methods, and applications. arXiv preprint arXiv:1901.04592 (2019)
-
[13]
Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Founda- tions and Trends® in Information Retrieval 2, 1–2 (2008), 1–135
work page 2008
-
[14]
Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2018. Glove: Global vectors for word representation. 2014. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) (2018)
work page 2018
-
[15]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[16]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135–1144
work page 2016
-
[17]
Wojciech Samek, Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, and Klaus-Robert Müller. 2017. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems 28, 11 (2017), 2660–2673
work page 2017
-
[18]
Andrew D Selbst and Julia Powles. 2017. Meaningful information and the right to explanation. International Data Privacy Law 7, 4 (2017), 233–242
work page 2017
-
[19]
Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning im- portant features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 . JMLR. org, 3145–3153
work page 2017
-
[20]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958
work page 2014
-
[21]
Maartje ter Hoeve, Anne Schuth, Daan Odijk, and Maarten de Rijke. 2018. Faithfully explaining rankings in a news recommender system. arXiv preprint arXiv:1805.05447 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.