Global Aggregations of Local Explanations for Black Box models

Evangelos Kanoulas; Hinda Haned; Ilse van der Linden

arxiv: 1907.03039 · v1 · pith:VWEBJHNTnew · submitted 2019-07-05 · 💻 cs.IR · cs.AI· cs.LG

Global Aggregations of Local Explanations for Black Box models

Ilse van der Linden , Hinda Haned , Evangelos Kanoulas This is my paper

Pith reviewed 2026-05-25 01:42 UTC · model grok-4.3

classification 💻 cs.IR cs.AIcs.LG

keywords black-box modelslocal explanationsglobal explanationsLIMEmodel interpretabilityfeature importanceaggregationsexplainable AI

0 comments

The pith

Aggregations of local explanations yield reliable global insights into black-box models when chosen appropriately.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Global Aggregations of Local Explanations (GALE) to turn local explanations into statements about a model's overall behavior. It demonstrates that different ways of combining local outputs produce different global pictures, and that LIME's standard global importance score fails to match the model's actual predictions across instances. The proposed aggregations more accurately track how individual features shift predictions and surface features that separate one outcome from another.

Core claim

Global Aggregations of Local Explanations (GALE) show that the choice of aggregation determines whether local explanations can be combined into trustworthy statements about how features affect the black-box model's predictions at scale; LIME's built-in global importance does not reliably represent this global behavior, while the new aggregations do and additionally identify distinguishing features.

What carries the argument

Global Aggregations of Local Explanations (GALE), the process of combining multiple local explanations to derive global statements on feature influence.

If this is right

Different aggregation choices produce measurably different global importance rankings.
LIME's global importance score does not reliably track the black-box model's overall decision process.
Aggregated explanations can surface features that distinguish between prediction classes.
Global model understanding becomes possible without building a single global surrogate model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same aggregation logic could be applied to local explanations from methods other than LIME.
In regulated domains, these aggregations might support auditing by producing stable global feature lists that can be checked against domain knowledge.
Performance of the aggregations may vary with model type or data distribution, suggesting targeted validation on new domains.

Load-bearing premise

Local explanations such as those produced by LIME are faithful enough to the underlying black-box model that their aggregates accurately reflect global feature effects.

What would settle it

A test that measures whether the ranked feature importances from a given aggregation match the actual change in model output when those features are systematically altered across a large held-out set of instances.

Figures

Figures reproduced from arXiv: 1907.03039 by Evangelos Kanoulas, Hinda Haned, Ilse van der Linden.

**Figure 2.** Figure 2: Quantitative evaluation of GALE on the multiclass [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Top 10 features per class according to global LIME [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 5.** Figure 5: Top 10 features per class according to homogeneity [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

The decision-making process of many state-of-the-art machine learning models is inherently inscrutable to the extent that it is impossible for a human to interpret the model directly: they are black box models. This has led to a call for research on explaining black box models, for which there are two main approaches. Global explanations that aim to explain a model's decision making process in general, and local explanations that aim to explain a single prediction. Since it remains challenging to establish fidelity to black box models in globally interpretable approximations, much attention is put on local explanations. However, whether local explanations are able to reliably represent the black box model and provide useful insights remains an open question. We present Global Aggregations of Local Explanations (GALE) with the objective to provide insights in a model's global decision making process. Overall, our results reveal that the choice of aggregation matters. We find that the global importance introduced by Local Interpretable Model-agnostic Explanations (LIME) does not reliably represent the model's global behavior. Our proposed aggregations are better able to represent how features affect the model's predictions, and to provide global insights by identifying distinguishing features.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces Global Aggregations of Local Explanations (GALE) to derive global insights into black-box model behavior by aggregating local explanations (e.g., from LIME). It claims that the choice of aggregation matters, that LIME's global importance measure does not reliably represent the model's global behavior, and that the proposed aggregations better capture how features affect predictions and identify distinguishing features.

Significance. If the empirical claims hold, the work would be moderately significant for the XAI community: it directly tests whether local explanations can be aggregated into trustworthy global statements and supplies a concrete counter-example to LIME's global importance. The observation that aggregation choice is consequential is useful, but the absence of any equations, fidelity metrics, datasets, or statistical tests in the supplied text prevents evaluation of whether the superiority result is robust.

major comments (2)

[Abstract] Abstract: the central claim that 'our proposed aggregations are better able to represent how features affect the model's predictions' is presented without any description of the aggregation operators, the fidelity metric used to compare them to LIME global importance, the datasets, or the statistical tests. This absence makes the superiority result impossible to verify and is load-bearing for the paper's main contribution.
[Abstract] Abstract: the weakest assumption noted by the reader—that local explanations (LIME) are sufficiently faithful for aggregation to yield accurate global statements—is not addressed or tested in the provided text, leaving open whether any aggregation method can overcome unfaithful local explanations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the feedback. We respond point-by-point to the major comments and indicate planned revisions to the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that 'our proposed aggregations are better able to represent how features affect the model's predictions' is presented without any description of the aggregation operators, the fidelity metric used to compare them to LIME global importance, the datasets, or the statistical tests. This absence makes the superiority result impossible to verify and is load-bearing for the paper's main contribution.

Authors: The abstract is space-constrained, but the full manuscript defines the aggregation operators (including mean and median variants), specifies the fidelity metric used to compare against LIME global importance, lists the datasets, and reports the statistical tests. To make the central claim verifiable from the abstract itself, we will revise the abstract to include a concise description of the evaluation setup. revision: yes
Referee: [Abstract] Abstract: the weakest assumption noted by the reader—that local explanations (LIME) are sufficiently faithful for aggregation to yield accurate global statements—is not addressed or tested in the provided text, leaving open whether any aggregation method can overcome unfaithful local explanations.

Authors: The manuscript already states that whether local explanations reliably represent the black-box model remains an open question. Our contribution focuses on the effect of aggregation choice given local explanations from LIME. We agree that an explicit discussion of the faithfulness assumption would strengthen the paper and will add a dedicated paragraph in the discussion section addressing this limitation and its implications for the results. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The abstract and available text present empirical claims about GALE aggregations outperforming LIME global importance without any equations, parameter-fitting steps, or derivation chain. No self-citations, ansatzes, or uniqueness theorems are invoked in the provided content. The comparison to LIME relies on external prior work (Ribeiro et al.) rather than self-referential reduction. Per hard rules, absence of quotable reductions to inputs by construction yields score 0; the paper's central claim remains independent of its own fitted values in the visible sections.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that local explanations remain faithful enough for aggregation; no free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Local explanations from methods such as LIME are faithful enough to the black-box model that their aggregation yields reliable global feature effects.
This premise is required for any claim that the aggregated quantities represent the model's actual global behavior.

pith-pipeline@v0.9.0 · 5739 in / 1096 out tokens · 25164 ms · 2026-05-25T01:42:47.069149+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 4 internal anchors

[1]

Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, 7 (2015), e0130140

work page 2015
[2]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems . 2672–2680. SIGIR ’19, July 21–25, 2019, Paris, France Ilse van der Linden, Hinda Haned, and Evangelos Kanoulas sci_space rec_sport_basebal...

work page 2014
[3]

Klaus Greff, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2017. LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28, 10 (2017), 2222–2232

work page 2017
[4]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) 51, 5 (2018), 93

work page 2018
[5]

Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 597–606

work page 2015
[6]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classifica- tion with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105

work page 2012
[7]

Scott Lundberg and Su-In Lee. 2016. An unexpected unity among methods for interpreting model predictions. arXiv preprint arXiv:1611.07478 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[8]

Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems . 4768–4777

work page 2017
[9]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579–2605

work page 2008
[10]

Brent Mittelstadt, Chris Russell, and Sandra Wachter. 2018. Explaining explana- tions in ai. arXiv preprint arXiv:1811.01439 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[11]

W James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu

work page
[12]

arXiv preprint arXiv:1901.04592 (2019)

Interpretable machine learning: definitions, methods, and applications. arXiv preprint arXiv:1901.04592 (2019)

work page arXiv 1901
[13]

Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Founda- tions and Trends® in Information Retrieval 2, 1–2 (2008), 1–135

work page 2008
[14]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2018. Glove: Global vectors for word representation. 2014. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) (2018)

work page 2018
[15]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[16]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135–1144

work page 2016
[17]

Wojciech Samek, Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, and Klaus-Robert Müller. 2017. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems 28, 11 (2017), 2660–2673

work page 2017
[18]

Andrew D Selbst and Julia Powles. 2017. Meaningful information and the right to explanation. International Data Privacy Law 7, 4 (2017), 233–242

work page 2017
[19]

Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning im- portant features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 . JMLR. org, 3145–3153

work page 2017
[20]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958

work page 2014
[21]

Maartje ter Hoeve, Anne Schuth, Daan Odijk, and Maarten de Rijke. 2018. Faithfully explaining rankings in a news recommender system. arXiv preprint arXiv:1805.05447 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[1] [1]

Sebastian Bach, Alexander Binder, Grégoire Montavon, Frederick Klauschen, Klaus-Robert Müller, and Wojciech Samek. 2015. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one 10, 7 (2015), e0130140

work page 2015

[2] [2]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems . 2672–2680. SIGIR ’19, July 21–25, 2019, Paris, France Ilse van der Linden, Hinda Haned, and Evangelos Kanoulas sci_space rec_sport_basebal...

work page 2014

[3] [3]

Klaus Greff, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink, and Jürgen Schmidhuber. 2017. LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems 28, 10 (2017), 2222–2232

work page 2017

[4] [4]

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) 51, 5 (2018), 93

work page 2018

[5] [5]

Dimitrios Kotzias, Misha Denil, Nando De Freitas, and Padhraic Smyth. 2015. From group to individual labels using deep features. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 597–606

work page 2015

[6] [6]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classifica- tion with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105

work page 2012

[7] [7]

Scott Lundberg and Su-In Lee. 2016. An unexpected unity among methods for interpreting model predictions. arXiv preprint arXiv:1611.07478 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[8] [8]

Scott Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems . 4768–4777

work page 2017

[9] [9]

Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, Nov (2008), 2579–2605

work page 2008

[10] [10]

Brent Mittelstadt, Chris Russell, and Sandra Wachter. 2018. Explaining explana- tions in ai. arXiv preprint arXiv:1811.01439 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[11] [11]

W James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, and Bin Yu

work page

[12] [12]

arXiv preprint arXiv:1901.04592 (2019)

Interpretable machine learning: definitions, methods, and applications. arXiv preprint arXiv:1901.04592 (2019)

work page arXiv 1901

[13] [13]

Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Founda- tions and Trends® in Information Retrieval 2, 1–2 (2008), 1–135

work page 2008

[14] [14]

Jeffrey Pennington, Richard Socher, and Christopher D Manning. 2018. Glove: Global vectors for word representation. 2014. Proceedings of the Empiricial Methods in Natural Language Processing (EMNLP 2014) (2018)

work page 2018

[15] [15]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[16] [16]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . ACM, 1135–1144

work page 2016

[17] [17]

Wojciech Samek, Alexander Binder, Grégoire Montavon, Sebastian Lapuschkin, and Klaus-Robert Müller. 2017. Evaluating the visualization of what a deep neural network has learned. IEEE transactions on neural networks and learning systems 28, 11 (2017), 2660–2673

work page 2017

[18] [18]

Andrew D Selbst and Julia Powles. 2017. Meaningful information and the right to explanation. International Data Privacy Law 7, 4 (2017), 233–242

work page 2017

[19] [19]

Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning im- portant features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70 . JMLR. org, 3145–3153

work page 2017

[20] [20]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research 15, 1 (2014), 1929–1958

work page 2014

[21] [21]

Maartje ter Hoeve, Anne Schuth, Daan Odijk, and Maarten de Rijke. 2018. Faithfully explaining rankings in a news recommender system. arXiv preprint arXiv:1805.05447 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018