pith. sign in

arxiv: 2405.18921 · v3 · submitted 2024-05-29 · 💻 cs.LG

GLANCE: Global Actions in a Nutshell for Counterfactual Explainability

Pith reviewed 2026-05-24 00:56 UTC · model grok-4.3

classification 💻 cs.LG
keywords global counterfactual explanationsagglomerative clusteringrecourse actionstrade-off balancingmachine learning explainabilityeffectivenesscostinterpretability
0
0 comments X

The pith

GLANCE uses a novel agglomerative approach on feature and action spaces to generate global counterfactual explanations that balance effectiveness, cost, and number of actions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces GLANCE to address the challenge of providing global counterfactual explanations for machine learning models. These explanations take the form of actions that offer recourse to large groups of individuals. The method aims to maximize the number of people helped while keeping actions low-cost and few in number for better interpretability. By clustering data points while considering both their features and the counterfactual actions they would require, GLANCE seeks to align with the model's decision structure. This is important because it could make explanations more practical for real-world applications where individual explanations are not feasible.

Core claim

GLANCE is a versatile algorithm that employs a novel agglomerative approach, jointly considering both the feature space and the space of counterfactual actions, thereby accounting for the distribution of points in a way that aligns with the model's structure. This design enables the careful balancing of the trade-offs among effectiveness, cost, and the number of actions, with the size objective functioning as a tunable parameter. Extensive experiments show it achieves greater robustness and performance than existing methods across datasets and models.

What carries the argument

agglomerative approach that jointly considers the feature space and the space of counterfactual actions

Load-bearing premise

Jointly considering both the feature space and the counterfactual-action space will account for the distribution of points in a way that aligns with the model's structure and produces the desired trade-off balance.

What would settle it

Comparative experiments on standard datasets and models where GLANCE does not achieve higher effectiveness at comparable or lower cost with fewer actions than baselines.

Figures

Figures reproduced from arXiv: 2405.18921 by Dimitrios Gunopulos, Dimitrios Rontogiannis, Dimitrios Tomaras, Dimitris Fotakis, Dimitris Sacharidis, Eleni Psaroudaki, Giorgos Giannopoulos, Ioannis Emiris, Kleopatra Markou, Konstantinos Tsopelas, Loukas Kavouras, Nikolaos Theologitis.

Figure 1
Figure 1. Figure 1: A toy example depicting two negative instances [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of GCEs on the COMPAS dataset using an [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Intuition behind clustering approaches. (a) First, [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of effectiveness (eff) and average recourse cost (avc), nor￾malized with the maximum cost achieved in each dataset/model combination) for the solution of s-GCE with s = 4. Standard deviations are represented by error bars. The red horizontal lines represent the eff > 80% threshold for eval￾uating the practicality of the solutions. H.2 Summary of Experiments In [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of effectiveness (eff) and average recourse cost (avc), nor￾malized with the maximum cost achieved in each dataset/model combination) for the solution of s-GCE with s = 8. Standard deviations are represented by error bars. The red horizontal lines represent the eff > 80% threshold for eval￾uating the practicality of the solution [PITH_FULL_IMAGE:figures/full_fig_p032_5.png] view at source ↗
read the original abstract

The widespread deployment of machine learning systems in critical real-world decision-making applications has highlighted the urgent need for counterfactual explainability methods that operate effectively. Global counterfactual explanations, expressed as actions to offer recourse, aim to provide succinct explanations and insights applicable to large population subgroups. High effectiveness, measured by the fraction of the population that is provided recourse, ensures that the actions benefit as many individuals as possible. Keeping the cost of actions low ensures the proposed recourse actions remain practical and actionable. Limiting the number of actions that provide global counterfactuals is essential to maximizing interpretability. The primary challenge, therefore, is to balance these trade-offs--maximizing effectiveness, minimizing cost, while maintaining a small number of actions. We introduce $\texttt{GLANCE}$, a versatile and adaptive algorithm that employs a novel agglomerative approach, jointly considering both the feature space and the space of counterfactual actions, thereby accounting for the distribution of points in a way that aligns with the model's structure. This design enables the careful balancing of the trade-offs among the three key objectives, with the size objective functioning as a tunable parameter to keep the actions few and easy to interpret. Our extensive experimental evaluation demonstrates that $\texttt{GLANCE}$ consistently shows greater robustness and performance compared to existing methods across various datasets and models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper introduces GLANCE, a versatile and adaptive algorithm for global counterfactual explanations that employs a novel agglomerative approach jointly considering the feature space and the space of counterfactual actions. This design accounts for the distribution of points in a model-aligned way to balance three objectives: maximizing effectiveness (fraction of the population provided recourse), minimizing action cost, and limiting the number of actions (with size as a tunable parameter for interpretability). Extensive experiments are claimed to show greater robustness and performance compared to existing methods across various datasets and models.

Significance. If the empirical claims hold, the work could contribute a practical method for generating succinct, actionable global recourse in deployed ML systems. The joint-space agglomerative construction is presented as enabling better trade-off control than prior approaches; reproducible code or parameter-free derivations are not mentioned.

minor comments (2)
  1. [Abstract] The abstract states that the method 'accounts for the distribution of points in a way that aligns with the model's structure,' but does not specify the precise mechanism (e.g., distance metric, linkage criterion, or how model predictions enter the joint space); this should be clarified with pseudocode or equations in §3 or §4.
  2. [Abstract] The claim of 'consistently shows greater robustness and performance' requires explicit definition of the three metrics and the statistical tests used; tables comparing effectiveness, cost, and action count against baselines should include standard deviations or p-values.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review and for recognizing the potential of GLANCE's joint-space agglomerative approach. We address the concerns about reproducibility and empirical validation below.

read point-by-point responses
  1. Referee: reproducible code or parameter-free derivations are not mentioned.

    Authors: We agree that reproducibility is important. The revised manuscript will include a public code repository link and a dedicated subsection detailing hyperparameter selection, default values, and sensitivity analysis to improve accessibility and reproducibility. revision: yes

  2. Referee: If the empirical claims hold, the work could contribute a practical method for generating succinct, actionable global recourse.

    Authors: Sections 4 and 5 present results across multiple datasets and models showing consistent improvements in the effectiveness-cost-number trade-off. We maintain that the reported experiments support the claims; additional ablation studies can be added if the referee identifies specific gaps. revision: no

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper introduces GLANCE as a new agglomerative algorithm operating on the joint feature and action space. No equations, fitted parameters, or self-citations appear in the abstract or description that reduce any claimed result to its own inputs by construction. The central claim is an empirical statement of improved trade-off balance, which rests on the independent algorithmic procedure rather than any definitional loop or renamed fit. This is the common case of a self-contained algorithmic contribution.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review; no free parameters, axioms, or invented entities beyond the algorithm name itself are specified.

invented entities (1)
  • GLANCE algorithm no independent evidence
    purpose: To generate global counterfactual actions balancing effectiveness, cost and cardinality
    The paper introduces this named method as its central contribution.

pith-pipeline@v0.9.0 · 5818 in / 1086 out tokens · 25378 ms · 2026-05-24T00:56:11.708763+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 1 internal anchor

  1. [1]

    Machine bias.Ethics of Data and Analytics, pages 254–264, 5 2016

    Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias.Ethics of Data and Analytics, pages 254–264, 5 2016. doi: 10.1201/ 9781003278290-37

  2. [2]

    Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Reposi- tory, 1996. DOI: https://doi.org/10.24432/C5XW20

  3. [3]

    Springer Science & Business Media, 2008

    J¨ urgen Branke.Multiobjective optimization: Interactive and evolutionary approaches, volume 5252. Springer Science & Business Media, 2008

  4. [4]

    HELOC Applicant Risk Performance Evaluation by Topological Hierarchical Decomposition

    Kyle Brown, Derek Doran, Ryan Kramer, and Brad Reynolds. HELOC applicant risk performance evaluation by topological hierarchical decom- position.CoRR, abs/1811.10658, 2018. URLhttp://arxiv.org/abs/ 1811.10658

  5. [5]

    Nice: an algo- rithm for nearest instance counterfactual explanations.Data mining and knowledge discovery, 38(5):2665–2703, 2024

    Dieter Brughmans, Pieter Leyman, and David Martens. Nice: an algo- rithm for nearest instance counterfactual explanations.Data mining and knowledge discovery, 38(5):2665–2703, 2024

  6. [6]

    Counterfactual explanations for oblique decision trees: Exact, efficient algorithms

    Miguel ´A Carreira-Perpi˜ n´ an and Suryabhan Singh Hada. Counterfactual explanations for oblique decision trees: Exact, efficient algorithms. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 6903–6911. Association for the Advancement of Artificial Intelligence (AAAI), 2021

  7. [7]

    Generating collective counterfactual explanations in score-based classifica- tion via mathematical optimization.Expert Systems with Applications, 238: 121954, 2024

    Emilio Carrizosa, Jasone Ram´ ırez-Ayerbe, and Dolores Romero Morales. Generating collective counterfactual explanations in score-based classifica- tion via mathematical optimization.Expert Systems with Applications, 238: 121954, 2024

  8. [8]

    Mathematical optimization modelling for group counterfactual explana- tions.European Journal of Operational Research, 2024

    Emilio Carrizosa, Jasone Ram´ ırez-Ayerbe, and Dolores Romero Morales. Mathematical optimization modelling for group counterfactual explana- tions.European Journal of Operational Research, 2024. 15

  9. [9]

    Equi-explanation maps: concise and informative global summary explanations

    Tanya Chowdhury, Razieh Rahimi, and James Allan. Equi-explanation maps: concise and informative global summary explanations. InProceed- ings of the 2022 ACM Conference on Fairness, Accountability, and Trans- parency, pages 464–472, 2022

  10. [10]

    Multi-objective counterfactual explanations

    Susanne Dandl, Christoph Molnar, Martin Binder, and Bernd Bischl. Multi-objective counterfactual explanations. InInternational conference on parallel problem solving from nature, pages 448–469. Springer, 2020

  11. [11]

    Instance-based counter- factual explanations for time series classification

    Eoin Delaney, Derek Greene, and Mark T Keane. Instance-based counter- factual explanations for time series classification. InInternational confer- ence on case-based reasoning, pages 32–47. Springer, 2021

  12. [12]

    Uci machine learning repository

    Dheeru Dua and C Graff. Uci machine learning repository. university of california, school of information and computer science, irvine, ca (2019), 2019

  13. [13]

    U. Feige. A threshold of lnnfor approximating set cover.Journal of the ACM, 45(4):634–652, 1998

  14. [14]

    Borda’s rule, positional voting, and condorcet’s simple majority principle.Public Choice, pages 79–88, 1976

    Peter C Fishburn and William V Gehrlein. Borda’s rule, positional voting, and condorcet’s simple majority principle.Public Choice, pages 79–88, 1976

  15. [15]

    Facegroup: Feasible and actionable counterfactual explana- tions for group fairness

    Christos Fragkathoulas, Vasiliki Papanikou, Evaggelia Pitoura, and Evi- maria Terzi. Facegroup: Feasible and actionable counterfactual explana- tions for group fairness. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 41–59. Springer, 2025

  16. [16]

    Garey and D.S

    M.R. Garey and D.S. Johnson.Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman, 1979. ISBN 0-7167-1044- 7

  17. [17]

    Counterfactual explanations and how to find them: lit- erature review and benchmarking.Data Mining and Knowledge Discovery, 38(5):2770–2824, 2024

    Riccardo Guidotti. Counterfactual explanations and how to find them: lit- erature review and benchmarking.Data Mining and Knowledge Discovery, 38(5):2770–2824, 2024

  18. [18]

    Global counterfactual explainer for graph neural networks

    Zexi Huang, Mert Kosan, Sourav Medya, Sayan Ranu, and Ambuj Singh. Global counterfactual explainer for graph neural networks. InProceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, pages 141–149, 2023

  19. [19]

    Coun- terfactual explanation trees: Transparent and consistent actionable re- course with decision trees

    Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, and Yuichi Ike. Coun- terfactual explanation trees: Transparent and consistent actionable re- course with decision trees. InInternational Conference on Artificial In- telligence and Statistics, pages 1846–1870. PMLR, 2022

  20. [20]

    A survey of algorithmic recourse: definitions, formulations, so- lutions, and prospects.arXiv preprint arXiv:2010.04050, 2020

    Amir-Hossein Karimi, Gilles Barthe, Bernhard Sch¨ olkopf, and Isabel Valera. A survey of algorithmic recourse: definitions, formulations, so- lutions, and prospects.arXiv preprint arXiv:2010.04050, 2020. 16

  21. [21]

    Algorithmic recourse: from counterfactual explanations to interventions

    Amir-Hossein Karimi, Bernhard Sch¨ olkopf, and Isabel Valera. Algorithmic recourse: from counterfactual explanations to interventions. InProceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 353–362, 2021

  22. [22]

    Fairness aware counterfactuals for subgroups.Advances in Neural Information Processing Systems, 36:58246– 58276, 2023

    Loukas Kavouras, Konstantinos Tsopelas, Giorgos Giannopoulos, Dimitris Sacharidis, Eleni Psaroudaki, Nikolaos Theologitis, Dimitrios Rontogiannis, Dimitris Fotakis, and Ioannis Emiris. Fairness aware counterfactuals for subgroups.Advances in Neural Information Processing Systems, 36:58246– 58276, 2023

  23. [23]

    An inverse classification framework with limited budget and maximum number of perturbed sam- ples.Expert Systems with Applications, 212:118761, 2023

    Jaehoon Koo, Diego Klabjan, and Jean Utke. An inverse classification framework with limited budget and maximum number of perturbed sam- ples.Expert Systems with Applications, 212:118761, 2023. ISSN 0957-

  24. [24]

    URLhttps: //www.sciencedirect.com/science/article/pii/S0957417422017791

    doi: https://doi.org/10.1016/j.eswa.2022.118761. URLhttps: //www.sciencedirect.com/science/article/pii/S0957417422017791

  25. [25]

    Global counterfac- tual explanations: Investigations, implementations and improvements

    Dan Ley, Saumitra Mishra, and Daniele Magazzeni. Global counterfac- tual explanations: Investigations, implementations and improvements. In ICLR Workshop on Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data, 2022

  26. [26]

    GLOBE-CE: A trans- lation based approach for global counterfactual explanations

    Dan Ley, Saumitra Mishra, and Daniele Magazzeni. GLOBE-CE: A trans- lation based approach for global counterfactual explanations. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors,Proceedings of the 40th Interna- tional Conference on Machine Learning, volume 202 ofProceedings of Ma- chine Le...

  27. [27]

    Explanation in artificial intelligence: Insights from the social sciences.Artificial intelligence, 267:1–38, 2019

    Tim Miller. Explanation in artificial intelligence: Insights from the social sciences.Artificial intelligence, 267:1–38, 2019

  28. [28]

    Explaining ma- chine learning classifiers through diverse counterfactual explanations

    Ramaravind K Mothilal, Amit Sharma, and Chenhao Tan. Explaining ma- chine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 conference on fairness, accountability, and trans- parency, pages 607–617, 2020

  29. [29]

    Beyond individualized re- course: Interpretable and interactive summaries of actionable recourses

    Kaivalya Rawal and Himabindu Lakkaraju. Beyond individualized re- course: Interpretable and interactive summaries of actionable recourses. Advances in Neural Information Processing Systems, 33:12187–12198, 2020

  30. [30]

    CERTIFAI: coun- terfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models.CoRR, abs/1905.07857, 2019

    Shubham Sharma, Jette Henderson, and Joydeep Ghosh. CERTIFAI: coun- terfactual explanations for robustness, transparency, interpretability, and fairness of artificial intelligence models.CoRR, abs/1905.07857, 2019. URL http://arxiv.org/abs/1905.07857

  31. [31]

    Barry Smyth and Mark T Keane. Good counterfactuals and where to find them: A case-based technique for generating counterfactuals for explainable 17 ai (xai).ICCBR 2020: Case-Based Reasoning Research and Development, 2020

  32. [32]

    Counterfactual explanations with probabilistic guarantees on their robustness to model change

    Ignacy Stepka, Jerzy Stefanowski, and Mateusz Lango. Counterfactual explanations with probabilistic guarantees on their robustness to model change. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1277–1288. ACM, 2025. doi: 10.1145/ 3690624.3709300. URLhttps://doi.org/10.1145/3690624.3709300

  33. [33]

    Actionable recourse in linear classification

    Berk Ustun, Alexander Spangher, and Yang Liu. Actionable recourse in linear classification. InProceedings of the conference on fairness, account- ability, and transparency, pages 10–19, 2019

  34. [34]

    Counterfactual explanations and algorithmic re- courses for machine learning: A review.ACM Computing Surveys, 56(12): 1–42, 2024

    Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan Hines, John Dick- erson, and Chirag Shah. Counterfactual explanations and algorithmic re- courses for machine learning: A review.ACM Computing Surveys, 56(12): 1–42, 2024

  35. [35]

    Counterfactual explanations without opening the black box: Automated decisions and the gdpr.Harv

    Sandra Wachter, Brent Mittelstadt, and Chris Russell. Counterfactual explanations without opening the black box: Automated decisions and the gdpr.Harv. JL & Tech., 31:841, 2017

  36. [36]

    Warren, M

    Greta Warren, Mark T Keane, Christophe Gueret, and Eoin Delaney. Ex- plaining groups of instances counterfactually for xai: a use case, algorithm and user study for group-counterfactuals.arXiv preprint arXiv:2303.09297, 2023

  37. [37]

    Masset, R

    Ivy Yeh and Che-Hui Lien. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36:2473–2480, 03 2009. doi: 10.1016/j. eswa.2007.12.020. 18 Technical Appendix The appendix is organized as follows: •Appendix A includes the proof of Theorem 2. •Appendix B pro...

  38. [38]

    We focus on modifying only the topk f most important features (using permutation feature importance), set to 3 in all experiments

  39. [39]

    For categorical features, only the topk c most frequent categories among unaffected individuals are considered replacement candidates (set to 10 for all experiments)

  40. [40]

    lowest cost above a certain effectiveness threshold

    We also introduce vectorization in certain operations, improving compu- tational efficiency over DiCE’s implementation. Nearest NeighborsThis method is implemented by storing all unaffected individuals in memory. When queried to providekcounterfactuals for an af- fected individual, it retrieves theknearest neighbors from the set of unaffected instances ba...

  41. [41]

    Assess how participants weigh trade-offs between effectiveness, average recourse cost, and size

  42. [42]

    Validate our evaluation metrics, i.e., solution practicality and robustness (investigate how variance in recourse cost and effectiveness affects the participants’ decisions)

  43. [43]

    We recruited 55 participants from six countries, comprising (a) PhD students and (b) researchers from various machine learning domains

    Assess how participants rankGLANCErelative to baselines in non-dominated solution scenarios. We recruited 55 participants from six countries, comprising (a) PhD students and (b) researchers from various machine learning domains. N.1 Part 1: Algorithm Ranking Task In the first part of the study, participants were asked to rank Global Counterfac- tual Expla...

  44. [44]

    How participants respond to trade-offs between effectiveness and average recourse cost, assuming low variance

  45. [45]

    Whether participants prioritize robustness (i.e., low variance)

  46. [46]

    45 N.2.1 Design Each participant answered five pairwise comparison questions:

    Whether dominance is respected or overridden by subjective considera- tions. 45 N.2.1 Design Each participant answered five pairwise comparison questions:

  47. [47]

    One involved an impractical baseline solution (e.g., DNN/Adult :Fast AresvsGLANCE)

  48. [48]

    One involved a case whereGLANCEwas formally dominated, demonstrat- ing lower effectiveness and higher cost (DNN/HELOC:dGLOBE-CEvs. GLANCE)

  49. [49]

    Participants chose one preferred algorithm per question and selected a justifi- cation from the list

    Three involved non-dominated algorithm pairs, includingGLANCEand var- ious baselines (e.g., DNN/German Credit:CETvsGLANCE). Participants chose one preferred algorithm per question and selected a justifi- cation from the list. Free-text feedback was also collected to better understand reasoning. N.2.2 Key Findings

  50. [50]

    Of these, 77.8% stated that their decision stemmed from the low effectiveness of the baseline, not justGLANCE’s strength

    In the comparison with an impractical solution, 100% of the participants preferredGLANCE. Of these, 77.8% stated that their decision stemmed from the low effectiveness of the baseline, not justGLANCE’s strength. The remaining 22.2% stated that they simply prioritized effectiveness over cost. These results validate the practicality criterion used in our ex...

  51. [51]

    This suggests that robust- ness considerations can override formal dominance in human evaluation

    In the dominated comparison (DNN/HELOC), wheredGLOBE-CEoutper- formedGLANCEin both cost and effectiveness, 74.5% of participants still preferredGLANCE, citing its lower variance. This suggests that robust- ness considerations can override formal dominance in human evaluation. The 74.5% preference forGLANCE(vs. 25.5% for baseline method) was statistically ...

  52. [52]

    smaller variance

    In the remaining three non-dominated comparisons, an average of 71.5% of participants preferredGLANCEover baseline methods. Across all non- dominated comparisons, 14.5% of participants on average selected the “smaller variance” option as their main justification for their decisions, and 27.8% of participants explicitly stated in free text feedback that ro...