Alike Parts: A Feature-Informed Approach to Local and Global Prototype Explanations
Pith reviewed 2026-05-22 09:06 UTC · model grok-4.3
The pith
Integrating feature importance into prototype explanations adds local and global granularity without reducing surrogate fidelity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors propose alike parts as a local explanation technique that leverages feature importance to highlight key shared feature subsets between a classified instance and its nearest prototype. They also augment the global prototype selection objective with a term based on feature importance to encourage diversity in the feature attributions of the prototypes. Experiments demonstrate that this augmented selection maintains or increases the prediction fidelity of the surrogate model on six benchmark datasets, suggesting that feature diversity does not compromise model fidelity.
What carries the argument
Alike parts, a local method that uses feature importance scores to highlight the most relevant shared feature subsets between an instance and its prototype, together with an augmented global prototype selection objective that adds a feature importance term to promote diversity in attributions.
If this is right
- Local explanations now guide attention to specific important features shared with the prototype rather than the whole instance.
- Global prototype sets can cover a broader range of feature-based reasons for model decisions.
- Surrogate model fidelity stays the same or rises even after the diversity-promoting change.
- Feature diversity among prototypes does not trade off against explanation reliability.
- Users obtain more granular, feature-level insight into why an instance matches a prototype.
Where Pith is reading between the lines
- The same importance-driven selection logic could be added to other example-based explanation methods to increase their feature specificity.
- Domains that need justifications at the feature level, such as medical or financial decisions, may find the local alike-parts highlighting especially practical.
- Testing the approach on models with strong feature interactions would show whether the diversity term remains beneficial when features are not independent.
Load-bearing premise
Feature importance scores can be integrated into local subset highlighting and the global prototype selection objective without introducing post-hoc biases or requiring dataset-specific tuning that affects the reported fidelity gains.
What would settle it
Re-running the augmented prototype selection on a new benchmark dataset and finding a statistically significant drop in surrogate prediction fidelity relative to the standard selection would falsify the claim that fidelity is maintained or increased.
Figures
read the original abstract
Prototype-based explanations offer an intuitive, example-based approach to support the interpretability of machine learning black box classifiers but often lack feature-level granularity. We introduce a framework that integrates feature importance at two levels to address this gap. First, for local explanations, we propose \textit{alike parts}: a method that uses feature importance scores to highlight the most relevant, shared feature subsets between a classified instance and its nearest prototype, guiding user attention. Second, we augment the global prototype selection objective function with a feature importance term to actively promote diversity in the feature attributions of the selected prototypes. Experiments on six benchmark datasets show that this augmented selection process maintains or, in some cases, increases the prediction fidelity of the surrogate model, suggesting that feature diversity does not compromise model fidelity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a framework called 'Alike Parts' for enhancing prototype-based explanations of black-box classifiers. It integrates feature importance scores in two ways: (1) locally, by highlighting the most relevant shared feature subsets ('alike parts') between a query instance and its nearest prototype; (2) globally, by augmenting the prototype selection objective with an additional feature-importance term that encourages diversity in the attributions of the selected prototypes. Experiments across six benchmark datasets are reported to show that the augmented selection maintains or increases surrogate-model prediction fidelity relative to the unaugmented baseline, leading to the suggestion that feature diversity need not compromise fidelity.
Significance. If the empirical results are robust, the work addresses a genuine gap in prototype explanations by adding feature-level granularity to both local highlighting and global selection. The multi-dataset evaluation is a positive feature. However, the central claim that 'feature diversity does not compromise model fidelity' rests on the behavior of the augmented objective; without clear evidence that the balancing weight is held fixed or chosen independently of the reported outcome, the result remains conditional rather than general.
major comments (2)
- [Global prototype selection objective] Global prototype selection objective (likely §3.2 or Eq. (3)–(5)): the manuscript must explicitly state the value or selection procedure for the hyperparameter that balances the original fidelity term against the new feature-importance diversity term. If this weight is tuned independently per dataset (or via validation that favors the reported fidelity), the observed maintenance of fidelity is consistent with an artifact of per-dataset optimization rather than evidence that diversity is harmless in general. An ablation across a fixed schedule of weights on all six datasets is required to support the claim.
- [Experimental results] Experimental results (likely §4 and Table 2): the claim that fidelity is 'maintained or, in some cases, increased' is load-bearing for the paper’s central suggestion. The current description supplies no information on the precise fidelity metric, the baseline prototype selector, statistical tests, or variance across runs. Without these controls, it is impossible to judge whether the reported gains are reliable or whether they depend on the same weighting choice raised above.
minor comments (2)
- [Local explanation method] Clarify whether the same feature-importance scores are used without modification for both the local 'alike parts' highlighting and the global objective, or whether any post-processing is applied.
- [Related work] Add a short paragraph contrasting the approach with prior prototype methods (e.g., ProtoDash, MMD-critic) that already incorporate some form of diversity or importance weighting.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major comment below and indicate the changes we will incorporate in the revised manuscript.
read point-by-point responses
-
Referee: [Global prototype selection objective] Global prototype selection objective (likely §3.2 or Eq. (3)–(5)): the manuscript must explicitly state the value or selection procedure for the hyperparameter that balances the original fidelity term against the new feature-importance diversity term. If this weight is tuned independently per dataset (or via validation that favors the reported fidelity), the observed maintenance of fidelity is consistent with an artifact of per-dataset optimization rather than evidence that diversity is harmless in general. An ablation across a fixed schedule of weights on all six datasets is required to support the claim.
Authors: We agree that the balancing hyperparameter requires explicit documentation. In the submitted manuscript the weight λ was fixed at 0.5 for every dataset; this value was chosen via a small preliminary grid search on the first dataset only and then held constant for all subsequent experiments. We will add a clear statement of this procedure to Section 3.2. In addition, we will perform the requested ablation by re-running the global selection with a fixed schedule of weights (λ ∈ {0.0, 0.25, 0.5, 0.75, 1.0}) on all six benchmarks and will report the resulting fidelity curves in a new appendix table. These additions directly address the concern that the reported fidelity maintenance might be an artifact of per-dataset tuning. revision: yes
-
Referee: [Experimental results] Experimental results (likely §4 and Table 2): the claim that fidelity is 'maintained or, in some cases, increased' is load-bearing for the paper’s central suggestion. The current description supplies no information on the precise fidelity metric, the baseline prototype selector, statistical tests, or variance across runs. Without these controls, it is impossible to judge whether the reported gains are reliable or whether they depend on the same weighting choice raised above.
Authors: We accept that the experimental reporting is currently underspecified. The fidelity metric is the test-set accuracy of a surrogate decision-tree model in reproducing the black-box classifier’s predictions. The baseline is the original prototype-selection objective without the feature-diversity term. We will revise Section 4 and Table 2 to state these definitions explicitly, to report mean fidelity ± one standard deviation over ten independent runs (different random seeds for prototype initialization and data splits), and to include Wilcoxon signed-rank tests comparing the augmented and baseline selectors. These controls will allow readers to evaluate both reliability and dependence on the weighting choice. revision: yes
Circularity Check
No significant circularity; empirical claims rest on independent experiments
full rationale
The paper introduces a framework for integrating feature importance into prototype-based explanations via 'alike parts' local highlighting and an augmented global selection objective. Its central claim—that the augmented process maintains or increases surrogate fidelity across six benchmarks—is presented as an empirical outcome rather than a derivation. No equations, fitted parameters, or self-citations reduce the reported results to quantities defined by the method itself; the experiments serve as external validation against benchmark datasets. The derivation chain is therefore self-contained and does not exhibit self-definitional, fitted-input, or self-citation load-bearing circularity.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
augment the global prototype selection objective function with a feature importance term ... f(P) = Σ min (d(xi,pj) + β·fi(xi,pj))
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Experiments on six benchmark datasets show that this augmented selection process maintains or increases prediction fidelity
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Progress in Polish Artificial Intelligence Research , volume =
Karolczak, Jacek and Stefanowski, Jerzy , title =. Progress in Polish Artificial Intelligence Research , volume =
-
[2]
Tan, Sarah and Soloviev, Matvey and Hooker, Giles and Wells, Martin T. , title =. 2020 , isbn =. doi:10.1145/3412815.3416893 , booktitle =
-
[3]
Gomes, Ryan and Krause, Andreas , title =. Proc. of ICML 2010 , pages =. 2010 , isbn =
work page 2010
-
[4]
Lundberg, Scott M. and Lee, Su-In , title =. 2017 , booktitle =
work page 2017
-
[5]
Li, Oscar and Liu, Hao and Chen, Chaofan and Rudin, Cynthia , title =. 2018 , booktitle =
work page 2018
- [6]
-
[7]
Menis Mastromichalakis, Orfeas and Filandrianos, Giorgos and Liartis, Jason and Dervakos, Edmund and Stamou, Giorgos , title =. Proc. of ACM CIKM 2024 , pages =. 2024 , doi =
work page 2024
-
[8]
Chen, Chaofan and Li, Oscar and Tao, Chaofan and Barnett, Alina Jade and Su, Jonathan and Rudin, Cynthia , title =. 2019 , booktitle =
work page 2019
-
[9]
Prototypical Convolutional Neural Network for a Phrase-Based Explanation of Sentiment Classification
Pluci \' n ski, Kamil and Lango, Mateusz and Stefanowski, Jerzy. Prototypical Convolutional Neural Network for a Phrase-Based Explanation of Sentiment Classification. Proc. of ECML PKDD 2021. 2021
work page 2021
- [10]
-
[11]
Benchmarking and Survey of Explanation Methods for Black Box Models , journal =
Bodria, Francesco and Giannotti, Fosca and Guidotti, Riccardo and Naretto, Francesca and Pedreschi, Dino and Rinzivillo, Salvatore , year =. Benchmarking and Survey of Explanation Methods for Black Box Models , journal =
-
[12]
Prototypes as Explanation for Time Series Anomaly Detection , author=
-
[13]
Examples are not enough, learn to criticize! Criticism for Interpretability , volume =
Kim, Been and Khanna, Rajiv and Koyejo, Oluwasanmi O , booktitle =. Examples are not enough, learn to criticize! Criticism for Interpretability , volume =
-
[14]
K-nearest neighbors rule combining prototype selection and local feature weighting for classification , journal =. 2022 , author =
work page 2022
-
[15]
INSIGHT: Efficient and Effective Instance Selection for Time-Series Classification
Buza, Krisztian and Nanopoulos, Alexandros and Schmidt-Thieme, Lars. INSIGHT: Efficient and Effective Instance Selection for Time-Series Classification. Advances in Knowledge Discovery and Data Mining. 2011
work page 2011
-
[16]
A new edited k-nearest neighbor rule in the pattern classification problem , journal =. 2000 , author =
work page 2000
-
[17]
and Villmann, Thomas and Hammer, Barbara and Schneider, Petra , title =
Schleif, F.-M. and Villmann, Thomas and Hammer, Barbara and Schneider, Petra , title =. International Journal of Neural Systems , volume =. 2011 , doi =
work page 2011
-
[18]
WIREs Cognitive Science , volume =
Biehl, Michael and Hammer, Barbara and Villmann, Thomas , title =. WIREs Cognitive Science , volume =. doi:10.1002/wcs.1378 , year =
-
[19]
Breiman, Leo , title =. Machine Learning , pages =. 2001 , issue_date =
work page 2001
-
[20]
Ribeiro, Marco Tulio and Singh, Sameer and Guestrin, Carlos , title =. Proc. of ACM SIGKDD 2016 , pages =. 2016 , doi =
work page 2016
-
[21]
Sex and Gender Differences in Risk, Pathophysiology and Complications of Type 2 Diabetes Mellitus , author=. Endocrine Reviews , year=
-
[22]
Advances in Neural Information Processing Systems , year =
Xiao Li and Yu Wang and Sumanta Basu and Karl Kumbier and Bin Yu , title =. Advances in Neural Information Processing Systems , year =
-
[23]
Tree-Structured Parzen Estimator: Understanding Its Algorithm Components and Their Roles for Better Empirical Performance , author=. 2025 , eprint=
work page 2025
-
[24]
An Interpretable Prototype Parts-based Neural Network for Medical Tabular Data
Karolczak, Jacek and Stefanowski, Jerzy. An Interpretable Prototype Parts-based Neural Network for Medical Tabular Data. Proc. of EXPLIMED at ECML PKDD 2025. 2025
work page 2025
- [25]
-
[26]
Kaur, Davinder and Uslu, Suleyman and Rittichier, Kaley J. and Durresi, Arjan , title =. ACM Comput. Surv. , volume =. 2022 , issue_date =
work page 2022
-
[27]
A Review of Trustworthy and Explainable Artificial Intelligence (XAI) , year=
Chamola, Vinay and Hassija, Vikas and Sulthana, A Razia and Ghosh, Debshishu and Dhingra, Divyansh and Sikdar, Biplab , journal=. A Review of Trustworthy and Explainable Artificial Intelligence (XAI) , year=
-
[28]
Alkhatib, Amr and Boström, Henrik and Vazirgiannis, Michalis , year =
-
[29]
Bach, Jakob , title =. 2025 , volume =. doi:10.1145/3725358 , journal =
-
[30]
Jacek Karolczak and Jerzy Stefanowski , title =. Proc. of the xAI 2025 Late-breaking Work, Demos and Doctoral Consortium at xAI , series = "
work page 2025
-
[31]
Feature Selection for Knowledge Discovery and Data Mining , publisher =
Liu, Huan and Motoda, Hiroshi , year =. Feature Selection for Knowledge Discovery and Data Mining , publisher =
-
[32]
and Tang, Jiliang and Liu, Huan , title =
Li, Jundong and Cheng, Kewei and Wang, Suhang and Morstatter, Fred and Trevino, Robert P. and Tang, Jiliang and Liu, Huan , title =. 2017 , issue_date =. doi:10.1145/3136625 , journal =
-
[33]
A survey of multiple classifier systems as hybrid systems , journal =. 2014 , issn =. doi:https://doi.org/10.1016/j.inffus.2013.04.006 , author =
-
[34]
TSProto: Fusing deep feature extraction with interpretable glass-box surrogate model for explainable time-series classification , journal =. 2025 , issn =. doi:https://doi.org/10.1016/j.inffus.2025.103357 , author =
-
[35]
Stepka, Ignacy and Lango, Mateusz and Stefanowski, Jerzy , title =. 2024 , volume =. doi:10.61822/amcs-2024-0009 , journal =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.