DLIME: A Deterministic Local Interpretable Model-Agnostic Explanations Approach for Computer-Aided Diagnosis Systems
Pith reviewed 2026-05-25 17:08 UTC · model grok-4.3
The pith
DLIME replaces LIME's random perturbations with hierarchical clustering and KNN to produce stable explanations for black-box models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DLIME produces explanations by applying agglomerative hierarchical clustering to the entire training set, selecting the relevant cluster for a given instance via KNN, and training a linear model on the selected cluster to obtain feature attributions. This deterministic neighborhood selection yields more stable explanations than standard LIME, as measured by higher Jaccard similarity across multiple runs on the same instances from three medical datasets.
What carries the argument
Agglomerative hierarchical clustering followed by KNN cluster selection, which supplies a fixed local neighborhood for the linear surrogate instead of random perturbation.
If this is right
- Explanations for any given instance remain identical across repeated queries.
- Feature attributions exhibit greater consistency than those produced by LIME on the same medical data.
- The approach can be applied to any existing training set without requiring additional random sampling steps.
- Linear models fitted on the selected cluster serve as stable local approximations to the original black-box model.
Where Pith is reading between the lines
- The deterministic clustering step could be reused across multiple instances that fall in the same cluster, reducing computation for batches of explanations.
- If the hierarchical clusters align with regions where the black-box model behaves linearly, DLIME might also improve the fidelity of the surrogate beyond mere stability.
- The method could be tested on non-medical tabular datasets to check whether the stability gain generalizes when data distributions differ from the medical cases examined.
Load-bearing premise
The cluster chosen by KNN after hierarchical clustering on the full training set is local enough for the linear model to approximate the black-box behavior and representative enough to avoid systematic bias in the attributions.
What would settle it
Multiple runs of DLIME and LIME on identical medical instances where the Jaccard similarity of DLIME explanations is not higher than that of LIME explanations.
Figures
read the original abstract
Local Interpretable Model-Agnostic Explanations (LIME) is a popular technique used to increase the interpretability and explainability of black box Machine Learning (ML) algorithms. LIME typically generates an explanation for a single prediction by any ML model by learning a simpler interpretable model (e.g. linear classifier) around the prediction through generating simulated data around the instance by random perturbation, and obtaining feature importance through applying some form of feature selection. While LIME and similar local algorithms have gained popularity due to their simplicity, the random perturbation and feature selection methods result in "instability" in the generated explanations, where for the same prediction, different explanations can be generated. This is a critical issue that can prevent deployment of LIME in a Computer-Aided Diagnosis (CAD) system, where stability is of utmost importance to earn the trust of medical professionals. In this paper, we propose a deterministic version of LIME. Instead of random perturbation, we utilize agglomerative Hierarchical Clustering (HC) to group the training data together and K-Nearest Neighbour (KNN) to select the relevant cluster of the new instance that is being explained. After finding the relevant cluster, a linear model is trained over the selected cluster to generate the explanations. Experimental results on three different medical datasets show the superiority for Deterministic Local Interpretable Model-Agnostic Explanations (DLIME), where we quantitatively determine the stability of DLIME compared to LIME utilizing the Jaccard similarity among multiple generated explanations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DLIME, a deterministic variant of LIME for explaining black-box ML predictions in CAD systems. It replaces LIME's random perturbation sampling with agglomerative hierarchical clustering on the full training set followed by KNN to select a cluster, then fits a linear surrogate on that cluster. The central claim is that this yields more stable explanations than LIME, quantitatively demonstrated via higher Jaccard similarity among repeated explanations on three medical datasets.
Significance. If the selected clusters remain sufficiently local and the linear surrogates retain comparable fidelity, DLIME would address a practical barrier to deploying explanation methods in medicine by removing randomness while preserving locality. The deterministic construction and use of real training points (rather than synthetic perturbations) are strengths that could be valuable if locality is verified.
major comments (2)
- [Method] Method section (description of HC+KNN procedure): agglomerative clustering is performed on the entire training set with no distance threshold or locality constraint, after which KNN assigns the instance to one cluster. Nothing enforces that the chosen cluster lies inside a small ball around the instance or that its points lie on the same side of the black-box decision boundary. This directly risks the linear model reflecting global rather than local behavior, so higher Jaccard similarity would not establish superiority as a local explanation method.
- [Experiments] Experiments section (stability evaluation): Jaccard similarity is reported as the stability metric, but no accompanying analysis (e.g., average cluster diameter, distance from instance to cluster centroid, or fidelity comparison) is provided to confirm that the DLIME neighborhoods are local. Without such checks, the quantitative superiority claim cannot be interpreted as evidence of improved local explanations.
minor comments (1)
- [Abstract] The three medical datasets are referred to only generically; naming them and providing basic statistics (size, dimensionality, class balance) would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We respond point-by-point to the major comments below.
read point-by-point responses
-
Referee: [Method] Method section (description of HC+KNN procedure): agglomerative clustering is performed on the entire training set with no distance threshold or locality constraint, after which KNN assigns the instance to one cluster. Nothing enforces that the chosen cluster lies inside a small ball around the instance or that its points lie on the same side of the black-box decision boundary. This directly risks the linear model reflecting global rather than local behavior, so higher Jaccard similarity would not establish superiority as a local explanation method.
Authors: The referee correctly notes that agglomerative clustering occurs on the full training set without an explicit distance threshold or boundary-side constraint. KNN selects the cluster containing the nearest points to the explained instance, which the method relies upon for locality. We acknowledge this does not strictly guarantee a small ball or same-side points and therefore does not preclude global behavior in some cases. We will revise the method section to clarify this design choice and add an explicit limitations paragraph discussing the risk. revision: yes
-
Referee: [Experiments] Experiments section (stability evaluation): Jaccard similarity is reported as the stability metric, but no accompanying analysis (e.g., average cluster diameter, distance from instance to cluster centroid, or fidelity comparison) is provided to confirm that the DLIME neighborhoods are local. Without such checks, the quantitative superiority claim cannot be interpreted as evidence of improved local explanations.
Authors: We agree that metrics such as average cluster diameter, instance-to-centroid distance, and fidelity would help substantiate that the selected clusters function as local neighborhoods. The submitted experiments focus solely on stability via Jaccard similarity. In revision we will add these locality diagnostics computed on the three medical datasets to allow readers to assess whether the stability gains occur within demonstrably local regions. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces DLIME by replacing LIME's random perturbation sampling with agglomerative hierarchical clustering on the full training set followed by KNN cluster assignment for a query instance, then fitting a linear surrogate on the selected cluster. Stability is assessed via an external Jaccard similarity metric computed across multiple generated explanations for both methods. No step reduces a claimed result to a quantity defined in terms of itself, a fitted parameter renamed as a prediction, or a self-citation chain; the determinism is an explicit procedural change whose consistency effect is measured independently rather than assumed by construction. The derivation remains self-contained against the stated experimental comparison.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe, Katja Hansen, and Klaus-Robert MÞller. 2010. How to explain individual classification decisions. Journal of Machine Learning Research 11, Jun (2010), 1803–1831
work page 2010
-
[2]
Gérard Biau and Erwan Scornet. 2016. A random forest guided tour. Test 25, 2 (2016), 197–227
work page 2016
-
[3]
T. Cover and P. Hart. 1967. Nearest neighbor pattern classification. IEEE Transac- tions on Information Theory 13, 1 (January 1967), 21–27. https://doi.org/10.1109/ TIT.1967.1053964
-
[4]
Piotr Dabkowski and Yarin Gal. 2017. Real Time Image Saliency for Black Box Classifiers. In Advances in Neural Information Processing Systems 30 , I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett Anchorage ’19, August 04–08, 2019, Anchorage, AK Muhammad Rehman Zafar and Naimul Mefraz Khan (Eds.). Curran Associates...
work page 2017
-
[5]
Persi Diaconis and Bradley Efron. 1983. Computer-intensive methods in statistics. Scientific American 248, 5 (1983), 116–131
work page 1983
-
[6]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http: //archive.ics.uci.edu/ml
work page 2017
-
[7]
Richard O Duda, Peter E Hart, et al. 1973. Pattern classification and scene analysis . Vol. 3. Wiley New York
work page 1973
-
[8]
Ruth C. Fong and Andrea Vedaldi. 2017. Interpretable Explanations of Black Boxes by Meaningful Perturbation. In The IEEE International Conference on Computer Vision (ICCV)
work page 2017
- [9]
-
[10]
Riccardo Guidotti and Salvatore Ruggieri. 2018. Assessing the Stability of Inter- pretable Models. arXiv preprint arXiv:1810.09352 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[11]
Patrick Hall, Navdeep Gill, Megan Kurka, and Wen Phan. 2017. Machine Learning Interpretability with H2O Driverless AI
work page 2017
-
[12]
Katherine A Heller and Zoubin Ghahramani. 2005. Bayesian hierarchical clus- tering. In Proceedings of the 22nd international conference on Machine learning . ACM, 297–304
work page 2005
-
[13]
Linwei Hu, Jie Chen, Vijayan N Nair, and Agus Sudjianto. 2018. Locally inter- pretable models and effects based on supervised partitioning (LIME-SUP). arXiv preprint arXiv:1806.00663 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[14]
Alexandros Kalousis, Julien Prados, and Melanie Hilario. 2007. Stability of Feature Selection Algorithms: A Study on High-dimensional Spaces. Knowl. Inf. Syst. 12, 1 (May 2007), 95–116
work page 2007
-
[15]
Gajendra Jung Katuwal and Robert Chen. 2016. Machine learning model inter- pretability for precision medicine. arXiv preprint arXiv:1610.09045 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[16]
Jing Lei, Max GâĂŹSell, Alessandro Rinaldo, Ryan J Tibshirani, and Larry Wasser- man. 2018. Distribution-free predictive inference for regression. J. Amer. Statist. Assoc. 113, 523 (2018), 1094–1111
work page 2018
-
[17]
Scott M Lundberg and Su-In Lee. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems 30 , I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 4765–4774
work page 2017
-
[18]
Olvi L Mangasarian, W Nick Street, and William H Wolberg. 1995. Breast cancer diagnosis and prognosis via linear programming.Operations Research 43, 4 (1995), 570–577
work page 1995
-
[19]
Manning, Prabhakar Raghavan, and Hinrich Schütze
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. In- troduction to Information Retrieval . Cambridge University Press, New York, NY, USA
work page 2008
-
[20]
Christoph Molnar. 2019. Interpretable Machine Learning . Online. https:// christophm.github.io/interpretable-ml-book/
work page 2019
-
[21]
Measuring the Stability of Feature Selection
Sarah Nogueira and Gavin Brown. 2016. "Measuring the Stability of Feature Selection". In "European Conference Proceedings, Part I, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016" (Lecture Notes in Artificial Intelligence) . Springer Press, 442–457
work page 2016
-
[22]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cour- napeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python . Journal of Machine Learning Research 12 (2011), 2825–2830
work page 2011
-
[23]
Gregory Plumb, Denali Molitor, and Ameet S Talwalkar. 2018. Model Agnostic Supervised Local Explanations. In Advances in Neural Information Processing Systems. 2520–2529
work page 2018
-
[24]
Bendi Venkata Ramana, M Surendra Prasad Babu, and NB Venkateswarlu. 2011. A critical study of selected classification algorithms for liver disease diagnosis. International Journal of Database Management Systems 3, 2 (2011), 101–114
work page 2011
-
[25]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining . ACM, 1135–1144
work page 2016
-
[26]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2018. Anchors: High- precision model-agnostic explanations. In Thirty-Second AAAI Conference on Artificial Intelligence
work page 2018
-
[27]
M. Robnik-Åăikonja and I. Kononenko. 2008. Explaining Classifications For Individual Instances. IEEE Transactions on Knowledge and Data Engineering 20, 5 (May 2008), 589–600
work page 2008
-
[28]
Hughes, and Finale Doshi-Velez
Andrew Slavin Ross, Michael C. Hughes, and Finale Doshi-Velez. 2017. Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17. 2662–2670. https://doi.org/10.24963/ijcai.2017/371
-
[29]
Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. 2013. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. CoRR abs/1312.6034 (2013)
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[30]
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic Attribution for Deep Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (ICML 17) . JMLR.org, 3319–3328
work page 2017
-
[31]
Matthew D. Zeiler and Rob Fergus. 2014. Visualizing and Understanding Convo- lutional Networks. In Computer Vision – ECCV 2014 , David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, 818–833
work page 2014
-
[32]
Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, and Antonio Torralba
-
[33]
In The IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR)
Learning Deep Features for Discriminative Localization. In The IEEE Con- ference on Computer Vision and Pattern Recognition (CVPR)
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.