pith. machine review for the scientific record. sign in

arxiv: 2604.02532 · v1 · submitted 2026-04-02 · 💻 cs.CV · cs.AI· cs.LG

Recognition: 2 theorem links

· Lean Theorem

Feature Attribution Stability Suite: How Stable Are Post-Hoc Attributions?

Authors on Pith no claims yet

Pith reviewed 2026-05-13 21:25 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG
keywords feature attributionstability evaluationpost-hoc explanationsimage perturbationsGrad-CAMprediction invariance
0
0 comments X

The pith

Feature attribution methods produce inconsistent explanations under geometric image changes even when the model prediction stays the same.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Feature Attribution Stability Suite to test how stable post-hoc attribution methods remain when images receive small realistic alterations. It applies prediction-invariance filtering so that only cases keeping the original prediction are scored, then measures stability with structural similarity, rank correlation, and top-k feature overlap. Tests across Integrated Gradients, GradientSHAP, Grad-CAM, and LIME on ImageNet, MS COCO, and CIFAR-10 show geometric perturbations create far more instability than photometric or compression changes. Grad-CAM records the highest stability scores under these controlled conditions. Without the prediction filter, up to 99 percent of evaluated pairs involve changed outputs, which inflates apparent fragility.

Core claim

FASS shows that attribution stability depends critically on perturbation family and on conditioning evaluations to preserve the original prediction. Geometric perturbations expose substantially greater attribution instability than photometric changes. Among the four methods tested, Grad-CAM achieves the highest stability across all three datasets and multiple architectures.

What carries the argument

The FASS benchmark, which enforces prediction-invariance filtering before scoring attribution stability via structural similarity, rank correlation, and top-k Jaccard overlap.

If this is right

  • Geometric perturbations should be included in any robustness assessment of vision explanations.
  • Grad-CAM shows more consistent attributions than Integrated Gradients, GradientSHAP, or LIME under the tested conditions.
  • Stability numbers drop sharply when evaluations are not restricted to prediction-preserving cases.
  • Single-scalar stability scores miss important differences captured by the three-metric decomposition.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Safety-critical systems using these attributions may require additional checks such as cross-method agreement before acting on an explanation.
  • The filtering approach could be adapted to other modalities like text or audio to test whether similar instability patterns appear.
  • Adding domain-specific perturbations such as sensor noise or weather effects would make the suite closer to actual deployment conditions.

Load-bearing premise

The chosen geometric, photometric, and compression perturbations adequately represent the input variations that occur in safety-critical vision deployments.

What would settle it

Re-running the same protocol on a fresh collection of real camera-captured images or on perturbations outside the original families and checking whether Grad-CAM still ranks highest in stability.

Figures

Figures reproduced from arXiv: 2604.02532 by Jugal Gajjar, Kamalasankari Subramaniakuppusamy.

Figure 1
Figure 1. Figure 1: FASS evaluation pipeline. Each input image is paired with its perturbed counterpart. Only prediction-invariant pairs proceed [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Attribution stability barplots across all three datasets. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 1
Figure 1. Figure 1: IG stability on CIFAR-10. B.2 GradientSHAP Architecture Rot. Trans.† Bright.† Noise JPEG† ResNet-50 0.423 0.501 0.425 0.393 0.468 DenseNet-121 0.445 0.491 0.479 0.384 0.520 ConvNeXt-T 0.457 0.536 0.563 0.441 0.583 ViT-B/16 0.442 0.517 0.523 0.463 0.588 [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: GradientSHAP stability on CIFAR-10. B.3 Grad-CAM Architecture Rot. Trans.† Bright.† Noise JPEG† ResNet-50 0.450 0.529 0.643 0.449 0.703 DenseNet-121 0.535 0.523 0.735 0.553 0.760 ConvNeXt-T 0.548 0.834 0.796 0.645 0.774 ViT-B/16 0.539 0.667 0.744 0.598 0.722 [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Grad-CAM stability on CIFAR-10. B.4 LIME Architecture Rot. Trans.† Bright.† Noise JPEG† ResNet-50 0.282 0.346 0.346 0.284 0.350 DenseNet-121 0.279 0.339 0.338 0.273 0.344 ConvNeXt-T 0.295 0.362 0.356 0.288 0.364 ViT-B/16 0.280 0.368 0.357 0.287 0.437 [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: IG stability on ImageNet. C.2 GradientSHAP Architecture Rot. Trans.† Bright.† Noise JPEG† ResNet-50 0.470 0.421 0.472 0.572 0.492 DenseNet-121 0.465 0.412 0.468 0.572 0.487 ConvNeXt-T 0.484 0.438 0.501 0.568 0.519 ViT-B/16 0.490 0.443 0.512 0.626 0.531 [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: GradientSHAP stability on ImageNet. C.3 Grad-CAM Architecture Rot. Trans.† Bright.† Noise JPEG† ResNet-50 0.726 0.671 0.821 0.877 0.853 DenseNet-121 0.761 0.703 0.873 0.915 0.889 ConvNeXt-T 0.779 0.724 0.869 0.897 0.878 ViT-B/16 0.762 0.712 0.842 0.822 0.852 [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Grad-CAM stability on ImageNet. C.4 LIME Architecture Rot. Trans.† Bright.† Noise JPEG† ResNet-50 0.410 0.340 0.390 0.470 0.430 DenseNet-121 0.370 0.360 0.410 0.490 0.450 ConvNeXt-T 0.390 0.350 0.430 0.470 0.440 ViT-B/16 0.400 0.380 0.440 0.570 0.460 [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: IG stability on COCO. D.2 GradientSHAP Architecture Rot. Trans.† Bright.† Noise JPEG ResNet-50 0.445 0.364 0.425 0.533 0.391 DenseNet-121 0.437 0.339 0.442 0.519 0.367† ConvNeXt-T 0.475 0.393 0.506 0.568 0.411† ViT-B/16 0.481 0.404 0.515 0.647 0.494 [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: GradientSHAP stability on COCO. D.3 Grad-CAM Architecture Rot. Trans.† Bright.† Noise JPEG ResNet-50 0.645 0.609 0.676 0.825 0.662 DenseNet-121 0.661 0.595 0.824 0.829 0.707† ConvNeXt-T 0.692 0.545 0.806 0.838 0.582† ViT-B/16 0.752 0.716 0.637 0.849 0.658 [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Grad-CAM stability on COCO. D.4 LIME Architecture Rot. Trans.† Bright.† Noise JPEG ResNet-50 0.427 0.489 0.529 0.415 0.512 DenseNet-121 0.427 0.466 0.526 0.432 0.495† ConvNeXt-T 0.448 0.492 0.556 0.422 0.476† ViT-B/16 0.429 0.445 0.517 0.413 0.489 [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: LIME stability on COCO. Summary. COCO displays intermediate stability between CIFAR-10 and ImageNet. Grad-CAM remains the most stable method, though the gap with IG narrows relative to ImageNet. LIME exhibits stronger performance on COCO than on CIFAR-10, consistent with the hypothesis that multi￾object scenes constrain perturbation-based sampling variabil￾ity. Architectural differences are more pronounce… view at source ↗
read the original abstract

Post-hoc feature attribution methods are widely deployed in safety-critical vision systems, yet their stability under realistic input perturbations remains poorly characterized. Existing metrics evaluate explanations primarily under additive noise, collapse stability to a single scalar, and fail to condition on prediction preservation, conflating explanation fragility with model sensitivity. We introduce the Feature Attribution Stability Suite (FASS), a benchmark that enforces prediction-invariance filtering, decomposes stability into three complementary metrics: structural similarity, rank correlation, and top-k Jaccard overlap-and evaluates across geometric, photometric, and compression perturbations. Evaluating four attribution methods (Integrated Gradients, GradientSHAP, Grad-CAM, LIME) across four architectures and three datasets-ImageNet-1K, MS COCO, and CIFAR-10, FASS shows that stability estimates depend critically on perturbation family and prediction-invariance filtering. Geometric perturbations expose substantially greater attribution instability than photometric changes, and without conditioning on prediction preservation, up to 99% of evaluated pairs involve changed predictions. Under this controlled evaluation, we observe consistent method-level trends, with Grad-CAM achieving the highest stability across datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the Feature Attribution Stability Suite (FASS), a benchmark for post-hoc attribution methods in computer vision. It enforces prediction-invariance filtering and uses three metrics—structural similarity, rank correlation, and top-k Jaccard overlap—to evaluate stability under geometric, photometric, and compression perturbations. The evaluation covers Integrated Gradients, GradientSHAP, Grad-CAM, and LIME across four architectures on ImageNet-1K, MS COCO, and CIFAR-10. The paper claims that stability depends on perturbation family and filtering, with geometric perturbations causing more instability, up to 99% prediction changes without filtering, and Grad-CAM showing the highest stability.

Significance. This benchmark addresses a gap in evaluating attribution stability by separating it from model sensitivity through invariance filtering. If the findings hold, they provide actionable insights for choosing attribution methods in safety-critical systems and underscore the limitations of unfiltered evaluations. The cross-dataset consistency of method rankings adds credibility to the recommendation of Grad-CAM for stable attributions.

major comments (2)
  1. The headline result that 'without conditioning on prediction preservation, up to 99% of evaluated pairs involve changed predictions' is central to arguing for the filtering step; however, the precise criterion for 'changed predictions' (e.g., whether it is top-1 class flip or a probability drop below a threshold) and the exact filtering implementation are not detailed enough to verify this percentage or assess its sensitivity to hyperparameters.
  2. The claim that geometric perturbations expose substantially greater attribution instability than photometric changes relies on the selected transforms being representative of realistic input variations. The manuscript should specify the exact parameter ranges (e.g., rotation angles, translation pixels, compression quality levels) and provide a justification or ablation showing why these families adequately sample the distribution of variations in safety-critical deployments, as disproportionate prediction changes in geometric cases (signaled by the 99% figure) could bias the filtered subset.
minor comments (2)
  1. The abstract states evaluation across four architectures but does not name them; listing the specific models (e.g., ResNet, ViT) in the main text would improve clarity.
  2. The three similarity metrics are introduced without explicit formulas or references to their standard definitions; adding equations for structural similarity (e.g., SSIM) and top-k Jaccard would aid reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback on our manuscript. We address each of the major comments below, providing clarifications and committing to revisions where appropriate to improve the clarity and rigor of the paper.

read point-by-point responses
  1. Referee: The headline result that 'without conditioning on prediction preservation, up to 99% of evaluated pairs involve changed predictions' is central to arguing for the filtering step; however, the precise criterion for 'changed predictions' (e.g., whether it is top-1 class flip or a probability drop below a threshold) and the exact filtering implementation are not detailed enough to verify this percentage or assess its sensitivity to hyperparameters.

    Authors: We agree that the precise definition of 'changed predictions' requires more detail for reproducibility. In our implementation, a prediction is deemed changed if the argmax (top-1 class) differs between the original and perturbed input. The filtering step retains only those perturbation pairs where the top-1 prediction is preserved. We will revise the manuscript to include a clear description of this criterion, along with pseudocode for the filtering process and an analysis of sensitivity to alternative definitions such as top-5 agreement or probability thresholds. revision: yes

  2. Referee: The claim that geometric perturbations expose substantially greater attribution instability than photometric changes relies on the selected transforms being representative of realistic input variations. The manuscript should specify the exact parameter ranges (e.g., rotation angles, translation pixels, compression quality levels) and provide a justification or ablation showing why these families adequately sample the distribution of variations in safety-critical deployments, as disproportionate prediction changes in geometric cases (signaled by the 99% figure) could bias the filtered subset.

    Authors: We acknowledge the importance of specifying the perturbation parameters and justifying their choice. The revised manuscript will detail the exact ranges: geometric perturbations consist of rotations uniformly sampled from [-15°, 15°], translations up to 10% of image dimensions, and scaling factors from 0.9 to 1.1; photometric include brightness and contrast adjustments within ±0.2; compression uses JPEG quality from 50 to 95. These ranges are motivated by standard data augmentation practices in computer vision robustness benchmarks (e.g., ImageNet-C). We will add an ablation study examining how varying these ranges affects the percentage of prediction changes and stability metrics, to address potential bias in the filtered subset and better support applicability to safety-critical settings. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark with independent measurements

full rationale

The paper introduces the FASS benchmark, explicitly defines its three metrics (structural similarity, rank correlation, top-k Jaccard) and perturbation families as design choices, applies prediction-invariance filtering as a stated protocol, and reports measured stability values across methods and datasets. No central claim reduces by the paper's own equations or self-citations to a quantity fitted inside the study; the observed trends (Grad-CAM highest stability, geometric perturbations showing greater instability) are direct empirical outputs rather than presupposed by the evaluation setup itself. The work is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that the chosen perturbation families are representative and that the three metrics adequately quantify stability; no free parameters or new invented entities are introduced.

axioms (1)
  • domain assumption The selected geometric, photometric, and compression perturbations represent realistic input variations in safety-critical vision systems.
    Invoked when the authors state that stability estimates depend critically on perturbation family.

pith-pipeline@v0.9.0 · 5503 in / 1293 out tokens · 38515 ms · 2026-05-13T21:25:56.446891+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    Sanity checks for saliency maps

    Julius Adebayo, Justin Gilmer, Michael Muelly, Ian Good- fellow, Moritz Hardt, and Been Kim. Sanity checks for saliency maps. InAdvances in Neural Information Process- ing Systems (NeurIPS), 2018. 2

  2. [2]

    OpenXAI: Towards a transparent evaluation of model explanations

    Chirag Agarwal, Eshika Saxena, Satyapriya Krishna, Mar- tin Pawelczyk, Nari Johnson, Isha Pishi, Marber Aber, and Himabindu Lakkaraju. OpenXAI: Towards a transparent evaluation of model explanations. InAdvances in Neural Information Processing Systems (NeurIPS), 2022. 1, 2, 3

  3. [3]

    On the Robustness of Interpretability Methods

    David Alvarez-Melis and Tommi S Jaakkola. On the robustness of interpretability methods.arXiv preprint arXiv:1806.08049, 2018. 1, 2, 3

  4. [4]

    Assess- ing the trustworthiness of saliency maps for localizing ab- normalities in medical imaging.Radiology: Artificial Intel- ligence, 3(6):e200267, 2021

    Nishanth Arun, Nathan Gishi, Skylar Fober, Kenneth Hajek, Liam Vaickus, Oscar Salas, and Lorenzo Torresani. Assess- ing the trustworthiness of saliency maps for localizing ab- normalities in medical imaging.Radiology: Artificial Intel- ligence, 3(6):e200267, 2021. 2

  5. [5]

    Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.IEEE Access, 12:6702– 6739, 2024

    Shahin Atakishiyev, Mohammad Salameh, Hengshuai Yao, and Randy Goebel. Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions.IEEE Access, 12:6702– 6739, 2024. 1

  6. [6]

    ImageNet: A large-scale hierarchical im- age database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical im- age database. InIEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009. 5

  7. [7]

    Explanations can be manipulated and geometry is behind it

    Ann-Kathrin Dombrowski, Maximilian Alber, Christopher Anders, Marcel Ackermann, Klaus-Robert M ¨uller, and Pan Kessel. Explanations can be manipulated and geometry is behind it. InAdvances in Neural Information Processing Systems (NeurIPS), 2019. 1, 2, 3

  8. [8]

    An image is worth 16x16 words: Trans- formers for image recognition at scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. InInternational Con- ference on Learning Representations (ICLR), 2021. 5

  9. [9]

    Inter- pretation of neural networks is fragile

    Amirata Ghorbani, Abubakar Abid, and James Zou. Inter- pretation of neural networks is fragile. InProceedings of the AAAI Conference on Artificial Intelligence, pages 3681– 3688, 2019. 1, 2, 3

  10. [10]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016. 5

  11. [11]

    Quantus: An explain- able AI toolkit for responsible evaluation of neural network explanations and beyond.Journal of Machine Learning Re- search, 24(34):1–11, 2023

    Anna Hedstr ¨om, Leander Weber, Daniel Krakowczyk, Dil- yara Bareeva, Franz Motzkus, Wojciech Samek, Sebastian Lapuschkin, and Marina M-C H¨ohne. Quantus: An explain- able AI toolkit for responsible evaluation of neural network explanations and beyond.Journal of Machine Learning Re- search, 24(34):1–11, 2023. 1, 2, 3

  12. [12]

    Densely connected convolutional net- works

    Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kil- ian Q Weinberger. Densely connected convolutional net- works. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4700–4708,

  13. [13]

    Guided integrated gradients: An adaptive path method for removing noise

    Andrei Kapishnikov, Subhashini Venugopalan, Besim Avber, Geoffrey Hinton, Fernanda Viegas, and Mukund Kudlur. Guided integrated gradients: An adaptive path method for removing noise. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5050–5058, 2021. 2

  14. [14]

    The (un)reliability of saliency methods

    Pieter-Jan Kindermans, Sara Hooker, Julius Adebayo, Max- imilian Alber, Kristof T Sch ¨utt, Sven D ¨ahne, Dumitru Er- han, and Been Kim. The (un)reliability of saliency methods. InExplainable AI: Interpreting, Explaining and Visualizing Deep Learning, pages 267–280. Springer, 2019. 1, 2, 3

  15. [15]

    LATEC: A large-scale benchmark for evaluating explainability methods on complex AI sys- tems

    Linus Klein, Tobias Wartmann, Ren ´e Schallner, Stephan Scheele, and Rafet Sifa. LATEC: A large-scale benchmark for evaluating explainability methods on complex AI sys- tems. InProceedings of the 1st Workshop on Evaluating Trustworthiness of AI (EvalTAI), European Conference on Artificial Intelligence (ECAI), 2024. 1, 2, 3

  16. [16]

    Captum: A unified and generic model inter- pretability library for PyTorch,

    Narine Kokhlikyan, Vivek Miglani, Miguel Martin, Edward Wang, Bilal Alsallakh, Jonathan Reynolds, Alexander Mel- nikov, Natalia Kliber, Carlos Fan, Pavlo Molchanov, et al. Captum: A unified and generic model interpretability library for PyTorch.arXiv preprint arXiv:2009.07896, 2020. 4, 5

  17. [17]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. 5

  18. [18]

    Brain tumor detection using deep learning and explainable AI.Computers in Biology and Medicine, 170:108035, 2024

    Sandeep Kumar et al. Brain tumor detection using deep learning and explainable AI.Computers in Biology and Medicine, 170:108035, 2024. 1

  19. [19]

    M4: A unified XAI benchmark for faithful evaluation of feature attribution methods across metrics, models and tasks.arXiv preprint arXiv:2310.19067, 2023

    Xuhong Li, Haoyi Xiong, Xingjian Li, Xiao Wu, Xiao Zhang, Ji Liu, Jiang Bian, and Jun Huan. M4: A unified XAI benchmark for faithful evaluation of feature attribution methods across metrics, models and tasks.arXiv preprint arXiv:2310.19067, 2023. 2

  20. [20]

    Microsoft COCO: Common objects in context

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C Lawrence Zitnick. Microsoft COCO: Common objects in context. In European Conference on Computer Vision (ECCV), pages 740–755. Springer, 2014. 5

  21. [21]

    A ConvNet for the 2020s

    Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A ConvNet for the 2020s. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition (CVPR), pages 11976– 11986, 2022. 5

  22. [22]

    A unified approach to interpreting model predictions

    Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. InAdvances in Neural Infor- mation Processing Systems (NeurIPS), 2017. 1, 2, 5

  23. [23]

    Investigating saturation effects in integrated gra- dients.arXiv preprint arXiv:2010.12697, 2020

    Vivek Miglani, Narine Kober, Hana Morgenstern, and Dan- ish Pruthi. Investigating saturation effects in integrated gra- dients.arXiv preprint arXiv:2010.12697, 2020. 2

  24. [24]

    PyTorch: An imperative style, high-performance deep learning li- brary

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zem- ing Lin, Natalia Gimelshein, Luca Antiga, et al. PyTorch: An imperative style, high-performance deep learning li- brary. InAdvances in Neural Information Processing Sys- tems (NeurIPS), 2019. 5

  25. [25]

    Why should I trust you?

    Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why should I trust you?”: Explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1135–1144, 2016. 1, 2, 5

  26. [26]

    Benchmarking saliency methods for chest X-ray interpretation.Nature Ma- chine Intelligence, 4(10):867–878, 2022

    Adriel Saporta, Xiaotong Gui, Ashwin Agrawal, Anuj Pa- reek, Steven QH Truong, Chanh DT Nguyen, Van-Doan Ngo, Jayne Seekins, Francis G Blankenberg, Andrew Y Ng, Matthew P Lungren, and Pranav Rajpurkar. Benchmarking saliency methods for chest X-ray interpretation.Nature Ma- chine Intelligence, 4(10):867–878, 2022. 2

  27. [27]

    Grad-CAM: Visual explanations from deep networks via gradient-based localization

    Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-CAM: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE In- ternational Conference on Computer Vision (ICCV), pages 618–626, 2017. 1, 2, 5

  28. [28]

    The proof and measurement of associa- tion between two things.The American Journal of Psychol- ogy, 15(1):72–101, 1904

    Charles Spearman. The proof and measurement of associa- tion between two things.The American Journal of Psychol- ogy, 15(1):72–101, 1904. 2

  29. [29]

    Visualiz- ing the impact of feature attribution baselines.Distill, 5(1): e22, 2020

    Patrick Sturmfels, Scott Lundberg, and Su-In Lee. Visualiz- ing the impact of feature attribution baselines.Distill, 5(1): e22, 2020. 2

  30. [30]

    Axiomatic attribution for deep networks

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. InProceedings of the 34th In- ternational Conference on Machine Learning (ICML), pages 3319–3328, 2017. 2, 5

  31. [31]

    Quick shift and kernel methods for mode seeking.Computer Vision–ECCV 2008, pages 705–718, 2008

    Andrea Vedaldi and Stefano Soatto. Quick shift and kernel methods for mode seeking.Computer Vision–ECCV 2008, pages 705–718, 2008. 5

  32. [32]

    Image quality assessment: From error visibility to structural similarity.IEEE Transactions on Image Process- ing, 13(4):600–612, 2004

    Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Si- moncelli. Image quality assessment: From error visibility to structural similarity.IEEE Transactions on Image Process- ing, 13(4):600–612, 2004. 2, 3, 4

  33. [33]

    On the (in)fidelity and sen- sitivity of explanations

    Chih-Kuan Yeh, Cheng-Yu Hsieh, Arun Suggala, David I In- ber, and Pradeep K Ravikumar. On the (in)fidelity and sen- sitivity of explanations. InAdvances in Neural Information Processing Systems (NeurIPS), 2019. 1, 2, 3

  34. [34]

    DLIME: A deterministic local interpretable model-agnostic explanations approach for computer-aided diagnosis sys- tems

    Muhammad Rehman Zafar and Naimul Mefraz Khan. DLIME: A deterministic local interpretable model-agnostic explanations approach for computer-aided diagnosis sys- tems. InACM SIGKDD Workshop on Explainable AI/ML (XAI), 2019. 2

  35. [35]

    S-LIME: Stabilized-LIME for model explanation

    Zhengze Zhou, Giles Hooker, and Fei Wang. S-LIME: Stabilized-LIME for model explanation. InProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 2429–2438, 2021. 2 Feature Attribution Stability Suite: How Stable Are Post-Hoc Attributions? Supplementary Material Appendix In the Appendix, we provide the followi...

  36. [36]

    The proof and measurement of associa- tion between two things,

    C. Spearman, “The proof and measurement of associa- tion between two things,”The American Journal of Psy- chology, vol. 15, no. 1, pp. 72–101, 1904

  37. [37]

    Image quality assessment: From error visibility to struc- tural similarity,

    Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to struc- tural similarity,”IEEE Trans. Image Processing, vol. 13, no. 4, pp. 600–612, 2004

  38. [38]

    The distribution of the flora in the alpine zone,

    P. Jaccard, “The distribution of the flora in the alpine zone,”New Phytologist, vol. 11, no. 2, pp. 37–50, 1912