UbiQVision: Quantifying Uncertainty in XAI for Image Recognition

Akshat Dubey; Aleksandar An\v{z}el; Bahar \.Ilgen; Georges Hattab

arxiv: 2512.20288 · v2 · submitted 2025-12-23 · 💻 cs.CV · cs.AI

UbiQVision: Quantifying Uncertainty in XAI for Image Recognition

Akshat Dubey , Aleksandar An\v{z}el , Bahar \.Ilgen , Georges Hattab This is my paper

Pith reviewed 2026-05-16 20:07 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords uncertainty quantificationSHAP explanationsmedical imagingXAIDempster-Shafer theoryDirichlet samplingimage recognitionepistemic uncertainty

0 comments

The pith

Dirichlet posterior sampling and Dempster-Shafer theory can quantify instability in SHAP explanations for medical image classifiers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a method to measure how unreliable SHAP explanations become when models face uncertainty in medical imaging. It combines Dirichlet sampling to model posterior distributions with Dempster-Shafer theory to assign belief and plausibility values to explanation features. This matters because doctors rely on these explanations to trust AI predictions, yet unstable ones can lead to wrong decisions. The approach generates fusion maps that highlight uncertain regions in the explanation. Evaluation on pathology, ophthalmology, and radiology datasets shows it can track uncertainty arising from image noise and varying resolutions.

Core claim

The central discovery is a framework called UbiQVision that uses Dirichlet posterior sampling to capture epistemic and aleatoric uncertainty in SHAP values, then applies Dempster-Shafer theory to compute belief maps, plausibility maps, and fusion maps, providing a quantitative measure of explanation uncertainty without requiring ground-truth labels.

What carries the argument

Dirichlet posterior sampling fused with Dempster-Shafer belief and plausibility functions to produce uncertainty maps from SHAP explanations.

If this is right

Clinicians can identify which parts of a SHAP explanation are trustworthy in noisy medical scans.
The method allows statistical comparison of uncertainty levels across different imaging modalities.
It provides a way to fuse multiple uncertain explanations into a single reliable visualization.
Models with high uncertainty in explanations can be flagged for further review before clinical use.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This could extend to other XAI methods beyond SHAP, such as LIME, by applying the same sampling and fusion process.
In non-medical domains like autonomous driving, it might help quantify trust in visual explanations under sensor noise.
Future work could test if these uncertainty scores correlate with actual model error rates on held-out test sets.

Load-bearing premise

The assumption that Dirichlet sampling combined with Dempster-Shafer theory yields a faithful measure of SHAP instability without adding new biases or needing separate validation data.

What would settle it

Run the framework on a dataset where SHAP explanations are known to be stable, such as synthetic images with no noise, and check if the uncertainty scores remain near zero.

Figures

Figures reproduced from arXiv: 2512.20288 by Akshat Dubey, Aleksandar An\v{z}el, Bahar \.Ilgen, Georges Hattab.

**Figure 1.** Figure 1: On the left, the plot demonstrates the impact of the temperature parameter (T) on the model weights for Models A, B, and C, as evaluated on the test dataset using a specific metric on a logarithmic scale. A low T value means that a model with even a slightly higher F1 score receives the highest weight, while a high T value means that the models will be allotted equal weight, irrespective of their performan… view at source ↗

**Figure 2.** Figure 2: The visualization shows the SHAP values for the two classes of the malaria dataset from the predictions of three models: a custom CNN model, a ResNet model, and a ViT model, with weights of 0.37, 0.32, and 0.31, respectively. The "uninfected" class receives positive attribution (𝜙 > 0) from the models for the erythrocyte’s interior regions with low-frequency spatial variation. Specifically, the smooth, hom… view at source ↗

**Figure 3.** Figure 3: This figure shows the fusion of SHAP explanations from the weighted model ensemble (Custom CNN, ResNet, and ViT) for a malaria-infected and uninfected sample. The figure (a) is of the parasitized sample. The belief mass (support) map shows a concentrated area of high belief (dark green), which precisely localizes the parasite. This indicates that the models found strong, consistent evidence at this locatio… view at source ↗

**Figure 4.** Figure 4: Figure (a) shows the distribution of evidence and model confidence for the parasitized sample. It presents the statistical analysis of the fusion process for the infected erythrocyte. The left panel shows the kernel density estimation (KDE) of pixel-wise mass values. The belief mass (green) is heavily skewed toward zero, but it has a noticeable tail that extends into higher values. This statistically confi… view at source ↗

**Figure 5.** Figure 5: This visualization shows the SHAP attribution maps (𝜙) for each dementia stage. It details the additive feature attribution scores for the ensemble members, where the color intensity corresponds to the impact on the model’s log-odds output. Red pixels denote positive SHAP values (phi > 0), indicating morphological regions, such as enlarged ventricles or cortical atrophy, that drive classification toward a … view at source ↗

**Figure 6.** Figure 6: This figure illustrates the correlation between feature distinctness and model confidence by presenting the pixel-wise fusion of SHAP explanations from the weighted model ensemble (Custom CNN, ResNet, and ViT) across four stages of dementia. (a) Mild Dementia: The belief mass (support) map shows localized clusters of evidence (green) that correspond to emerging pathological features. The uncertainty map sh… view at source ↗

**Figure 7.** Figure 7: This figure shows the quantitative analytics of the fusion engine by contrasting the statistical evidence distribution on the left with the assigned Bayesian model confidence on the right. The bar charts confirm that Custom CNN (𝑤 = 0.362) has the greatest influence on the ensemble, followed closely by ResNet and ViT. This indicates a preference for local texture features over global dependencies. The kern… view at source ↗

**Figure 8.** Figure 8: This figure illustrates the fused SHAP explanations and quantitative analytics for the diabetic retinopathy (DR) classification ensemble at different severity levels. The attribution maps in the top rows visualize the pixel-wise contribution of each model (Custom CNN, ResNet, and ViT). Red indicates positive evidence for the target class, and blue indicates suppression. In advanced stages, such as Prolifer… view at source ↗

**Figure 9.** Figure 9: This figure illustrates the Dempster-Shafer fusion of model explanations for classifying diabetic retinopathy (DR), tracking the evolution of evidential support across five distinct severity levels. (a) Healthy: The fusion maps for healthy retinas exhibit minimal belief mass (pale/empty) and high, uniform uncertainty (bright yellow). This indicates that the ensemble’s decision is driven by the absence of p… view at source ↗

**Figure 10.** Figure 10: This figure shows the fusion maps of SHAP explanations produced by a Bayesian-weighted model ensemble (Custom CNN, ResNet, and ViT) at different levels of diabetic retinopathy (DR) severity. The individual attribution maps demonstrate that advanced disease states, such as severe and proliferative DR, result in dense, widespread positive contributions (red pixels) across the various architectures. In contr… view at source ↗

read the original abstract

Recent advances in deep learning have led to its widespread adoption across diverse domains, including medical imaging. This progress is driven by increasingly sophisticated model architectures, such as ResNets, Vision Transformers, and Hybrid Convolutional Neural Networks, that offer enhanced performance at the cost of greater complexity. This complexity often compromises model explainability and interpretability. SHAP has emerged as a prominent method for providing interpretable visualizations that aid domain experts in understanding model predictions. However, SHAP explanations can be unstable and unreliable in the presence of epistemic and aleatoric uncertainty. In this study, we address this challenge by using Dirichlet posterior sampling and Dempster-Shafer theory to quantify the uncertainty that arises from these unstable explanations in medical imaging applications. The framework uses a belief, plausible, and fusion map approach alongside statistical quantitative analysis to produce quantification of uncertainty in SHAP. Furthermore, we evaluated our framework on three medical imaging datasets with varying class distributions, image qualities, and modality types which introduces noise due to varying image resolutions and modality-specific aspect covering the examples from pathology, ophthalmology, and radiology, introducing significant epistemic uncertainty.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a way to add uncertainty maps to SHAP explanations in medical imaging by combining Dirichlet sampling and Dempster-Shafer theory, but the abstract shows no equations, results, or validation that the maps track actual explanation instability.

read the letter

The main point is a framework that fuses Dirichlet posterior sampling with Dempster-Shafer theory to produce belief, plausibility, and fusion maps for quantifying uncertainty in SHAP outputs on medical images. It targets the known instability of explanations from models like ResNets and Vision Transformers when applied to noisy or varying medical data. The evaluation covers three datasets spanning pathology, ophthalmology, and radiology, which at least tries to cover differences in resolution, modality, and class balance. That spread is a reasonable choice for showing the method is not tied to one narrow setting. The specific pairing of those two techniques with SHAP is the clearest new element, even if both are established tools elsewhere. It gives credit for focusing on a practical pain point in high-stakes imaging where unreliable explanations matter. The soft spots are clear and central. The abstract states the intended steps but supplies no equations, sampling details, or numerical outcomes, so there is no way to check the implementation or see effect sizes. More critically, nothing in the provided text tests whether the resulting maps actually correlate with observable SHAP variance across seeds, perturbations, or repeated runs on the same image. Without that check the maps could simply reflect modeling choices rather than the targeted epistemic instability. The stress-test concern lands because the central claim rests on faithful quantification, yet no direct evidence is shown. This is for readers already working on XAI extensions in medical imaging who want ideas for adding reliability layers. It shows honest engagement with the instability problem but needs the missing experiments and comparisons before it can be taken as a working solution. I would send it to peer review so the authors can supply the validation and tighten the claims.

Referee Report

2 major / 2 minor

Summary. The paper introduces UbiQVision, a framework that applies Dirichlet posterior sampling combined with Dempster-Shafer theory to quantify uncertainty arising from unstable SHAP explanations in deep learning models for medical image recognition. It generates belief, plausibility, and fusion maps, performs statistical quantitative analysis, and evaluates the approach on three medical imaging datasets spanning pathology, ophthalmology, and radiology that vary in class distribution, image quality, and modality.

Significance. If the method can be shown to produce maps that reliably track actual SHAP instability, the work would address a practical gap in deploying XAI for high-stakes medical decisions where explanation variance can undermine trust. The use of DST belief/plausibility constructs on top of Dirichlet sampling is a plausible direction, but the current manuscript supplies no equations, implementation details, or validation experiments, so the significance cannot yet be assessed.

major comments (2)

[Abstract/Methods] Abstract and Methods: the central claim that Dirichlet posterior sampling plus Dempster-Shafer theory yields a faithful quantification of SHAP instability is unsupported because no equations, sampling procedure, or fusion rule are provided; without these it is impossible to determine whether the belief/plausibility maps reflect epistemic instability in the explanations or merely modeling artifacts.
[Results] Results/Evaluation: no empirical check is reported that the produced belief or fusion maps correlate with observable SHAP variance (e.g., across random seeds, input perturbations, or repeated explanations on identical images), leaving open the possibility that the maps capture aleatoric noise rather than the targeted explanation instability.

minor comments (2)

[Abstract] The abstract refers to 'statistical quantitative analysis' without naming the specific metrics, confidence intervals, or hypothesis tests employed.
[Experiments] Dataset descriptions mention 'varying image resolutions and modality-specific aspect' but do not report exact image sizes, preprocessing steps, or how these factors were controlled in the uncertainty quantification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. We address each major comment below and will revise the manuscript to incorporate the suggested improvements.

read point-by-point responses

Referee: [Abstract/Methods] Abstract and Methods: the central claim that Dirichlet posterior sampling plus Dempster-Shafer theory yields a faithful quantification of SHAP instability is unsupported because no equations, sampling procedure, or fusion rule are provided; without these it is impossible to determine whether the belief/plausibility maps reflect epistemic instability in the explanations or merely modeling artifacts.

Authors: We agree that the current manuscript lacks explicit equations and procedural details for the Dirichlet posterior sampling and the Dempster-Shafer fusion rules. This omission makes it difficult for readers to fully assess the method. In the revised version, we will expand the Methods section to include the full mathematical formulation: the Dirichlet distribution used for posterior sampling of explanation weights, the Monte Carlo sampling procedure to generate an ensemble of SHAP maps, and the specific combination rules for computing belief and plausibility from the sampled explanations. These additions will demonstrate that the resulting maps specifically capture the variance due to SHAP instability. revision: yes
Referee: [Results] Results/Evaluation: no empirical check is reported that the produced belief or fusion maps correlate with observable SHAP variance (e.g., across random seeds, input perturbations, or repeated explanations on identical images), leaving open the possibility that the maps capture aleatoric noise rather than the targeted explanation instability.

Authors: We acknowledge the importance of empirical validation to confirm that the uncertainty maps track SHAP instability rather than other sources of noise. The current evaluation focuses on qualitative and statistical analysis across datasets, but does not include direct correlation studies. We will add new experiments in the revised manuscript: for a subset of images, we will generate multiple SHAP explanations under controlled variations (different seeds, slight input perturbations), compute the variance in the explanation values, and show that the belief and fusion maps have high correlation with these variance measures, supported by quantitative metrics such as Pearson correlation coefficients. revision: yes

Circularity Check

0 steps flagged

No circularity detected in the derivation chain

full rationale

The abstract describes a framework that applies Dirichlet posterior sampling and Dempster-Shafer theory to produce belief, plausibility, and fusion maps for quantifying SHAP instability. No equations, parameter-fitting steps, or self-citations are shown that would reduce any claimed output to an input by construction. The approach introduces new constructs (belief/plausibility maps plus statistical analysis) rather than re-labeling fitted quantities or importing uniqueness results from prior self-work. Without load-bearing reductions visible in the provided text, the derivation chain is self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only access prevents identification of any concrete free parameters, axioms, or invented entities; full text would be required to audit these elements.

pith-pipeline@v0.9.0 · 5507 in / 1205 out tokens · 33298 ms · 2026-05-16T20:07:24.305629+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we address this challenge by using Dirichlet posterior sampling and Dempster-Shafer theory to quantify the uncertainty that arises from these unstable explanations... belief, plausible, and fusion map approach
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean alpha_pin_under_high_calibration unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The framework uses a belief, plausible, and fusion map approach alongside statistical quantitative analysis to produce quantification of uncertainty in SHAP

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

94 extracted references · 94 canonical work pages · 1 internal anchor

[1]

Vision transformers in medical imaging: a comprehensive review of advancements and applications across multiple diseases

Aburass, S., Dorgham, O., Al Shaqsi, J., Abu Rumman, M., Al-Kadi, O., 2025. Vision transformers in medical imaging: a comprehensive review of advancements and applications across multiple diseases. Journal of Imaging Informatics in Medicine , 1–44

work page 2025
[2]

The role of big data in healthcare: A review of implications for patient outcomes and treatment personalization

Adeghe, E.P., Okolo, C.A., Ojeyinka, O.T., 2024. The role of big data in healthcare: A review of implications for patient outcomes and treatment personalization. World Journal of Biology Pharmacy and Health Sciences 17, 198–204

work page 2024
[3]

Improvingmalariadiagnosisthroughinterpretablecustomizedcnnsarchitectures

Ahamed,M.F.,Nahiduzzaman,M.,Mahmud,G.,Shafi,F.B.,Ayari,M.A.,Khandakar,A.,Abdullah-Al-Wadud,M.,Islam, S.R.,2025. Improvingmalariadiagnosisthroughinterpretablecustomizedcnnsarchitectures. ScientificReports15,6484

work page 2025
[4]

Early cancer detection using deep learning and medical imaging: A survey

Ahmad, I., Alqurashi, F., 2024. Early cancer detection using deep learning and medical imaging: A survey. Critical Reviews in Oncology/Hematology 204, 104528

work page 2024
[5]

Hippocampal atrophy and ventricular enlargement in normal aging, mild cognitive impairment (mci), and alzheimer disease

Apostolova, L.G., Green, A.E., Babakchanian, S., Hwang, K.S., Chou, Y.Y., Toga, A.W., Thompson, P.M., 2012. Hippocampal atrophy and ventricular enlargement in normal aging, mild cognitive impairment (mci), and alzheimer disease. Alzheimer Disease & Associated Disorders 26, 17–27

work page 2012
[6]

Detection and grading of diabetic retinopathy in retinal images using deep intelligent systems: A comprehensive review

Asha Gnana Priya, H., Anitha, J., Popescu, D.E., Asokan, A., Jude Hemanth, D., Son, L.H., 2021. Detection and grading of diabetic retinopathy in retinal images using deep intelligent systems: A comprehensive review. Computers, Materials & Continua 66

work page 2021
[7]

Atad, M., Schinz, D., Moeller, H., Graf, R., Wiestler, B., Rueckert, D., Navab, N., Kirschke, J.S., Keicher, M., et al.,

work page
[8]

Machine Learning for Biomedical Imaging 2, 2103–2125

Counterfactual explanations for medical image classification and regression using diffusion autoencoder. Machine Learning for Biomedical Imaging 2, 2103–2125

work page
[9]

Ba,W.,Wu,H.,Chen,W.W.,Wang,S.H.,Zhang,Z.Y.,Wei,X.J.,Wang,W.J.,Yang,L.,Zhou,D.M.,Zhuang,Y.X.,etal.,

work page
[10]

European Journal of Cancer 169, 156–165

Convolutional neural network assistance significantly improves dermatologists’ diagnosis of cutaneous tumours using clinical images. European Journal of Cancer 169, 156–165

work page
[11]

Uncertainty quantification in medical image synthesis, in: Biomedical image synthesis and simulation

Barbano, R., Arridge, S., Jin, B., Tanno, R., 2022. Uncertainty quantification in medical image synthesis, in: Biomedical image synthesis and simulation. Elsevier, pp. 601–641

work page 2022
[12]

Evaluatingtheexplainabilityofvisiontransformersinmedicalimaging

Barekatain,L.,Glocker,B.,2025. Evaluatingtheexplainabilityofvisiontransformersinmedicalimaging. arXivpreprint arXiv:2510.12021

work page arXiv 2025
[13]

Some aspects of dempster-shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account

Bloch, I., 1996. Some aspects of dempster-shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account. Pattern Recognition Letters 17, 905–919

work page 1996
[14]

Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE

Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N., 2018. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE. pp. 839–847

work page 2018
[15]

Explaining a series of models by propagating shapley values

Chen, H., Lundberg, S.M., Lee, S.I., 2022. Explaining a series of models by propagating shapley values. Nature communications 13, 4512

work page 2022
[16]

Evidence-based uncertainty-aware semi-supervised medical image segmentation

Chen, Y., Yang, Z., Shen, C., Wang, Z., Zhang, Z., Qin, Y., Wei, X., Lu, J., Liu, Y., Zhang, Y., 2024. Evidence-based uncertainty-aware semi-supervised medical image segmentation. Computers in Biology and Medicine 170, 108004

work page 2024
[17]

Uncertainty propagation in xai: A comparison of analytical and empirical estimators, in: World Conference on Explainable Artificial Intelligence, Springer

Chiaburu, T., Bießmann, F., Haußer, F., 2025. Uncertainty propagation in xai: A comparison of analytical and empirical estimators, in: World Conference on Explainable Artificial Intelligence, Springer. pp. 390–411

work page 2025
[18]

Chromatin-mediated epigenetic regulation in the malaria parasite plasmodium falciparum

Cui, L., Miao, J., 2010. Chromatin-mediated epigenetic regulation in the malaria parasite plasmodium falciparum. Eukaryotic cell 9, 1138–1149

work page 2010
[19]

Explainable artificial intelligence (xai) in radiology and nuclear medicine: a literature review

DeVries,B.M.,Zwezerijnen,G.J.,Burchell,G.L.,vanVelden,F.H.,Menke-vanderHouvenvanOordt,C.W.,Boellaard, R., 2023. Explainable artificial intelligence (xai) in radiology and nuclear medicine: a literature review. Frontiers in medicine 10, 1180773

work page 2023
[20]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020
[21]

Ubiqtree:UncertaintyquantificationinXAIwithtreeensembles

Dubey,A.,Anžel,A.,İlgen,B.,Hattab,G.,2025. Ubiqtree:UncertaintyquantificationinXAIwithtreeensembles. arXiv preprint arXiv:2508.09639

work page arXiv 2025
[22]

Ai readiness in healthcare through storytelling XAI, in: EXPLIMED@ ECAI

Dubey, A., Yang, Z., Hattab, G., 2024a. Ai readiness in healthcare through storytelling XAI, in: EXPLIMED@ ECAI

work page
[23]

A nested model for AI design and validation

Dubey, A., Yang, Z., Hattab, G., 2024b. A nested model for AI design and validation. Iscience 27

work page
[24]

Diabetic retinopathy detection (2015)

Dugas, E., Jared, J., Cukierski, W., . Diabetic retinopathy detection (2015). URL https://kaggle. com/competitions/diabetic-retinopathy-detection 7

work page 2015
[25]

Deep learning applications in medical image analysis: Advancements, challenges, and future directions

Eli, A.A., Ali, A., 2024. Deep learning applications in medical image analysis: Advancements, challenges, and future directions. arXiv preprint arXiv:2410.14131

work page arXiv 2024
[26]

Esteban, L.M., Borque-Fernando, Á., Escorihuela, M.E., Esteban-Escaño, J., Abascal, J.M., Servian, P., Morote, J.,

work page
[27]

Scientific reports 15, 4261

Integrating radiological and clinical data for clinically significant prostate cancer detection with machine learning techniques. Scientific reports 15, 4261

work page
[28]

Deep learning-enabled medical computer vision

Esteva, A., Chou, K., Yeung, S., Naik, N., Madani, A., Mottaghi, A., Liu, Y., Topol, E., Dean, J., Socher, R., 2021. Deep learning-enabled medical computer vision. NPJ digital medicine 4, 5

work page 2021
[29]

Quantifying uncertainty in deep learning of radiologic images

Faghani, S., Moassefi, M., Rouzrokh, P., Khosravi, B., Baffour, F.I., Ringler, M.D., Erickson, B.J., 2023. Quantifying uncertainty in deep learning of radiologic images. Radiology 308, e222217. :Preprint submitted to ElsevierPage 29 of 32

work page 2023
[30]

Dirichletprocesses,in:StochasticIntegrals:ProceedingsoftheLMSDurhamSymposium,July7–17, 1980, Springer

Föllmer,H.,2006. Dirichletprocesses,in:StochasticIntegrals:ProceedingsoftheLMSDurhamSymposium,July7–17, 1980, Springer. pp. 476–478

work page 2006
[31]

Axiom-based grad-cam: Towards accurate visualization and explanation of cnns

Fu, R., Hu, Q., Dong, X., Guo, Y., Gao, Y., Li, B., 2020. Axiom-based grad-cam: Towards accurate visualization and explanation of cnns. arXiv e-prints , arXiv–2008

work page 2020
[32]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning, in: international conference on machine learning, PMLR

Gal, Y., Ghahramani, Z., 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, in: international conference on machine learning, PMLR. pp. 1050–1059

work page 2016
[33]

Uncertainty-aware visualization in medical imaging-a survey, in: Computer Graphics Forum, Wiley Online Library

Gillmann, C., Saur, D., Wischgoll, T., Scheuermann, G., 2021. Uncertainty-aware visualization in medical imaging-a survey, in: Computer Graphics Forum, Wiley Online Library. pp. 665–689

work page 2021
[34]

Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778

work page 2016
[35]

Explainable ai for medical data: Current methods, limitations, and future directions

Hossain, M.I., Zamzmi, G., Mouton, P.R., Salekin, M.S., Sun, Y., Goldgof, D., 2025. Explainable ai for medical data: Current methods, limitations, and future directions. ACM Computing Surveys 57, 1–46

work page 2025
[36]

Deepevidentialfusionwithuncertaintyquantificationandreliability learning for multimodal medical image segmentation

Huang,L.,Ruan,S.,Decazes,P.,Denœux,T.,2025. Deepevidentialfusionwithuncertaintyquantificationandreliability learning for multimodal medical image segmentation. Information Fusion 113, 102648

work page 2025
[37]

A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods

Huang, L., Ruan, S., Xing, Y., Feng, M., 2024. A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods. Medical Image Analysis 97, 103223

work page 2024
[38]

Layercam: Exploring hierarchical class activation maps for localization

Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y., 2021. Layercam: Exploring hierarchical class activation maps for localization. IEEE transactions on image processing 30, 5875–5888

work page 2021
[39]

Onuncertainty,tempering,anddataaugmentationinbayesian classification

Kapoor,S.,Maddox,W.J.,Izmailov,P.,Wilson,A.G.,2022. Onuncertainty,tempering,anddataaugmentationinbayesian classification. Advances in neural information processing systems 35, 18211–18225

work page 2022
[40]

Diagnosing malaria patients with plasmodium falciparum and vivax using deep learning for thick smear images

Kassim, Y.M., Yang, F., Yu, H., Maude, R.J., Jaeger, S., 2021. Diagnosing malaria patients with plasmodium falciparum and vivax using deep learning for thick smear images. Diagnostics 11, 1994

work page 2021
[41]

Interpretabilitybeyondfeatureattribution: Quantitative testing with concept activation vectors (tcav), in: International conference on machine learning, PMLR

Kim,B.,Wattenberg,M.,Gilmer,J.,Cai,C.,Wexler,J.,Viegas,F.,etal.,2018. Interpretabilitybeyondfeatureattribution: Quantitative testing with concept activation vectors (tcav), in: International conference on machine learning, PMLR. pp. 2668–2677

work page 2018
[42]

Simple and scalable predictive uncertainty estimation using deep ensembles

Lakshminarayanan, B., Pritzel, A., Blundell, C., 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems 30

work page 2017
[43]

Trustworthy clinical ai solutions: A unified review of uncertainty quantification in deep learning models for medical image analysis

Lambert, B., Forbes, F., Doyle, S., Dehaene, H., Dojat, M., 2024. Trustworthy clinical ai solutions: A unified review of uncertainty quantification in deep learning models for medical image analysis. Artif. Intell. Medicine 150, 102830

work page 2024
[44]

Oasis-3:longitudinalneuroimaging,clinical,andcognitivedatasetfornormalagingand alzheimer disease

LaMontagne, P.J., Benzinger, T.L., Morris, J.C., Keefe, S., Hornbeck, R., Xiong, C., Grant, E., Hassenstab, J., Moulder, K.,Vlassenko,A.G.,etal.,2019. Oasis-3:longitudinalneuroimaging,clinical,andcognitivedatasetfornormalagingand alzheimer disease. medrxiv , 2019–12

work page 2019
[45]

Biophysical profiling of red blood cells from thin-film blood smears using deep learning

Lamoureux, E.S., Cheng, Y., Islamzada, E., Matthews, K., Duffy, S.P., Ma, H., 2024. Biophysical profiling of red blood cells from thin-film blood smears using deep learning. Heliyon 10

work page 2024
[46]

Gradient-based learning applied to document recognition

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 2002. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324

work page 2002
[47]

Deep learning in medical imaging: general overview

Lee, J.G., Jun, S., Cho, Y.W., Lee, H., Kim, G.B., Seo, J.B., Kim, N., 2017. Deep learning in medical imaging: general overview. Korean journal of radiology 18, 570–584

work page 2017
[48]

Medical image analysis using deep learning algorithms

Li, M., Jiang, Y., Zhang, Y., Zhu, H., 2023. Medical image analysis using deep learning algorithms. Frontiers in public health 11, 1273253

work page 2023
[49]

Li, X., Zhou, Y., Dvornek, N.C., Gu, Y., Ventola, P., Duncan, J.S., 2020. Efficient shapley explanation for features importance estimation under uncertainty, in: International Conference on Medical Image Computing and Computer- Assisted Intervention, Springer. pp. 792–801

work page 2020
[50]

A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis

Liu, X., Faes, L., Kale, A.U., Wagner, S.K., Fu, D.J., Bruynseels, A., Mahendiran, T., Moraes, G., Shamdas, M., Kern, C., et al., 2019. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. The lancet digital health 1, e271–e297

work page 2019
[51]

Enhancingmedicalimagesegmentationviacomplementarycnn-transformer fusion and boundary perception

Liu,X.,Tian,J.,Huang,S.,Shen,W.,2025. Enhancingmedicalimagesegmentationviacomplementarycnn-transformer fusion and boundary perception. Frontiers in Computer Science 7, 1677905

work page 2025
[52]

Uncertainty-aware deep learning in healthcare: a scoping review

Loftus,T.J.,Shickel,B.,Ruppert,M.M.,Balch,J.A.,Ozrazgat-Baslanti,T.,Tighe,P.J.,Efron,P.A.,Hogan,W.R.,Rashidi, P., Upchurch Jr, G.R., et al., 2022. Uncertainty-aware deep learning in healthcare: a scoping review. PLOS digital health 1, e0000085

work page 2022
[53]

Towardsaleatoricandepistemicuncertaintyinmedicalimageclassification, in: International Conference on Artificial Intelligence in Medicine, Springer

Löhr,T.,Ingrisch,M.,Hüllermeier,E.,2024. Towardsaleatoricandepistemicuncertaintyinmedicalimageclassification, in: International Conference on Artificial Intelligence in Medicine, Springer. pp. 145–155

work page 2024
[54]

A unified approach to interpreting model predictions

Lundberg, S.M., Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems 30

work page 2017
[55]

A comprehensive reviewofdeepneuralnetworksformedicalimageprocessing:Recentdevelopmentsandfutureopportunities

Mall, P.K., Singh, P.K., Srivastav, S., Narayan, V., Paprzycki, M., Jaworska, T., Ganzha, M., 2023. A comprehensive reviewofdeepneuralnetworksformedicalimageprocessing:Recentdevelopmentsandfutureopportunities. Healthcare Analytics 4, 100216

work page 2023
[56]

Ventricular features as reliable differentiators between bvftd and other dementias

Manera, A.L., Dadar, M., Collins, D.L., Ducharme, S., Initiative, F.L.D.N., (ADNI, A.D.N.I., et al., 2022. Ventricular features as reliable differentiators between bvftd and other dementias. NeuroImage: Clinical 33, 102947. :Preprint submitted to ElsevierPage 30 of 32

work page 2022
[57]

Deep learning–based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis

Manikandan, S., Raman, R., Rajalakshmi, R., Tamilselvi, S., Surya, R.J., 2023. Deep learning–based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis. Indian Journal of Ophthalmology 71, 1783–1796

work page 2023
[58]

Openaccessseriesofimaging studies (oasis): cross-sectional mri data in young, middle aged, nondemented, and demented older adults

Marcus,D.S.,Wang,T.H.,Parker,J.,Csernansky,J.G.,Morris,J.C.,Buckner,R.L.,2007. Openaccessseriesofimaging studies (oasis): cross-sectional mri data in young, middle aged, nondemented, and demented older adults. Journal of cognitive neuroscience 19, 1498–1507

work page 2007
[59]

Ganterfactual—counterfactual explanations for medical non-experts using generative adversarial learning

Mertes, S., Huber, T., Weitz, K., Heimerl, A., André, E., 2022. Ganterfactual—counterfactual explanations for medical non-experts using generative adversarial learning. Frontiers in artificial intelligence 5, 825565

work page 2022
[60]

Monte carlo dropout for uncertainty estimation and motor imagery classification

Milanés-Hermosilla,D.,TrujilloCodorniú,R.,López-Baracaldo,R.,Sagaró-Zamora,R.,Delisle-Rodriguez,D.,Villarejo- Mayor, J.J., Nunez-Alvarez, J.R., 2021. Monte carlo dropout for uncertainty estimation and motor imagery classification. Sensors 21, 7241

work page 2021
[61]

Reinventing radiology: big data and the future of medical imaging

Morris, M.A., Saboury, B., Burkett, B., Gao, J., Siegel, E.L., 2018. Reinventing radiology: big data and the future of medical imaging. Journal of thoracic imaging 33, 4–16

work page 2018
[62]

Computational and structural biotechnology journal 24, 542–560

Muhammad,D.,Bendechache,M.,2024.Unveilingtheblackbox:Asystematicreviewofexplainableartificialintelligence in medical image analysis. Computational and structural biotechnology journal 24, 542–560

work page 2024
[63]

Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks

Nazir, S., Dickson, D.M., Akram, M.U., 2023. Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks. Computers in Biology and Medicine 156, 106668

work page 2023
[64]

Nguyen, V.P., Trinh, N.H., Nguyen, D.M.L., Nguyen, P.L., Tran, Q.L., 2025. Aleatoric uncertainty medical image segmentation estimation via flow matching, in: International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, Springer. pp. 134–144

work page 2025
[65]

Emerging trends in ai-powered medical imaging: enhancing diagnostic accuracy and treatment decisions

Oyeniyi, J., Oluwaseyi, P., 2024. Emerging trends in ai-powered medical imaging: enhancing diagnostic accuracy and treatment decisions. International Journal of Enhanced Research In Science Technology & Engineering 13, 81–94

work page 2024
[66]

Parcalabescu, L., Frank, A., 2023. Mm-shap: A performance-agnostic metric for measuring multimodal contributions in vision and language models & tasks, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 4032–4059

work page 2023
[67]

Practical guide to shap analysis: Explaining supervised machine learning model predictions in drug development

Ponce-Bobadilla, A.V., Schmitt, V., Maier, C.S., Mensing, S., Stodtmann, S., 2024. Practical guide to shap analysis: Explaining supervised machine learning model predictions in drug development. Clinical and translational science 17, e70056

work page 2024
[68]

Revolutionizing healthcare: a comparative insight into deep learning’s role in medical imaging

Prasad, V.K., Verma, A., Bhattacharya, P., Shah, S., Chowdhury, S., Bhavsar, M., Aslam, S., Ashraf, N., 2024. Revolutionizing healthcare: a comparative insight into deep learning’s role in medical imaging. Scientific Reports 14, 30273

work page 2024
[69]

Enhanced mri brain tumor detection using deep learning in conjunction with explainable ai shap based diverse and multi feature analysis

Rahman, A., Hayat, M., Iqbal, N., Alarfaj, F.K., Alkhalaf, S., Alturise, F., 2025. Enhanced mri brain tumor detection using deep learning in conjunction with explainable ai shap based diverse and multi feature analysis. Scientific Reports 15, 29411

work page 2025
[70]

Ramaswamy, H.G., et al., 2020. Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization, in: proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 983–991

work page 2020
[71]

Herecomestheexplanation:Ashapley perspective on multi-contrast medical image segmentation

Ren,T.,Rivera,J.H.,Oswal,H.,Pan,Y.,Chopra,A.,Ruzevick,J.,Kurt,M.,2025. Herecomestheexplanation:Ashapley perspective on multi-contrast medical image segmentation. arXiv preprint arXiv:2504.04645

work page arXiv 2025
[72]

why should i trust you?

Ribeiro, M.T., Singh, S., Guestrin, C., 2016. " why should i trust you?" explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135– 1144

work page 2016
[73]

Aperspective on explainable artificial intelligence methods: Shap and lime

Salih,A.M.,Raisi-Estabragh,Z.,Galazzo,I.B.,Radeva,P.,Petersen,S.E.,Lekadir,K.,Menegaz,G.,2025. Aperspective on explainable artificial intelligence methods: Shap and lime. Advanced Intelligent Systems 7, 2400304

work page 2025
[74]

Explainability and uncertainty: Two sides of the same coin for enhancing the interpretability of deep learning models in healthcare

Salvi, M., Seoni, S., Campagner, A., Gertych, A., Acharya, U.R., Molinari, F., Cabitza, F., 2025. Explainability and uncertainty: Two sides of the same coin for enhancing the interpretability of deep learning models in healthcare. International Journal of Medical Informatics 197, 105846

work page 2025
[75]

Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, pp

Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D., 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, pp. 618–626

work page 2017
[76]

Combination of evidence in dempster-shafer theory

Sentz, K., Ferson, S., 2002. Combination of evidence in dempster-shafer theory

work page 2002
[77]

Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013–2023)

Seoni, S., Jahmunah, V., Salvi, M., Barua, P.D., Molinari, F., Acharya, U.R., 2023. Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013–2023). Computers in Biology and Medicine 165, 107441

work page 2023
[78]

Dempster-shafer theory

Shafer, G., 1992. Dempster-shafer theory. Encyclopedia of artificial intelligence 1, 330–331

work page 1992
[79]

ProbabilisticModelingandUncertaintyAwarenessinDeepLearning

Shen,Y.,2025. ProbabilisticModelingandUncertaintyAwarenessinDeepLearning. Ph.D.thesis.TechnischeUniversität München

work page 2025
[80]

Studying ventricular abnormalities in mild cognitive impairment with hyperbolic ricci flow and tensor-based morphometry

Shi, J., Stonnington, C.M., Thompson, P.M., Chen, K., Gutman, B., Reschke, C., Baxter, L.C., Reiman, E.M., Caselli, R.J., Wang, Y., et al., 2015. Studying ventricular abnormalities in mild cognitive impairment with hyperbolic ricci flow and tensor-based morphometry. Neuroimage 104, 1–20. :Preprint submitted to ElsevierPage 31 of 32

work page 2015

Showing first 80 references.

[1] [1]

Vision transformers in medical imaging: a comprehensive review of advancements and applications across multiple diseases

Aburass, S., Dorgham, O., Al Shaqsi, J., Abu Rumman, M., Al-Kadi, O., 2025. Vision transformers in medical imaging: a comprehensive review of advancements and applications across multiple diseases. Journal of Imaging Informatics in Medicine , 1–44

work page 2025

[2] [2]

The role of big data in healthcare: A review of implications for patient outcomes and treatment personalization

Adeghe, E.P., Okolo, C.A., Ojeyinka, O.T., 2024. The role of big data in healthcare: A review of implications for patient outcomes and treatment personalization. World Journal of Biology Pharmacy and Health Sciences 17, 198–204

work page 2024

[3] [3]

Improvingmalariadiagnosisthroughinterpretablecustomizedcnnsarchitectures

Ahamed,M.F.,Nahiduzzaman,M.,Mahmud,G.,Shafi,F.B.,Ayari,M.A.,Khandakar,A.,Abdullah-Al-Wadud,M.,Islam, S.R.,2025. Improvingmalariadiagnosisthroughinterpretablecustomizedcnnsarchitectures. ScientificReports15,6484

work page 2025

[4] [4]

Early cancer detection using deep learning and medical imaging: A survey

Ahmad, I., Alqurashi, F., 2024. Early cancer detection using deep learning and medical imaging: A survey. Critical Reviews in Oncology/Hematology 204, 104528

work page 2024

[5] [5]

Hippocampal atrophy and ventricular enlargement in normal aging, mild cognitive impairment (mci), and alzheimer disease

Apostolova, L.G., Green, A.E., Babakchanian, S., Hwang, K.S., Chou, Y.Y., Toga, A.W., Thompson, P.M., 2012. Hippocampal atrophy and ventricular enlargement in normal aging, mild cognitive impairment (mci), and alzheimer disease. Alzheimer Disease & Associated Disorders 26, 17–27

work page 2012

[6] [6]

Detection and grading of diabetic retinopathy in retinal images using deep intelligent systems: A comprehensive review

Asha Gnana Priya, H., Anitha, J., Popescu, D.E., Asokan, A., Jude Hemanth, D., Son, L.H., 2021. Detection and grading of diabetic retinopathy in retinal images using deep intelligent systems: A comprehensive review. Computers, Materials & Continua 66

work page 2021

[7] [7]

Atad, M., Schinz, D., Moeller, H., Graf, R., Wiestler, B., Rueckert, D., Navab, N., Kirschke, J.S., Keicher, M., et al.,

work page

[8] [8]

Machine Learning for Biomedical Imaging 2, 2103–2125

Counterfactual explanations for medical image classification and regression using diffusion autoencoder. Machine Learning for Biomedical Imaging 2, 2103–2125

work page

[9] [9]

Ba,W.,Wu,H.,Chen,W.W.,Wang,S.H.,Zhang,Z.Y.,Wei,X.J.,Wang,W.J.,Yang,L.,Zhou,D.M.,Zhuang,Y.X.,etal.,

work page

[10] [10]

European Journal of Cancer 169, 156–165

Convolutional neural network assistance significantly improves dermatologists’ diagnosis of cutaneous tumours using clinical images. European Journal of Cancer 169, 156–165

work page

[11] [11]

Uncertainty quantification in medical image synthesis, in: Biomedical image synthesis and simulation

Barbano, R., Arridge, S., Jin, B., Tanno, R., 2022. Uncertainty quantification in medical image synthesis, in: Biomedical image synthesis and simulation. Elsevier, pp. 601–641

work page 2022

[12] [12]

Evaluatingtheexplainabilityofvisiontransformersinmedicalimaging

Barekatain,L.,Glocker,B.,2025. Evaluatingtheexplainabilityofvisiontransformersinmedicalimaging. arXivpreprint arXiv:2510.12021

work page arXiv 2025

[13] [13]

Some aspects of dempster-shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account

Bloch, I., 1996. Some aspects of dempster-shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account. Pattern Recognition Letters 17, 905–919

work page 1996

[14] [14]

Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE

Chattopadhay, A., Sarkar, A., Howlader, P., Balasubramanian, V.N., 2018. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks, in: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE. pp. 839–847

work page 2018

[15] [15]

Explaining a series of models by propagating shapley values

Chen, H., Lundberg, S.M., Lee, S.I., 2022. Explaining a series of models by propagating shapley values. Nature communications 13, 4512

work page 2022

[16] [16]

Evidence-based uncertainty-aware semi-supervised medical image segmentation

Chen, Y., Yang, Z., Shen, C., Wang, Z., Zhang, Z., Qin, Y., Wei, X., Lu, J., Liu, Y., Zhang, Y., 2024. Evidence-based uncertainty-aware semi-supervised medical image segmentation. Computers in Biology and Medicine 170, 108004

work page 2024

[17] [17]

Uncertainty propagation in xai: A comparison of analytical and empirical estimators, in: World Conference on Explainable Artificial Intelligence, Springer

Chiaburu, T., Bießmann, F., Haußer, F., 2025. Uncertainty propagation in xai: A comparison of analytical and empirical estimators, in: World Conference on Explainable Artificial Intelligence, Springer. pp. 390–411

work page 2025

[18] [18]

Chromatin-mediated epigenetic regulation in the malaria parasite plasmodium falciparum

Cui, L., Miao, J., 2010. Chromatin-mediated epigenetic regulation in the malaria parasite plasmodium falciparum. Eukaryotic cell 9, 1138–1149

work page 2010

[19] [19]

Explainable artificial intelligence (xai) in radiology and nuclear medicine: a literature review

DeVries,B.M.,Zwezerijnen,G.J.,Burchell,G.L.,vanVelden,F.H.,Menke-vanderHouvenvanOordt,C.W.,Boellaard, R., 2023. Explainable artificial intelligence (xai) in radiology and nuclear medicine: a literature review. Frontiers in medicine 10, 1180773

work page 2023

[20] [20]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Dosovitskiy, A., 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2020

[21] [21]

Ubiqtree:UncertaintyquantificationinXAIwithtreeensembles

Dubey,A.,Anžel,A.,İlgen,B.,Hattab,G.,2025. Ubiqtree:UncertaintyquantificationinXAIwithtreeensembles. arXiv preprint arXiv:2508.09639

work page arXiv 2025

[22] [22]

Ai readiness in healthcare through storytelling XAI, in: EXPLIMED@ ECAI

Dubey, A., Yang, Z., Hattab, G., 2024a. Ai readiness in healthcare through storytelling XAI, in: EXPLIMED@ ECAI

work page

[23] [23]

A nested model for AI design and validation

Dubey, A., Yang, Z., Hattab, G., 2024b. A nested model for AI design and validation. Iscience 27

work page

[24] [24]

Diabetic retinopathy detection (2015)

Dugas, E., Jared, J., Cukierski, W., . Diabetic retinopathy detection (2015). URL https://kaggle. com/competitions/diabetic-retinopathy-detection 7

work page 2015

[25] [25]

Deep learning applications in medical image analysis: Advancements, challenges, and future directions

Eli, A.A., Ali, A., 2024. Deep learning applications in medical image analysis: Advancements, challenges, and future directions. arXiv preprint arXiv:2410.14131

work page arXiv 2024

[26] [26]

Esteban, L.M., Borque-Fernando, Á., Escorihuela, M.E., Esteban-Escaño, J., Abascal, J.M., Servian, P., Morote, J.,

work page

[27] [27]

Scientific reports 15, 4261

Integrating radiological and clinical data for clinically significant prostate cancer detection with machine learning techniques. Scientific reports 15, 4261

work page

[28] [28]

Deep learning-enabled medical computer vision

Esteva, A., Chou, K., Yeung, S., Naik, N., Madani, A., Mottaghi, A., Liu, Y., Topol, E., Dean, J., Socher, R., 2021. Deep learning-enabled medical computer vision. NPJ digital medicine 4, 5

work page 2021

[29] [29]

Quantifying uncertainty in deep learning of radiologic images

Faghani, S., Moassefi, M., Rouzrokh, P., Khosravi, B., Baffour, F.I., Ringler, M.D., Erickson, B.J., 2023. Quantifying uncertainty in deep learning of radiologic images. Radiology 308, e222217. :Preprint submitted to ElsevierPage 29 of 32

work page 2023

[30] [30]

Dirichletprocesses,in:StochasticIntegrals:ProceedingsoftheLMSDurhamSymposium,July7–17, 1980, Springer

Föllmer,H.,2006. Dirichletprocesses,in:StochasticIntegrals:ProceedingsoftheLMSDurhamSymposium,July7–17, 1980, Springer. pp. 476–478

work page 2006

[31] [31]

Axiom-based grad-cam: Towards accurate visualization and explanation of cnns

Fu, R., Hu, Q., Dong, X., Guo, Y., Gao, Y., Li, B., 2020. Axiom-based grad-cam: Towards accurate visualization and explanation of cnns. arXiv e-prints , arXiv–2008

work page 2020

[32] [32]

Dropout as a bayesian approximation: Representing model uncertainty in deep learning, in: international conference on machine learning, PMLR

Gal, Y., Ghahramani, Z., 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning, in: international conference on machine learning, PMLR. pp. 1050–1059

work page 2016

[33] [33]

Uncertainty-aware visualization in medical imaging-a survey, in: Computer Graphics Forum, Wiley Online Library

Gillmann, C., Saur, D., Wischgoll, T., Scheuermann, G., 2021. Uncertainty-aware visualization in medical imaging-a survey, in: Computer Graphics Forum, Wiley Online Library. pp. 665–689

work page 2021

[34] [34]

Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp

He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778

work page 2016

[35] [35]

Explainable ai for medical data: Current methods, limitations, and future directions

Hossain, M.I., Zamzmi, G., Mouton, P.R., Salekin, M.S., Sun, Y., Goldgof, D., 2025. Explainable ai for medical data: Current methods, limitations, and future directions. ACM Computing Surveys 57, 1–46

work page 2025

[36] [36]

Deepevidentialfusionwithuncertaintyquantificationandreliability learning for multimodal medical image segmentation

Huang,L.,Ruan,S.,Decazes,P.,Denœux,T.,2025. Deepevidentialfusionwithuncertaintyquantificationandreliability learning for multimodal medical image segmentation. Information Fusion 113, 102648

work page 2025

[37] [37]

A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods

Huang, L., Ruan, S., Xing, Y., Feng, M., 2024. A review of uncertainty quantification in medical image analysis: Probabilistic and non-probabilistic methods. Medical Image Analysis 97, 103223

work page 2024

[38] [38]

Layercam: Exploring hierarchical class activation maps for localization

Jiang, P.T., Zhang, C.B., Hou, Q., Cheng, M.M., Wei, Y., 2021. Layercam: Exploring hierarchical class activation maps for localization. IEEE transactions on image processing 30, 5875–5888

work page 2021

[39] [39]

Onuncertainty,tempering,anddataaugmentationinbayesian classification

Kapoor,S.,Maddox,W.J.,Izmailov,P.,Wilson,A.G.,2022. Onuncertainty,tempering,anddataaugmentationinbayesian classification. Advances in neural information processing systems 35, 18211–18225

work page 2022

[40] [40]

Diagnosing malaria patients with plasmodium falciparum and vivax using deep learning for thick smear images

Kassim, Y.M., Yang, F., Yu, H., Maude, R.J., Jaeger, S., 2021. Diagnosing malaria patients with plasmodium falciparum and vivax using deep learning for thick smear images. Diagnostics 11, 1994

work page 2021

[41] [41]

Interpretabilitybeyondfeatureattribution: Quantitative testing with concept activation vectors (tcav), in: International conference on machine learning, PMLR

Kim,B.,Wattenberg,M.,Gilmer,J.,Cai,C.,Wexler,J.,Viegas,F.,etal.,2018. Interpretabilitybeyondfeatureattribution: Quantitative testing with concept activation vectors (tcav), in: International conference on machine learning, PMLR. pp. 2668–2677

work page 2018

[42] [42]

Simple and scalable predictive uncertainty estimation using deep ensembles

Lakshminarayanan, B., Pritzel, A., Blundell, C., 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems 30

work page 2017

[43] [43]

Trustworthy clinical ai solutions: A unified review of uncertainty quantification in deep learning models for medical image analysis

Lambert, B., Forbes, F., Doyle, S., Dehaene, H., Dojat, M., 2024. Trustworthy clinical ai solutions: A unified review of uncertainty quantification in deep learning models for medical image analysis. Artif. Intell. Medicine 150, 102830

work page 2024

[44] [44]

Oasis-3:longitudinalneuroimaging,clinical,andcognitivedatasetfornormalagingand alzheimer disease

LaMontagne, P.J., Benzinger, T.L., Morris, J.C., Keefe, S., Hornbeck, R., Xiong, C., Grant, E., Hassenstab, J., Moulder, K.,Vlassenko,A.G.,etal.,2019. Oasis-3:longitudinalneuroimaging,clinical,andcognitivedatasetfornormalagingand alzheimer disease. medrxiv , 2019–12

work page 2019

[45] [45]

Biophysical profiling of red blood cells from thin-film blood smears using deep learning

Lamoureux, E.S., Cheng, Y., Islamzada, E., Matthews, K., Duffy, S.P., Ma, H., 2024. Biophysical profiling of red blood cells from thin-film blood smears using deep learning. Heliyon 10

work page 2024

[46] [46]

Gradient-based learning applied to document recognition

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., 2002. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324

work page 2002

[47] [47]

Deep learning in medical imaging: general overview

Lee, J.G., Jun, S., Cho, Y.W., Lee, H., Kim, G.B., Seo, J.B., Kim, N., 2017. Deep learning in medical imaging: general overview. Korean journal of radiology 18, 570–584

work page 2017

[48] [48]

Medical image analysis using deep learning algorithms

Li, M., Jiang, Y., Zhang, Y., Zhu, H., 2023. Medical image analysis using deep learning algorithms. Frontiers in public health 11, 1273253

work page 2023

[49] [49]

Li, X., Zhou, Y., Dvornek, N.C., Gu, Y., Ventola, P., Duncan, J.S., 2020. Efficient shapley explanation for features importance estimation under uncertainty, in: International Conference on Medical Image Computing and Computer- Assisted Intervention, Springer. pp. 792–801

work page 2020

[50] [50]

A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis

Liu, X., Faes, L., Kale, A.U., Wagner, S.K., Fu, D.J., Bruynseels, A., Mahendiran, T., Moraes, G., Shamdas, M., Kern, C., et al., 2019. A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis. The lancet digital health 1, e271–e297

work page 2019

[51] [51]

Enhancingmedicalimagesegmentationviacomplementarycnn-transformer fusion and boundary perception

Liu,X.,Tian,J.,Huang,S.,Shen,W.,2025. Enhancingmedicalimagesegmentationviacomplementarycnn-transformer fusion and boundary perception. Frontiers in Computer Science 7, 1677905

work page 2025

[52] [52]

Uncertainty-aware deep learning in healthcare: a scoping review

Loftus,T.J.,Shickel,B.,Ruppert,M.M.,Balch,J.A.,Ozrazgat-Baslanti,T.,Tighe,P.J.,Efron,P.A.,Hogan,W.R.,Rashidi, P., Upchurch Jr, G.R., et al., 2022. Uncertainty-aware deep learning in healthcare: a scoping review. PLOS digital health 1, e0000085

work page 2022

[53] [53]

Towardsaleatoricandepistemicuncertaintyinmedicalimageclassification, in: International Conference on Artificial Intelligence in Medicine, Springer

Löhr,T.,Ingrisch,M.,Hüllermeier,E.,2024. Towardsaleatoricandepistemicuncertaintyinmedicalimageclassification, in: International Conference on Artificial Intelligence in Medicine, Springer. pp. 145–155

work page 2024

[54] [54]

A unified approach to interpreting model predictions

Lundberg, S.M., Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems 30

work page 2017

[55] [55]

A comprehensive reviewofdeepneuralnetworksformedicalimageprocessing:Recentdevelopmentsandfutureopportunities

Mall, P.K., Singh, P.K., Srivastav, S., Narayan, V., Paprzycki, M., Jaworska, T., Ganzha, M., 2023. A comprehensive reviewofdeepneuralnetworksformedicalimageprocessing:Recentdevelopmentsandfutureopportunities. Healthcare Analytics 4, 100216

work page 2023

[56] [56]

Ventricular features as reliable differentiators between bvftd and other dementias

Manera, A.L., Dadar, M., Collins, D.L., Ducharme, S., Initiative, F.L.D.N., (ADNI, A.D.N.I., et al., 2022. Ventricular features as reliable differentiators between bvftd and other dementias. NeuroImage: Clinical 33, 102947. :Preprint submitted to ElsevierPage 30 of 32

work page 2022

[57] [57]

Deep learning–based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis

Manikandan, S., Raman, R., Rajalakshmi, R., Tamilselvi, S., Surya, R.J., 2023. Deep learning–based detection of diabetic macular edema using optical coherence tomography and fundus images: A meta-analysis. Indian Journal of Ophthalmology 71, 1783–1796

work page 2023

[58] [58]

Openaccessseriesofimaging studies (oasis): cross-sectional mri data in young, middle aged, nondemented, and demented older adults

Marcus,D.S.,Wang,T.H.,Parker,J.,Csernansky,J.G.,Morris,J.C.,Buckner,R.L.,2007. Openaccessseriesofimaging studies (oasis): cross-sectional mri data in young, middle aged, nondemented, and demented older adults. Journal of cognitive neuroscience 19, 1498–1507

work page 2007

[59] [59]

Ganterfactual—counterfactual explanations for medical non-experts using generative adversarial learning

Mertes, S., Huber, T., Weitz, K., Heimerl, A., André, E., 2022. Ganterfactual—counterfactual explanations for medical non-experts using generative adversarial learning. Frontiers in artificial intelligence 5, 825565

work page 2022

[60] [60]

Monte carlo dropout for uncertainty estimation and motor imagery classification

Milanés-Hermosilla,D.,TrujilloCodorniú,R.,López-Baracaldo,R.,Sagaró-Zamora,R.,Delisle-Rodriguez,D.,Villarejo- Mayor, J.J., Nunez-Alvarez, J.R., 2021. Monte carlo dropout for uncertainty estimation and motor imagery classification. Sensors 21, 7241

work page 2021

[61] [61]

Reinventing radiology: big data and the future of medical imaging

Morris, M.A., Saboury, B., Burkett, B., Gao, J., Siegel, E.L., 2018. Reinventing radiology: big data and the future of medical imaging. Journal of thoracic imaging 33, 4–16

work page 2018

[62] [62]

Computational and structural biotechnology journal 24, 542–560

Muhammad,D.,Bendechache,M.,2024.Unveilingtheblackbox:Asystematicreviewofexplainableartificialintelligence in medical image analysis. Computational and structural biotechnology journal 24, 542–560

work page 2024

[63] [63]

Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks

Nazir, S., Dickson, D.M., Akram, M.U., 2023. Survey of explainable artificial intelligence techniques for biomedical imaging with deep neural networks. Computers in Biology and Medicine 156, 106668

work page 2023

[64] [64]

Nguyen, V.P., Trinh, N.H., Nguyen, D.M.L., Nguyen, P.L., Tran, Q.L., 2025. Aleatoric uncertainty medical image segmentation estimation via flow matching, in: International Workshop on Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, Springer. pp. 134–144

work page 2025

[65] [65]

Emerging trends in ai-powered medical imaging: enhancing diagnostic accuracy and treatment decisions

Oyeniyi, J., Oluwaseyi, P., 2024. Emerging trends in ai-powered medical imaging: enhancing diagnostic accuracy and treatment decisions. International Journal of Enhanced Research In Science Technology & Engineering 13, 81–94

work page 2024

[66] [66]

Parcalabescu, L., Frank, A., 2023. Mm-shap: A performance-agnostic metric for measuring multimodal contributions in vision and language models & tasks, in: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 4032–4059

work page 2023

[67] [67]

Practical guide to shap analysis: Explaining supervised machine learning model predictions in drug development

Ponce-Bobadilla, A.V., Schmitt, V., Maier, C.S., Mensing, S., Stodtmann, S., 2024. Practical guide to shap analysis: Explaining supervised machine learning model predictions in drug development. Clinical and translational science 17, e70056

work page 2024

[68] [68]

Revolutionizing healthcare: a comparative insight into deep learning’s role in medical imaging

Prasad, V.K., Verma, A., Bhattacharya, P., Shah, S., Chowdhury, S., Bhavsar, M., Aslam, S., Ashraf, N., 2024. Revolutionizing healthcare: a comparative insight into deep learning’s role in medical imaging. Scientific Reports 14, 30273

work page 2024

[69] [69]

Enhanced mri brain tumor detection using deep learning in conjunction with explainable ai shap based diverse and multi feature analysis

Rahman, A., Hayat, M., Iqbal, N., Alarfaj, F.K., Alkhalaf, S., Alturise, F., 2025. Enhanced mri brain tumor detection using deep learning in conjunction with explainable ai shap based diverse and multi feature analysis. Scientific Reports 15, 29411

work page 2025

[70] [70]

Ramaswamy, H.G., et al., 2020. Ablation-cam: Visual explanations for deep convolutional network via gradient-free localization, in: proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 983–991

work page 2020

[71] [71]

Herecomestheexplanation:Ashapley perspective on multi-contrast medical image segmentation

Ren,T.,Rivera,J.H.,Oswal,H.,Pan,Y.,Chopra,A.,Ruzevick,J.,Kurt,M.,2025. Herecomestheexplanation:Ashapley perspective on multi-contrast medical image segmentation. arXiv preprint arXiv:2504.04645

work page arXiv 2025

[72] [72]

why should i trust you?

Ribeiro, M.T., Singh, S., Guestrin, C., 2016. " why should i trust you?" explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1135– 1144

work page 2016

[73] [73]

Aperspective on explainable artificial intelligence methods: Shap and lime

Salih,A.M.,Raisi-Estabragh,Z.,Galazzo,I.B.,Radeva,P.,Petersen,S.E.,Lekadir,K.,Menegaz,G.,2025. Aperspective on explainable artificial intelligence methods: Shap and lime. Advanced Intelligent Systems 7, 2400304

work page 2025

[74] [74]

Explainability and uncertainty: Two sides of the same coin for enhancing the interpretability of deep learning models in healthcare

Salvi, M., Seoni, S., Campagner, A., Gertych, A., Acharya, U.R., Molinari, F., Cabitza, F., 2025. Explainability and uncertainty: Two sides of the same coin for enhancing the interpretability of deep learning models in healthcare. International Journal of Medical Informatics 197, 105846

work page 2025

[75] [75]

Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, pp

Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D., 2017. Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, pp. 618–626

work page 2017

[76] [76]

Combination of evidence in dempster-shafer theory

Sentz, K., Ferson, S., 2002. Combination of evidence in dempster-shafer theory

work page 2002

[77] [77]

Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013–2023)

Seoni, S., Jahmunah, V., Salvi, M., Barua, P.D., Molinari, F., Acharya, U.R., 2023. Application of uncertainty quantification to artificial intelligence in healthcare: A review of last decade (2013–2023). Computers in Biology and Medicine 165, 107441

work page 2023

[78] [78]

Dempster-shafer theory

Shafer, G., 1992. Dempster-shafer theory. Encyclopedia of artificial intelligence 1, 330–331

work page 1992

[79] [79]

ProbabilisticModelingandUncertaintyAwarenessinDeepLearning

Shen,Y.,2025. ProbabilisticModelingandUncertaintyAwarenessinDeepLearning. Ph.D.thesis.TechnischeUniversität München

work page 2025

[80] [80]

Studying ventricular abnormalities in mild cognitive impairment with hyperbolic ricci flow and tensor-based morphometry

Shi, J., Stonnington, C.M., Thompson, P.M., Chen, K., Gutman, B., Reschke, C., Baxter, L.C., Reiman, E.M., Caselli, R.J., Wang, Y., et al., 2015. Studying ventricular abnormalities in mild cognitive impairment with hyperbolic ricci flow and tensor-based morphometry. Neuroimage 104, 1–20. :Preprint submitted to ElsevierPage 31 of 32

work page 2015