What do we need to build explainable AI systems for the medical domain?

Andreas Holzinger , Chris Biemann , Constantinos S. Pattichis , Douglas B. Kell

Authors on Pith no claims yet

classification 💻 cs.AI stat.ML

keywords datadomainlearningmedicalapproachesevenmakesystems

read the original abstract

Artificial intelligence (AI) generally and machine learning (ML) specifically demonstrate impressive practical success in many different application domains, e.g. in autonomous driving, speech recognition, or recommender systems. Deep learning approaches, trained on extremely large data sets or using reinforcement learning methods have even exceeded human performance in visual tasks, particularly on playing games such as Atari, or mastering the game of Go. Even in the medical domain there are remarkable results. The central problem of such models is that they are regarded as black-box models and even if we understand the underlying mathematical principles, they lack an explicit declarative knowledge representation, hence have difficulty in generating the underlying explanatory structures. This calls for systems enabling to make decisions transparent, understandable and explainable. A huge motivation for our approach are rising legal and privacy aspects. The new European General Data Protection Regulation entering into force on May 25th 2018, will make black-box approaches difficult to use in business. This does not imply a ban on automatic learning approaches or an obligation to explain everything all the time, however, there must be a possibility to make the results re-traceable on demand. In this paper we outline some of our research topics in the context of the relatively new area of explainable-AI with a focus on the application in medicine, which is a very special domain. This is due to the fact that medical professionals are working mostly with distributed heterogeneous and complex sources of data. In this paper we concentrate on three sources: images, *omics data and text. We argue that research in explainable-AI would generally help to facilitate the implementation of AI/ML in the medical domain, and specifically help to facilitate transparency and trust.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Measuring Faithfulness in Chain-of-Thought Reasoning
cs.AI 2023-07 conditional novelty 7.0

Chain-of-Thought reasoning in LLMs is often unfaithful, with models relying on it variably by task and less so as models scale larger.
Atomic Fact-Checking Increases Clinician Trust in Large Language Model Recommendations for Oncology Decision Support: A Randomized Controlled Trial
cs.CL 2026-05 conditional novelty 6.0

Atomic fact-checking of LLM oncology recommendations increased clinician trust from 26.9% to 66.5% (Cohen's d=0.94) in a trial of 356 doctors.
Seeing What Shouldn't Be There: Counterfactual GANs for Medical Image Attribution
cs.CV 2026-05 unverdicted novelty 5.0

A cycle-consistent GAN generates counterfactual medical images to attribute classification decisions more comprehensively than standard saliency methods.
Grounded Multimodal Retrieval-Augmented Drafting of Radiology Impressions Using Case-Based Similarity Search
q-bio.QM 2026-03 unverdicted novelty 5.0

A case-based multimodal RAG system for chest radiograph impressions achieves Recall@5 above 0.95 and produces citation-traceable drafts.
ADP-FL-MedSeg: Adaptive Differential Privacy for Federated Medical Segmentation Across Diverse Modalities
eess.IV 2026-04 unverdicted novelty 4.0

ADP-FL adaptively tunes differential privacy in federated learning to deliver higher Dice scores, better boundary quality, faster convergence, and greater stability in medical image segmentation across skin lesions, k...
ADP-FL-MedSeg: Adaptive Differential Privacy for Federated Medical Segmentation Across Diverse Modalities
eess.IV 2026-04 unverdicted novelty 4.0

ADP-FL adapts differential privacy mechanisms in federated learning to achieve higher segmentation accuracy, better boundaries, faster convergence, and stability across skin, kidney, and brain imaging tasks while pres...