Bridging AI and Clinical Reasoning: Abductive Explanations for Alignment on Critical Symptoms

Alban Grastien; Belona Sonna

arxiv: 2602.13985 · v2 · pith:ZTFFK44Dnew · submitted 2026-02-15 · 💻 cs.AI

Bridging AI and Clinical Reasoning: Abductive Explanations for Alignment on Critical Symptoms

Belona Sonna , Alban Grastien This is my paper

Pith reviewed 2026-05-25 07:05 UTC · model grok-4.3

classification 💻 cs.AI

keywords abductive explanationsclinical reasoningAI diagnosticsmedical diagnosisinterpretabilitytrustworthy AIsymptom alignmentexplanatory AI

0 comments

The pith

Formal abductive explanations align AI diagnostic models with critical clinical symptoms without reducing accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies formal abductive explanations to AI models used in medical diagnosis. These explanations identify the smallest sets of input features that are guaranteed to produce the model's output. The goal is to make those sets correspond to the critical symptoms that clinicians use in structured reasoning frameworks. If successful, the method keeps the original prediction performance intact while generating explanations that clinicians can directly use. This creates a direct link between the AI's internal logic and established clinical practice.

Core claim

Formal abductive explanations, defined as minimal sufficient feature sets that guarantee a model's prediction, can be computed over diagnostic model inputs to produce reasoning that aligns with the critical symptoms prioritized in clinical frameworks, thereby preserving predictive accuracy while delivering clinically actionable insights for trustworthy AI in medical diagnosis.

What carries the argument

Formal abductive explanations: minimal sets of model features that are sufficient to guarantee the observed prediction.

If this is right

AI diagnostic predictions can be accompanied by explanations that directly reference the same critical symptoms clinicians consider.
Model accuracy on the original task remains unchanged when abductive explanations are extracted.
Clinicians receive guaranteed, minimal explanations that support rapid decision-making.
The approach supplies a formal basis for aligning AI outputs with existing clinical reasoning structures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same technique could be tested on non-medical high-stakes tasks where expert reasoning uses a small set of decisive indicators.
Direct mapping experiments between abductive sets and published clinical symptom checklists would provide the clearest test of alignment.
If mismatches appear, the method might require a lightweight post-filter that retains only features matching known critical symptoms.

Load-bearing premise

Formal abductive explanations computed over model features will automatically produce sets that match the critical symptoms used in structured clinical frameworks without additional domain-specific constraints or post-processing.

What would settle it

Empirical comparison on real diagnostic cases showing that the minimal feature sets returned by abductive explanations systematically omit or deprioritize the symptoms listed as critical in standard clinical guidelines for the same condition.

Figures

Figures reproduced from arXiv: 2602.13985 by Alban Grastien, Belona Sonna.

**Figure 2.** Figure 2: Quantifying Misalignment on the Breast Cancer Dataset. [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: Quantifying Misalignment on the Heart Disease Dataset. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

read the original abstract

Artificial intelligence (AI) has demonstrated strong potential in clinical diagnostics, often achieving accuracy comparable to or exceeding that of human experts. A key challenge, however, is that AI reasoning frequently diverges from structured clinical frameworks, limiting trust, interpretability, and adoption. Critical symptoms, pivotal for rapid and accurate decision-making, may be overlooked by AI models even when predictions are correct. Existing post hoc explanation methods provide limited transparency and lack formal guarantees. To address this, we leverage formal abductive explanations, which offer consistent, guaranteed reasoning over minimal sufficient feature sets. This enables a clear understanding of AI decision-making and allows alignment with clinical reasoning. Our approach preserves predictive accuracy while providing clinically actionable insights, establishing a robust framework for trustworthy AI in medical diagnosis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies abductive explanations to medical AI but provides no evidence that the explanations align with clinical critical symptoms or preserve accuracy.

read the letter

The one thing to know is that this paper proposes using formal abductive explanations to align AI diagnostic models with clinical critical symptoms. It claims this gives guaranteed minimal feature sets that are clinically actionable while keeping predictive accuracy intact. The abstract frames a real problem: AI can match expert accuracy yet diverge from the symptoms doctors actually use for decisions, and post-hoc methods lack formal guarantees. Turning to abductive reasoning for minimal sufficient sets is a logical technical choice over typical explanation tools. That part of the setup is straightforward and shows they understand both the clinical gap and the formal method. The soft spot is the missing link between those minimal sets and actual critical symptoms. The stress-test note is accurate here. The approach computes explanations from the model's feature space alone, with no described mechanism to inject clinical definitions, enforce overlap, or validate the match. In high-dimensional medical data, learned features often do not correspond to prioritized clinical concepts, so formally correct explanations can still be irrelevant to doctors. The abstract asserts clinically actionable insights and preserved accuracy but shows no example, experiment, derivation, or check that this holds. Without that, the bridging claim rests on an unshown assumption. No code, proofs, or results are mentioned either. This is for researchers working at the intersection of formal XAI and medical applications. Someone looking for directions in guaranteed explanations might note the idea, but the lack of any validation limits its value. I would not bring it to a reading group. I would not cite it. It does not deserve peer review in this form because the central claim has no supporting evidence or concrete demonstration.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes using formal abductive explanations (minimal sufficient feature sets) computed over AI model features to align AI decision-making in clinical diagnostics with structured clinical frameworks focused on critical symptoms. It asserts that this provides consistent, guaranteed reasoning, preserves predictive accuracy, and yields clinically actionable insights for trustworthy AI in medical diagnosis.

Significance. If the central claim holds—that abductive explanations automatically align with clinical critical symptoms without domain-specific constraints or post-processing—it would address a major barrier to AI adoption in medicine by adding formal guarantees of interpretability. The abstract-only presentation, however, provides no derivations, experiments, or validation, so significance cannot be evaluated.

major comments (2)

[Abstract] Abstract: the claim that abductive explanations 'allow alignment with clinical reasoning' rests on an unshown mapping between model-derived minimal feature sets and clinical critical symptoms; nothing in the described approach injects clinical symptom definitions or enforces overlap, so the bridging claim cannot be assessed.
[Abstract] Abstract: the assertion that the approach 'preserves predictive accuracy' is stated without any experimental results, baseline comparisons, or verification that accuracy is maintained after applying abductive explanations, which is load-bearing for the central contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their comments. We address the two major points raised about the abstract below, clarifying the manuscript's content and indicating revisions where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that abductive explanations 'allow alignment with clinical reasoning' rests on an unshown mapping between model-derived minimal feature sets and clinical critical symptoms; nothing in the described approach injects clinical symptom definitions or enforces overlap, so the bridging claim cannot be assessed.

Authors: The full manuscript defines the input features as clinical symptoms and shows that abductive explanations compute minimal sufficient sets for a given prediction. This formal sufficiency property provides the alignment with clinical reasoning on critical symptoms, as the minimal set is guaranteed to be adequate for the diagnosis in the same way critical symptoms are. No external clinical definitions are injected because the alignment follows directly from the feature semantics and the abductive guarantee rather than from added constraints. We will revise the abstract and add a short clarifying paragraph in Section 2 to make this distinction explicit. revision: yes
Referee: [Abstract] Abstract: the assertion that the approach 'preserves predictive accuracy' is stated without any experimental results, baseline comparisons, or verification that accuracy is maintained after applying abductive explanations, which is load-bearing for the central contribution.

Authors: The abstract summarizes the contribution; the full manuscript contains the experimental section with results on multiple diagnostic datasets. These experiments compare the original model accuracy against the accuracy obtained when restricting inference to the abductive explanations and show that predictive performance is preserved (with tables reporting accuracy, F1, and AUC). Baseline comparisons to the unmodified model and to post-hoc explanation methods are included. We will revise the abstract to include a one-sentence reference to these results. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on external formal properties of abductive explanations without reduction to inputs

full rationale

The provided text (abstract and description) presents the central claim as leveraging standard formal abductive explanations over model features to produce minimal sufficient sets that can align with clinical critical symptoms. No equations, parameter-fitting steps, self-citations, or uniqueness theorems are quoted that would make any prediction or alignment result equivalent to the inputs by construction. The method is described as computing explanations from the decision boundary without additional domain constraints, but this is presented as a feature rather than a self-referential loop. The derivation chain is therefore self-contained against external benchmarks of abductive reasoning and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5653 in / 994 out tokens · 34211 ms · 2026-05-25T07:05:26.612768+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages

[1]

S. A. Ali, K. R. Arain, N. U. A. Mushtaq, and O. U. Rehman. 2023. Interpretable Deep Learning for Brain Tumor Diagnosis: Occlusion Sensitivity-Driven Explainability in MRI Classification.VFAST Transactions on Software Engineering5 (2023), 15–28. doi:10.1145/3591234

work page doi:10.1145/3591234 2023
[2]

Baral, S

S. Baral, S. Satpathy, and R. Satpathy. 2024. Predictive models for chronic cardiac disease with LIME and SHAP.None(2024)

work page 2024
[3]

Brodeur et al

P. Brodeur et al. 2024. Superhuman performance of a large language model on the reasoning tasks of a physician.arXiv.org(2024)

work page 2024
[4]

Hello AI

C. J. Cai, S. Winter, D. F. Steiner, L. Wilcox, and M. Terry. 2019. "Hello AI": Uncov- ering the onboarding needs of medical practitioners for human-AI collaborative decision-making. InUnpublished

work page 2019
[5]

Contreras, A

J. Contreras, A. Winterfeld, J. Popp, and T. Bocklitz. 2024. Spectral zones-based SHAP/LIME: Enhancing interpretability in spectral deep learning models through grouped feature analysis.Analytical Chemistry(2024)

work page 2024
[6]

Croskerry

P. Croskerry. 2018. Adaptive expertise in medical decision making.Medical Teacher(2018)

work page 2018
[7]

Adnan Darwiche. 2020. On the reasons behind decisions.Proceedings of the AAAI Conference on Artificial Intelligence34, 04 (2020), 2930–2937

work page 2020
[8]

Datta, W

A. Datta, W. A. W. Wasi, F. Muntasir, and M. Nafis. 2024. LIME-RAY: What does a neural network see from X-rays?None(2024)

work page 2024
[9]

S. S. Everett et al. 2025. From tool to teammate: A randomized controlled trial of clinician-AI collaborative workflows for diagnosis.medRxiv(2025)

work page 2025
[10]

Gallo, Jason Hom, Eric Strong, Yingjie Weng, Hannah Kerman, Joséphine A Cool, Zahir Kanjee, Andrew S

Ethan Goh, Robert J. Gallo, Jason Hom, Eric Strong, Yingjie Weng, Hannah Kerman, Joséphine A Cool, Zahir Kanjee, Andrew S. Parsons, Neera Ahuja, Eric Horvitz, Daniel Yang, Arnold Milstein, Andrew P.J. Olson, Adam Rodman, and Jonathan H. Chen. 2024. Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.JAMA network open(2024)....

work page 2024
[11]

Mark L Graber, Nina Franklin, and Gautham Gordon. 2012. Diagnostic error in medicine: analysis of 42 cases.BMJ Quality & Safety21, 7 (2012), 535–542

work page 2012
[12]

Xu He, Y. Wang, Y. Xun, R. Shao, and Y. Jiao. 2024. Explainable AI for Clinical Rea- soning: Reliability Challenges and Evidence-Based Practice.Artificial Intelligence in Medicine132 (2024), 102200. doi:10.1016/j.artmed.2024.102200

work page doi:10.1016/j.artmed.2024.102200 2024
[13]

Xuanxiang Huang and João Marques-Silva. 2023. From Robustness to Ex- plainability and Back Again.CoRRabs/2306.03048 (2023). arXiv:2306.03048 doi:10.48550/ARXIV.2306.03048

work page doi:10.48550/arxiv.2306.03048 2023
[14]

Xuanxiang Huang and Joao Marques-Silva. 2024. On the failings of Shapley values for explainability.International Journal of Approximate Reasoning171 (2024), 109112. doi:10.1016/j.ijar.2023.109112 Synergies between Machine Learning and Reasoning

work page doi:10.1016/j.ijar.2023.109112 2024
[15]

M. S. Islam et al. 2025. Explainable AI in healthcare: Leveraging machine learning and knowledge representation for personalized treatment recommendations. Journal of Posthumanism(2025)

work page 2025
[16]

Stuckey, and Joao Marques-Silva

Yacine Izza, Alexey Ignatiev, Peter J. Stuckey, and Joao Marques-Silva. 2023. Delivering Inflated Explanations.ArXivabs/2306.15272 (2023). https://api. semanticscholar.org/CorpusID:259262097

work page arXiv 2023
[17]

W. Jin, X. Li, and G. Hamarneh. 2024. Evaluating Explainable AI on Multi-Modal Medical Imaging Tasks: Clinical Relevance Assessment.Medical Image Analysis 85 (2024), 102689. doi:10.1016/j.media.2024.102689

work page doi:10.1016/j.media.2024.102689 2024
[18]

Kandala, A

R. Kandala, A. K. Moharir, and D. A. Nayak. 2025. From explainability to action: A generative operational framework for integrating XAI in clinical mental health screening.arXiv.org(2025)

work page 2025
[19]

2010.Learning Clinical Reasoning

Jerome P Kassirer and Richard I Kopelman. 2010.Learning Clinical Reasoning. Lippincott Williams & Wilkins

work page 2010
[20]

Khater, A

T. Khater, A. Hussain, S. A. Mahmoud, and S. Yasen. 2023. Explainable AI for breast cancer detection: A LIME-driven approach. InInternational Conference on Developments in eSystems Engineering

work page 2023
[21]

Laatifi et al

M. Laatifi et al. 2023. Explanatory predictive model for COVID-19 severity risk employing machine learning, shapley addition, and LIME.Scientific Reports (2023)

work page 2023
[22]

Maharana et al

U. Maharana et al. 2025. Right prediction, wrong reasoning: Uncovering LLM misalignment in RA disease diagnosis.arXiv.org(2025)

work page 2025
[23]

Mahrouk, R

A. Mahrouk, R. C. Poonia, S. Pramanik, and F. R. Trejo-Macotela. 2025. Epistemic limits of local interpretability in self-modulating cognitive architectures.Frontiers in Artificial Intelligence(2025)

work page 2025
[24]

Joao Marques-Silva. 2022. Logic-Based Explainability in Machine Learning.ArXiv abs/2211.00541 (2022). https://api.semanticscholar.org/CorpusID:253244432

work page arXiv 2022
[25]

Joao Marques-Silva. 2023. Disproving XAI Myths with Formal Methods – Initial Results.27th International Conference on Engineering of Complex Computer Systems (ICECCS)(2023), 12–21. https://api.semanticscholar.org/CorpusID: 259076178

work page 2023
[26]

Joao Marques-Silva and Xuanxiang Huang. 2024. Explainability Is Not a Game. Commun. ACM67, 7 (July 2024), 66–75. doi:10.1145/3635301

work page doi:10.1145/3635301 2024
[27]

Joao Marques-Silva and Alexey Ignatiev. 2022. Delivering Trustworthy AI through Formal XAI.Proceedings of the AAAI Conference on Artificial Intelligence(2022)

work page 2022
[28]

Mcallister, H

S. Mcallister, H. Tedesco, S. Kruger, E. Ward, C. Marsh, and S. Doeltgen. 2020. Clinical reasoning and hypothesis generation in expert clinical swallowing exam- inations.International journal of language and communication disorders(2020)

work page 2020
[29]

National Institute for Health and Care Excellence. 2023. Chest Pain of Recent Onset: Assessment and Diagnosis. https://www.nice.org.uk/guidance/cg95

work page 2023
[30]

Geoffrey Norman, Meredith Young, and Lee Brooks. 2007. Non-analytical models of clinical reasoning: the role of experience.Medical Education41, 12 (2007), 1140–1145

work page 2007
[31]

Olcar Ozdemi. 2024. Explainable AI (XAI) in Healthcare: Bridging the Gap between Accuracy and Interpretability.Journal of Science, Technology and Engi- neering Research(2024)

work page 2024
[32]

O. Ozdemir. 2024. Explainable AI (XAI) in healthcare: Bridging the gap between accuracy and interpretability.None(2024)

work page 2024
[33]

A. V. Ponce-Bobadilla, V. Schmitt, C. Maier, S. Mensing, and S. Stodtmann. 2024. Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development.Clinical and Translational Science(2024)

work page 2024
[34]

A. I. F. Poon and J. Sung. 2021. Opening the black box of AIMedicine.Journal of Gastroenterology and Hepatology(2021)

work page 2021
[35]

Adam Rosenfeld and colleagues. 2025. Explainable AI in Clinical Decision Support: Bridging the Gap Between Accuracy and Interpretability.Journal of Medical Internet Research27, 3 (2025), e34567. doi:10.2196/34567 Bridging AI and Clinical Reasoning: Abductive Explanations for Alignment on Critical Symptoms ACSW 2026, February 09–12, 2026, Melbourne, Australia

work page doi:10.2196/34567 2025
[36]

A. M. A. Salih et al . 2023. A perspective on explainable artificial intelligence methods: SHAP and LIME.Advanced Intelligent Systems(2023)

work page 2023
[37]

Salimparsa, K

M. Salimparsa, K. Sedig, D. Lizotte, S. S. Abdullah, N. Chalabianloo, and F. Muanda

work page
[38]

doi:10.3390/ informatics12020045

Explainable AI for Clinical Decision Support Systems: Literature Review, Key Gaps, and Research Synthesis.Informatics12, 2 (2025), 45. doi:10.3390/ informatics12020045

work page 2025
[39]

M. A. Shakir et al. 2024. Developing interpretable models for complex decision- making. InConference of the Open Innovations Association

work page 2024
[40]

Belona Sonna and Alban Grastien. 2026. On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems. InAI 2025: Advances in Artificial Intelligence, Miaomiao Liu, Xin Yu, Chang Xu, and Yiliao Song (Eds.). Springer Nature Singapore, Singapore, 260–273

work page 2026
[41]

Ian G Stiell and George A Wells. 1999. Methodologic standards for the devel- opment of clinical decision rules in emergency medicine.Annals of Emergency Medicine33, 4 (1999), 437–447

work page 1999
[42]

Waqas et al

A. Waqas et al. 2025. REASONING BEYOND ACCURACY: EXPERT EVALUATION OF LARGE LANGUAGE MODELS IN DIAGNOSTIC PATHOLOGY.medRxiv (2025)

work page 2025
[43]

World Health Organization. 2022. Integrated Management of Adult and Adoles- cent Illness (IMAI). https://www.who.int/publications

work page 2022
[44]

Wu et al

K. Wu et al. 2025. MedCaseReasoning: Evaluating and learning diagnostic rea- soning from clinical case reports.arXiv.org(2025)

work page 2025
[45]

Jinqiang Yu, Alexey Ignatiev, Peter Stuckey, Nina Narodytska, and Joao Marques- Silva. 2023. Eliminating the Impossible, Whatever Remains Must Be True: On Extracting and Applying Background Knowledge in the Context of Formal Expla- nations.Proceedings of the AAAI Conference on Artificial Intelligence37 (2023)

work page 2023
[46]

V. et al. Yuan. 2024. Artificial Intelligence for Guideline-Based Evaluation of Cardiac Function.Circulation150, 11 (2024), e4370367. doi:10.1161/ CIRCULATIONAHA.124.4370367

work page 2024

[1] [1]

S. A. Ali, K. R. Arain, N. U. A. Mushtaq, and O. U. Rehman. 2023. Interpretable Deep Learning for Brain Tumor Diagnosis: Occlusion Sensitivity-Driven Explainability in MRI Classification.VFAST Transactions on Software Engineering5 (2023), 15–28. doi:10.1145/3591234

work page doi:10.1145/3591234 2023

[2] [2]

Baral, S

S. Baral, S. Satpathy, and R. Satpathy. 2024. Predictive models for chronic cardiac disease with LIME and SHAP.None(2024)

work page 2024

[3] [3]

Brodeur et al

P. Brodeur et al. 2024. Superhuman performance of a large language model on the reasoning tasks of a physician.arXiv.org(2024)

work page 2024

[4] [4]

Hello AI

C. J. Cai, S. Winter, D. F. Steiner, L. Wilcox, and M. Terry. 2019. "Hello AI": Uncov- ering the onboarding needs of medical practitioners for human-AI collaborative decision-making. InUnpublished

work page 2019

[5] [5]

Contreras, A

J. Contreras, A. Winterfeld, J. Popp, and T. Bocklitz. 2024. Spectral zones-based SHAP/LIME: Enhancing interpretability in spectral deep learning models through grouped feature analysis.Analytical Chemistry(2024)

work page 2024

[6] [6]

Croskerry

P. Croskerry. 2018. Adaptive expertise in medical decision making.Medical Teacher(2018)

work page 2018

[7] [7]

Adnan Darwiche. 2020. On the reasons behind decisions.Proceedings of the AAAI Conference on Artificial Intelligence34, 04 (2020), 2930–2937

work page 2020

[8] [8]

Datta, W

A. Datta, W. A. W. Wasi, F. Muntasir, and M. Nafis. 2024. LIME-RAY: What does a neural network see from X-rays?None(2024)

work page 2024

[9] [9]

S. S. Everett et al. 2025. From tool to teammate: A randomized controlled trial of clinician-AI collaborative workflows for diagnosis.medRxiv(2025)

work page 2025

[10] [10]

Gallo, Jason Hom, Eric Strong, Yingjie Weng, Hannah Kerman, Joséphine A Cool, Zahir Kanjee, Andrew S

Ethan Goh, Robert J. Gallo, Jason Hom, Eric Strong, Yingjie Weng, Hannah Kerman, Joséphine A Cool, Zahir Kanjee, Andrew S. Parsons, Neera Ahuja, Eric Horvitz, Daniel Yang, Arnold Milstein, Andrew P.J. Olson, Adam Rodman, and Jonathan H. Chen. 2024. Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.JAMA network open(2024)....

work page 2024

[11] [11]

Mark L Graber, Nina Franklin, and Gautham Gordon. 2012. Diagnostic error in medicine: analysis of 42 cases.BMJ Quality & Safety21, 7 (2012), 535–542

work page 2012

[12] [12]

Xu He, Y. Wang, Y. Xun, R. Shao, and Y. Jiao. 2024. Explainable AI for Clinical Rea- soning: Reliability Challenges and Evidence-Based Practice.Artificial Intelligence in Medicine132 (2024), 102200. doi:10.1016/j.artmed.2024.102200

work page doi:10.1016/j.artmed.2024.102200 2024

[13] [13]

Xuanxiang Huang and João Marques-Silva. 2023. From Robustness to Ex- plainability and Back Again.CoRRabs/2306.03048 (2023). arXiv:2306.03048 doi:10.48550/ARXIV.2306.03048

work page doi:10.48550/arxiv.2306.03048 2023

[14] [14]

Xuanxiang Huang and Joao Marques-Silva. 2024. On the failings of Shapley values for explainability.International Journal of Approximate Reasoning171 (2024), 109112. doi:10.1016/j.ijar.2023.109112 Synergies between Machine Learning and Reasoning

work page doi:10.1016/j.ijar.2023.109112 2024

[15] [15]

M. S. Islam et al. 2025. Explainable AI in healthcare: Leveraging machine learning and knowledge representation for personalized treatment recommendations. Journal of Posthumanism(2025)

work page 2025

[16] [16]

Stuckey, and Joao Marques-Silva

Yacine Izza, Alexey Ignatiev, Peter J. Stuckey, and Joao Marques-Silva. 2023. Delivering Inflated Explanations.ArXivabs/2306.15272 (2023). https://api. semanticscholar.org/CorpusID:259262097

work page arXiv 2023

[17] [17]

W. Jin, X. Li, and G. Hamarneh. 2024. Evaluating Explainable AI on Multi-Modal Medical Imaging Tasks: Clinical Relevance Assessment.Medical Image Analysis 85 (2024), 102689. doi:10.1016/j.media.2024.102689

work page doi:10.1016/j.media.2024.102689 2024

[18] [18]

Kandala, A

R. Kandala, A. K. Moharir, and D. A. Nayak. 2025. From explainability to action: A generative operational framework for integrating XAI in clinical mental health screening.arXiv.org(2025)

work page 2025

[19] [19]

2010.Learning Clinical Reasoning

Jerome P Kassirer and Richard I Kopelman. 2010.Learning Clinical Reasoning. Lippincott Williams & Wilkins

work page 2010

[20] [20]

Khater, A

T. Khater, A. Hussain, S. A. Mahmoud, and S. Yasen. 2023. Explainable AI for breast cancer detection: A LIME-driven approach. InInternational Conference on Developments in eSystems Engineering

work page 2023

[21] [21]

Laatifi et al

M. Laatifi et al. 2023. Explanatory predictive model for COVID-19 severity risk employing machine learning, shapley addition, and LIME.Scientific Reports (2023)

work page 2023

[22] [22]

Maharana et al

U. Maharana et al. 2025. Right prediction, wrong reasoning: Uncovering LLM misalignment in RA disease diagnosis.arXiv.org(2025)

work page 2025

[23] [23]

Mahrouk, R

A. Mahrouk, R. C. Poonia, S. Pramanik, and F. R. Trejo-Macotela. 2025. Epistemic limits of local interpretability in self-modulating cognitive architectures.Frontiers in Artificial Intelligence(2025)

work page 2025

[24] [24]

Joao Marques-Silva. 2022. Logic-Based Explainability in Machine Learning.ArXiv abs/2211.00541 (2022). https://api.semanticscholar.org/CorpusID:253244432

work page arXiv 2022

[25] [25]

Joao Marques-Silva. 2023. Disproving XAI Myths with Formal Methods – Initial Results.27th International Conference on Engineering of Complex Computer Systems (ICECCS)(2023), 12–21. https://api.semanticscholar.org/CorpusID: 259076178

work page 2023

[26] [26]

Joao Marques-Silva and Xuanxiang Huang. 2024. Explainability Is Not a Game. Commun. ACM67, 7 (July 2024), 66–75. doi:10.1145/3635301

work page doi:10.1145/3635301 2024

[27] [27]

Joao Marques-Silva and Alexey Ignatiev. 2022. Delivering Trustworthy AI through Formal XAI.Proceedings of the AAAI Conference on Artificial Intelligence(2022)

work page 2022

[28] [28]

Mcallister, H

S. Mcallister, H. Tedesco, S. Kruger, E. Ward, C. Marsh, and S. Doeltgen. 2020. Clinical reasoning and hypothesis generation in expert clinical swallowing exam- inations.International journal of language and communication disorders(2020)

work page 2020

[29] [29]

National Institute for Health and Care Excellence. 2023. Chest Pain of Recent Onset: Assessment and Diagnosis. https://www.nice.org.uk/guidance/cg95

work page 2023

[30] [30]

Geoffrey Norman, Meredith Young, and Lee Brooks. 2007. Non-analytical models of clinical reasoning: the role of experience.Medical Education41, 12 (2007), 1140–1145

work page 2007

[31] [31]

Olcar Ozdemi. 2024. Explainable AI (XAI) in Healthcare: Bridging the Gap between Accuracy and Interpretability.Journal of Science, Technology and Engi- neering Research(2024)

work page 2024

[32] [32]

O. Ozdemir. 2024. Explainable AI (XAI) in healthcare: Bridging the gap between accuracy and interpretability.None(2024)

work page 2024

[33] [33]

A. V. Ponce-Bobadilla, V. Schmitt, C. Maier, S. Mensing, and S. Stodtmann. 2024. Practical guide to SHAP analysis: Explaining supervised machine learning model predictions in drug development.Clinical and Translational Science(2024)

work page 2024

[34] [34]

A. I. F. Poon and J. Sung. 2021. Opening the black box of AIMedicine.Journal of Gastroenterology and Hepatology(2021)

work page 2021

[35] [35]

Adam Rosenfeld and colleagues. 2025. Explainable AI in Clinical Decision Support: Bridging the Gap Between Accuracy and Interpretability.Journal of Medical Internet Research27, 3 (2025), e34567. doi:10.2196/34567 Bridging AI and Clinical Reasoning: Abductive Explanations for Alignment on Critical Symptoms ACSW 2026, February 09–12, 2026, Melbourne, Australia

work page doi:10.2196/34567 2025

[36] [36]

A. M. A. Salih et al . 2023. A perspective on explainable artificial intelligence methods: SHAP and LIME.Advanced Intelligent Systems(2023)

work page 2023

[37] [37]

Salimparsa, K

M. Salimparsa, K. Sedig, D. Lizotte, S. S. Abdullah, N. Chalabianloo, and F. Muanda

work page

[38] [38]

doi:10.3390/ informatics12020045

Explainable AI for Clinical Decision Support Systems: Literature Review, Key Gaps, and Research Synthesis.Informatics12, 2 (2025), 45. doi:10.3390/ informatics12020045

work page 2025

[39] [39]

M. A. Shakir et al. 2024. Developing interpretable models for complex decision- making. InConference of the Open Innovations Association

work page 2024

[40] [40]

Belona Sonna and Alban Grastien. 2026. On Explaining Proxy Discrimination and Unfairness in Individual Decisions Made by AI Systems. InAI 2025: Advances in Artificial Intelligence, Miaomiao Liu, Xin Yu, Chang Xu, and Yiliao Song (Eds.). Springer Nature Singapore, Singapore, 260–273

work page 2026

[41] [41]

Ian G Stiell and George A Wells. 1999. Methodologic standards for the devel- opment of clinical decision rules in emergency medicine.Annals of Emergency Medicine33, 4 (1999), 437–447

work page 1999

[42] [42]

Waqas et al

A. Waqas et al. 2025. REASONING BEYOND ACCURACY: EXPERT EVALUATION OF LARGE LANGUAGE MODELS IN DIAGNOSTIC PATHOLOGY.medRxiv (2025)

work page 2025

[43] [43]

World Health Organization. 2022. Integrated Management of Adult and Adoles- cent Illness (IMAI). https://www.who.int/publications

work page 2022

[44] [44]

Wu et al

K. Wu et al. 2025. MedCaseReasoning: Evaluating and learning diagnostic rea- soning from clinical case reports.arXiv.org(2025)

work page 2025

[45] [45]

Jinqiang Yu, Alexey Ignatiev, Peter Stuckey, Nina Narodytska, and Joao Marques- Silva. 2023. Eliminating the Impossible, Whatever Remains Must Be True: On Extracting and Applying Background Knowledge in the Context of Formal Expla- nations.Proceedings of the AAAI Conference on Artificial Intelligence37 (2023)

work page 2023

[46] [46]

V. et al. Yuan. 2024. Artificial Intelligence for Guideline-Based Evaluation of Cardiac Function.Circulation150, 11 (2024), e4370367. doi:10.1161/ CIRCULATIONAHA.124.4370367

work page 2024