Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI
read the original abstract
In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI. Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is acknowledged as a crucial feature for the practical deployment of AI models. This overview examines the existing literature in the field of XAI, including a prospect toward what is yet to be reached. We summarize previous efforts to define explainability in Machine Learning, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought. We then propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at Deep Learning methods for which a second taxonomy is built. This literature analysis serves as the background for a series of challenges faced by XAI, such as the crossroads between data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to XAI with a reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.
This paper has not been read by Pith yet.
Forward citations
Cited by 34 Pith papers
-
Embodied Explainability and Ontological Obstacles: Why We Struggle to Explain the Answers of Large Language Models (LLMs)
An argument paper reframes LLM explainability as an embodied, situated practice based on Dourish and enactivist cognition, identifying ontological obstacles in internal explanations and advocating affordance-based designs.
-
Probabilistic Attribution For Large Language Models
Develops a model-agnostic attribution score as the log-ratio of conditional response probabilities with and without a marginalized prompt token, derived via Bayes inversion of next-token distributions, and relates it ...
-
Explainable Outlier Detection for Interval-valued Data
Derives a closed-form Shapley value for the squared robust Interval-Mahalanobis distance to explain variable contributions to outlyingness in interval-valued data.
-
The Unseen Hand: Manipulating Model Fairness and SHAP with Targeted Identity Re-Association Attacks
TIRA attacks with PMiS and PRSMP push fairness metrics to ideal values and reduce SHAP attribution for protected features to zero in black-box settings.
-
PsyScore: A Psychometrically-Aware Framework for Trait-Adaptive Essay Scoring and ZPD-Scaffolded Feedback
PsyScore combines a Trait-Adaptive Neural IRT Scorer using GPCM with a ZPD-Scaffolded Feedback Generator to deliver both competitive scoring and pedagogically aligned feedback on the ASAP++ dataset.
-
From Weight Perturbation to Feature Attribution for Explaining Fully Connected Neural Networks
XWP and XWP_c are novel attribution methods for FCNNs that estimate feature importance by perturbing attached weights to avoid added bias and out-of-distribution issues in occlusion approaches.
-
Critic-Driven Voronoi-Quantization for Distilling Deep RL Policies to Explainable Models
Critic-Driven Voronoi State Partitioning distills deep RL policies into piecewise-linear models by iteratively adding linear subpolicies in high-value-error regions identified by the critic.
-
Correcting Influence: Unboxing LLM Outputs with Orthogonal Latent Spaces
A latent mediation framework with sparse autoencoders enables non-additive token-level influence attribution in LLMs by learning orthogonal features and back-propagating attributions.
-
NEURON: A Neuro-symbolic System for Grounded Clinical Explainability
NEURON raises AUC from 0.74-0.77 to 0.84-0.88 on MIMIC-IV heart-failure mortality prediction while lifting human-aligned explanation scores from 0.50 to 0.85 by grounding SHAP values in SNOMED CT and patient notes via...
-
RCProb: Probabilistic Rule Extraction for Efficient Simplification of Tree Ensembles
RCProb uses Dirichlet-smoothed class priors and Beta-smoothed condition likelihoods in a Naive Bayes formulation to extract rules from tree ensembles approximately 22 times faster than RuleCOSI+ while maintaining comp...
-
Option Pricing on Noisy Intermediate-Scale Quantum Computers: A Quantum Neural Network Approach
A compact 2-qubit QNN approximates Black-Scholes-Merton option prices with usable accuracy when executed on multiple commercial NISQ quantum processors.
-
Solution space path planning for supporting en-route air traffic control
The study introduces SSPPV and SSPPE variants of solution-space path planning for en-route ATC, integrating distance-, time-, and zone-based conflict detection, with SSPPV plus zone detection achieving 3.69 ms average...
-
Low-cost concept-based localized explanations: How far can we get with training-free approaches?
Mid-scale MLLMs reach 62-88% object-level exact-match accuracy in zero-shot localized concept naming via closed-set prompting and an embedding-based Open-CoNa strategy across datasets.
-
When LLM Rationales Become User-Facing: Effects on Trust Perception, Decision-Making, and Gaze Behaviors
Two linked user studies find that LLM rationale correctness and certainty framing affect trust and decision confidence while presentation format does not, and incorrect rationales increase gaze attention and pupil size.
-
Foundation Models for Credit Risk Prediction: A Game Changer?
Tabular foundation models outperform standard methods in credit risk PD and LGD tasks, with larger gains on smaller datasets when used out-of-the-box.
-
NEURON: A Neuro-symbolic System for Grounded Clinical Explainability
NEURON integrates SNOMED CT, ML, and RAG LLM to raise AUC from 0.74-0.77 to 0.84-0.88 and human-aligned explainability scores from 0.50 to 0.85 on MIMIC-IV acute heart failure data.
-
Explainable AI for Jet Tagging: A Comparative Study of GNNExplainer, GNNShap, and GradCAM for Jet Tagging in the Lund Jet Plane
Explainability techniques applied to LundNet show that assigned node importance correlates with classical jet substructure observables such as N-subjettiness ratios and energy correlation functions, with shifts across...
-
Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies
In real human subjects, AI transparency impacts imperfectly cooperative interactions far more than personality traits, unlike simulations where both are comparably influential.
-
Toward Explanatory Equilibrium: Verifiable Reasoning as a Coordination Mechanism under Asymmetric Information
Structured reasoning artifacts enable coordination in LLM multi-agent systems by preventing approval and welfare collapse under asymmetric information while keeping bad-approval rates low across audit regimes.
-
Explaining Graph Neural Networks for Node Similarity on Graphs
Empirical comparison shows gradient-based explanations for GNN node similarities are actionable, consistent, and retain effects when sparsified, unlike mutual information explanations.
-
Shapley in Context: Explaining Financial Language with Domain Expertise
Shapley values for LLM explanations in financial text are shown via theory and experiments to produce attributions consistent with financial reasoning.
-
A New Technique for AI Explainability using Feature Association Map
FAMeX introduces a graph-theoretic Feature Association Map to explain feature importance in AI classification models and outperforms PFI and SHAP on eight benchmarks.
-
A New Technique for AI Explainability using Feature Association Map
FAMeX creates a graph of feature associations to explain AI classification decisions and outperforms SHAP and permutation feature importance on eight benchmark datasets.
-
LegalCheck: Retrieval- and Context-Augmented Generation for Drafting Municipal Legal Advice Letters
LegalCheck automates drafting of municipal legal advice letters via RAG and CAG, producing near-final drafts in minutes with 80-100% coverage of essential legal reasoning in an Amsterdam deployment.
-
Interpretable Physics-Informed Load Forecasting for U.S. Grid Resilience: SHAP-Guided Ensemble Validation in Hybrid Deep Learning Under Extreme Weather
A hybrid deep learning model with physics regularization and SHAP analysis achieves 1.18% MAPE on ERCOT load data and up to 40.5% better performance on extreme events than its individual branches.
-
Assessing Model-Agnostic XAI Methods against EU AI Act Explainability Requirements
A qualitative-to-quantitative scoring framework is proposed to evaluate how well model-agnostic XAI methods support EU AI Act explainability requirements.
-
DenoGrad: A Gradient-Based Framework for Data Refinement in Tabular and Time-Series Learning
DenoGrad refines noisy tabular and time-series data by optimizing inputs via gradients from a fixed model, yielding better downstream predictions on ten real-world datasets while preserving data statistics.
-
Attribution Graphs and Causal Probing for Mechanistic Discovery and Bias Repair in Multimodal Generative Learning
Reveal-to-Revise integrates cross-modal attention fusion, Grad-CAM++ attribution, and bias feedback in a conditional attention WGAN-GP to report high accuracy, F1, and fairness metrics on multimodal MNIST variants and...
-
Industry Practitioners Perspectives on AI Model Quality: Perceptions, Challenges, and Solutions
Industry AI practitioners view model quality through nine attributes with context-dependent priorities, where data imbalance is a key challenge addressed by strategies like active learning, as confirmed by interviews ...
-
A New Technique for AI Explainability using Feature Association Map
FAMeX is a graph-based XAI algorithm claimed to outperform PFI and SHAP at gauging feature importance for classification.
-
LegalCheck: Retrieval- and Context-Augmented Generation for Drafting Municipal Legal Advice Letters
LegalCheck applies RAG and CAG to generate draft legal advice letters from laws and precedents, achieving 80-100% coverage of essential reasoning in minutes during a municipal deployment.
-
Self-Explainability in Self-Adaptive and Self-Organising Systems: Status and Research Directions
A systematic literature review defines self-explainability, proposes a taxonomy and levels framework, and reports that most approaches are conceptual with no standard evaluation method.
-
Agentic Artificial Intelligence in Finance: A Comprehensive Survey
Agentic AI brings goal-oriented autonomy and multi-agent coordination to finance, promising better efficiency and risk management while introducing stability and regulatory issues.
-
Compliance of AI Systems
The paper analyzes compliance challenges for AI systems under the EU AI Act, especially on edge devices, and proposes initial best practices emphasizing data set compliance for trustworthiness and explainability.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.