A Unified Approach to Interpreting Model Predictions

Scott Lundberg , Su-In Lee

Authors on Pith no claims yet

classification 💻 cs.AI cs.LGstat.ML

keywords methodsclassaccuracymodelspredictionpredictionsadditivecomplex

read the original abstract

Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 22 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Optimal Recourse Summaries via Bi-Objective Decision Tree Learning
cs.LG 2026-05 unverdicted novelty 7.0

SOGAR learns Pareto-optimal recourse summaries by solving a bi-objective decision tree problem, yielding stable low-cost effective group actions that outperform prior methods on effectiveness and cost.
Mitigating False Positives in Static Memory Safety Analysis of Rust Programs via Reinforcement Learning
cs.SE 2026-05 unverdicted novelty 6.0

Reinforcement learning on MIR features with fuzz testing feedback reduces false positives in Rust static memory safety analysis, raising precision from 25.6% to 59% and accuracy to 65.2% while keeping 74.6% recall.
Mitigating False Positives in Static Memory Safety Analysis of Rust Programs via Reinforcement Learning
cs.SE 2026-05 unverdicted novelty 6.0

Reinforcement learning on MIR features combined with cargo-fuzz validation reduces false positives in Rust static memory safety analysis, raising precision from 25.6% to 59.0% and accuracy to 65.2%.
Evaluating Agentic AI in the Wild: Failure Modes, Drift Patterns, and a Production Evaluation Framework
cs.AI 2026-05 unverdicted novelty 6.0

The paper presents a taxonomy of seven production-specific failure modes for agentic AI, demonstrates that existing metrics fail to detect four of them entirely, and proposes the PAEF five-dimension framework for cont...
Scale-Aware Adversarial Analysis: A Diagnostic for Generative AI in Multiscale Complex Systems
cs.LG 2026-05 unverdicted novelty 6.0

A new scale-aware diagnostic framework shows that unconstrained diffusion generative models exhibit structural freezing and instability instead of smooth physical responses under multiscale perturbations.
Validating the Clinical Utility of CineECG 3D Reconstructions through Cross-Modal Feature Attribution
eess.IV 2026-04 unverdicted novelty 6.0

Cross-modal averaging maps ECG model attributions to CineECG 3D space, raising Dice overlap with expert annotations from 0.47 to 0.56 on 20 cases while filtering attribution noise.
TrajOnco: a multi-agent framework for temporal reasoning over longitudinal EHR for multi-cancer early detection
cs.AI 2026-04 unverdicted novelty 6.0

TrajOnco uses a chain-of-agents LLM architecture with memory to perform temporal reasoning on longitudinal EHR, achieving 0.64-0.80 AUROC for 1-year multi-cancer risk prediction in zero-shot mode on matched cohorts wh...
Gradient Boosted Risk Scores
cs.LG 2026-05 conditional novelty 5.0

Gradient boosting produces risk scores with competitive accuracy but 60% fewer rules on classification tasks and 16% fewer on time-to-event tasks than regression-based methods like AutoScore.
Toward a Unified Framework for Collaborative Design of Human-AI Interaction
cs.HC 2026-05 unverdicted novelty 5.0

A framework unifies multimodal intent interpretation, interaction-centric explainability, and agency-preserving controls as interdependent requirements for trustworthy Human-AI collaboration.
CoAX: Cognitive-Oriented Attribution eXplanation User Model of Human Understanding of AI Explanations
cs.AI 2026-04 unverdicted novelty 5.0

Cognitive models of user reasoning strategies with XAI methods on tabular data fit human forward-simulation decisions better than ML baselines and support hypothesis testing without new user studies.
Characterisation of the Clouds' young stellar Bridge using Gaia DR3
astro-ph.GA 2026-04 unverdicted novelty 5.0

A new sample of young candidate Bridge stars is identified and shown to align with gas structures, with kinematics implying a ~125 Myr crossing time consistent with the last LMC-SMC interaction.
Who Audits the Auditor? Tamper-Proof Fraud Detection with Blockchain-Anchored Explainable ML
cs.CR 2026-04 unverdicted novelty 5.0

A blockchain-anchored explainable ML system delivers tamper-evident fraud detection with F1 of 0.895 and sub-25ms latency on Layer-2 networks.
Predicting the thermodynamics in the chromosphere from the translation of SDO data into the IRIS$^{2}$ inversion results using a visual transformer model
astro-ph.SR 2026-04 unverdicted novelty 5.0

A visual transformer model trained on IRIS inversions predicts chromospheric temperature and density from SDO data with correlations around 0.8 on 80% of test cases.
Surrogate modeling for interpreting black-box LLMs in medical predictions
cs.CL 2026-04 unverdicted novelty 5.0

A surrogate modeling method approximates LLM-encoded medical knowledge via prompting to quantify variable influence and flag inaccuracies and racial biases.
Efficient KernelSHAP Explanations for Patch-based 3D Medical Image Segmentation
cs.CV 2026-04 unverdicted novelty 5.0

An optimized KernelSHAP method for 3D medical image segmentation restricts computation to ROI and receptive fields, uses patch logit caching for 15-30% savings, and compares organ units versus supervoxels for clinical...
A Wasserstein GAN-based climate scenario generator for risk management and insurance: the case of soil subsidence
cs.LG 2026-04 unverdicted novelty 4.0

A conditional Wasserstein GAN generates plausible future SWI drought trajectories for French insurance risk management under climate change.
An Integrative Genome-Scale Metabolic Modeling and Machine Learning Framework for Predicting and Optimizing Single-Cell Protein Production in Saccharomyces cerevisiae
cs.LG 2026-03 unverdicted novelty 4.0

A Yeast9 GEM plus FBA, Random Forest, XGBoost, VAE, SHAP, Bayesian optimization and GAN framework yields a 12-fold biomass flux increase in simulated S. cerevisiae for SCP production.
Spectra-Scope : A toolkit for automated and interpretable characterization of material properties from spectral data
cond-mat.mtrl-sci 2026-03 unverdicted novelty 4.0

Spectra-Scope is a new AutoML framework that trains interpretable machine learning models on spectral data to characterize material properties while enabling users to understand which spectral features drive the predictions.
XAI and Statistical Analysis for Reliable Intrusion Detection in the UAVIDS-2025 Dataset: From Tree to Hybrid and Tabular DNN Ensembles
cs.CR 2026-05 unverdicted novelty 3.0

XGBoost with SHAP and statistical distribution analysis on UAVIDS-2025 identifies density support intersection as the cause of false predictions for Wormhole and Blackhole attacks in UAV intrusion detection.
Learning from Change: Predictive Models for Incident Prevention in a Regulated IT Environment
cs.SE 2026-04 unverdicted novelty 3.0

LightGBM with team-level features outperforms a bank's existing rule-based change risk process on a one-year dataset while using SHAP for regulatory explainability.
SDNGuardStack: An Explainable Ensemble Learning Framework for High-Accuracy Intrusion Detection in Software-Defined Networks
cs.CR 2026-04 unverdicted novelty 2.0

SDNGuardStack ensemble learning model reports 99.98% accuracy and 0.9998 Cohen's kappa on the InSDN dataset for SDN intrusion detection while providing SHAP-based explanations.
Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers
math.OC 2026-04 unverdicted novelty 2.0

A tutorial framing deep learning as a complement to optimization for sequential decision-making under uncertainty, with applications in supply chains, healthcare, and energy.