Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review
read the original abstract
Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine learning based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability.
This paper has not been read by Pith yet.
Forward citations
Cited by 8 Pith papers
-
Optimal Counterfactual Search in Tree Ensembles: A Study Across Modeling and Solution Paradigms
CPCF, a compact finite-domain CP encoding for tree ensembles, outperforms MaxSAT and MILP for optimal counterfactual search in most tested regimes.
-
Learning-Augmented Robust Algorithmic Recourse
Introduces learning-augmented robust algorithmic recourse that trades off consistency with accurate future-model predictions against robustness to inaccurate predictions via a novel algorithm.
-
A Meta Reinforcement Learning Approach to Goals-Based Wealth Management
MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
-
From Universal to Individualized Actionability: Revisiting Personalization in Algorithmic Recourse
Formalizing personalization as individual actionability in causal recourse shows hard constraints degrade validity and plausibility while revealing socio-demographic disparities in costs.
-
Profit-Based Counterfactual Explanations for Product Improvement: A Case Study of Manga Sales in Japan
PBCE formulates counterfactual explanations as profit maximization, removing exogenous targets and treating feature changes as modification costs, applied to manga sales prediction in Japan.
-
UNR-Explainer: Counterfactual Explanations for Unsupervised Node Representation Learning Models
UNR-Explainer applies MCTS to find subgraphs that change k-NN relations in unsupervised node embeddings, claiming superior performance on GraphSAGE and DGI across datasets.
-
A Neuro-Symbolic Framework for Accountability in Public-Sector AI
A framework combining legal ontology, rule extraction, and solver reasoning verifies whether AI explanations for CalFresh eligibility align with statutory constraints.
-
Explainable bank failure prediction models: Counterfactual explanations to reduce the failure risk
Compares counterfactual generation methods with balancing strategies on bank failure data, finding NICF with cost-sensitive learning produces the highest quality explanations on validity, proximity, and sparsity.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.