Explaining Predictions by Approximating the Local Decision Boundary

Georgios Vlassopoulos; Henry Brighton; Tim van Erven; Vlado Menkovski

arxiv: 2006.07985 · v2 · pith:JSXCBDS3new · submitted 2020-06-14 · 💻 cs.LG · stat.ML

Explaining Predictions by Approximating the Local Decision Boundary

Georgios Vlassopoulos , Tim van Erven , Henry Brighton , Vlado Menkovski This is my paper

classification 💻 cs.LG stat.ML

keywords datameaningfulattributesboundarydecisionlatentlocalpredictions

0 comments

read the original abstract

Constructing accurate model-agnostic explanations for opaque machine learning models remains a challenging task. Classification models for high-dimensional data, like images, are often inherently complex. To reduce this complexity, individual predictions may be explained locally, either in terms of a simpler local surrogate model or by communicating how the predictions contrast with those of another class. However, existing approaches still fall short in the following ways: a) they measure locality using a (Euclidean) metric that is not meaningful for non-linear high-dimensional data; or b) they do not attempt to explain the decision boundary, which is the most relevant characteristic of classifiers that are optimized for classification accuracy; or c) they do not give the user any freedom in specifying attributes that are meaningful to them. We address these issues in a new procedure for local decision boundary approximation (DBA). To construct a meaningful metric, we train a variational autoencoder to learn a Euclidean latent space of encoded data representations. We impose interpretability by exploiting attribute annotations to map the latent space to attributes that are meaningful to the user. A difficulty in evaluating explainability approaches is the lack of a ground truth. We address this by introducing a new benchmark data set with artificially generated Iris images, and showing that we can recover the latent attributes that locally determine the class. We further evaluate our approach on tabular data and on the CelebA image data set.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Conditional Attribution for Root Cause Analysis in Time-Series Anomaly Detection
cs.LG 2026-04 unverdicted novelty 6.0

Conditional attribution retrieves contextually similar normal states from VAE latent spaces and UMAP embeddings to explain time-series anomalies while preserving dependencies, improving root-cause accuracy on SWaT and...