pith. sign in

arxiv: 1806.10080 · v1 · pith:3UUJVREAnew · submitted 2018-06-26 · 💻 cs.LG · stat.ML

A Theory of Diagnostic Interpretation in Supervised Classification

classification 💻 cs.LG stat.ML
keywords modelblack-boxdecisiondiagnosticinterpretationprocessinterpretabilityknown
0
0 comments X
read the original abstract

Interpretable deep learning is a fundamental building block towards safer AI, especially when the deployment possibilities of deep learning-based computer-aided medical diagnostic systems are so eminent. However, without a computational formulation of black-box interpretation, general interpretability research rely heavily on subjective bias. Clear decision structure of the medical diagnostics lets us approximate the decision process of a radiologist as a model - removed from subjective bias. We define the process of interpretation as a finite communication between a known model and a black-box model to optimally map the black box's decision process in the known model. Consequently, we define interpretability as maximal information gain over the initial uncertainty about the black-box's decision within finite communication. We relax this definition based on the observation that diagnostic interpretation is typically achieved by a process of minimal querying. We derive an algorithm to calculate diagnostic interpretability. The usual question of accuracy-interpretability tradeoff, i.e. whether a black-box model's prediction accuracy is dependent on its ability to be interpreted by a known source model, does not arise in this theory. With multiple example simulation experiments of various complexity levels, we demonstrate the working of such a theoretical model in synthetic supervised classification scenarios.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.