Influence-Directed Explanations for Deep Convolutional Networks

Anupam Datta; Klas Leino; Linyi Li; Matt Fredrikson; Shayak Sen

arxiv: 1802.03788 · v2 · pith:FLLLKBATnew · submitted 2018-02-11 · 💻 cs.LG · cs.AI· stat.ML

Influence-Directed Explanations for Deep Convolutional Networks

Klas Leino , Shayak Sen , Anupam Datta , Matt Fredrikson , Linyi Li This is my paper

classification 💻 cs.LG cs.AIstat.ML

keywords explanationsinfluence-directednetworknetworksapproachclassconceptsconvolutional

0 comments

read the original abstract

We study the problem of explaining a rich class of behavioral properties of deep neural networks. Distinctively, our influence-directed explanations approach this problem by peering inside the network to identify neurons with high influence on a quantity and distribution of interest, using an axiomatically-justified influence measure, and then providing an interpretation for the concepts these neurons represent. We evaluate our approach by demonstrating a number of its unique capabilities on convolutional neural networks trained on ImageNet. Our evaluation demonstrates that influence-directed explanations (1) identify influential concepts that generalize across instances, (2) can be used to extract the "essence" of what the network learned about a class, and (3) isolate individual features the network uses to make decisions and distinguish related classes.

This paper has not been read by Pith yet.

Influence-Directed Explanations for Deep Convolutional Networks

discussion (0)