pith. sign in

arxiv: 1802.03788 · v2 · pith:FLLLKBATnew · submitted 2018-02-11 · 💻 cs.LG · cs.AI· stat.ML

Influence-Directed Explanations for Deep Convolutional Networks

classification 💻 cs.LG cs.AIstat.ML
keywords explanationsinfluence-directednetworknetworksapproachclassconceptsconvolutional
0
0 comments X
read the original abstract

We study the problem of explaining a rich class of behavioral properties of deep neural networks. Distinctively, our influence-directed explanations approach this problem by peering inside the network to identify neurons with high influence on a quantity and distribution of interest, using an axiomatically-justified influence measure, and then providing an interpretation for the concepts these neurons represent. We evaluate our approach by demonstrating a number of its unique capabilities on convolutional neural networks trained on ImageNet. Our evaluation demonstrates that influence-directed explanations (1) identify influential concepts that generalize across instances, (2) can be used to extract the "essence" of what the network learned about a class, and (3) isolate individual features the network uses to make decisions and distinguish related classes.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.