pith. machine review for the scientific record. sign in

arxiv: 1806.05337 · v2 · submitted 2018-06-14 · 💻 cs.LG · cs.AI· cs.CL· cs.CV· stat.ML

Recognition: unknown

Hierarchical interpretations for neural network predictions

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIcs.CLcs.CVstat.ML
keywords dnnshierarchicalpredictionsfeatureshierarchyidentifyinputinterpretations
0
0 comments X
read the original abstract

Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables. However, the inability to effectively visualize these relationships has led to DNNs being characterized as black boxes and consequently limited their applications. To ameliorate this problem, we introduce the use of hierarchical interpretations to explain DNN predictions through our proposed method, agglomerative contextual decomposition (ACD). Given a prediction from a trained DNN, ACD produces a hierarchical clustering of the input features, along with the contribution of each cluster to the final prediction. This hierarchy is optimized to identify clusters of features that the DNN learned are predictive. Using examples from Stanford Sentiment Treebank and ImageNet, we show that ACD is effective at diagnosing incorrect predictions and identifying dataset bias. Through human experiments, we demonstrate that ACD enables users both to identify the more accurate of two DNNs and to better trust a DNN's outputs. We also find that ACD's hierarchy is largely robust to adversarial perturbations, implying that it captures fundamental aspects of the input and ignores spurious noise.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. H-Sets: Hessian-Guided Discovery of Set-Level Feature Interactions in Image Classifiers

    cs.CV 2026-04 unverdicted novelty 6.0

    H-Sets detects higher-order feature interactions in image classifiers via Hessian-guided pair merging and attributes them with IDG-Vis to generate more interpretable saliency maps than existing marginal or coarse methods.