Not just a black box: Learning important features through propagating activation differences

Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, Anshul Kundaje · 2016 · cs.LG · arXiv 1605.01713

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open full Pith review browse 6 citing papers arXiv PDF

abstract

Note: This paper describes an older version of DeepLIFT. See https://arxiv.org/abs/1704.02685 for the newer version. Original abstract follows: The purported "black box" nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Learning Important FeaTures), an efficient and effective method for computing importance scores in a neural network. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. We apply DeepLIFT to models trained on natural images and genomic data, and show significant advantages over gradient-based methods.

representative citing papers

Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution

cs.LG · 2026-05-04 · unverdicted · novelty 7.0 · 2 refs

MA-GIG uses VAE latent space to align Integrated Gradients paths with the data manifold for more faithful feature attributions in deep neural networks.

From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models

cs.CV · 2026-05-01 · unverdicted · novelty 7.0

An iERF-centric framework unifies local, global, and mechanistic interpretability in vision models via SRD for saliency, CAFE for concept anchoring, and ICAT for interlayer attribution.

Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction

cs.LG · 2026-05-05 · unverdicted · novelty 6.0 · 4 refs

GNNs with ontology-derived semantic loss create hierarchy-aware box embeddings of a yeast knowledge graph that raise double-knockout growth prediction R² to 0.377 and generalize to triple knockouts while identifying a validated trait association.

Causal Attribution via Activation Patching

cs.CV · 2026-03-13 · unverdicted · novelty 6.0

CAAP produces patch attributions in ViTs by direct activation patching on intermediate layers to measure causal contribution to the target class score.

Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection

cs.LG · 2025-04-01 · unverdicted · novelty 6.0

LiMA reformulates attribution as submodular subset selection and uses bidirectional greedy search to identify minimal important regions, reporting 36.3% better insertion and 39.6% better deletion scores than prior methods on eight foundation models.

Heterogeneous Graph Neural Networks with Post-hoc Explanations for Multi-modal and Explainable Land Use Inference

cs.AI · 2024-06-19 · unverdicted · novelty 4.0

Heterogeneous graph neural networks with post-hoc explanations improve accuracy on six land-use indicators from mobility data and provide feature attribution and counterfactual insights aligned with commuting patterns.

citing papers explorer

Showing 6 of 6 citing papers.

Manifold-Aligned Guided Integrated Gradients for Reliable Feature Attribution cs.LG · 2026-05-04 · unverdicted · none · ref 7 · 2 links · internal anchor
MA-GIG uses VAE latent space to align Integrated Gradients paths with the data manifold for more faithful feature attributions in deep neural networks.
From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models cs.CV · 2026-05-01 · unverdicted · none · ref 6
An iERF-centric framework unifies local, global, and mechanistic interpretability in vision models via SRD for saliency, CAFE for concept anchoring, and ICAT for interlayer attribution.
Graph Neural Network based Hierarchy-Aware Embeddings of Knowledge Graphs: Applications to Yeast Phenotype Prediction cs.LG · 2026-05-05 · unverdicted · none · ref 46 · 4 links · internal anchor
GNNs with ontology-derived semantic loss create hierarchy-aware box embeddings of a yeast knowledge graph that raise double-knockout growth prediction R² to 0.377 and generalize to triple knockouts while identifying a validated trait association.
Causal Attribution via Activation Patching cs.CV · 2026-03-13 · unverdicted · none · ref 31 · internal anchor
CAAP produces patch attributions in ViTs by direct activation patching on intermediate layers to measure causal contribution to the target class score.
Less is More: Efficient Black-box Attribution via Minimal Interpretable Subset Selection cs.LG · 2025-04-01 · unverdicted · none · ref 85 · internal anchor
LiMA reformulates attribution as submodular subset selection and uses bidirectional greedy search to identify minimal important regions, reporting 36.3% better insertion and 39.6% better deletion scores than prior methods on eight foundation models.
Heterogeneous Graph Neural Networks with Post-hoc Explanations for Multi-modal and Explainable Land Use Inference cs.AI · 2024-06-19 · unverdicted · none · ref 64 · internal anchor
Heterogeneous graph neural networks with post-hoc explanations improve accuracy on six land-use indicators from mobility data and provide feature attribution and counterfactual insights aligned with commuting patterns.

Not just a black box: Learning important features through propagating activation differences

fields

years

verdicts

representative citing papers

citing papers explorer