Understanding Neural Networks Through Deep Visualization
read the original abstract
Recent years have produced great advances in training large, deep neural networks (DNNs), including notable successes in training convolutional neural networks (convnets) to recognize natural images. However, our understanding of how these models work, especially what computations they perform at intermediate layers, has lagged behind. Progress in the field will be further accelerated by the development of better tools for visualizing and interpreting neural nets. We introduce two such tools here. The first is a tool that visualizes the activations produced on each layer of a trained convnet as it processes an image or video (e.g. a live webcam stream). We have found that looking at live activations that change in response to user input helps build valuable intuitions about how convnets work. The second tool enables visualizing features at each layer of a DNN via regularized optimization in image space. Because previous versions of this idea produced less recognizable images, here we introduce several new regularization methods that combine to produce qualitatively clearer, more interpretable visualizations. Both tools are open source and work on a pre-trained convnet with minimal setup.
This paper has not been read by Pith yet.
Forward citations
Cited by 14 Pith papers
-
Concrete Problems in AI Safety
The paper categorizes five concrete AI safety problems arising from flawed objectives, costly evaluation, and learning dynamics.
-
A Distributional View for Visual Mechanistic Interpretability: KL-Minimal Soft-Constraint Principle
The work introduces a distributional view of visual mechanistic interpretability that casts the task as KL-minimal optimization and realizes it through a soft-constraint principle implemented with energy-guided diffus...
-
Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning
CELM uses class-wise evidence scores from client logits to compute contribution weights that upweight clients strong on underrepresented classes for stable aggregation in non-IID federated learning.
-
Inside-Out: Measuring Generalization in Vision Transformers Through Inner Workings
Circuit-based metrics from Vision Transformer internals provide better label-free proxies for generalization under distribution shift than existing methods like model confidence.
-
Fast gradient-free activation maximization for neurons in spiking neural networks
A Tensor Train decomposition-based method enables efficient gradient-free activation maximization for neurons in spiking neural networks by searching generative model latent spaces.
-
Interpretability Beyond Classification Output: Semantic Bottleneck Networks
Semantic Bottleneck Networks add interpretable semantic concept layers to deep networks, recovering SOTA segmentation performance with drastic channel reduction and enabling failure interpretation at over 99% accuracy...
-
Scalable Topological Data Analysis and Visualization for Evaluating Data-Driven Models in Scientific Applications
A scalable framework combining streaming graphs, topology computation, and topology-aware datacubes enables interactive analysis of high-dimensional functions in scientific ML applications.
-
Seeing What Shouldn't Be There: Counterfactual GANs for Medical Image Attribution
A cycle-consistent GAN generates counterfactual medical images to attribute classification decisions more comprehensively than standard saliency methods.
-
NeuroViz: Real-time Interactive Visualization of Forward and Backward Passes in Neural Network Training
NeuroViz offers interactive real-time visualization of neural network forward and backward passes, achieving top usability scores in a study with 31 participants compared to existing tools.
-
Understanding Task Representations in Neural Networks via Bayesian Ablation
A Bayesian ablation framework combined with information-theoretic metrics is introduced to analyze causal roles, distributedness, manifold complexity, and polysemanticity of task representations in neural networks.
-
Visual Interaction with Deep Learning Models through Collaborative Semantic Inference
Proposes the CSI framework for co-designing visual interactions and deep learning models to expose and allow semantic control over intermediate reasoning processes, shown in a summarization case study.
-
Generative Counterfactual Introspection for Explainable Deep Learning
A generative-model-driven introspection method produces counterfactual image edits to explain deep neural network predictions on MNIST and CelebA.
-
Neuron ranking -- an informed way to condense convolutional neural networks architecture
Shapley value and variational importance switch methods produce consistent rankings of filter importance in CNNs, enabling compression and interpretability.
-
What does it mean to understand a neural network?
Simple training code produces complex neural networks, suggesting that brain learning rules may be easier to understand than mature brain properties and that neuroscience should shift focus accordingly.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.