pith. sign in

arxiv: 1907.00262 · v1 · pith:RYQMQXQOnew · submitted 2019-06-29 · 💻 cs.LG · cs.CV· cs.NE· stat.ML

Dissecting Pruned Neural Networks

Pith reviewed 2026-05-25 12:41 UTC · model grok-4.3

classification 💻 cs.LG cs.CVcs.NEstat.ML
keywords pruninginterpretabilityneural networksnetwork dissectionResNet-50ImageNetmodel compressiondisentangled representations
0
0 comments X

The pith

ResNet-50 models on ImageNet keep the same number of interpretable concepts in their units after more than 90 percent of parameters are pruned.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Pruning removes large numbers of parameters from neural networks while preserving accuracy. The paper measures whether this process also preserves the count of hidden units that represent human-recognizable concepts, using network dissection to identify such units. It finds that the number of these interpretable concepts and units remains unchanged until pruning reaches the point where accuracy begins to decline. The result applies to ResNet-50 trained on ImageNet and holds after more than 90 percent of parameters have been removed. This indicates that the parameters removed by pruning are not required for maintaining this form of interpretability.

Core claim

Pruning has no detrimental effect on the measure of interpretability until so few parameters remain that accuracy begins to drop. Resnet-50 models trained on ImageNet maintain the same number of interpretable concepts and units until more than 90% of parameters have been pruned.

What carries the argument

Network dissection, which counts hidden units that learn disentangled representations of human-recognizable concepts.

If this is right

  • The structure removed by pruning does not include the units that encode the measured interpretable concepts.
  • This measure of interpretability remains stable under compression as long as accuracy is preserved.
  • Accuracy and the count of interpretable units decline together once pruning exceeds the point where unnecessary parameters are exhausted.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The encoding of these concepts may be redundant enough to survive removal of most parameters.
  • The same pattern could be tested on other architectures or datasets to check whether it is general.
  • Pruning might serve as a compression method that leaves explanatory units intact.

Load-bearing premise

Network dissection continues to provide a reliable count of disentangled human-recognizable concepts after pruning without the reduced capacity introducing systematic bias into the measurement.

What would settle it

A measured drop in the number of interpretable units in a ResNet-50 on ImageNet that occurs before accuracy declines would falsify the central claim.

read the original abstract

Pruning is a standard technique for removing unnecessary structure from a neural network to reduce its storage footprint, computational demands, or energy consumption. Pruning can reduce the parameter-counts of many state-of-the-art neural networks by an order of magnitude without compromising accuracy, meaning these networks contain a vast amount of unnecessary structure. In this paper, we study the relationship between pruning and interpretability. Namely, we consider the effect of removing unnecessary structure on the number of hidden units that learn disentangled representations of human-recognizable concepts as identified by network dissection. We aim to evaluate how the interpretability of pruned neural networks changes as they are compressed. We find that pruning has no detrimental effect on this measure of interpretability until so few parameters remain that accuracy beings to drop. Resnet-50 models trained on ImageNet maintain the same number of interpretable concepts and units until more than 90% of parameters have been pruned.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that pruning ResNet-50 models trained on ImageNet has no detrimental effect on interpretability—as measured by the number of hidden units learning disentangled human-recognizable concepts via network dissection—until more than 90% of parameters are removed, at which point accuracy also begins to decline. The work positions this as evidence that the 'unnecessary structure' removed by pruning does not include the units responsible for these interpretable concepts.

Significance. If the central empirical result is robust, it would indicate that aggressive unstructured pruning preserves the count of concept-aligned units, implying that interpretability is concentrated in a small, resilient subset of parameters. This could guide pruning algorithms that explicitly protect interpretable representations and inform theoretical accounts of how overparameterization relates to disentangled feature learning.

major comments (2)
  1. [Abstract] Abstract: the central claim equates stable dissection counts with preserved interpretability, yet the text supplies no indication that the network-dissection pipeline (activation thresholds, IoU computation against Broden concepts, selectivity criteria) was re-validated or re-calibrated on the pruned models. Because pruning changes sparsity, dynamic range, and co-activation statistics, these quantities are distribution-dependent; without explicit checks, the plateau until >90% pruning could be a measurement artifact.
  2. [Results] Results / Experimental protocol (implied by the abstract's empirical finding): no details are given on the number of independent runs, statistical tests for the 'same number' claim, or controls that would rule out systematic bias in the dissection metric after pruning. These omissions are load-bearing because the weakest assumption is precisely that the metric remains unbiased under the altered activation regime.
minor comments (1)
  1. [Abstract] Abstract: 'accuracy beings to drop' is a typographical error and should read 'accuracy begins to drop'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim equates stable dissection counts with preserved interpretability, yet the text supplies no indication that the network-dissection pipeline (activation thresholds, IoU computation against Broden concepts, selectivity criteria) was re-validated or re-calibrated on the pruned models. Because pruning changes sparsity, dynamic range, and co-activation statistics, these quantities are distribution-dependent; without explicit checks, the plateau until >90% pruning could be a measurement artifact.

    Authors: The network dissection procedure followed the exact protocol and hyperparameters from the original Network Dissection paper, applied uniformly to all models. We agree that the manuscript would benefit from explicit discussion of metric stability under pruning-induced distribution shifts. In the revised version we will add a dedicated paragraph and supplementary analysis confirming that activation thresholds and concept IoU distributions do not exhibit systematic drift with increasing sparsity. revision: partial

  2. Referee: [Results] Results / Experimental protocol (implied by the abstract's empirical finding): no details are given on the number of independent runs, statistical tests for the 'same number' claim, or controls that would rule out systematic bias in the dissection metric after pruning. These omissions are load-bearing because the weakest assumption is precisely that the metric remains unbiased under the altered activation regime.

    Authors: Results are reported from the standard single training run per pruning level, consistent with common practice for large-scale ImageNet experiments. No formal statistical tests or multi-seed controls were included. We acknowledge these omissions weaken the robustness claim. The revision will add an explicit experimental-protocol subsection noting the single-run limitation and, where computationally feasible, supplementary multi-seed verification or a bias-control argument based on the observed invariance of the Broden concept set. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical measurement

full rationale

The paper presents an experimental study measuring the number of interpretable units via network dissection before and after pruning ResNet-50 models. No derivation, equations, fitted parameters, or predictions appear in the claim; the result is a direct count from applying an external method to pruned networks. The central observation is therefore self-contained against the reported benchmarks and does not reduce to any self-referential step.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that network dissection yields a stable, meaningful count of interpretable units even after network capacity is reduced by pruning.

axioms (1)
  • domain assumption Network dissection reliably identifies units that learn disentangled representations of human-recognizable concepts.
    This metric is used to quantify interpretability before and after pruning.

pith-pipeline@v0.9.0 · 5685 in / 1000 out tokens · 34044 ms · 2026-05-25T12:41:18.532580+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.