Snip: Single-shot network pruning based on connection sensitivity

· 2018 · cs.CV · arXiv 1810.02340

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

open full Pith review browse 9 citing papers arXiv PDF

abstract

Pruning large neural networks while maintaining their performance is often desirable due to the reduced space and time complexity. In existing methods, pruning is done within an iterative optimization procedure with either heuristically designed pruning schedules or additional hyperparameters, undermining their utility. In this work, we present a new approach that prunes a given network once at initialization prior to training. To achieve this, we introduce a saliency criterion based on connection sensitivity that identifies structurally important connections in the network for the given task. This eliminates the need for both pretraining and the complex pruning schedule while making it robust to architecture variations. After pruning, the sparse network is trained in the standard way. Our method obtains extremely sparse networks with virtually the same accuracy as the reference network on the MNIST, CIFAR-10, and Tiny-ImageNet classification tasks and is broadly applicable to various architectures including convolutional, residual and recurrent networks. Unlike existing methods, our approach enables us to demonstrate that the retained connections are indeed relevant to the given task.

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Hybrid-LoRA: Bridging Full Fine-Tuning and Low-Rank Adaptation for Post-Training

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

Hybrid-LoRA selectively full fine-tunes modules with high sensitivity to low-rank adaptation using a novel score and applies LoRA elsewhere, matching full fine-tuning at 10% budget and outperforming PEFT baselines by up to 5.65%.

XTinyU-Net: Training-Free U-Net Scaling via Initialization-Time Sensitivity

eess.IV · 2026-05-10 · unverdicted · novelty 6.0 · 2 refs

A Jacobian sensitivity curve computed at initialization identifies the narrowest U-Net configuration that avoids performance collapse, matching nnU-Net accuracy with 400-1600x fewer parameters on six medical datasets.

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

cs.LG · 2023-10-19 · conditional · novelty 6.0

SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.

Not How Many, But Which: Parameter Placement in Low-Rank Adaptation

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.

Different Prompts, Different Ranks: Prompt-aware Dynamic Rank Selection for SVD-based LLM Compression

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

PARSE trains a prompt-aware linear router on dense-model outputs to select dynamic SVD ranks, improving accuracy up to 10% at 0.6 compression ratio on LLaMA-7B while delivering 2.5x prefill and 2.4x decode speedups.

Representation-Guided Parameter-Efficient LLM Unlearning

cs.CL · 2026-04-19 · unverdicted · novelty 6.0

REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.

Efficient compression of neural networks and datasets

cs.LG · 2025-05-23 · unverdicted · novelty 5.0

Refined probabilistic and smooth l0 pruning techniques approximate minimum description length for neural networks, achieving high compression with minimal accuracy loss and empirically verifying better sample efficiency and generalization on image and text tasks.

On improving deep learning generalization with adaptive sparse connectivity

cs.NE · 2019-06-27 · unverdicted · novelty 4.0

Sparse MLPs trained via SET plus neuron pruning achieve competitive performance on 15 datasets while pruning ~50% of hidden neurons and keeping parameter count linear in neuron count.

Prune-Quantize-Distill: An Ordered Pipeline for Efficient Neural Network Compression

cs.LG · 2026-04-05 · unverdicted · novelty 4.0

The prune-quantize-distill ordering produces a better accuracy-size-latency frontier on CIFAR-10/100 than any single technique or other orderings, with INT8 QAT providing the main runtime gain.

citing papers explorer

Showing 9 of 9 citing papers.

Hybrid-LoRA: Bridging Full Fine-Tuning and Low-Rank Adaptation for Post-Training cs.LG · 2026-05-12 · unverdicted · none · ref 2 · internal anchor
Hybrid-LoRA selectively full fine-tunes modules with high sensitivity to low-rank adaptation using a novel score and applies LoRA elsewhere, matching full fine-tuning at 10% budget and outperforming PEFT baselines by up to 5.65%.
XTinyU-Net: Training-Free U-Net Scaling via Initialization-Time Sensitivity eess.IV · 2026-05-10 · unverdicted · none · ref 11 · 2 links · internal anchor
A Jacobian sensitivity curve computed at initialization identifies the narrowest U-Net configuration that avoids performance collapse, matching nnU-Net accuracy with 400-1600x fewer parameters on six medical datasets.
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation cs.LG · 2023-10-19 · conditional · none · ref 198 · internal anchor
SalUn uses gradient-based weight saliency to achieve effective machine unlearning of data, classes, or concepts in image classification and generation, narrowing the gap to exact retraining.
Not How Many, But Which: Parameter Placement in Low-Rank Adaptation cs.LG · 2026-05-12 · unverdicted · none · ref 50
Gradient-informed placement of LoRA parameters recovers full performance under GRPO while random placement does not, due to differences in gradient rank and stability across training regimes.
Different Prompts, Different Ranks: Prompt-aware Dynamic Rank Selection for SVD-based LLM Compression cs.LG · 2026-05-09 · unverdicted · none · ref 34
PARSE trains a prompt-aware linear router on dense-model outputs to select dynamic SVD ranks, improving accuracy up to 10% at 0.6 compression ratio on LLaMA-7B while delivering 2.5x prefill and 2.4x decode speedups.
Representation-Guided Parameter-Efficient LLM Unlearning cs.CL · 2026-04-19 · unverdicted · none · ref 97
REGLU guides LoRA-based unlearning via representation subspaces and orthogonal regularization to outperform prior methods on forget-retain trade-off in LLM benchmarks.
Efficient compression of neural networks and datasets cs.LG · 2025-05-23 · unverdicted · none · ref 38 · internal anchor
Refined probabilistic and smooth l0 pruning techniques approximate minimum description length for neural networks, achieving high compression with minimal accuracy loss and empirically verifying better sample efficiency and generalization on image and text tasks.
On improving deep learning generalization with adaptive sparse connectivity cs.NE · 2019-06-27 · unverdicted · none · ref 6 · internal anchor
Sparse MLPs trained via SET plus neuron pruning achieve competitive performance on 15 datasets while pruning ~50% of hidden neurons and keeping parameter count linear in neuron count.
Prune-Quantize-Distill: An Ordered Pipeline for Efficient Neural Network Compression cs.LG · 2026-04-05 · unverdicted · none · ref 13
The prune-quantize-distill ordering produces a better accuracy-size-latency frontier on CIFAR-10/100 than any single technique or other orderings, with INT8 QAT providing the main runtime gain.

Snip: Single-shot network pruning based on connection sensitivity

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer