Compressing Neural Networks with the Hashing Trick

arxiv: 1504.04788 · v1 · pith:JVPUO3KXnew · submitted 2015-04-19 · 💻 cs.LG · cs.NE

Compressing Neural Networks with the Hashing Trick

Wenlin Chen , James T. Wilson , Stephen Tyree , Kilian Q. Weinberger , Yixin Chen This is my paper

classification 💻 cs.LG cs.NE

keywords hashednetshashnetworksneuralarchitecturedatadeepdevices

0 comments p. Extension

pith:JVPUO3KX Add to your LaTeX paper

What is a Pith Number?

\usepackage{pith}
\pithnumber{JVPUO3KX}

Prints a linked pith:JVPUO3KX badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

As deep nets are increasingly used in applications suited for mobile devices, a fundamental dilemma becomes apparent: the trend in deep learning is to grow models to absorb ever-increasing data set sizes; however mobile devices are designed with very little memory and cannot store such large models. We present a novel network architecture, HashedNets, that exploits inherent redundancy in neural networks to achieve drastic reductions in model sizes. HashedNets uses a low-cost hash function to randomly group connection weights into hash buckets, and all connections within the same hash bucket share a single parameter value. These parameters are tuned to adjust to the HashedNets weight sharing architecture with standard backprop during training. Our hashing procedure introduces no additional memory overhead, and we demonstrate on several benchmark data sets that HashedNets shrink the storage requirements of neural networks substantially while mostly preserving generalization performance.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
cs.CV 2017-04 accept novelty 7.0

MobileNets introduce depthwise separable convolutions plus width and resolution multipliers to produce efficient CNNs that trade off latency and accuracy for mobile and embedded vision applications.
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
cs.CV 2015-10 conditional novelty 7.0

A pruning-quantization-Huffman pipeline compresses deep neural networks 35-49x without accuracy loss.