pith. sign in

arxiv: 1810.03979 · v1 · pith:XUVRWZW6new · submitted 2018-10-01 · 💻 cs.CV · cs.AI· cs.AR· cs.LG

Extended Bit-Plane Compression for Convolutional Neural Network Accelerators

classification 💻 cs.CV cs.AIcs.ARcs.LG
keywords compressionconvolutionalneuralacceleratorsconstraineddatanetworksachieved
0
0 comments X
read the original abstract

After the tremendous success of convolutional neural networks in image classification, object detection, speech recognition, etc., there is now rising demand for deployment of these compute-intensive ML models on tightly power constrained embedded and mobile systems at low cost as well as for pushing the throughput in data centers. This has triggered a wave of research towards specialized hardware accelerators. Their performance is often constrained by I/O bandwidth and the energy consumption is dominated by I/O transfers to off-chip memory. We introduce and evaluate a novel, hardware-friendly compression scheme for the feature maps present within convolutional neural networks. We show that an average compression ratio of 4.4x relative to uncompressed data and a gain of 60% over existing method can be achieved for ResNet-34 with a compression block requiring <300 bit of sequential cells and minimal combinational logic.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.