pith. sign in

arxiv: 1808.05779 · v3 · pith:XIJWF43Snew · submitted 2018-08-17 · 💻 cs.CV

Learning to Quantize Deep Networks by Optimizing Quantization Intervals with Task Loss

classification 💻 cs.CV
keywords networksaccuracyquantizationquantizequantizeractivationsbit-widthbit-widths
0
0 comments X
read the original abstract

Reducing bit-widths of activations and weights of deep networks makes it efficient to compute and store them in memory, which is crucial in their deployments to resource-limited devices, such as mobile phones. However, decreasing bit-widths with quantization generally yields drastically degraded accuracy. To tackle this problem, we propose to learn to quantize activations and weights via a trainable quantizer that transforms and discretizes them. Specifically, we parameterize the quantization intervals and obtain their optimal values by directly minimizing the task loss of the network. This quantization-interval-learning (QIL) allows the quantized networks to maintain the accuracy of the full-precision (32-bit) networks with bit-width as low as 4-bit and minimize the accuracy degeneration with further bit-width reduction (i.e., 3 and 2-bit). Moreover, our quantizer can be trained on a heterogeneous dataset, and thus can be used to quantize pretrained networks without access to their training data. We demonstrate the effectiveness of our trainable quantizer on ImageNet dataset with various network architectures such as ResNet-18, -34 and AlexNet, on which it outperforms existing methods to achieve the state-of-the-art accuracy.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Evolutionary fine tuning of quantized convolution-based deep learning models

    cs.LG 2026-04 unverdicted novelty 5.0

    Evolutionary fine-tuning of select weights in pre-quantized convolutional networks improves accuracy over standard rounding for VGG, ResNet, and autoencoder models.