Deep neural networks are robust to weight binarization and other non-linear distortions

· 2016 · cs.NE · arXiv 1606.01981

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Recent results show that deep neural networks achieve excellent performance even when, during training, weights are quantized and projected to a binary representation. Here, we show that this is just the tip of the iceberg: these same networks, during testing, also exhibit a remarkable robustness to distortions beyond quantization, including additive and multiplicative noise, and a class of non-linear projections where binarization is just a special case. To quantify this robustness, we show that one such network achieves 11% test error on CIFAR-10 even with 0.68 effective bits per weight. Furthermore, we find that a common training heuristic--namely, projecting quantized weights during backpropagation--can be altered (or even removed) and networks still achieve a base level of robustness during testing. Specifically, training with weight projections other than quantization also works, as does simply clipping the weights, both of which have never been reported before. We confirm our results for CIFAR-10 and ImageNet datasets. Finally, drawing from these ideas, we propose a stochastic projection rule that leads to a new state of the art network with 7.64% test error on CIFAR-10 using no data augmentation.

representative citing papers

Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems

cs.LG · 2019-07-16 · unverdicted · novelty 4.0

Experiments show that shifted-ReLU layers can replace batch-normalization in single-bit-weight wide residual networks on CIFAR-10/100 and ImageNet without consistent accuracy penalty.

citing papers explorer

Showing 1 of 1 citing paper.

Single-bit-per-weight deep convolutional neural networks without batch-normalization layers for embedded systems cs.LG · 2019-07-16 · unverdicted · none · ref 9 · internal anchor
Experiments show that shifted-ReLU layers can replace batch-normalization in single-bit-weight wide residual networks on CIFAR-10/100 and ImageNet without consistent accuracy penalty.

Deep neural networks are robust to weight binarization and other non-linear distortions

fields

years

verdicts

representative citing papers

citing papers explorer