SGDR uses periodic warm restarts of the learning rate in SGD to reach new state-of-the-art error rates of 3.14% on CIFAR-10 and 16.21% on CIFAR-100.
8.1 50 K VS 100 K EXAMPLES PER EPOCH Our data augmentation procedure code is inherited from the Lasagne Recipe code for ResNets where flipped images are added to the training set
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2016 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
SGDR: Stochastic Gradient Descent with Warm Restarts
SGDR uses periodic warm restarts of the learning rate in SGD to reach new state-of-the-art error rates of 3.14% on CIFAR-10 and 16.21% on CIFAR-100.